标题：Study on Pretreatment Methods of Stellar Spectral Data
作者：Jiang, Bin; Chen, Bingrui; Zhao, Ziliang
作者机构：[Jiang, Bin; Chen, Bingrui; Zhao, Ziliang] Shandong Univ, Sch Mech Elect & Informat Engn, Weihai, Peoples R China.
会议名称：4th IEEE International Conference on Big Data Analytics (ICBDA)
会议日期：MAR 15-18, 2019
来源：2019 4TH IEEE INTERNATIONAL CONFERENCE ON BIG DATA ANALYTICS (ICBDA 2019)
关键词：Big Data; Machine Learning; Auto-Encoder; PCA; Stellar spectra; SDSS
摘要：Stellar spectra classification is an important part of the automatic recognition of celestial spectra. In the era of Big Data, many observatories produce PB-level spectra every day. Although many people focus on using new machine learning classification methods, spectral data preprocessing is also integral for improving classification accuracy. This paper compares different pretreatment methods and analyzes the steps of spectral data pretreatment which include abnormal data detection, data standardization and feature selection. Also, we compare the effects of two machine learning algorithms with different pretreatment methods, that is, PCA and Auto-Encoder, on SDSS data. The experimental results show that by using correct preprocessing methods on SDSS data set, classification accuracy can be improved no matter what classifier is used. Besides, for spectral data, PCA performs better than Auto-Encoder in most cases.