标题：Data Mining of Stellar Spectra with Emission Lines Based on Hadoop
作者：Ge, Guozhou; Pan, Jingchang
作者机构：[Ge, Guozhou; Pan, Jingchang] Shandong Univ, Sch Mech Elect & Informat Engn, Weihai, Peoples R China.
会议名称：IEEE Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC)
会议日期：OCT 03-05, 2016
来源：PROCEEDINGS OF 2016 IEEE ADVANCED INFORMATION MANAGEMENT, COMMUNICATES, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (IMCEC 2016)
关键词：LAMOST DR2; Massive spectra; Emission Line Stars (ELS); Hadoop cluster
摘要：Large Sky Area Multi-Object Fiber Spectroscopy Telescope (LAMOST) is a meridian reflecting Schmidt telescope. For each observation, it will produce tens of thousands of spectra. The spectra obtained from LAMOST pilot survey and the first two years of its regular survey, LMOST data release 2 (DR2) was released online in December 2014. This data set contains about more than four million spectra, which include stars, galaxies, quasars and other unknown stars. LAMOST large scientific survey project has provide massive spectra for the astronomers to search some rare special stars such as Cataclysmic Variable stars (CVs), Herbig Ae/Be etc. These special stars always contain emission lines. The existing of emission lines indicate that the stars have experienced or are not stable ejection process. The search for these objects is helpful in astronomy for scholars to study the stellar evolution. In this paper, we study the identification method of emission line stars, using the distributed, parallel computing large data processing technology, Hadoop, the emission line stars (ELS) spectra were screened from the DR2 spectra data set. Through by a multi node cluster parallel data mining experiment, we got 51092 spectra with emission lines from these spectra. Hadoop cluster has greatly improved the identification transmission line of the stellar spectrum efficiency, and this paper provides important reference value for the future to resolve similar massive spectra data processing problems.