标题：Mining unusual and rare stellar spectra from large spectroscopic survey data sets using the outlier-detection method
作者：Wei, P.;Luo, A.;Li, Y.;Pan, J.;Tu, L.;Jiang, B.;Kong, X.;Shi, Z.;Yi, Z.;Wang, F.;Liu, J.;Zhao, Y.
作者机构：[Wei, P] Key Laboratory of Optical Astronomy, National Astronomical Observatories, Chinese Academy of Sciences, Beijing 100012, China, University of C 更多
通讯作者地址：[Wei, P]Chinese Acad Sci, Key Lab Opt Astron, Natl Astron Observ, Beijing 100012, Peoples R China.
来源：Monthly notices of the Royal Astronomical Society
关键词：Carbon-stars;Data analysis-surveys-binaries;Emission-line;Be-novae;cataclysmic variables;Methods;Spectroscopic-stars
摘要：The large number of spectra obtained from sky surveys such as the Sloan Digital Sky Survey (SDSS) and the survey executed by the Large sky Area Multi-Object fibre Spectroscopic Telescope (LAMOST, also called GuoShouJing Telescope) provide us with opportunities to search for peculiar or even unknown types of spectra. In response to the limitations of existing methods, a novel outlier-mining method, the Monte Carlo Local Outlier Factor (MCLOF), is proposed in this paper, which can be used to highlight unusual and rare spectra from large spectroscopic survey data sets. The MCLOF method exposes outliers automatically and efficiently by marking each spectrum with a number, i.e. using outlier index as a flag for an unusual and rare spectrum. The Local Outlier Factor (LOF) represents how unusual and rare a spectrum is compared with other spectra and the Monte Carlo method is used to compute the global LOF for each spectrum by randomly selecting samples in each independent iteration. Our MCLOF method is applied to over half a million stellar spectra (classified as STAR by the SDSS Pipeline) from the SDSS data release 8 (DR8) and a total of 37 033 spectra are selected as outliers with signal-to-noise ratio (S/N) ≥ 3 and outlier index ≥0.85. Some of these outliers are shown to be binary stars, emission-line stars, carbon stars and stars with unusual continuum. The results show that our proposed method can efficiently highlight these unusual spectra from the survey data sets. In addition, some relatively rare and interesting spectra are selected, indicating that the proposed method can also be used to mine rare, even unknown, spectra. The proposed method can be applicable not only to spectral survey data sets but also to other types of survey data sets. The spectra of all peculiar objects selected by our MCLOF method are available from a user-friendly website: http://sciwiki.lamost.org/Miningdr8/.