• Spectroscopy and Spectral Analysis
  • Vol. 42, Issue 4, 1186 (2022)
Yu-qing YANG*, Jiang-hui CAI1; 2; *;, Hai-feng YANG1; *;, Xu-jun ZHAO1;, and Xiao-na YIN1;
Author Affiliations
  • 1. School of Computer Science and Technology, Taiyuan University of Science and Technology, Taiyuan 030024, China
  • show less
    DOI: 10.3964/j.issn.1000-0593(2022)04-1186-06 Cite this Article
    Yu-qing YANG, Jiang-hui CAI, Hai-feng YANG, Xu-jun ZHAO, Xiao-na YIN. LAMOST Unknown Spectral Analysis Based on Influence Space and Data Field[J]. Spectroscopy and Spectral Analysis, 2022, 42(4): 1186 Copy Citation Text show less

    Abstract

    Based on the spectral data classified as Unknown by LAMOST DR5 Pipeline, the characteristics of low-quality spectra are extracted, and clustering analysis is conducted in this paper. The main work includes: (1) Feature extraction based on influence space and the data field. Firstly, a large number of small clusters are extracted from the low SNR spectrum based on influence space; secondly, each small cluster’s data field is calculated, and the spectrum is sorted using the above field; and then, access the sorted spectrum and the members in its small cluster to obtain the characteristic spectrum. (2) Carry out K-means clustering with the above characteristic spectrum and statistics on the sky area, observed visual ninety, the signal-to-noise ratio in each band, brightness, and spectrometer/fiber distribution for each class of targets. (3) Analysis of clustering results of the low SNR spectra. All low-quality spectra are divided into five categories through cluster analysis: A. The spectral SNR is low, or the spectrum is different from the traditional classification template, but its category can be determined by feature analysis (accounting for 2.7%); B. Suspected characteristic lines or molecular bands that do not match the line table appear at the blue or red end of the spectrum (accounting for 23.6%); C. The SNR at the spectrum’s blue end is very low, and the noise value in this wavelength region is strong. While in other wavelength regions, the features of continuous spectrum and line are weak (accounting for 48%); D. Due to the splicing problem, a protrusion can be seen in the local spectrum between 5 700 and 5 900 Å, and the continuum and line characteristics are poor at other wavelengths (accounting for 24.2%); E. Many default values make it impossible to determine the category of the spectrum (accounting for 1.5%). The experimental results show that this method can not only effectively extract the characteristic spectrum of low SNR spectrum, but also effectively carry out clustering analysis on the characteristic spectrum to reveal their causes, to provide a reference for the formulation of spectrum observation plan and the analysis and processing of low SNR spectrum.
    Yu-qing YANG, Jiang-hui CAI, Hai-feng YANG, Xu-jun ZHAO, Xiao-na YIN. LAMOST Unknown Spectral Analysis Based on Influence Space and Data Field[J]. Spectroscopy and Spectral Analysis, 2022, 42(4): 1186
    Download Citation