• Spectroscopy and Spectral Analysis
  • Vol. 39, Issue 3, 948 (2019)
LIU Zhong-bao1、2、*, LEI Yu-fei1, SONG Wen-ai2, ZHANG Jing2, WANG Jie3, and TU Liang-ping4
Author Affiliations
  • 1[in Chinese]
  • 2[in Chinese]
  • 3[in Chinese]
  • 4[in Chinese]
  • show less
    DOI: 10.3964/j.issn.1000-0593(2019)03-0948-05 Cite this Article
    LIU Zhong-bao, LEI Yu-fei, SONG Wen-ai, ZHANG Jing, WANG Jie, TU Liang-ping. Stellar Spectra Classification by Support Vector Machine with Unlabeled Data[J]. Spectroscopy and Spectral Analysis, 2019, 39(3): 948 Copy Citation Text show less

    Abstract

    Stellar spectra classification is one of hot spots in astronomical techniques and methods. With continuous operation and improvement of observation apparatus, hundreds and thousands of spectra were obtained by researchers, which presented challenges to process them manually. In view of this, data mining algorithms have attracted more attentions, and have been utilized to deal with the spectra. Neural networks, self organization mapping, association rules and other data mining algorithms have been utilized to classify the stellar spectra in recent years. In these algorithms, Support Vector Machine (SVM) is much more popular due to its good learning capability and excellent classification performance. The basic idea of standard SVM is to find an optimal separating hyper-plane between the positive and negative samples. SVM as a convex programming problem has a unique optimal solution, which can be posed as a quadratic programming (QP) problem. In order to further improve the classification efficiency, Twin Support Vector Machine (TSVM) has been proposed. It aims at generating two non-parallel hyper-planes such that each plane is close to one class and as far as possible from the other one. The learning speed of TSVM is approximately four times faster than that of the classical SVM. TSVM receives many attentions since it shows low computational complexity, and many variants of TSVM have been proposed in literatures. During the process of stellar spectra classification, the classification model is built based on the observation data. The key step is to manually label the spectra, which is time-consuming and painstaking. Therefore, how to construct the spectra classification model based on the labeled and unlabeled spectra is a problem deserving study. In order to effectively classify the stellar spectra, Twin Support Vector Machine with Unlabeled Data (TSVMUD) is proposed in this paper. In TSVMUD, the stellar spectra are firstly divided into two parts, one is for training, and the other is for test. Then, the proposed method TSVMUD is utilized on the training data and the classification model is obtained. At last, the spectra in the test dataset are verified by the classification model. TSVMUD not only preserve the advantage of low computational complexity, but also improve the classification efficiency by taking both the labeled and unlabeled data into consideration. The comparative experiments on the SDSS datasets verify that TSVMUD performs better than the traditional classifiers, such as SVM, TSVM, KNN (K Nearest Neighbor). However, some limitations exist in TSVMUD, for example, how to deal with the mass spectra is quite difficult to solve. Inspired by random sampling, we will research the adaptability of our proposed method in the big data environment based on big data technologies.
    LIU Zhong-bao, LEI Yu-fei, SONG Wen-ai, ZHANG Jing, WANG Jie, TU Liang-ping. Stellar Spectra Classification by Support Vector Machine with Unlabeled Data[J]. Spectroscopy and Spectral Analysis, 2019, 39(3): 948
    Download Citation