• Spectroscopy and Spectral Analysis
  • Vol. 37, Issue 6, 1733 (2017)
YAN Sheng-ke1、*, YANG Hui-hua1、2, HU Bai-chao1, REN Cha0-chao1, and LiU Zhen-bing1
Author Affiliations
  • 1[in Chinese]
  • 2[in Chinese]
  • show less
    DOI: 103964/jissn1000-0593(2017)06-1733-06 Cite this Article
    YAN Sheng-ke, YANG Hui-hua, HU Bai-chao, REN Cha0-chao, LiU Zhen-bing. Variable Selection Method of NIR Spectroscopy Based on Least Angle Regression and GA-PLS[J]. Spectroscopy and Spectral Analysis, 2017, 37(6): 1733 Copy Citation Text show less

    Abstract

    Near infrared (NIR) spectra usually have many wavelength variables. Direct or indirect variable selection is crucial to improve the stability and prediction performance of a model. Least angle regression (LAR) is a relatively new and efficient machine learning algorithm for regression analysis and variable selection. By combining LAR and genetic algorithm-partial least square (GA-PLS) algorithm, a wavelength selection method is proposed in this paper for spectral modeling applications, which can effectively screen a few wavelength points. Firstly, LAR is used to eliminate the multiple-collinearity among variables in the full spectrum region and get a reduced set of features, then GA-PLS is introduced to select the variables from the reduced set of features to achieve the purpose of further dimension reduction. In order to verify the validity of it, the method is carried out by making regression analysis on the NIR spectroscopy of tablets and gasoline. The pre-processing results of original spectra are used to select the variables and have modeled on the content of active ingredients (Tablets) and C10 (Gasoline). As a result, the optimal number of variables are just 7 in both of applications, and the predictive decision coefficient (R2p) reached 0933 9 and 0951 9 respectively. Moreover, by comparing with the full spectrum, elimination of uninformative variables (UVE) and successive projection algorithm (SPA) model, the result shows that this method needs less wavelength points and have more excellent in R2p and root mean square error of predication (RMSEP). Therefore, LAR combined with GA-PLS not only can picks out information variables from NIR spectroscopy to reduce the variable number for modeling and improve the prediction accuracy, but also has a better interpretation of the model. The method can provide as effective wavelength selection tool for designing of special spectrometer in particular area.
    YAN Sheng-ke, YANG Hui-hua, HU Bai-chao, REN Cha0-chao, LiU Zhen-bing. Variable Selection Method of NIR Spectroscopy Based on Least Angle Regression and GA-PLS[J]. Spectroscopy and Spectral Analysis, 2017, 37(6): 1733
    Download Citation