• Acta Optica Sinica
  • Vol. 39, Issue 9, 0930002 (2019)
Guanwen Li1、2、**, Xiaohong Gao、*, Nengwen Xiao2, and Yunfei Xiao1
Author Affiliations
  • 1 Qinghai Provincial Key Laboratory of Physical Geography and Environmental Process, School of Geography Sciences, Qinghai Normal University, Xining, Qinghai 810008, China
  • 2 Chinese Research Academy of Environmental Sciences, Beijing 100012, China
  • show less
    DOI: 10.3788/AOS201939.0930002 Cite this Article Set citation alerts
    Guanwen Li, Xiaohong Gao, Nengwen Xiao, Yunfei Xiao. Estimation of Soil Organic Matter Content Based on Characteristic Variable Selection and Regression Methods[J]. Acta Optica Sinica, 2019, 39(9): 0930002 Copy Citation Text show less

    Abstract

    In view of the large amount of soil hyperspectral data and obvious spectral information redundancy, this paper aims to compare prediction abilities of multiple feature variable selection methods for estimating soil organic matter. The stability competitive adaptive reweighted sampling (sCARS), successive projections algorithm (SPA), genetic algorithm (GA), iteratively retained information variables (IRIV), and sCARS-SPA are used to select the characteristic variables from full spectral data. Based on these characteristic bands and full spectral bands, partial least squares regression (PLSR), support vector machine (SVM), and random forest (RF) models are used to predict the soil organic matter content. The results show that the PLSR and SVM models combined with variable selection can not only improve the efficiency of the model, but also improve the model prediction ability over the full bands. The accuracy of RF model constructed with characteristic variables is not obviously improved, but the variable number in the construction model is significantly reduced and the modeling efficiency is greatly improved. Overall, the RF model’s accuracy is better than those of the SVM model and the PLSR model. The variable number of the prediction model from the combination of IRIV and RF is only 63, and the coefficients of determination (R2) from calibration set and validation set are respectively 0.941 and 0.96, and the relative deviation for the validation set RPD is 4.8, showing a very good prediction capacity. Compared to modeling based on the full bands, the combination of characteristic variable selection and regression methods can effectively improve the modeling efficiency while ensuring the accuracy of the model.
    Guanwen Li, Xiaohong Gao, Nengwen Xiao, Yunfei Xiao. Estimation of Soil Organic Matter Content Based on Characteristic Variable Selection and Regression Methods[J]. Acta Optica Sinica, 2019, 39(9): 0930002
    Download Citation