• Spectroscopy and Spectral Analysis
  • Vol. 43, Issue 10, 3302 (2023)
AN Bai-song, WANG Xue-mei, HUANG Xiao-yu, and KAWUQIATI Bai-shan
Author Affiliations
  • [in Chinese]
  • show less
    DOI: 10.3964/j.issn.1000-0593(2023)10-3302-08 Cite this Article
    AN Bai-song, WANG Xue-mei, HUANG Xiao-yu, KAWUQIATI Bai-shan. Hyperspectral Estimation of Soil Lead Content Based on Random Frog Band Selection Algorithm[J]. Spectroscopy and Spectral Analysis, 2023, 43(10): 3302 Copy Citation Text show less

    Abstract

    Due to the large amount of redundant information in hyperspectral data, it greatly impacts the accuracy of hyperspectral estimation. The purpose of this study is to find the best algorithm for the screening of feature bands to realize the accurate monitoring of the lead content of heavy metals in soil and to provide a reference for soil pollution prevention and control. The lead contents and spectral data in the oasis soils of the Weigan-Kuqa river delta in Xinjiang were used as data sources, and 92 valid soil samples were identified using the Monte Carlo cross-validation algorithm (MCCV), and the spectral data processed by the first-order differential transformation of the reciprocal logarithm are selected through correlation analysis. The random frog (RF) algorithm is combined with the competitive adaptive reweighted sampling (CARS) algorithm. The iteratively retains informative variables (IRIV) algorithm and the successive projections algorithm (SPA). The RF-CARS, RF-IRIV and RF-SPA algorithms are constructed to screen the bands. Taking the reflectivity of feature bands as the independent variable and the content of heavy metal lead in the soil as the dependent variable, the extreme gradient boosting (XG Boost) and geographically weighted regression (GWR) methods were used to construct the estimation model of the lead content in the soil. The results show that: (1) The spectral transformation treatment can effectively enhance the sensitivity of the spectrum and lead content. The spectral characteristics after the first-order differential transformation of the reciprocal logarithm are obvious, and the correlation coefficient can reach 0.620 (p<0.001). (2) RF-CARS, RF-IRIV and RF-SPA algorithms extract 6, 9 and 7 feature bands from hyperspectral data, all located in the near-infrared spectral region. The three algorithms have strong feature extraction ability, greatly reducing redundant information in spectral data. (3) The accuracy and stability of the soil lead content estimation model constructed based on that the RF-IRIV algorithm are higher than those constructed by RF-CARS and RF-SPA, showing the RF-IRIV algorithm can more accurately retain the bands related to soil lead content. In addition, the performance of the GWR model is better than that of the XGBoost model, and the constructed RF-IRIV-GWR model has the good predictive ability, which can be used as the optimal estimation model for soil lead content in the study area. The R2, RMSE and RPD of the validation set of the RF-IRIV-GWR model are 0.892, 0.825 mg·kg-1 and 3.09 respectively. Based on the random frog (RF) and iteratively retains informative variables (IRIV) algorithm combined with geographically weighted regression (GWR) modeling method, it has certain advantages in quickly and accurately estimating soil lead content, which can be used for dynamic monitoring of soil heavy metal pollution.
    AN Bai-song, WANG Xue-mei, HUANG Xiao-yu, KAWUQIATI Bai-shan. Hyperspectral Estimation of Soil Lead Content Based on Random Frog Band Selection Algorithm[J]. Spectroscopy and Spectral Analysis, 2023, 43(10): 3302
    Download Citation