• Spectroscopy and Spectral Analysis
  • Vol. 42, Issue 9, 2740 (2022)
Xin TANG, Sheng-ling ZHOU*;, Shi-ping ZHU*;, Ling-kai MA, Quan ZHENG, and Jing PU
Author Affiliations
  • College of Engineering and Technology, Southwest University, Chongqing 402160, China
  • show less
    DOI: 10.3964/j.issn.1000-0593(2022)09-2740-06 Cite this Article
    Xin TANG, Sheng-ling ZHOU, Shi-ping ZHU, Ling-kai MA, Quan ZHENG, Jing PU. Analysis and Identification of Terahertz Tartaric Acid Spectral Characteristic Region Based on Density Functional Theory and Bootstrapping Soft Shrinkage Method[J]. Spectroscopy and Spectral Analysis, 2022, 42(9): 2740 Copy Citation Text show less

    Abstract

    Terahertz time-domain spectroscopy contains the chemical and physical information of samples and indicates the background information related to equipment noise, sample status and environmental parameters. Its diversified spectrum may affect the model’s performance and reduce the prediction accuracy. Therefore, extracting the characteristic information of target components, eliminatingredundant variables and screen the characteristic spectrum regions from the spectral data in a complex, overlapping and changing environment is of great significance for the quantitative and qualitative analysis of the terahertz spectrum. This paper collected the THz absorption spectra of 342 L-tartaric acid samples with concentrations of 10%, 20%, 40%, 50%, 60% and 80%. The B3LYP method in density functional theory (DFT) was used to optimize the monolecular model of L-tartaric acid based on 6-31G* (d, p) basis set, and the terahertz spectrum characteristics of the monolecular model were theoretically simulated. The molecular vibration modes corresponding to the characteristic wave peaks were analyzed, and the absorption spectra in the band of 0.2~1.6 THz were obtained. Compared with the measured absorption spectrum, the measured results agree well with the theoretical calculation results. The terahertz absorption spectrum of L-tartaric acid was screened using Bootstrapping soft shrinkage (BOSS). The competitive adaptive weighted sampling (CARS-PLS), Monte Carlo non-informational variable elimination (MC-UVE-PLS) and interval partial least square method (iPLS) were then compared and analyzed to obtain a better feature spectral region identification model. The analysis results indicate that the effective spectrum area obtained by the BOSS algorithm agrees better with the characteristic spectral region calculated by DFT theory. The L-tartaric acid spectrum modeling and regression analysis were conducted using full-spectrum PLS, CARS-PLS, MC-UVE-PLS, iPLS and BOSS algorithms. The experimental results imply that the prediction accuracy of the four spectral region screening methods is improved compared with the full spectrum PLS model. In addition, the prediction ability of the BOSS algorithm is improved most significantly by whose cross-validation root-mean-square error (RMSECV), prediction root-mean-square error (RMSEP), validation set determination coefficient (Rtest2) and test set determination coefficient (Rtrain2) are 0.026 0, 0.026 0, 0.988 1 and 0.987 5 respectively, with higher prediction accuracy and model stability than other models. Therefore, it is foreseeable that, this study may provide an effective method for rapid and quantitative detection based on terahertz spectroscopy.
    Xin TANG, Sheng-ling ZHOU, Shi-ping ZHU, Ling-kai MA, Quan ZHENG, Jing PU. Analysis and Identification of Terahertz Tartaric Acid Spectral Characteristic Region Based on Density Functional Theory and Bootstrapping Soft Shrinkage Method[J]. Spectroscopy and Spectral Analysis, 2022, 42(9): 2740
    Download Citation