• Spectroscopy and Spectral Analysis
  • Vol. 39, Issue 7, 2176 (2019)
CHEN Ying1, DI Yuan-jian1, TANG Xin-liang2, CUI Xing-ning1, GAO Xin-bei1, CAO Jing-gang1, and LI Shao-hua3
Author Affiliations
  • 1[in Chinese]
  • 2[in Chinese]
  • 3[in Chinese]
  • show less
    DOI: 10.3964/j.issn.1000-0593(2019)07-2176-06 Cite this Article
    CHEN Ying, DI Yuan-jian, TANG Xin-liang, CUI Xing-ning, GAO Xin-bei, CAO Jing-gang, LI Shao-hua. Combination Weight COD Concentration Prediction Model Based on BiPLS and SiPLS[J]. Spectroscopy and Spectral Analysis, 2019, 39(7): 2176 Copy Citation Text show less

    Abstract

    The excessively high concentration of organic matter in water poses a great harm, which not only causes serious environmental pollution, but also harms human health. The traditional chemical method for detecting COD(Chemical oxygen denmand, COD) in water usually takes a long time, which is not conducive to rapid quantitative detection of COD in water. In order to solve these problems, a rapid and quantitative detection of COD using a combination of UV spectroscopy and combined weight models is proposed in this paper, the prediction model is based on the backward interval partial least squares (BiPLS) and synergy interval partial least squares (SiPLS) algorithm for screening the characteristic Intervals of UV spectra, and then based on the weights of the characteristic Intervals, a combination weight concentration prediction model is established. In this paper, 45 samples of COD standard solution are experimented; The first derivative and S-G screening of COD UV spect rum date are preprocessed to eliminate baseline drift and environmental noise; The SPXY algorithm is used to divide the experimental data sets into calibration sets and prediction sets. Then, the wavelength of the whole spectral range is screened based on the BiPLS algorithm. In the process of BiPLS screening, the selection of the number of target interval division will have a great influence on the model, so the number of Interval divisions is optimized, subintervals are divided into 15 to 25, and PLS modeling is performed under different interval numbers. The optimal subinterval number is selected by cross-validating root mean square error (RMSECV). When the number of intervals is 18, the effect of the model is the best. 6 characteristic wavelengths are selected from 18 wavelengths. The selected Intervals are 2, 1, 3, 11, 7, 6, and the corresponding wavelengths are 234~240, 262~268, 269~275, 290~296, 297~303, 304~310 nm, respectively. These 6 characteristic wavelength ranges cover a large amount of spectral information and contribute greatly to the final prediction model. Then, these 6 regions are further screened and combined through the SiPLS algorithm, PLS models with different characteristic intervals are constructed using different combinations under the same combination number, the optimal results of an interval combination number are screened out, and the error and correlation of the prediction models under different combinations are compared, the 6 interval are combined into 3 characteristic wavelength intervals, which are 234~240, 262~275 and 290~310 nm respectively. The optimal factor of the optimal PLS model for these three characteristic intervals is 4, 4 and 3, respectively. The characteristic interval combination method of the traditional SiPLS is improved, and the three characteristic intervals are linearly combined based on the weight value instead of the previous direct combination of characteristic intervals. The weights of these three characteristic intervals are calculated by the weight formula as 0.509, 0.318 and 0.173 respectively. Finally, a linear combination weight COD concentration prediction model is established. In order to verify the accuracy of the combined weight prediction model, a PLS prediction model over the full wavelength range, a PLS prediction model with a single characteristic wavelength interval, and a PLS model directly combining characteristic wavelength intervals are established, and the square of the correlation coefficient of the evaluation parameter (R2), the root mean square error of the predicted value and the true concentration value (RMSEC) as well as the Predicted recovery (T) are used to evaluate the model. Compared with other predictive models, the verification results show that the square of the correlation coefficient of the combined weight model reaches 0.999 7, which is obviously higher than the 0.968 0 of the direct combined characteristic interval model, the prediction root mean square error is 0.532, which is more than the prediction of the direct combination characteristic intervals. The model error is reduced by 29.3%, the predicted recovery rate is 96.4%~103.1%, which significantly improves the prediction accuracy. The method is simple and feasible without generating twice pollution, which can provide some technical support for on-line monitoring of COD concentration in water.
    CHEN Ying, DI Yuan-jian, TANG Xin-liang, CUI Xing-ning, GAO Xin-bei, CAO Jing-gang, LI Shao-hua. Combination Weight COD Concentration Prediction Model Based on BiPLS and SiPLS[J]. Spectroscopy and Spectral Analysis, 2019, 39(7): 2176
    Download Citation