• Spectroscopy and Spectral Analysis
  • Vol. 41, Issue 1, 188 (2021)
Ying CHEN1、1, Yang-mei XU1、1, Yuan-jian DI1、1, Xing-ning CUI1、1, Jie ZHANG1、1, Xin-de ZHOU1、1, Chun-yan XIAO1、1, and Shao-hua LI1、1
Author Affiliations
  • 1[in Chinese]
  • 11. Hebei Province Key Laboratory of Test/Measurement Technology and Instrument, School of Electrical Engineering, Yanshan University, Qinhuangdao 066004, China
  • show less
    DOI: 10.3964/j.issn.1000-0593(2021)01-0188-06 Cite this Article
    Ying CHEN, Yang-mei XU, Yuan-jian DI, Xing-ning CUI, Jie ZHANG, Xin-de ZHOU, Chun-yan XIAO, Shao-hua LI. COD Concentration Prediction Model Based on Multi-Spectral Data Fusion and GANs Algorithm[J]. Spectroscopy and Spectral Analysis, 2021, 41(1): 188 Copy Citation Text show less

    Abstract

    Excessive concentration of organic pollutants in water is harmful, which causes not only serious environmental pollution but also endangers human health. Chemical oxygen demand (COD) can be used to characterize the pollution degree of organic pollutants in water. A quantitative prediction model of COD concentration based on generative adversarial networks (GANs) algorithm is proposed, which combines ultraviolet (UV) and Near Infrared (NIR) spectra with data-level fusion (DLDF) and feature level data fusion (FLDF). In this paper, firstly, COD standard samples are prepared according to a certain concentration gradient, and the ultraviolet spectrum (190~310 nm) and near-infrared spectrum (830~2 100 nm) of the standard sample are collected respectively. The first derivative and Savitzky-Golay (S-G) smoothing pretreatment of the obtained ultraviolet and near-infrared spectrum data are carried out to eliminate the baseline drift of the spectrum and the interference noise. Then, the data fusion of data level and featural level are carried out directly basing on the pretreated ultraviolet and near-infrared spectra, and the COD concentration prediction model is constructed by GANs algorithm. The model is evaluated by using the square of the correlation coefficient of the evaluation parameters (R2), the mean square root error of the predicted value and the real concentration value (RMSEP) and the prediction deviation. The results show that neither FLDF model nor DLDF model is not ideal. The analysis shows that the model contribution of the ultraviolet spectrum is concealed in the near-infrared band due to the unbalanced data in the ultraviolet and near-infrared bands, which makes the spectral fusion meaningless. In order to avoid the problem of fusion failure, the normalizat-ion method is proposed to deal with the mixed spectrum in the text. The effects of standard normal variation (SNV), maximum and minimum normalization (MMN) and vector normalization (VN) on the modeling are discussed. Then the normalized ultraviolet and near-infrared spectral data are fused again under the given sub-interval number, the input X of GAN model is taken as the input X, and the real measured COD value is taken as the output Y. The prediction models of COD concentration are established after different normalization methods. The modeling results show that different normalization methods have a great influence on the hybrid spectral data fusion model, and the prediction accuracy of the data-level fusion model and the feature-level fusion model is significantly improved before it is normalized, among which the prediction model with the maximum and minimum normalization is the most obvious. Finally, in order to verify the accuracy of the multi-spectral data fusion GANs Prediction model, the GANs prediction model of the full wavelength ultraviolet band of a single spectral source and the GANs prediction model of the full wavelength near-infrared band of a single spectral source are established. The experimental results show that the correlation coefficient of the characteristic level spectral fusion model basing on the ultraviolet and near-infrared spectra is 0.994 7, the prediction mean square root error is 0.976, the prediction model error is reduced by 52.9% comparing with the data level fusion, and the predicted recovery rate is 98.4%~103.1%, which is much better than the other groups. The generalization ability of the model is strong and the prediction accuracy is high. Compared with the monitoring model of single spectral source, the data fusion of mixed spectra can reflect more the chemical information of water samples, and reveals the pollutant degree of a water body more comprehensively, reflects the difference of pollutants in a water body from different levels, provides some technical support for on-line monitoring of COD concentration in water.
    Ying CHEN, Yang-mei XU, Yuan-jian DI, Xing-ning CUI, Jie ZHANG, Xin-de ZHOU, Chun-yan XIAO, Shao-hua LI. COD Concentration Prediction Model Based on Multi-Spectral Data Fusion and GANs Algorithm[J]. Spectroscopy and Spectral Analysis, 2021, 41(1): 188
    Download Citation