• Spectroscopy and Spectral Analysis
  • Vol. 39, Issue 7, 2171 (2019)
LI Zhi-hao*, SHEN Jun, BIAN Rui-hua, and ZHENG Jian
Author Affiliations
  • [in Chinese]
  • show less
    DOI: 10.3964/j.issn.1000-0593(2019)07-2171-05 Cite this Article
    LI Zhi-hao, SHEN Jun, BIAN Rui-hua, ZHENG Jian. Accuracy Comparison of the Machine Learning Algorithm Used to Raman Real Sample Collection in the Front Line of Public Security[J]. Spectroscopy and Spectral Analysis, 2019, 39(7): 2171 Copy Citation Text show less

    Abstract

    Raman spectroscopy equipment comes into use in the front line of public security gradually, which is mainly used for the detection of inflammable, explosive and easily-made drug chemicals. However, workers without professional knowledge may not be able to perform detection in full accordance with the best conditions. Frequent problems such as defocusing, offsetting and short sampling time may cause a great influence on the final comparison. In this article, five mainstream machine learning algorithms were used to train and classify the original data collected during the actual inspection and handling of the case. Also, the accuracy comparisons was given in this paper. According to the result, algorithm with the best accuracy will be used to improve the Raman spectroscopy in the future. The collected data were all from the EVA3000 Raman spectrometer developed by the Third Research Institute of the Ministry of Public Security. The spectrometer had been equipped in certain provinces, cities, prefectures and counties across the country. Front-line inspection personnel would periodically transmit the raw data back to the EVA3000’s back office management system. Through the management system, the raw data generated during the actual inspection was collected. A total of 160 cases including phenylacetic acid, methylene chloride, ephedrine and nitrobenzene, which had been qualitatively determined, were randomly extracted from the uploaded database. The 40-, 60-, 100-, 150-, 200-, 300-, 500-time trainings and predictions with decision trees, random forests, AdaBoost, support vector machines and artificial neural networks were executed to calculate average accuracy respectively. From the experimental results, we can see that among the five learning algorithms, the ranking of the prediction accuracy to actual samples is roughly random forest≈AdaBoost>decision tree>SVM>ANNs. The verification results are generally consistent with the experimental ones. The accuracy of random forest is similar to AdaBoost because both algorithms constantly build new training data sets from the original ones and improve the weight of the wrong samples in the next training. On the other hand, SVM and ANNs are perceptron-based algorithms. It can be seen that in the current mainstream algorithms, bootstrap aggregating method is more suitable for the sampling training of actual samples. In the next step, the research team will continue to optimize existing algorithms and implement them in the back office management system for on-line detection. The results of this paper are of great significance for further using machine learning algorithms to the practical applications in the field of the front line of public security.
    LI Zhi-hao, SHEN Jun, BIAN Rui-hua, ZHENG Jian. Accuracy Comparison of the Machine Learning Algorithm Used to Raman Real Sample Collection in the Front Line of Public Security[J]. Spectroscopy and Spectral Analysis, 2019, 39(7): 2171
    Download Citation