• Spectroscopy and Spectral Analysis
  • Vol. 42, Issue 11, 3501 (2022)

Abstract

The combination of spectroscopic technology and machine learning algorithm for rapid identification of microplastics provides great technical support for microplastics' field detection, a new field that has attracted great attention. Nirs detection technology has the characteristics of fast detection speediness, highly sensitization, damage less, and can be directly detected without sample pretreatment, widely used in chemical analysis quality detection and other fields.This paper compares support vector machine (SVM) and Extreme Gradient Boosting (XGBoost), two machine learning classification algorithms based on the infrared spectrum, to build a classification model for high-speed and effective recognition of microplastics. Acrylonitrile butadiene styrene(ABS), Polyacrylonitrile (PAN), Polycarbonate (PC), Polyethylene glycol terephthalate(PET), Polymethyl methacrylate (PMMA), Polypropylene (PP), Polystyrene(PS), Polyvinyl chloride (PVC), Thermoplastic polyurethane (TPU), Ethylene-vinyl acetate copolymer (EVA), Polybutylene terephthalate (PBT), Polycaprolactone (PCL), Polyethersulfone (PES), Polylactic acid (PLA), Polyoxymethylene (POM), Polyphenylene Oxide (PPO), Polyphenylene sulfide (PPS), Poly tetra fluoroethylene (PTFE), polyvinyl alcohol (PVA), Styrenic Block Copolymers(SBS)20 standard samples of microplastics were collected by using A miniature near-infrared spectrum. In order to prevent overfitting, 1 260 microplastic samples were collected, each sample containing 512 data points. The XGBoost algorithm was used to rank the importance of the logarithmic data points, and a total of 65 data points which greatly influenced the recognition accuracy were extracted. SVM algorithm and XGBoost algorithm are respectively used to establish a microplastic fast recognition model for 65 data points extracted after dimensionality reduction, and GridSearchCV is used to select the hyperparameters that have a great influence on XGBoost algorithm to determine n_estimators, learning_rate, The optimal hyperparameters for min_child_weigh, max_depth, and gamma are 700, 0.07, 1,1, 0.0, respectively. In order to improve the model's stability, recognition rate and generalization ability, a 10-fold cross-validation and confusion matrix were used to evaluate the model. The results show that the recognition accuracy of the XGBoost model is 97%, while that of the SVM model is 95%. The accuracy of the XGBoost model is better than the SVM model. In conclusion, the overall performance of the XGBoost model was better than that of the SVM model, which provides technical support for rapid identification of actual microplastics.