• Spectroscopy and Spectral Analysis
  • Vol. 39, Issue 7, 2271 (2019)
JI Hai-yan1、2、*, REN Zhan-qi1、2, and RAO Zhen-hong3
Author Affiliations
  • 1[in Chinese]
  • 2[in Chinese]
  • 3[in Chinese]
  • show less
    DOI: 10.3964/j.issn.1000-0593(2019)07-2271-07 Cite this Article
    JI Hai-yan, REN Zhan-qi, RAO Zhen-hong. Discriminant Analysis of Millet from Different Origins Based on Hyperspectral Imaging Technology[J]. Spectroscopy and Spectral Analysis, 2019, 39(7): 2271 Copy Citation Text show less

    Abstract

    Hyperspectral imaging technology has been widely used in the detection of agricultural products. This paper studies the non-destructive identification of millet samples from different regions based on hyperspectral imaging and machine learning algorithms. The millet samples from seven provinces were divided into five categories according to geographical regions. They were Dongbei, Hebei, Shaanxi, Shandong, and Shanxi, respectively. A total of 23 samples were collected in these areas, including 6 samples in Dongbei, 5 samples in Shanxi, and respective 4 samples in Hebei, Shaanxi, and Shandong. Each sample was equally divided into 10 equal parts and the hyperspectral data of millet in the wavelength band from 900 to 1 700 nm was collected using a hyperspectral imager. In order to reduce the influence of uneven illumination and dark current on the experiment, the collected hyperspectral data was corrected in black and white. The ENVI software was used to select the region of interest (ROI) of millet hyperspectral image, and 9 ROIs were selected for each sample of millet. The average spectral value in the ROI was calculated, which was used as a spectrum record of the sample. Finally, a total of 2 070 spectral curves were collected, of which 540 from Dongbei, 450 from Shanxi, and several 360 from Hebei, Shandong, and Shaanxi respectively. In order to reduce the scattering phenomenon caused by the unevenness of the sample surface, which would affect the true spectral information of millet, the multivariate scatter correction (MSC) pretreatment was performed on the original spectrum. In addition, randomized division method was used to divide the corrected spectral data into training set and test set. The ratio of test set was 0.3. Linear Discriminant Analysis (LDA) was used to visualize spectral data of millet from different origins. Substituting the test set into a well-trained LDA model, and finally a confusion matrix of prediction results was created. The results showed that LDA had a prediction accuracy of 0.84 and 0.99 for Shaanxi and Shanxi, and only 0.68, 0.68, and 0.40 for Dongbei, Hebei, and Shandong. Therefore, the recursive feature elimination (RFE) was used to select useful spectral information, remove redundant information, and improve the prediction accuracy. The RFE combined with support vector machine (SVM) and Logistic Regression (LR) were used to compare and analyze the discriminant of millet from different regions. Substituting training set of millet spectral data into SVM-RFE and LR-RFE models, and the corresponding feature subsets were selected optimally by the micro-averaging of the model F-values and 3-fold cross validation technology. The results showed that the number of wavelengths selected by the LR-RFE was 74 and the Micro_F of the model was 0.59; Meanwhile the number of wavelengths selected by the SVM-RFE was 220 and the Micro_F of the model was 0.66. The selected feature subset was applied to the test set. Substituting the test set into SVM and LR models respectively, and confusion matrix of model prediction results and the receiver operating characteristic curve (ROC) of the model were used as the evaluation method. The results showed that the accuracy of SVM-RFE prediction was 1, 0.37, 0.72, 0, and 1 for Dongbei, Hebei, Shaanxi, Shandong, and Shanxi, and the area under ROC curve (AUC) was 0.82, 0.92, 0.93, 0.70, and 0.99 respectively. The accuracy of LR-RFE prediction was 0.92, 0, 0.97, 0, and 0.80, and the AUC was 0.72, 0.74, 0.94, 0.66, and 0.88 respectively. It can be seen from the prediction results that the overall classification performance of SVM-RFE model was better than that of LR-RFE, while the discrimination of Shaanxi class LR-RFE was better than that of SVM-RFE. For the Hebei and Shandong categories, neither model could effectively discriminate it. Compared with LDA, the prediction accuracy of these two models had been improved.
    JI Hai-yan, REN Zhan-qi, RAO Zhen-hong. Discriminant Analysis of Millet from Different Origins Based on Hyperspectral Imaging Technology[J]. Spectroscopy and Spectral Analysis, 2019, 39(7): 2271
    Download Citation