• Spectroscopy and Spectral Analysis
  • Vol. 41, Issue 12, 3857 (2021)
Long DUAN1、1;, Tian-ying YAN1、1;, Jiang-li WANG2、2; 3;, Wei-xin YE1、1;, Wei CHEN1、1;, Pan GAO1、1; 2; *;, and Xin LÜ2、2; 3; *;
Author Affiliations
  • 11. College of Information Science and Technology, Shihezi University, Shihezi 832003, China
  • 22. The Key Laboratory of Oasis Eco-Agriculture, Xinjiang Production and Construction Corps, Shihezi 832003, China
  • show less
    DOI: 10.3964/j.issn.1000-0593(2021)12-3857-07 Cite this Article
    Long DUAN, Tian-ying YAN, Jiang-li WANG, Wei-xin YE, Wei CHEN, Pan GAO, Xin LÜ. Combine Hyperspectral Imaging and Machine Learning to Identify the Age of Cotton Seeds[J]. Spectroscopy and Spectral Analysis, 2021, 41(12): 3857 Copy Citation Text show less

    Abstract

    At present, the technology of precision cotton seeding has been promoted comprehensively in Xinjiang Corps, which can accurately achieve the agronomic technical standards of one grain per hole, but it also sets higher demands for the screening of high-quality cotton seeds. To avoid the decrease of germination rate caused by the cotton seeds with lack of vitality in previous years, machine learning and near-infrared (NIR) hyperspectral imaging (HSI) technology can be used to identify cotton seed years with high precision and to screen cotton seeds quickly and nondestructively. A total of 1 440 cotton seeds with no difference in appearance were collected in 2016, 2017, 2018, and 2019, and 360 seeds per year (According to 3∶1∶1, it is divided into the training set, validation set, and test set.) as samples. Hyperspectral images of cotton seeds in the range of 915~1 698 nm were collected according to each batch of 60 seeds, and average spectra (1 002~1 602 nm) for removing obvious noise at the beginning and the end were extracted as the raw data. SavitzkyGolay (SG) smoothing algorithm was used to preprocess the spectra. The principal component analysis loading (PCA-loading) method was used to select 13 effective wavelengths. Six classification models, including logistic regression (LR), partial least squares discriminant analysis (PLS-DA), support vector machine (SVM), recurrent neural network (RNN), long-short memory network (LSTM), and convolution neural network (CNN), were established based on full spectra and effective wavelengths. When using full spectra to build models, the identification accuracy of the six classification models on the test set was 96.27%, 98.98%, 99.32%, 96.95%, 97.63%, and 100%, respectively, among which CNN and SVM models had achieved good results. When using effective wavelengths to build models, the identification accuracy of the six classification models on the test set was 93.56%, 97.29%, 98.30%, 95.25%, 94.24%, and 99.66%, respectively, among which CNN and SVM models still had excellent classification results. The results showed that the six classification models could achieve high precision cotton seed years identification when the full spectra were used, and the identification accuracy of CNN and SVM models was still up to 98% when the effective wavelengths were used. The deep learning methods are generally better than the traditional machine learning methods, but traditional machine learning methods can still maintain good identification accuracy. Therefore, the combination of near-infrared hyperspectral imaging technology and machine learning methods can achieve high-precision identification of cotton seed years. It provides theories foundation and methods for selecting high-quality cotton seeds in the process of precision sowing.
    Long DUAN, Tian-ying YAN, Jiang-li WANG, Wei-xin YE, Wei CHEN, Pan GAO, Xin LÜ. Combine Hyperspectral Imaging and Machine Learning to Identify the Age of Cotton Seeds[J]. Spectroscopy and Spectral Analysis, 2021, 41(12): 3857
    Download Citation