• Spectroscopy and Spectral Analysis
  • Vol. 42, Issue 1, 282 (2022)
Xiao-kang ZHAO*, Xin ZHAO, Qi-bing ZHU*;, and Min HUANG
Author Affiliations
  • Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), Jiangnan University, Wuxi 214122, China
  • show less
    DOI: 10.3964/j.issn.1000-0593(2022)01-0282-10 Cite this Article
    Xiao-kang ZHAO, Xin ZHAO, Qi-bing ZHU, Min HUANG. A Model Construction Method of Spectral Nondestructive Detection for Apple Quality Based on Unsupervised Active Learning[J]. Spectroscopy and Spectral Analysis, 2022, 42(1): 282 Copy Citation Text show less

    Abstract

    The essence of using near-infrared spectroscopy to realize non-destructive detection of agricultural products and food quality is to establish a machine learning model between sample spectral information and sample quality parameters. In order to obtain a machine learning model with good generalization performance, a large number of labeled samples are usually required. However, it is relatively easy to obtain spectral information of samples, but labeling samples quality parameters often involves a large amount of time and economic costs and is destructive. Active learning is a method to reduce the number of labeled samples in training set by selecting the most valuable samples for labeling instead of random selection. Therefore, active learning can control which samples are added to the training set, and the model no longer passively accepts samples for modeling. There have been many active learning algorithms in classification tasks. There are relatively few researches in regression tasks. Moreover, most of the existing active learning algorithms for regression tasks are supervised. That is, a small number of labeled samples are needed to train the initial model. In this paper, a training sample selection strategy based on unsupervised active learning is proposed. Firstly, the method divides the diversity of unlabeled (standard value) spectral datasets through hierarchical agglomerative clustering to obtain different clustering clusters. Then, the locally linear reconstruction method selects the most representative samples in each clustering cluster to form a training sample set and establish the partial least squares regression model based on the training set to predict the unlabeled samples. In this paper, partial least squares prediction models for soluble solids content and firmness prediction were constructed to evaluate the proposed method’s performance, using the near infrared spectrum data of three varieties of apples from two years. The experimental results show that the method proposed in this paper is superior to the existing sample selection strategy, which can effectively improve the model accuracy and reduce destructive physical and chemical experiments in model training. Meanwhile, compared with random sampling (RS), traditional Kennard-Stone (KS) and joint x-y distances (SPXY), the proposed method achieved the optimal performance. The root mean square error of the soluble solid content prediction models based on the unsupervised active learning algorithm proposed in this paper, which selects 200 samples as the training set, is reduced by 2.0%~13.2% compared with the other three algorithms, and the root means square error of the firmness prediction models is reduced by 1.2%~15.7%.
    Xiao-kang ZHAO, Xin ZHAO, Qi-bing ZHU, Min HUANG. A Model Construction Method of Spectral Nondestructive Detection for Apple Quality Based on Unsupervised Active Learning[J]. Spectroscopy and Spectral Analysis, 2022, 42(1): 282
    Download Citation