• Spectroscopy and Spectral Analysis
  • Vol. 38, Issue 6, 1766 (2018)
LI Sheng-fang1、2、*, JIA Min-zhi1, and DONG Da-ming2、3
Author Affiliations
  • 1[in Chinese]
  • 2[in Chinese]
  • 3[in Chinese]
  • show less
    DOI: 10.3964/j.issn.1000-0593(2018)06-1766-06 Cite this Article
    LI Sheng-fang, JIA Min-zhi, DONG Da-ming. Fast Measurement of Sugar in Fruits Using Near Infrared Spectroscopy Combined with Random Forest Algorithm[J]. Spectroscopy and Spectral Analysis, 2018, 38(6): 1766 Copy Citation Text show less

    Abstract

    In recent years, many researchers have studied the measurement methods of fruit sugar and other internal quality by near-infrared (NIR) spectroscopy and some commercial instruments have been produced. However, due to the complexity of the NIR spectra, the transitivity of the models established with NIR is often poorly performed. The model is only built for a particular species or even a certain variety. Random forest (RF) is an integrated algorithm based on decision tree, which improves the prediction accuracy by integrating the classification regression tree (CART) model. Compared with partial least squares (PLS), multiple linear regression (MLR) and other methods, RF algorithm has the strong analytical ability of nonlinear data. Taking into account the randomness of the RF model, the model is optimized by debugging the number of decision tree (ntree) and the number of split variables (mtry). In this study, we used RF to predict the sugar content in different types of fruits (apple and pear). Experimental results showed that for the same kind of fruit, the modeling and predicting results of RF and PLS were better. However, for different types of fruits, RF significantly increased the prediction ability of the model. The R2 of PLS model was 0.878 and the R2 of RF model was increased to 0.999. The RMSEC of PLS model and RF model were respectively 0.453 and 0.015. In addition, the optimal RF model was tested by independent test set samples, the R2 of PLS model was 0.731 and the R2 of RF model was increased to 0.888. The RMSEC of PLS model and RF model were respectively 1.148 and 0.334. RF showed a significant advantage in predicting a variety of fruit sugar. This research proved that the RF method could be applied to detect the sugar content in fruits by NIR spectroscopy, thus solving the model problem of universality and transitivity.
    LI Sheng-fang, JIA Min-zhi, DONG Da-ming. Fast Measurement of Sugar in Fruits Using Near Infrared Spectroscopy Combined with Random Forest Algorithm[J]. Spectroscopy and Spectral Analysis, 2018, 38(6): 1766
    Download Citation