• Spectroscopy and Spectral Analysis
  • Vol. 42, Issue 7, 2148 (2022)
Ping JIANG1、1;, Hao-xiang LU2、2;, and Zhen-bing LIU2、2; *;
Author Affiliations
  • 11. School of Computer and Information Technology, Guangxi Police College, Nanning 530028, China
  • 22. College of Computer and Information Security, Guilin University of Electronic Technology, Guilin 541004, China
  • show less
    DOI: 10.3964/j.issn.1000-0593(2022)07-2148-08 Cite this Article
    Ping JIANG, Hao-xiang LU, Zhen-bing LIU. Drugs Identification Using Near-Infrared Spectroscopy Based on Random Forest and CatBoost[J]. Spectroscopy and Spectral Analysis, 2022, 42(7): 2148 Copy Citation Text show less
    The structure of RF-CatBoost
    Fig. 1. The structure of RF-CatBoost
    NIR spectra of pretreated cefixime tablets
    Fig. 2. NIR spectra of pretreated cefixime tablets
    Covariance matrix of drug NIR data before (a) and after (b) pretreatment
    Fig. 3. Covariance matrix of drug NIR data before (a) and after (b) pretreatment
    Classification accuracy of different decision tree numbers in Catboost on datasets of different sizes in group A (a) and group B (b)
    Fig. 4. Classification accuracy of different decision tree numbers in Catboost on datasets of different sizes in group A (a) and group B (b)
    Standard deviations of each model on different sizes data sets in group A (a) and group B (b)
    Fig. 5. Standard deviations of each model on different sizes data sets in group A (a) and group B (b)
    厂商非铝塑
    包装
    铝塑
    包装
    合计
    湖南方盛制药股份有限公司5454108
    江苏正大清江制药有限公司6356119
    山东鲁抗医药股份有限公司514091
    山东罗欣药业股份有限公司484896
    共计216198414
    Table 1. Near infrared spectral data of cefixime tablets
    数据集样本总数正样本数负样本数
    A401525
    602040
    802555
    1003070
    1203585
    14040100
    16045115
    18050130
    B301020
    501535
    702050
    902565
    1103080
    1303595
    15040110
    17045125
    Table 2. Configuration table of different number of training sets in group A and B
    组别训练/测试集ELMSWELMSVMBPBoostingCatBoostCatBoost RF-CatBoost
    A40/17692.3692.3393.6889.3694.4494.9996.79
    60/15693.6593.8894.0990.3194.9595.0397.89
    80/13694.9995.1195.3591.0196.2296.8598.82
    100/11696.0396.2996.8891.8597.0597.5299.05
    120/9697.8897.9597.2392.9997.9997.8999.95
    140/7697.9997.6498.8893.3598.8598.98100
    160/5698.0598.0199.0594.9999.0199.02100
    180/3698.8898.9199.0195.8999.1899.35100
    B30/16891.2890.9992.3488.7592.8693.9595.97
    50/14892.6991.6793.3890.3194.0194.9896.59
    70/12893.1993.1194.2091.1195.1996.8298.79
    90/10894.2594.2695.3891.8596.2196.9899.92
    110/8894.9595.8996.2192.9996.8997.99100
    130/6895.9297.2298.0993.3597.9698.09100
    150/4897.9598.0998.8994.9998.3998.19100
    170/2898.8898.8599.0095.8999.0898.51100
    Table 3. Classification accuracy of each model on different sizes data sets in group A and B (%)
    组别训练/测试集ELMSWELMSVMBPBoostingCatBoostCatBoost RF-CatBoost
    A40/1760.009 40.003 10.017 038.098 815.928 88.717 06.088 3
    60/1560.013 00.005 00.032 138.339 717.985 415.903 07.408 3
    80/1360.013 80.015 80.063 338.449 820.111 623.147 49.237 4
    100/1160.021 10.017 10.113 838.864 722.273 030.441 09.365 9
    120/960.031 10.030 20.151 839.800 824.884 638.090 210.297 3
    140/760.044 20.040 10.219 840.885 026.690 645.062 610.988 8
    160/560.056 70.059 70.287 841.297 828.565 252.382 012.083 4
    180/360.078 70.074 00.359 542.380 730.661 259.526 412.530 8
    B30/1680.017 50.009 40.007 32.171 02.510 61.496 20.477 1
    50/1480.053 50.023 70.024 34.159 93.322 72.395 71.164 1
    70/1280.127 50.053 30.052 36.208 04.081 73.325 82.031 3
    90/1080.215 50.098 30.102 78.350 65.336 24.422 02.907 4
    110/880.339 10.167 00.177 810.416 06.174 35.406 63.831 2
    130/680.495 80.260 20.295 312.404 06.887 96.411 84.785 0
    150/480.706 10.406 30.439 314.357 27.694 57.363 15.779 7
    170/280.912 00.616 80.643 016.322 88.565 58.346 46.829 1
    Table 4. Runningtime of each model on different sizes data sets in group A and group B (s)
    Ping JIANG, Hao-xiang LU, Zhen-bing LIU. Drugs Identification Using Near-Infrared Spectroscopy Based on Random Forest and CatBoost[J]. Spectroscopy and Spectral Analysis, 2022, 42(7): 2148
    Download Citation