• Infrared and Laser Engineering
  • Vol. 51, Issue 4, 20220166 (2022)
Yiduo Li, Zibo Guo, Kai Liu, and Xiaoyao Sun
Author Affiliations
  • School of Computer Science and Technology, Xidian University, Xi'an 710071, China
  • show less
    DOI: 10.3788/IRLA20220166 Cite this Article
    Yiduo Li, Zibo Guo, Kai Liu, Xiaoyao Sun. Mixed-precision quantization for neural networks based on error limit (Invited)[J]. Infrared and Laser Engineering, 2022, 51(4): 20220166 Copy Citation Text show less
    (a) Photograph of deep learning convolutional 8-bit quantization procession[6]; (b) The distribution trend of the most valued weights in the first 20 layers of the YOLOV5 s network; (c) Distribution of activation maximum and cutoff value during network quantization in YOLOV5 s
    Fig. 1. (a) Photograph of deep learning convolutional 8-bit quantization procession[6]; (b) The distribution trend of the most valued weights in the first 20 layers of the YOLOV5 s network; (c) Distribution of activation maximum and cutoff value during network quantization in YOLOV5 s
    Framework of network hierarchical policy methodology
    Fig. 2. Framework of network hierarchical policy methodology
    Example of COCO dataset detection results
    Fig. 3. Example of COCO dataset detection results
    Quantitative methodOperation
    ${q}\left(w,{b}_{i}\right)=round\left(w/s\right)$Multiplication
    ${q}\left(w,{b}_{i}\right)=round\left(w×{2}^{fl}\right)$Displacement
    Table 1. Product quantization method and shift quantization method
    Network modelDatasetbitmAP.5-.95
    DisplacementMultiplication
    YOLOV5 sVOC863.4%77.9%
    726.5%68.8%
    64.6%39.5%
    3281.8%
    Table 2. The performance of different quantification methods on the VOC2007 dataset
    bit876532
    mAPMAX78.9%67.4%46.7%4.0%82.6%
    MSE82.7%76.0%69.0%31.7%
    Table 3. Network accuracy before and after quantization with different truncation methods
    γCompression radioAverage bitmAP
    0.084.936.4979.6%
    0.105.136.2377.8%
    0.1255.745.5772.3%
    0.1426.115.2362.8%
    0.1666.315.0763.3%
    0.207.144.4821.0%
    Table 4. Error limit parameter γ value comparison
    DatasetMethodbitγmAP@0.5mAP@0.5-0.95Model size
    COCOUnified bit70.5670.3456.35
    60.5030.3015.45
    50.3860.2154.54
    Mixed bit6.490.080.6020.3685.89
    5.570.1250.5460.3225.05
    5.070.1660.4460.2604.60
    Ori model320.6360.41129.07
    VOC2011Unified bit70.9500.7326.35
    60.9250.6435.45
    50.5330.2954.54
    Mixed bit6.490.080.9500.7065.89
    5.570.1250.9810.6695.05
    5.070.1660.7820.4564.60
    Ori model320.9500.78629.07
    Table 5. Test results of different quantification methods on COCO dataset and VOC2011 dataset
    DatasetMethodbitmAP@0.5AeroplaneBicycleBirdBoatBottleChairDogPersonSheepTrainTvmonitor
    VOC2011Unite50.7820.7530.4350.4970.9950.8010.9950.2490.8970.9950.9950.995
    Mixed0.5330.2320.3240.4970.4840.2090.9950.3320.4550.9950.9950.34
    Table 6. VOC2011 dataset category accuracy detection table
    Yiduo Li, Zibo Guo, Kai Liu, Xiaoyao Sun. Mixed-precision quantization for neural networks based on error limit (Invited)[J]. Infrared and Laser Engineering, 2022, 51(4): 20220166
    Download Citation