• Laser & Optoelectronics Progress
  • Vol. 57, Issue 12, 120005 (2020)
Zhongjing Duan1, Shaobo Li1、2、*, Jianjun Hu2, Jing Yang2, and Zheng Wang2
Author Affiliations
  • 1Key Laboratory of Advanced Manufacturing Technology of Ministry of Education, Guizhou University, Guiyang, Guizhou 550025, China
  • 2School of Mechanical Engineering, Guizhou University, Guiyang, Guizhou 550025, China
  • show less
    DOI: 10.3788/LOP57.120005 Cite this Article Set citation alerts
    Zhongjing Duan, Shaobo Li, Jianjun Hu, Jing Yang, Zheng Wang. Review of Deep Learning Based Object Detection Methods and Their Mainstream Frameworks[J]. Laser & Optoelectronics Progress, 2020, 57(12): 120005 Copy Citation Text show less
    LeNet network structure[9]
    Fig. 1. LeNet network structure[9]
    AlexNet network structure[10]
    Fig. 2. AlexNet network structure[10]
    VGG16 network structure[15]
    Fig. 3. VGG16 network structure[15]
    GoogLeNet network structure[19]
    Fig. 4. GoogLeNet network structure[19]
    Basic module of ResNet network[20]
    Fig. 5. Basic module of ResNet network[20]
    DenseNet network structure[21]
    Fig. 6. DenseNet network structure[21]
    Overview of target detection algorithms proposed from November 2013 to October 2019
    Fig. 7. Overview of target detection algorithms proposed from November 2013 to October 2019
    R-CNN implementation flow chart[22]
    Fig. 8. R-CNN implementation flow chart[22]
    Schematic diagram of SPP-Net implementation[24]
    Fig. 9. Schematic diagram of SPP-Net implementation[24]
    Fast R-CNN implementation flow chart[25]
    Fig. 10. Fast R-CNN implementation flow chart[25]
    Flow chart of Faster R-CNN implementation[27]
    Fig. 11. Flow chart of Faster R-CNN implementation[27]
    Implementation process comparison of R-CNN, Fast R-CNN, and Faster R-CNN
    Fig. 12. Implementation process comparison of R-CNN, Fast R-CNN, and Faster R-CNN
    Flow chart of Mask R-CNN implementation[33]
    Fig. 13. Flow chart of Mask R-CNN implementation[33]
    FSAF implementation flow chart
    Fig. 14. FSAF implementation flow chart
    ModelUsed methodDisadvantageImprovement
    R-CNN1) Region proposal (SS);2) extraction feature(ConvNet); 3) classification(SVM); 4) regression(Candidate Bbox)1) Complex training steps;2) training and testing areslow and take up a lot of diskspace; 3) CNN features arenot learned and updatedduring SVM and regression1) Refresh mAP of DPMHSC from 34.3% to 66%;2) region proposal andconvolution network are used
    Fast R-CNN1) Region proposal(SS);2) extraction feature(ConvNet);3) classification(softmax);4) Bbox regression(multi-task loss function)1) RP is still extracted withSS (consuming time of 2-3 s);2) difficult to meetreal-time requirements;3) GPU is utilized,but the region proposalmethod is implemented on CPU1) mAP is increased by 4% from 66%;2) speeds of training and testing are improved
    Faster R-CNN1) Region proposalnetwork(RPN);2) extraction feature(ConvNet);3) classification(softmax);4) Bbox regression(multi-task loss function)1) Real-time object detectionis not realized;2) computation ofobtaining region proposaland reclassification isvery large1) It only takes 10 ms to generate suggestion box by usingconvolution network;2) accuracy and speed of detection are improved; 3) implement end-to-end target detection framework
    Table 1. Comparison of R-CNN, Fast R-CNN, and Faster R-CNN
    Zhongjing Duan, Shaobo Li, Jianjun Hu, Jing Yang, Zheng Wang. Review of Deep Learning Based Object Detection Methods and Their Mainstream Frameworks[J]. Laser & Optoelectronics Progress, 2020, 57(12): 120005
    Download Citation