Review of Deep Learning Based Object Detection Methods and Their Mainstream Frameworks

Zhongjing Duan; Shaobo Li; Jianjun Hu; Jing Yang; Zheng Wang

doi:10.3788/LOP57.120005

Journals >Laser & Optoelectronics Progress >Volume 57 >Issue 12 >Page 120005 > Article

Laser & Optoelectronics Progress
Vol. 57, Issue 12, 120005 (2020)

Review of Deep Learning Based Object Detection Methods and Their Mainstream Frameworks

Zhongjing Duan¹, Shaobo Li^1、2、*, Jianjun Hu², Jing Yang², and Zheng Wang²

Author Affiliations

¹Key Laboratory of Advanced Manufacturing Technology of Ministry of Education, Guizhou University, Guiyang, Guizhou 550025, China

²School of Mechanical Engineering, Guizhou University, Guiyang, Guizhou 550025, China

show less

DOI: 10.3788/LOP57.120005 Cite this Article Set citation alerts

Zhongjing Duan, Shaobo Li, Jianjun Hu, Jing Yang, Zheng Wang. Review of Deep Learning Based Object Detection Methods and Their Mainstream Frameworks[J]. Laser & Optoelectronics Progress, 2020, 57(12): 120005 Copy Citation Text

show less

Fig. 1. LeNet network structure^[9]

Download full size

Fig. 2. AlexNet network structure^[10]

Download full size

Fig. 3. VGG16 network structure^[15]

Download full size

Fig. 4. GoogLeNet network structure^[19]

Download full size

Fig. 5. Basic module of ResNet network^[20]

Download full size

Fig. 6. DenseNet network structure^[21]

Download full size

Fig. 7. Overview of target detection algorithms proposed from November 2013 to October 2019

Download full size

Fig. 8. R-CNN implementation flow chart^[22]

Download full size

Fig. 9. Schematic diagram of SPP-Net implementation^[24]

Download full size

Fig. 10. Fast R-CNN implementation flow chart^[25]

Download full size

Fig. 11. Flow chart of Faster R-CNN implementation^[27]

Download full size

Fig. 12. Implementation process comparison of R-CNN, Fast R-CNN, and Faster R-CNN

Download full size

Fig. 13. Flow chart of Mask R-CNN implementation^[33]

Download full size

Fig. 14. FSAF implementation flow chart

Download full size

Model	Used method	Disadvantage	Improvement
R-CNN	1) Region proposal (SS);2) extraction feature(ConvNet); 3) classification(SVM); 4) regression(Candidate Bbox)	1) Complex training steps;2) training and testing areslow and take up a lot of diskspace; 3) CNN features arenot learned and updatedduring SVM and regression	1) Refresh mAP of DPMHSC from 34.3% to 66%;2) region proposal andconvolution network are used
Fast R-CNN	1) Region proposal(SS);2) extraction feature(ConvNet);3) classification(softmax);4) Bbox regression(multi-task loss function)	1) RP is still extracted withSS (consuming time of 2-3 s);2) difficult to meetreal-time requirements;3) GPU is utilized,but the region proposalmethod is implemented on CPU	1) mAP is increased by 4% from 66%;2) speeds of training and testing are improved
Faster R-CNN	1) Region proposalnetwork(RPN);2) extraction feature(ConvNet);3) classification(softmax);4) Bbox regression(multi-task loss function)	1) Real-time object detectionis not realized;2) computation ofobtaining region proposaland reclassification isvery large	1) It only takes 10 ms to generate suggestion box by usingconvolution network;2) accuracy and speed of detection are improved; 3) implement end-to-end target detection framework

Table 1. Comparison of R-CNN, Fast R-CNN, and Faster R-CNN

Download Citation

Set citation alerts for the article

Tools

Set citation alerts for the article

Save the article for my favorites

Paper Information