Improved Tiny YOLOv4 Algorithm and Its Application in Pedestrian Detection

Yong Xuan; Chao Han; Wenhan Sha

doi:10.3788/LOP202259.1215002

[1] Duan Z J, Li S B, Hu J J et al. Review of deep learning based object detection methods and their mainstream frameworks[J]. Laser & Optoelectronics Progress, 57, 120005(2020).

[2] Cao S, Zhang X W, Ma J W. Trans-scale feature aggregation network for multiscale pedestrian detection[J]. Journal of Beijing University of Aeronautics and Astronautics, 46, 1786-1796(2020).

[3] Ju M R, Luo H B, Wang Z B et al. Improved YOLO V3 algorithm and its application in small target detection[J]. Acta Optica Sinica, 39, 0715004(2019).

[4] Zhao B, Wang C P, Fu Q et al. Multi-scale infrared pedestrian detection based on deep attention mechanism[J]. Acta Optica Sinica, 40, 0504001(2020).

[5] Lan W B, Dang J W, Wang Y P et al. Pedestrian detection based on YOLO network model[C], 1547-1551(2018).

[6] Lowe D G. Distinctive image features from scale-invariant keypoints[J]. International Journal of Computer Vision, 60, 91-110(2004).

[7] Dalal N, Triggs B. Histograms of oriented gradients for human detection[C], 886-893(2005).

[8] Felzenszwalb P, McAllester D, Ramanan D. A discriminatively trained, multiscale, deformable part model[C], 10139902(2008).

[9] Zheng Y P, Li G Y, Li Y. Survey of application of deep learning in image recognition[J]. Computer Engineering and Applications, 55, 20-36(2019).

[10] Girshick R, Donahue J, Darrell T et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C], 580-587(2014).

[11] Girshick R. Fast R-CNN[C], 1440-1448(2015).

[12] Ren S Q, He K M, Girshick R et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 1137-1149(2017).

[13] He K M, Gkioxari G, Dollár P et al. Mask R-CNN[C], 2980-2988(2017).

[14] Redmon J, Divvala S, Girshick R et al. You only look once: unified, real-time object detection[C], 779-788(2016).

[15] Liu W, Anguelov D, Erhan D et al. SSD: single shot MultiBox detector[M]. Leibe B, Matas J, Sebe N, et al. Computer vision-ECCV 2016, 9905, 21-37(2016).

[16] Redmon J, Farhadi A. YOLO9000: better, faster, stronger[C], 6517-6525(2017).

[17] Redmon J, Farhadi A. YOLOv3: an incremental improvement[EB/OL]. https://arxiv.org/abs/1804.02767

[18] Bochkovskiy A, Wang C Y, Liao H Y M. YOLOv4: optimal speed and accuracy of object detection[EB/OL]. https://arxiv.org/abs/2004.10934

[19] Jiang Z X, Zhao L Q, Li S Y et al. Real-time object detection method based on improved YOLOv4-tiny[EB/OL]. https://arxiv.org/abs/2011.04244

[20] Kirillov A, Girshick R, He K M et al. Panoptic feature pyramid networks[C], 6392-6401(2019).

[21] Chollet F. Xception: deep learning with depthwise separable convolutions[C], 1800-1807(2017).

[22] Zhu X Z, Cheng D Z, Zhang Z et al. An empirical study of spatial attention mechanisms in deep networks[C], 6687-6696(2019).

[23] Adelson E H, Anderson C H, Bergen J R et al. Pyramid methods in image processing[J]. RCA Engineer, 29, 33-41(1984).