Review of Deep Learning Based Object Detection Methods and Their Mainstream Frameworks

Zhongjing Duan; Shaobo Li; Jianjun Hu; Jing Yang; Zheng Wang

doi:10.3788/LOP57.120005

[1] Yin H P, Chen B, Chai Y et al. Vision-based object detection and tracking: a review[J]. Acta Automatica Sinica, 42, 1466-1489(2016).

[2] Zhang X Y, Gao H B, Zhao J H et al. Overview of deep learning intelligent driving methods[J]. Journal of Tsinghua University(Science and Technology), 58, 438-444(2018).

[3] Li H B, Xu C Y, Hu C C. Improved real-time vehicle detection method based on YoLOV3[J]. Laser & Optoelectronics Progress, 57, 101507(2020).

[4] Li X, Shi B B, Liu Y et al. Multi-target recognition method based on improved YOLOv2 model[J]. Laser & Optoelectronics Progress, 57, 101010(2020).

[5] Wang D C, Chen X N, Zhao F et al. Vehicle detection algorithm based on convolutional neural network and RGB-D images[J]. Laser & Optoelectronics Progress, 56, 181003(2019).

[6] Gowsikhaa D, Abirami S, Baskaran R. Automated human behavior analysis from surveillance videos: a survey[J]. Artificial Intelligence Review, 42, 747-765(2014).

[7] Huang K Q, Chen X T, Kang Y F et al. Intelligent visual surveillance: a review[J]. Chinese Journal of Computers, 38, 1093-1118(2015).

[8] Li S B, Yang J, Wang Z et al. -04-04)[2019-11-26]. https:∥doi.org/10. 16383/j.aas., c180538(2019).

[9] LeCun Y, Bottou L, Bengio Y et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 86, 2278-2324(1998).

[10] Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks. [C]∥Proceedings of the 25th International Conference on Neural Information Processing Systems, December 3-6, 2012, Lake Tahoe, Nevada. New York: ACM, 1, 1097-1105(2012).

[12] Yang G C, Yang J, Li S B et al. Modified CNN algorithm based on Dropout and ADAM optimizer[J]. Journal of Huazhong University of Science and Technology (Natural Science Edition), 46, 122-127(2018).

[13] Nair V, Hinton G E. Rectified linear units improve restricted Boltzmann machines. [C]∥Proceedings of the 27th International Conference on Machine Learning(ICML), Haifa, 807-814(2010).

[14] Yang J, Li S B, Gao Z et al. Real-time recognition method for 0.8 cm darning needle and KR22 bearing based on convolution neural network and data increase[J]. Applied Sciences, 8, 1857(2018).

[15] Simonyan K. -09-04)[2019-11-26]. https:∥arxiv.org/abs/1409.1556v1.(2014).

[16] Shelhamer E, Long J, Darrell T. Fully convolutional networks for semantic segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 640-651(2017).

[17] Ronneberger O, Fischer P. -05-18)[2019-11-26]. https:∥arxiv., org/abs/1505, 04597(2015).

[18] Badrinarayanan V, Handa A. -11-01)[2019-11-26]. http:∥de.arxiv., org/pdf/1511, 00561(2015).

[19] Szegedy C, Liu W, Jia Y et al. -09-17)[2019-11-26]. https:∥arxiv.org/, abs/1409, 4842(2014).

[20] He K, Zhang X, Ren S et al. -12-10)[2019-11-26]. https:∥ arxiv., org/abs/1512, 03385(2015).

[21] Huang G, Liu Z, Laurens V D M et al. -08-25)[2019-11-26]. https:∥arxiv., org/abs/1608, 06993(2016).

[22] Girshick R, Donahue J, Darrell T, semantic segmentation[EB/OL] et al. -11-11)[2019-11-26]. https:∥arxiv., org/abs/1311, 2524(2013).

[23] Uijlings J R R, Gevers T et al. Segmentation as selective search for object recognition. [C]∥2011 International Conference on Computer Vision, November 6-13, 2011, Barcelona, Spain. IEEE, 154-171(2011).

[24] He K M, Zhang X Y, Ren S Q et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37, 1904-1916(2015). http://www.sciencedirect.com/science/article/pii/S0031320315004252

[25] Girshick R. Fast R-CNN[EB/OL]. -04-30)[2019-11-26]. https:∥arxiv., org/abs/1504, 08083(2015).

[26] Scholkopf B, Platt J. -12-04)[2019-11-26]. https:∥dl.acm.org/ citation.cfm?id=2976462.(2006).

[27] Ren S Q, He K M, Girshick R et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 1137-1149(2017).

[28] Hosang J, Benenson R, Dollar P et al. What makes for effective detection proposals?[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38, 814-830(2016).

[29] Dai J, Li Y, He K et al. -05-20)[2019-11-26]. https:∥arxiv.org/abs/1605.06409?context=cs.(2016).

[30] Agrawal P, Girshick R. -07-07)[2019-11-26]. https:∥arxiv., org/abs/1407, 1610(2014).

[31] Hinton G E, Srivastava N, Krizhevsky A et al. Improving neural networks by preventing co-adaptation of feature detectors[J]. Computer Science, 3, 212-223(2012).

[32] Lin T Y, Dollar P, Girshick R et al. -12-09)[2019-11-26]. https:∥arxiv., org/abs/1612, 03144(2016).

[33] He K M, Gkioxari G, Dollar P et al. Mask R-CNN[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42, 386-397(2020).

[34] Xie S, Girshick R, Dollar P et al. -11-16)[2019-11-26]. https:∥arxiv., org/abs/1611, 05431(2016).

[35] Peng C, Xiao T, Li Z M et al. -11-20)[2019-11-26]. https:∥arxiv., org/abs/1711, 07240(2017).

[36] Qin L K, Gong Y F, Tang T Q et al. Training deep nets with progressive batch normalization on multi-GPUs[J]. International Journal of Parallel Programming, 47, 373-387(2019).

[37] Gidaris S. -05-07)[2019-11-26]. https:∥arxiv., org/abs/1505, 01749(2015).

[38] Kong T, Yao A B, Chen Y R, joint object detection[EB/OL] et al. -04-03)[2019-11-26]. https:∥arxiv., org/abs/1604, 00600(2016).

[39] Yang B, Yan J J, Zhen L et al. -04-12)[2019-11-26]. https:∥arxiv.org/abs/, 1604, 03239(2016).

[40] Wang X, Shrivastava A. -04-11)[2019-11-26]. https:∥arxiv., org/abs/1704, 03414(2017).

[41] Li Z M, Peng C, Yu G et al. -11-20)[2019-11-26]. https:∥arxiv., org/abs/1711, 07264(2017).

[42] Cai Z W. -12-03)[2019-11-26]. https:∥arxiv., org/abs/1712, 00726(2017).

[43] Singh B. -11-22)[2019-11-26]. https:∥arxiv., org/abs/1711, 08189(2017).

[44] Ghiasi G, Lin T Y, Pang R et al. -04-16)[2019-11-26]. https:∥arxiv., org/abs/1904, 07392(2019).

[45] Li Y, Chen Y, Wang N et al. -01-07) https:∥arxiv.org/abs/1901.01892?context=cs[2019-11-26]. CV.(2019).

[46] Sermanet P, Eigen D, Zhang X, detection using convolutional networks[EB/OL] et al. -12-21)[2019-11-26]. https:∥arxiv., org/abs/1312, 6229(2013).

[47] Redmon J, Divvala S, Girshick R, real-time object detection[EB/OL] et al. -06-08)[2019-11-26]. https:∥arxiv., org/abs/1506, 02640(2015).

[48] Liu W, Anguelov D, Erhan D et al[M]. SSD: single shot MultiBox detector, 21-37(2016).

[49] Chen L C, Papandreou G, Kokkinos I et al. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40, 834-848(2018).

[50] Fu C Y, Lin W, Ranga A et al. -01-23)[2019-11-26]. https:∥arxiv.org/abs/1701.06659v1.(2017).

[51] Redmon J, Farhadi A. YOLO9000: better, faster, stronger. [C]∥2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 21-26, 2017. Honolulu, HI. IEEE, 6517-6525(2017).

[52] Lin T Y, Goyal P, Girshick R et al. -08-07)[2019-11-26]. https:∥arxiv.org/abs/1708.02002v2.(2017).

[53] Redmon J. -04-08)[2019-11-26]. https:∥arxiv., org/abs/1804, 02767(2018).

[54] Najibi M, Rastegari M. -12-24)[2019-11-26]. https:∥arxiv., org/abs/1512, 07729(2015).

[55] Jeong J, Park H. -05-26)[2019-11-26]. https:∥arxiv.org/abs/1705.09587v1.(2017).

[56] Shen Z Q, Liu Z, Li G et al. -08-03) https:∥arxiv.org/abs/1708.01241?context=cs[2019-11-26]. LG.(2017).

[57] Kong T, Sun F C, Yao A B et al. -07-06)[2019-11-26]. https:∥arxiv.org/abs/1707.01691?context=cs.(2017).

[58] Zhou P, Ni B B, Geng C et al. -04-05)[2019-11-26]. http:∥ openaccess.thecvf.com/content_cvpr_2018/papers/Zhou_Scale-Transferrable_Object_Detection_CVPR_2018_p- aper.pdf.(2019).

[59] Lin T Y, Dollar P, Girshick R et al. -12-09)[2019-11-26]. https:∥arxiv.org/abs/1612.03144?context=cs.(2016).

[60] Zhao Q J, Sheng T, Wang Y T et al. -11-12)[2019-11-26]. https:∥arxiv., org/abs/1811, 04533(2018).

[61] Zhu C C, He Y H. -03-02)[2019-11-26]. https:∥arxiv.org/abs/1903.00621v1.(2019).

[62] Tian Z, Shen C H, Chen H et al. -04-02)[2019-11-26]. https:∥arxiv., org/abs/1904, 01355(2019).

[63] Kong T, Sun F C, Liu H P et al. -04-08)[2019-11-26]. https:∥arxiv., org/abs/1904, 03797(2019).

[64] Huang L, Yang Y, Deng Y et al. -09-16)[2019-11-26]. https:∥arxiv., org/abs/1509, 04874(2015).

[65] Law H. -08-03)[2019-11-26]. https:∥arxiv. org/abs/1808.01244v1.(2018).

[66] Zhou X Y, Zhuo J C, center points[EB/OL]. -01-23)[2019-11-26]. https:∥arxiv.org/abs/1901.08043v1.(2019).

[67] Duan K W, Bai S, Xie L X et al. -04-17)[2019-11-26]. https:∥arxiv.org/abs/1904.08189?context=cs.(2019).

[68] Yu J H, Jiang Y, Wang Z Y et al. -08-04)[2019-11-26]. https:∥arxiv., org/abs/1608, 01471(2016).

[69] Zhou X Y, Wang D Q, Krähenbühl P et al[2019-11-26]. 2019-04-16) https:∥arxiv.org/abs/, 07850v1(1904).

[70] Lu X, Li B Y, Yue Y X et al. -11-29)[2019-11-26]. https:∥arxiv.org/abs/1811.12030v1.(2018).

[71] Law H, Teng Y, Russakovsky O et al. -04-180)[2019-11-26]. https:∥arxiv., org/abs/1904, 08900(2019).