Multi-Scale Infrared Pedestrian Detection Based on Deep Attention Mechanism

Bin Zhao; Chunping Wang; Qiang Fu; Yichao Chen

doi:10.3788/AOS202040.0504001

[1] Liu S T, Jiang N, Liu Z X et al. Saliency detection of infrared image based on region covariance and global feature[J]. Journal of Systems Engineering and Electronics, 29, 483-490(2018).

[2] Cai Y F, Liu Z, Wang H et al. Saliency-based pedestrian detection in far infrared images[J]. IEEE Access, 5, 5013-5019(2017).

[3] Hintermüller M, Wu T. Robust principal component pursuit via inexact alternating minimization on matrix manifolds[J]. Journal of Mathematical Imaging and Vision, 51, 361-377(2015).

[4] Shu X B, Porikli F, Ahuja N. Robust orthonormal subspace learning: efficient recovery of corrupted low-rank matrices. [C]∥2014 IEEE Conference on Computer Vision and Pattern Recognition, June 23-28, 2014, Columbus, OH, USA. New York: IEEE, 3874-3881(2014).

[5] Ye X C, Yang J Y, Sun X et al. Foreground-background separation from video clips via motion-assisted matrix restoration[J]. IEEE Transactions on Circuits and Systems for Video Technology, 25, 1721-1734(2015).

[6] Cherapanamjeri Y, Gupta K, Jain P. Nearly optimal robust matrix completion[C]∥Proceedings of the 34th International Conference on Machine Learning, August 6-11, 2017, Sydney, NSW, Australia., 70, 797-805(2017).

[7] Sobral A, Javed S, Jung S K et al. Online stochastic tensor decomposition for background subtraction in multispectral video sequences. [C]∥2015 IEEE International Conference on Computer Vision Workshop (ICCVW), December 7-13, 2015, Santiago, Chile. New York: IEEE, 946-953(2015).

[8] Girshick R, Donahue J, Darrell T et al. Rich feature hierarchies for accurate object detection and semantic segmentation. [C]∥2014 IEEE Conference on Computer Vision and Pattern Recognition, June 23-28, 2014, Columbus, OH, USA. New York: IEEE, 580-587(2014).

[9] Girshick R. Fast R-CNN. [C]∥2015 IEEE International Conference on Computer Vision (ICCV), December 7-13, 2015, Santiago, Chile. New York: IEEE, 1440-1448(2015).

[10] Ren S Q, He K M, Girshick R et al. Faster R-CNN: towards real-time object detection with region proposal networks. [C]∥Advances in Neural Information Processing Systems, December 7-12, 2015, Montreal, Quebec, Canada. New York: Curran Associates, 91-99(2015).

[11] Redmon J, Divvala S, Girshick R et al. You only look once: unified, real-time object detection. [C]∥2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 27-30, 2016, Las Vegas, NV, USA. New York: IEEE, 779-788(2016).

[12] Redmon J, Farhadi A. YOLO9000: better, faster, stronger. [C]∥2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 21-26, 2017, Honolulu, HI. New York: IEEE, 6517-6525(2017).

[13] Redmon J. -04-08)[2019-09-22]. https:∥arxiv.xilesou., top/abs/1804, 02767(2018).

[14] Liu W, Anguelov D, Erhan D et al. SSD: single shot MultiBox detector[M]. ∥Leibe B, Matas J, Sebe N, et al. Computer vision-ECCV 2016. Lecture notes in computer science. Cham: Springer, 9905, 21-37(2016).

[15] Fu C Y, Liu W, Ranga A et al. -01-23)[2019-09-22]. https:∥arxiv.xilesou., top/abs/1701, 06659(2017).

[16] Wang F, Jiang M Q, Qian C et al. Residual attention network for image classification. [C]∥2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 21-26, 2017, Honolulu, HI, USA. New York: IEEE, 6450-6458(2017).

[17] Hu J, Shen L, Sun G. Squeeze-and-excitation networks. [C]∥2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 18-23, 2018, Salt Lake City, UT, USA. New York: IEEE, 7132-7141(2018).

[18] Woo S, Park J, Lee J Y et al. CBAM: convolutional block attention module[M]. ∥Ferrari V, Hebert M, Sminchisescu C, et al. Computer vision-ECCV 2018. Lecture notes in computer science. Cham: Springer, 11211, 3-19(2018).

[19] Oktay O, Schlemper J, Folgoc L L et al. -05-20)[2019-09-22]. https:∥arxiv.xilesou., top/abs/1804, 03999(2018).

[20] Tang X, Du D K, He Z Q et al. PyramidBox: a context-assisted single shot face detector[M]. ∥Ferrari V, Hebert M, Sminchisescu C, et al. Computer vision-ECCV 2018. Lecture notes in computer science. Cham: Springer, 11213, 812-828(2018).

[21] Qin J, Wang M H. Fast pedestrian proposal generation algorithm using online Gaussian model[J]. Acta Optica Sinica, 36, 1115001(2016).

[22] Zhao P R, Wu X Y, Tang X Y et al. An algorithm of small object detection region proposal search based on GN splitting[J]. Acta Optica Sinica, 38, 0915005(2018).

[23] Cheung W, Hamarneh G. n-SIFT: n-dimensional scale invariant feature transform[J]. IEEE Transactions on Image Processing, 18, 2012-2021(2009).

[24] Dalal N, Triggs B. Histograms of oriented gradients for human detection. [C]∥2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), June 20-25, 2005, San Diego, CA, USA. New York: IEEE, 8588935(2005).

[25] Zhang C J, Liu J, Liang C et al. Image classification using Harr-like transformation of local features with coding residuals[J]. Signal Processing, 93, 2111-2118(2013).

[26] Ye G L, Sun S Y, Gao K J et al. Nighttime pedestrian detection based on faster region convolution neural network[J]. Laser & Optoelectronics Progress, 54, 081003(2017).

[27] Aimar A, Mostafa H, Calabrese E et al. NullHop: a flexible convolutional neural network accelerator based on sparse representations of feature maps[J]. IEEE Transactions on Neural Networks and Learning Systems, 30, 644-656(2019).

[28] He K M, Zhang X Y, Ren S Q et al. Deep residual learning for image recognition. [C]∥2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 27-30, 2016, Las Vegas, NV, USA. New York: IEEE, 770-778(2016).

[29] Szegedy C, Vanhoucke V, Ioffe S et al. Rethinking the inception architecture for computer vision. [C]∥2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 27-30, 2016, Las Vegas, NV, USA. New York: IEEE, 2818-2826(2016).

[30] Lin T Y, Dollár P, Girshick R et al. Feature pyramid networks for object detection. [C]∥2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 21-26, 2017, Honolulu, HI, USA. New York: IEEE, 936-944(2017).

[31] IoffeS, SzegedyC. Batch normalization: accelerating deep network training by reducing internal covariate shift[C]∥32th International Conference on Machine Learning, July 6-11, 2015, Lille, France. USA: MLR Press, 2015: 448- 456.

[32] Dai J, Li Y, He K et al. R-FCN: object detection via region-based fully convolutional networks. [C]∥Advances in Neural Information Processing Systems, December 5-10, 2016, Barcelona, Spain. New York: Curran Associates, 379-387(2016).