• Acta Optica Sinica
  • Vol. 40, Issue 5, 0504001 (2020)
Bin Zhao, Chunping Wang*, Qiang Fu, and Yichao Chen
Author Affiliations
  • Department of Electronic and Optical Engineering, Shijiazhuang Campus of Army Engineering University, Shijiazhuang, Hebei 050003, China
  • show less
    DOI: 10.3788/AOS202040.0504001 Cite this Article Set citation alerts
    Bin Zhao, Chunping Wang, Qiang Fu, Yichao Chen. Multi-Scale Infrared Pedestrian Detection Based on Deep Attention Mechanism[J]. Acta Optica Sinica, 2020, 40(5): 0504001 Copy Citation Text show less
    References

    [1] Liu S T, Jiang N, Liu Z X et al. Saliency detection of infrared image based on region covariance and global feature[J]. Journal of Systems Engineering and Electronics, 29, 483-490(2018).

    [2] Cai Y F, Liu Z, Wang H et al. Saliency-based pedestrian detection in far infrared images[J]. IEEE Access, 5, 5013-5019(2017).

    [3] Hintermüller M, Wu T. Robust principal component pursuit via inexact alternating minimization on matrix manifolds[J]. Journal of Mathematical Imaging and Vision, 51, 361-377(2015).

    [4] Shu X B, Porikli F, Ahuja N. Robust orthonormal subspace learning: efficient recovery of corrupted low-rank matrices. [C]∥2014 IEEE Conference on Computer Vision and Pattern Recognition, June 23-28, 2014, Columbus, OH, USA. New York: IEEE, 3874-3881(2014).

    [5] Ye X C, Yang J Y, Sun X et al. Foreground-background separation from video clips via motion-assisted matrix restoration[J]. IEEE Transactions on Circuits and Systems for Video Technology, 25, 1721-1734(2015).

    [6] Cherapanamjeri Y, Gupta K, Jain P. Nearly optimal robust matrix completion[C]∥Proceedings of the 34th International Conference on Machine Learning, August 6-11, 2017, Sydney, NSW, Australia., 70, 797-805(2017).

    [7] Sobral A, Javed S, Jung S K et al. Online stochastic tensor decomposition for background subtraction in multispectral video sequences. [C]∥2015 IEEE International Conference on Computer Vision Workshop (ICCVW), December 7-13, 2015, Santiago, Chile. New York: IEEE, 946-953(2015).

    [8] Girshick R, Donahue J, Darrell T et al. Rich feature hierarchies for accurate object detection and semantic segmentation. [C]∥2014 IEEE Conference on Computer Vision and Pattern Recognition, June 23-28, 2014, Columbus, OH, USA. New York: IEEE, 580-587(2014).

    [9] Girshick R. Fast R-CNN. [C]∥2015 IEEE International Conference on Computer Vision (ICCV), December 7-13, 2015, Santiago, Chile. New York: IEEE, 1440-1448(2015).

    [10] Ren S Q, He K M, Girshick R et al. Faster R-CNN: towards real-time object detection with region proposal networks. [C]∥Advances in Neural Information Processing Systems, December 7-12, 2015, Montreal, Quebec, Canada. New York: Curran Associates, 91-99(2015).

    [11] Redmon J, Divvala S, Girshick R et al. You only look once: unified, real-time object detection. [C]∥2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 27-30, 2016, Las Vegas, NV, USA. New York: IEEE, 779-788(2016).

    [12] Redmon J, Farhadi A. YOLO9000: better, faster, stronger. [C]∥2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 21-26, 2017, Honolulu, HI. New York: IEEE, 6517-6525(2017).

    [13] Redmon J. -04-08)[2019-09-22]. https:∥arxiv.xilesou., top/abs/1804, 02767(2018).

    [14] Liu W, Anguelov D, Erhan D et al. SSD: single shot MultiBox detector[M]. ∥Leibe B, Matas J, Sebe N, et al. Computer vision-ECCV 2016. Lecture notes in computer science. Cham: Springer, 9905, 21-37(2016).

    [15] Fu C Y, Liu W, Ranga A et al. -01-23)[2019-09-22]. https:∥arxiv.xilesou., top/abs/1701, 06659(2017).

    [16] Wang F, Jiang M Q, Qian C et al. Residual attention network for image classification. [C]∥2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 21-26, 2017, Honolulu, HI, USA. New York: IEEE, 6450-6458(2017).

    [17] Hu J, Shen L, Sun G. Squeeze-and-excitation networks. [C]∥2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 18-23, 2018, Salt Lake City, UT, USA. New York: IEEE, 7132-7141(2018).

    [18] Woo S, Park J, Lee J Y et al. CBAM: convolutional block attention module[M]. ∥Ferrari V, Hebert M, Sminchisescu C, et al. Computer vision-ECCV 2018. Lecture notes in computer science. Cham: Springer, 11211, 3-19(2018).

    [19] Oktay O, Schlemper J, Folgoc L L et al. -05-20)[2019-09-22]. https:∥arxiv.xilesou., top/abs/1804, 03999(2018).

    [20] Tang X, Du D K, He Z Q et al. PyramidBox: a context-assisted single shot face detector[M]. ∥Ferrari V, Hebert M, Sminchisescu C, et al. Computer vision-ECCV 2018. Lecture notes in computer science. Cham: Springer, 11213, 812-828(2018).

    [21] Qin J, Wang M H. Fast pedestrian proposal generation algorithm using online Gaussian model[J]. Acta Optica Sinica, 36, 1115001(2016).

    [22] Zhao P R, Wu X Y, Tang X Y et al. An algorithm of small object detection region proposal search based on GN splitting[J]. Acta Optica Sinica, 38, 0915005(2018).

    [23] Cheung W, Hamarneh G. n-SIFT: n-dimensional scale invariant feature transform[J]. IEEE Transactions on Image Processing, 18, 2012-2021(2009).

    [24] Dalal N, Triggs B. Histograms of oriented gradients for human detection. [C]∥2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), June 20-25, 2005, San Diego, CA, USA. New York: IEEE, 8588935(2005).

    [25] Zhang C J, Liu J, Liang C et al. Image classification using Harr-like transformation of local features with coding residuals[J]. Signal Processing, 93, 2111-2118(2013).

    [26] Ye G L, Sun S Y, Gao K J et al. Nighttime pedestrian detection based on faster region convolution neural network[J]. Laser & Optoelectronics Progress, 54, 081003(2017).

    [27] Aimar A, Mostafa H, Calabrese E et al. NullHop: a flexible convolutional neural network accelerator based on sparse representations of feature maps[J]. IEEE Transactions on Neural Networks and Learning Systems, 30, 644-656(2019).

    [28] He K M, Zhang X Y, Ren S Q et al. Deep residual learning for image recognition. [C]∥2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 27-30, 2016, Las Vegas, NV, USA. New York: IEEE, 770-778(2016).

    [29] Szegedy C, Vanhoucke V, Ioffe S et al. Rethinking the inception architecture for computer vision. [C]∥2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 27-30, 2016, Las Vegas, NV, USA. New York: IEEE, 2818-2826(2016).

    [30] Lin T Y, Dollár P, Girshick R et al. Feature pyramid networks for object detection. [C]∥2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 21-26, 2017, Honolulu, HI, USA. New York: IEEE, 936-944(2017).

    [31] IoffeS, SzegedyC. Batch normalization: accelerating deep network training by reducing internal covariate shift[C]∥32th International Conference on Machine Learning, July 6-11, 2015, Lille, France. USA: MLR Press, 2015: 448- 456.

    [32] Dai J, Li Y, He K et al. R-FCN: object detection via region-based fully convolutional networks. [C]∥Advances in Neural Information Processing Systems, December 5-10, 2016, Barcelona, Spain. New York: Curran Associates, 379-387(2016).

    Bin Zhao, Chunping Wang, Qiang Fu, Yichao Chen. Multi-Scale Infrared Pedestrian Detection Based on Deep Attention Mechanism[J]. Acta Optica Sinica, 2020, 40(5): 0504001
    Download Citation