[7] Lin T Y, Dollár P, Girshick R et al. Feature pyramid networks for object detection[C]. //2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 21-26, 2017, Honolulu, HI, USA., 936-944(2017).
[8] Liu W, Anguelov D, Erhan D et al. SSD: single shot MultiBox detector[M]. //Leibe B, Matas J, Sebe N, et al. Computer vision-ECCV 2016. Lecture notes in computer science, 9905, 21-37(2016).
[9] Redmon J, Divvala S, Girshick R et al. You only look once: unified, real-time object detection[C]. //2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 27-30, 2016, Las Vegas, NV, USA, 779-788(2016).
[12] Cao J L, Pang Y W, Han J G et al. Hierarchical shot detector[C]. //2019 IEEE/CVF International Conference on Computer Vision (ICCV), October 27-November 2, 2019, Seoul, Korea (South)., 9704-9713(2019).
[13] Lin T Y, Goyal P, Girshick R et al. Focal loss for dense object detection[C]. //2017 IEEE International Conference on Computer Vision (ICCV), October 22-29, 2017, Venice, Italy., 2999-3007(2017).
[15] Tian Z, Shen C H, Chen H et al. FCOS: fully convolutional one-stage object detection[C]. //2019 IEEE/CVF International Conference on Computer Vision (ICCV), October 27-November 2, 2019, Seoul, Korea (South), 9626-9635(2019).
[16] He K M, Zhang X Y, Ren S Q et al. Deep residual learning for image recognition[C]. //2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 27-30, 2016, Las Vegas, NV, USA., 770-778(2016).
[17] Yu F, Wang D Q, Shelhamer E et al. Deep layer aggregation[C]. //2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 18-23, 2018, Salt Lake City, UT, USA., 2403-2412(2018).
[18] Newell A, Yang K Y, Deng J et al. Stacked hourglass networks for human pose estimation[M]. //Leibe B, Matas J, Sebe N, et al. Computer vision-ECCV 2016. Lecture notes in computer science, 9912, 483-499(2016).
[22] Rothe R, Guillaumin M, Gool L et al. Non-maximum suppression for object detection by passing messages between windows[M]. //Cremers D, Reid I, Saito H, et al. Computer vision-ACCV 2014. Lecture notes in computer science, 9903, 290-306(2015).
[23] Everingham M, Eslami S M A, Gool L et al. The pascal visual object classes challenge: a retrospective[J]. International Journal of Computer Vision, 111, 98-136(2015).
[24] Lin T Y, Maire M, Belongie S et al. Microsoft COCO: common objects in context[M]. //Fleet D, Pajdla T, Schiele B, et al. Computer vision-ECCV 2014. Lecture notes in computer science, 8693, 740-755(2014).
[25] Russakovsky O, Deng J, Su H et al. ImageNet large scale visual recognition challenge[J]. International Journal of Computer Vision, 115, 211-252(2015).
[26] Glorot X, Bengio Y. Understanding the difficulty of training deep feedforward neural networks[J]. Journal of Machine Learning Research, 9, 249-256(2010).
[28] Tan M X, Pang R M, Le Q V et al. EfficientDet: scalable and efficient object detection[C]. //2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 13-19, 2020, Seattle, WA, USA, 10778-10787(2020).