Position-sensitive Transformer aerial image object detection model

Daxiang LI; Jiani XIN; Ying LIU

doi:10.37188/OPE.20243205.0727

[1] 朱威，王立凯，靳作宝，等. 引入注意力机制的轻量级小目标检测网络［J］. 光学精密工程， 2022， 30（8）： 998-1010. doi: 10.37188/OPE.20223008.0998ZHUW， WANGL K， JINZ B， et al. Lightweight small object detection network with attention mechanism［J］. Optics and Precision Engineering， 2022， 30（8）： 998-1010.（in Chinese）. doi: 10.37188/OPE.20223008.0998

[2] 范丽丽，赵宏伟，赵浩宇，等. 基于深度卷积神经网络的目标检测研究综述［J］. 光学精密工程， 2020， 28（5）： 1152-1164.FANL L， ZHAOH W， ZHAOH Y， et al. Survey of target detection based on deep convolutional neural networks［J］. Optics and Precision Engineering， 2020， 28（5）： 1152-1164.（in Chinese）

[3] S Q REN, K M HE, R GIRSHICK et al. Faster R-CNN： towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 1137-1149(2017).

[4] Z W CAI, N VASCONCELOS. Cascade R-CNN： delving into high quality object detection, 18, 6154-6162(2018).

[5] W LIU, D ANGUELOV, D ERHAN et al. SSD： Single Shot Multibox Detector. Computer Vision-ECCV 2016, 21-37(2016).

[6] A BOCHKOVSKIY, C Y WANG, H LIAO. YOLOv4： Optimal Speed and Accuracy of Object Detection. arXiv preprint(2020).

[7] C YANG, Z H HUANG, N Y WANG. QueryDet： cascaded sparse query for accelerating high-resolution small object detection, 18, 13658-13667(2022).

[8] W T LI, Y J CHEN, K X HU et al. Oriented RepPoints for aerial object detection, 18, 1829-1838(2022).

[9] D LIANG, Q X GENG, Z Q WEI et al. Anchor retouching via model interaction for robust object detection in aerial images. IEEE Transactions on Geoscience and Remote Sensing, 60, 1-13(2022).

[10] H LAW, J DENG. CornerNet： detecting objects as paired keypoints. International Journal of Computer Vision, 128, 642-656(2020).

[11] Z TIAN, C H SHEN, H CHEN et al. FCOS： fully convolutional one-stage object detection, 9626-9635(2019).

[12] P W DAI, S Y YAO, Z K LI et al. ACE： anchor-free corner evolution for real-time arbitrarily-oriented object detection. IEEE Transactions on Image Processing, 31, 4076-4089(2022).

[13] N CARION, F MASSA, G SYNNAEVE et al. End-to-end Object Detection with Transformers. Computer Vision-ECCV 2020, 213-229(2020).

[14] X Z ZHU, W J SU, L W LU et al. Deformable DETR： deformable transformers for end-to-end object detection, 1-14(2021).

[15] F LI, H ZHANG, S L LIU et al. DN-DETR： accelerate DETR training by introducing query DeNoising, 18, 13609-13617(2022).

[16] Q B HOU, D Q ZHOU, J S FENG. Coordinate attention for efficient mobile network design, 13713-13722(2021).

[17] A DOSOVITSKIY, L BEYER, A KOLESNIKOV et al. An image is worth 16x16 words： transformers for image recognition at scale, 15-35(2021).

[18] A VASWANI, N SHAZEER, N PARMAR et al. Attention is all you need, 6000-6010(2017).

[19] H W KUHN. The Hungarian method for the assignment problem. Naval Research Logistics Quarterly, 2, 83-97(1955).

[20] P F ZHU, L Y WEN, D W DU et al. Vision Meets Drones： Past， Present and Future(2020).

[21] T Y LIN, P GOYAL, R GIRSHICK et al. Focal loss for dense object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42, 318-327(2020).

[22] Z H ZHENG, P WANG, W LIU et al. Distance-IoU loss： faster and better learning for bounding box regression. Proceedings of the AAAI Conference on Artificial Intelligence, 34, 12993-13000(2020).

[23] T Y LIN, M MAIRE, S BELONGIE et al. Microsoft COCO： Common Objects in Context. Computer Vision-ECCV 2014, 740-755(2014).

[24] J HU, L SHEN, G SUN. Squeeze-and-excitation networks, 18, 7132-7141(2018).

[25] J PARK, J Y LEE et al. Bam： Bottleneck attention module. arXiv preprint(2018).

[26] J PARK, J Y LEE et al. Cbam： Convolutional block attention module, 3-19(2018).

[27] Z H DAI, Z L YANG, Y M YANG et al. Transformer-XL： attentive language models beyond a fixed-length context, 2978-2988(2019).

[28] Z H HUANG, D LIANG, P XU et al. Improve transformer models with better relative position embeddings, 3327-3335(2020).

[29] Y WU, Y P CHEN, L YUAN et al. Rethinking classification and localization for object detection, 13, 10186-10195(2020).

[30] D RUKHOVICH, K SOFIIUK, D GALEEV et al. IterDet： Iterative Scheme for Object Detection in Crowded Environments. Lecture Notes in Computer Science, 344-354(2021).

[31] W SUN, L DAI, X R ZHANG et al. RSOD： real-time small object detection algorithm in UAV-based traffic monitoring. Applied Intelligence, 52, 8448-8463(2022).

[32] Y T LI, Q S FAN, H S HUANG et al. A modified YOLOv8 detection network for UAV aerial image recognition. Drones, 7, 304(2023).

[33] W H WANG, E Z XIE, X LI et al. PVT v2： improved baselines with pyramid vision transformer. Computational Visual Media, 8, 415-424(2022).

微信扫一扫：分享

微信扫一扫：分享