• Optics and Precision Engineering
  • Vol. 32, Issue 5, 727 (2024)
Daxiang LI, Jiani XIN*, and Ying LIU
Author Affiliations
  • College of communication and information engineering, Xi'an University of Posts and Telecommunication, Xi'an710121, China
  • show less
    DOI: 10.37188/OPE.20243205.0727 Cite this Article
    Daxiang LI, Jiani XIN, Ying LIU. Position-sensitive Transformer aerial image object detection model[J]. Optics and Precision Engineering, 2024, 32(5): 727 Copy Citation Text show less
    References

    [1] 朱威, 王立凯, 靳作宝, 等. 引入注意力机制的轻量级小目标检测网络[J]. 光学 精密工程, 2022, 30(8): 998-1010. doi: 10.37188/OPE.20223008.0998ZHUW, WANGL K, JINZ B, et al. Lightweight small object detection network with attention mechanism[J]. Optics and Precision Engineering, 2022, 30(8): 998-1010.(in Chinese). doi: 10.37188/OPE.20223008.0998

    [2] 范丽丽, 赵宏伟, 赵浩宇, 等. 基于深度卷积神经网络的目标检测研究综述[J]. 光学 精密工程, 2020, 28(5): 1152-1164.FANL L, ZHAOH W, ZHAOH Y, et al. Survey of target detection based on deep convolutional neural networks[J]. Optics and Precision Engineering, 2020, 28(5): 1152-1164.(in Chinese)

    [3] S Q REN, K M HE, R GIRSHICK et al. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 1137-1149(2017).

    [4] Z W CAI, N VASCONCELOS. Cascade R-CNN: delving into high quality object detection, 18, 6154-6162(2018).

    [5] W LIU, D ANGUELOV, D ERHAN et al. SSD Single Shot Multibox Detector. Computer Vision-ECCV 2016, 21-37(2016).

    [6] A BOCHKOVSKIY, C Y WANG, H LIAO. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv preprint(2020).

    [7] C YANG, Z H HUANG, N Y WANG. QueryDet: cascaded sparse query for accelerating high-resolution small object detection, 18, 13658-13667(2022).

    [8] W T LI, Y J CHEN, K X HU et al. Oriented RepPoints for aerial object detection, 18, 1829-1838(2022).

    [9] D LIANG, Q X GENG, Z Q WEI et al. Anchor retouching via model interaction for robust object detection in aerial images. IEEE Transactions on Geoscience and Remote Sensing, 60, 1-13(2022).

    [10] H LAW, J DENG. CornerNet: detecting objects as paired keypoints. International Journal of Computer Vision, 128, 642-656(2020).

    [11] Z TIAN, C H SHEN, H CHEN et al. FCOS: fully convolutional one-stage object detection, 9626-9635(2019).

    [12] P W DAI, S Y YAO, Z K LI et al. ACE: anchor-free corner evolution for real-time arbitrarily-oriented object detection. IEEE Transactions on Image Processing, 31, 4076-4089(2022).

    [13] N CARION, F MASSA, G SYNNAEVE et al. End-to-end Object Detection with Transformers. Computer Vision-ECCV 2020, 213-229(2020).

    [14] X Z ZHU, W J SU, L W LU et al. Deformable DETR: deformable transformers for end-to-end object detection, 1-14(2021).

    [15] F LI, H ZHANG, S L LIU et al. DN-DETR: accelerate DETR training by introducing query DeNoising, 18, 13609-13617(2022).

    [16] Q B HOU, D Q ZHOU, J S FENG. Coordinate attention for efficient mobile network design, 13713-13722(2021).

    [17] A DOSOVITSKIY, L BEYER, A KOLESNIKOV et al. An image is worth 16x16 words: transformers for image recognition at scale, 15-35(2021).

    [18] A VASWANI, N SHAZEER, N PARMAR et al. Attention is all you need, 6000-6010(2017).

    [19] H W KUHN. The Hungarian method for the assignment problem. Naval Research Logistics Quarterly, 2, 83-97(1955).

    [20] P F ZHU, L Y WEN, D W DU et al. Vision Meets Drones: Past, Present and Future(2020).

    [21] T Y LIN, P GOYAL, R GIRSHICK et al. Focal loss for dense object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42, 318-327(2020).

    [22] Z H ZHENG, P WANG, W LIU et al. Distance-IoU loss: faster and better learning for bounding box regression. Proceedings of the AAAI Conference on Artificial Intelligence, 34, 12993-13000(2020).

    [23] T Y LIN, M MAIRE, S BELONGIE et al. Microsoft COCO Common Objects in Context. Computer Vision-ECCV 2014, 740-755(2014).

    [24] J HU, L SHEN, G SUN. Squeeze-and-excitation networks, 18, 7132-7141(2018).

    [25] J PARK, J Y LEE et al. Bam: Bottleneck attention module. arXiv preprint(2018).

    [26] J PARK, J Y LEE et al. Cbam: Convolutional block attention module, 3-19(2018).

    [27] Z H DAI, Z L YANG, Y M YANG et al. Transformer-XL: attentive language models beyond a fixed-length context, 2978-2988(2019).

    [28] Z H HUANG, D LIANG, P XU et al. Improve transformer models with better relative position embeddings, 3327-3335(2020).

    [29] Y WU, Y P CHEN, L YUAN et al. Rethinking classification and localization for object detection, 13, 10186-10195(2020).

    [30] D RUKHOVICH, K SOFIIUK, D GALEEV et al. IterDet Iterative Scheme for Object Detection in Crowded Environments. Lecture Notes in Computer Science, 344-354(2021).

    [31] W SUN, L DAI, X R ZHANG et al. RSOD: real-time small object detection algorithm in UAV-based traffic monitoring. Applied Intelligence, 52, 8448-8463(2022).

    [32] Y T LI, Q S FAN, H S HUANG et al. A modified YOLOv8 detection network for UAV aerial image recognition. Drones, 7, 304(2023).

    [33] W H WANG, E Z XIE, X LI et al. PVT v2: improved baselines with pyramid vision transformer. Computational Visual Media, 8, 415-424(2022).

    Daxiang LI, Jiani XIN, Ying LIU. Position-sensitive Transformer aerial image object detection model[J]. Optics and Precision Engineering, 2024, 32(5): 727
    Download Citation