• Laser & Optoelectronics Progress
  • Vol. 61, Issue 2, 0211023 (2024)
Ruijiao Jin1,2,†, Kun Wang1,2,†, Minhao Liu1,2, Xichao Teng1,2..., Zhang Li1,2,* and Qifeng Yu1,2|Show fewer author(s)
Author Affiliations
  • 1College of Aerospace Science and Engineering, National University of Defense Technology, Changsha 410000, Hunan , China
  • 2Hunan Key Laboratory of Image Measurement and Vision Navigation, Changsha 410000, Hunan , China
  • show less
    DOI: 10.3788/LOP240502 Cite this Article Set citation alerts
    Ruijiao Jin, Kun Wang, Minhao Liu, Xichao Teng, Zhang Li, Qifeng Yu. DETR with Improved DeNoising Training for Multi-Scale Oriented Object Detection in Optical Remote Sensing Images (Invited)[J]. Laser & Optoelectronics Progress, 2024, 61(2): 0211023 Copy Citation Text show less
    Overall architecture of AO2DINO
    Fig. 1. Overall architecture of AO2DINO
    Multi-scale rotated deformable attention module
    Fig. 2. Multi-scale rotated deformable attention module
    Comparison of attention heatmaps between AO2DINO (left) and ReDet (right)
    Fig. 3. Comparison of attention heatmaps between AO2DINO (left) and ReDet (right)
    Self-adaption assigner. (a) Positive and negative assigner of AO2DINO (left) and DINO (right); (b) overlapping of rotated boxes
    Fig. 4. Self-adaption assigner. (a) Positive and negative assigner of AO2DINO (left) and DINO (right); (b) overlapping of rotated boxes
    Calculation of Rotated IoU
    Fig. 5. Calculation of Rotated IoU
    Periodicity of angle. (a) Ideal representation of bounding boxes; (b) the predicted angle differs from the ideal angle by 90°; (c) the predicted angle differs from the ideal angle by 180°
    Fig. 6. Periodicity of angle. (a) Ideal representation of bounding boxes; (b) the predicted angle differs from the ideal angle by 90°; (c) the predicted angle differs from the ideal angle by 180°
    Edge exchangeability
    Fig. 7. Edge exchangeability
    Principle of KFIoU
    Fig. 8. Principle of KFIoU
    Comparison of test results of different methods on DOTAv1.0 dataset
    Fig. 9. Comparison of test results of different methods on DOTAv1.0 dataset
    Adaptability of AO2DINO on DIOR-R dataset
    Fig. 10. Adaptability of AO2DINO on DIOR-R dataset
    Comparison of dense small object detection performance on DOTAv1.0 dataset
    Fig. 11. Comparison of dense small object detection performance on DOTAv1.0 dataset
    ConfigurationModel
    Operating systemUnuntu 20.0.4
    GPUNVIDIA GeForce RTX-4080Ti GPU
    Hardware configurationi9-10920X
    EnvironmentPython 3.8,PyTorch1.7.1,CUDA11.2
    Table 1. Experimental software and hardware configuration
    CategoryOne-stageTwo-stageDETR-like
    Rotated RetinaNet(3×)

    R3Det

    (3×)

    Rotated

    FCOS(3×)

    Rotated Faster R-CNN(3×)

    ReDet

    (3×)

    Rotated D-DETR(3×)

    AO2DETR

    (3×)

    ARS-DETR

    (3×)

    AO2DINO

    (1×)

    AO2DINO

    (3×)

    mAP69.2373.4072.4573.9674.0363.4270.9173.7972.1674.07
    PL87.3389.2488.5289.0988.9478.9587.9986.6186.3386.87
    BD78.9183.3277.5478.2878.0768.6479.4677.2676.7981.91
    BR46.4548.0347.0648.9351.1932.5745.7448.8449.5248.25
    GTF69.8172.5263.7871.5472.7655.1766.6466.7663.4372.90
    SV67.7277.5280.4274.0174.2672.5378.9078.3877.4379.92
    LV62.3476.7280.5074.9978.0857.7773.9078.9662.8363.24
    SH73.5986.4887.3485.9087.4473.7173.3087.4084.5485.87
    TC90.8590.8990.3990.8490.8488.3690.4090.6190.1288.23
    BC82.7982.3377.8386.8780.7975.4680.5582.7683.9282.89
    ST79.3783.5184.1385.0378.5979.3485.8982.1984.8286.87
    SBF59.6260.9655.4557.9760.8545.3655.1954.0255.9461.17
    RA61.8963.0965.8469.7464.2253.7863.6262.6167.2265.97
    HA65.0167.5865.0268.1076.8452.9451.8372.6468.1165.60
    SP67.7669.2772.7771.2872.7966.3570.1572.8071.8977.39
    HC44.9549.5049.1756.8854.8550.3860.0464.9675.4077.13
    Table 2. Comparison of different models on the DOTAv1.0 dataset
    CategoryOne-stageTwo-stageDETR-like
    Rotated RetinaNet(3×)

    R3Det

    (3×)

    Rotated FCOS(3×)

    GWD

    (3×)

    KLD

    (3×)

    Rotated Faster R-CNN(3×)

    ReDet

    (3×)

    ARS-DETR

    (3×)

    AO2DINO

    (1×)

    AO2DINO

    (3×)

    mAP54.8361.9163.2160.3164.6363.4163.8165.9060.5465.94
    APL59.5462.5562.3169.6866.5263.0763.2265.8263.9368.78
    APO25.0343.4442.1828.8346.8040.2244.1853.4042.2148.83
    BF70.0871.7275.3474.3271.7671.8972.1174.2273.2474.32
    BC81.0181.8481.3281.4981.4381.3681.2681.1183.5784.49
    BR28.2636.4939.2629.6240.8139.6743.8342.1340.3941.62
    CH72.0272.6374.8972.6778.2572.5172.7276.2363.6572.67
    ESA55.3579.5077.4276.4579.2379.1979.1082.2464.9176.45
    ETS56.7764.4168.6763.1466.6369.4569.7871.5268.9869.14
    DAM21.2627.0226.0027.1329.0126.0028.4538.9033.4534.13
    GF65.7077.3673.9477.1978.6877.9378.6975.9171.2471.19
    GTF70.2877.1778.7378.9480.1982.2877.1877.9177.0378.94
    HA30.5240.5341.2839.1144.8846.9148.2433.0342.6743.11
    OP44.3753.3354.1942.1857.2353.9056.8157.0266.6566.18
    SH77.0279.6680.6179.1080.9181.0381.1784.8285.4386.10
    STA59.0169.2266.9270.4174.1775.7769.1769.7169.8070.41
    STO59.3961.1069.1758.6968.0262.5462.7372.2062.3462.69
    TC81.1881.5487.2081.5281.4881.4281.4280.3372.9881.66
    TS38.4352.1852.3147.7854.6354.5054.9058.9154.5555.78
    VE39.1043.5747.0844.4747.8043.1744.0451.5249.8050.47
    WM61.5864.1365.2162.3664.4165.7366.3770.7368.2169.36
    Table 3. Comparison of different models on the DIOR-R dataset
    MethodEpochDOTAv1.0DIOR-R
    SHSVSHVEWM
    R3Det86.4877.5279.6643.5764.13
    ReDet87.4474.2681.1744.0466.37
    ARS-DETR87.4078.3884.8251.5270.73
    AO2DINO85.8779.9286.1050.4769.36
    AO2DINO84.5477.4385.4349.8068.21
    AO2DINO-ms88.9079.9887.5750.6670.68
    Table 4. mAP of dense small target detection on DOTAv1.0 and DIOR-R datasets
    CDNMS-RDASAAKFIoUAP50 /%AP75 /%
    67.1233.35
    68.90(+1.78)38.70(+5.35)
    71.06(+3.94)40.15(+6.80)
    70.29(+3.17)36.65(+3.30)
    72.16(+5.04)41.80(+8.45)
    Table 5. Ablation experiments of AO2DINO's component on DOTAv1.0 dataset
    BaselineScaleResNet50Swin-TAP50 /%AP75 /%
    AO2DINO4 scale72.1641.80
    72.5042.10
    5 scale72.5441.73
    72.6842.21
    multi-scale75.7744.29
    Table 6. Comparative experiment of AO2DINO with different scales on DOTAv1.0 dataset
    Loss functionDOTAv1.0DIOR-R
    L1 loss67.1253.50
    GWD70.0155.56
    KLD69.8255.91
    KFIoU70.2956.02
    Table 7. AP50 of different loss functions on DOTAv1.0 dataset and DIOR-R dataset
    Ruijiao Jin, Kun Wang, Minhao Liu, Xichao Teng, Zhang Li, Qifeng Yu. DETR with Improved DeNoising Training for Multi-Scale Oriented Object Detection in Optical Remote Sensing Images (Invited)[J]. Laser & Optoelectronics Progress, 2024, 61(2): 0211023
    Download Citation