• Optical Instruments
  • Vol. 45, Issue 6, 14 (2023)
Yingwei TANG, Rongfu ZHANG*, Ran DING, and Jie ZHANG
Author Affiliations
  • School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China
  • show less
    DOI: 10.3969/j.issn.1005-5630.202302030011 Cite this Article
    Yingwei TANG, Rongfu ZHANG, Ran DING, Jie ZHANG. MSA-Net: few-shot object detection with multi-stage attention mechanism[J]. Optical Instruments, 2023, 45(6): 14 Copy Citation Text show less
    Learning strategy for a few-shot object detection framework based on meta-learning
    Fig. 1. Learning strategy for a few-shot object detection framework based on meta-learning
    General framework of the MSA model
    Fig. 2. General framework of the MSA model
    Traditional Faster RCNN architecture [22]
    Fig. 3. Traditional Faster RCNN architecture [22]
    Gradient backpropagation decoupling mechanism
    Fig. 4. Gradient backpropagation decoupling mechanism
    Attention based distillation module
    Fig. 5. Attention based distillation module
    Multi-scale attention module
    Fig. 6. Multi-scale attention module
    Example of a model trained under MS COCO 10-shot setting on VOC 2007 dataset
    Fig. 7. Example of a model trained under MS COCO 10-shot setting on VOC 2007 dataset
    ModelSplit 1Split 2Split 3Mean
    123510123510123510
    FRCN[22]9.915.621.628.035.69.413.817.421.929.88.113.919.023.931.019.93
    Deformable-DETR[25]5.613.321.734.245.010.913.018.427.339.47.316.620.832.241.823.17
    RepMet[25]26.132.934.438.641.317.222.123.428.335.827.531.131.534.437.230.79
    FSRW[20]14.815.526.733.947.215.715.322.730.140.521.325.628.442.845.934.08
    Meta RCNN [11]19.925.535.045.751.510.419.429.634.845.414.318.227.541.248.131.10
    TFA[9]25.336.442.147.952.818.327.530.934.139.517.927.234.340.845.634.71
    LSTD[19]8.211.012.429.138.511.413.815.015.731.012.618.525.027.336.320.39
    FSDet[20]24.235.342.249.157.421.624.631.937.045.721.230.037.243.849.636.72
    MSA-Net26.441.547.649.758.917.826.635.438.146.521.436.139.645.649.938.74
    Table 1. Few-shot object detection performance on VOC 2007 test set
    Modelk=10k=30Average
    APAP50AP75APAP50AP75APAP50AP75
    Faster RCNN[20]5.510.05.57.413.17.46.4511.556.45
    Meta-YOLO[20]5.612.34.69.119.07.67.3515.656.1
    Meta Det[10]7.114.66.111.321.78.19.218.157.1
    Meta RCNN[11]8.719.16.612.425.310.810.5522.28.7
    TFA[9]9.117.18.812.122.012.010.619.5510.4
    MPSR[16]9.817.99.714.125.414.211.9521.6511.95
    SRR-FSD[26]11.323.09.814.729.313.513.026.1513.85
    FSCE[27]11.19.815.314.213.212.0
    Deformable-DETR[28]11.719.612.116.327.216.714.023.414.4
    MSA-Net14.525.115.216.127.616.915.326.3516.05
    Table 2. Few-shot object detection performance on MS COCO test set
    FRCNDecoupled-layerABDMulti-scale AttentionBaseNovel
    123510
    56.39.915.621.628.035.6
    68.613.522.524.729.940.1
    74.517.227.933.135.745.3
    75.815.025.828.332.943.9
    74.621.535.244.545.954.7
    72.919.033.741.343.250.1
    79.823.437.145.448.656.3
    82.026.441.547.649.758.9
    Table 3. Effect of improved modules on mAP (AP50)
    Yingwei TANG, Rongfu ZHANG, Ran DING, Jie ZHANG. MSA-Net: few-shot object detection with multi-stage attention mechanism[J]. Optical Instruments, 2023, 45(6): 14
    Download Citation