MSA-Net: few-shot object detection with multi-stage attention mechanism

Yingwei TANG; Rongfu ZHANG; Ran DING; Jie ZHANG

doi:10.3969/j.issn.1005-5630.202302030011

Journals >Optical Instruments >Volume 45 >Issue 6 >Page 14 > Article

Optical Instruments
Vol. 45, Issue 6, 14 (2023)

MSA-Net: few-shot object detection with multi-stage attention mechanism

Yingwei TANG, Rongfu ZHANG^*, Ran DING, and Jie ZHANG

Author Affiliations

School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China

show less

DOI: 10.3969/j.issn.1005-5630.202302030011 Cite this Article

Yingwei TANG, Rongfu ZHANG, Ran DING, Jie ZHANG. MSA-Net: few-shot object detection with multi-stage attention mechanism[J]. Optical Instruments, 2023, 45(6): 14 Copy Citation Text

show less

Fig. 1. Learning strategy for a few-shot object detection framework based on meta-learning

Download full size | View in the Article

Fig. 2. General framework of the MSA model

Download full size | View in the Article

Fig. 3. Traditional Faster RCNN architecture^[22]

Download full size | View in the Article

Fig. 4. Gradient backpropagation decoupling mechanism

Download full size | View in the Article

Fig. 5. Attention based distillation module

Download full size | View in the Article

Fig. 6. Multi-scale attention module

Download full size | View in the Article

Fig. 7. Example of a model trained under MS COCO 10-shot setting on VOC 2007 dataset

Download full size | View in the Article

Model	Split 1					Split 2					Split 3					Mean
Model	1	2	3	5	10	1	2	3	5	10	1	2	3	5	10	Mean
FRCN^[22]	9.9	15.6	21.6	28.0	35.6	9.4	13.8	17.4	21.9	29.8	8.1	13.9	19.0	23.9	31.0	19.93
Deformable-DETR^[25]	5.6	13.3	21.7	34.2	45.0	10.9	13.0	18.4	27.3	39.4	7.3	16.6	20.8	32.2	41.8	23.17
RepMet^[25]	26.1	32.9	34.4	38.6	41.3	17.2	22.1	23.4	28.3	35.8	27.5	31.1	31.5	34.4	37.2	30.79
FSRW^[20]	14.8	15.5	26.7	33.9	47.2	15.7	15.3	22.7	30.1	40.5	21.3	25.6	28.4	42.8	45.9	34.08
Meta RCNN ^[11]	19.9	25.5	35.0	45.7	51.5	10.4	19.4	29.6	34.8	45.4	14.3	18.2	27.5	41.2	48.1	31.10
TFA^[9]	25.3	36.4	42.1	47.9	52.8	18.3	27.5	30.9	34.1	39.5	17.9	27.2	34.3	40.8	45.6	34.71
LSTD^[19]	8.2	11.0	12.4	29.1	38.5	11.4	13.8	15.0	15.7	31.0	12.6	18.5	25.0	27.3	36.3	20.39
FSDet^[20]	24.2	35.3	42.2	49.1	57.4	21.6	24.6	31.9	37.0	45.7	21.2	30.0	37.2	43.8	49.6	36.72
MSA-Net	26.4	41.5	47.6	49.7	58.9	17.8	26.6	35.4	38.1	46.5	21.4	36.1	39.6	45.6	49.9	38.74

Table 1. Few-shot object detection performance on VOC 2007 test set

Model	k=10			k=30			Average
Model	AP	AP50	AP75	AP	AP50	AP75	AP	AP50	AP75
Faster RCNN^[20]	5.5	10.0	5.5	7.4	13.1	7.4	6.45	11.55	6.45
Meta-YOLO^[20]	5.6	12.3	4.6	9.1	19.0	7.6	7.35	15.65	6.1
Meta Det^[10]	7.1	14.6	6.1	11.3	21.7	8.1	9.2	18.15	7.1
Meta RCNN^[11]	8.7	19.1	6.6	12.4	25.3	10.8	10.55	22.2	8.7
TFA^[9]	9.1	17.1	8.8	12.1	22.0	12.0	10.6	19.55	10.4
MPSR^[16]	9.8	17.9	9.7	14.1	25.4	14.2	11.95	21.65	11.95
SRR-FSD^[26]	11.3	23.0	9.8	14.7	29.3	13.5	13.0	26.15	13.85
FSCE^[27]	11.1	—	9.8	15.3	—	14.2	13.2	—	12.0
Deformable-DETR^[28]	11.7	19.6	12.1	16.3	27.2	16.7	14.0	23.4	14.4
MSA-Net	14.5	25.1	15.2	16.1	27.6	16.9	15.3	26.35	16.05

Table 2. Few-shot object detection performance on MS COCO test set

FRCN	Decoupled-layer	ABD	Multi-scale Attention	Base	Novel
FRCN	Decoupled-layer	ABD	Multi-scale Attention	Base	1	2	3	5	10
√				56.3	9.9	15.6	21.6	28.0	35.6
√	√			68.6	13.5	22.5	24.7	29.9	40.1
√		√		74.5	17.2	27.9	33.1	35.7	45.3
√			√	75.8	15.0	25.8	28.3	32.9	43.9
√	√	√		74.6	21.5	35.2	44.5	45.9	54.7
√	√		√	72.9	19.0	33.7	41.3	43.2	50.1
√		√	√	79.8	23.4	37.1	45.4	48.6	56.3
√	√	√	√	82.0	26.4	41.5	47.6	49.7	58.9

Table 3. Effect of improved modules on mAP (AP50)

Yingwei TANG, Rongfu ZHANG, Ran DING, Jie ZHANG. MSA-Net: few-shot object detection with multi-stage attention mechanism[J]. Optical Instruments, 2023, 45(6): 14

Download Citation

Tools

Save the article for my favorites

Paper Information

微信扫一扫：分享

微信扫一扫：分享