Author Affiliations
1Institute of Optics and Electronics, Chinese Academy of Science, Chengdu, Sichuan 610209, China2University of Chinese Academy of Science, Beijing 100049, Chinashow less
Fig. 1. Faster R-CNN network architecture
Fig. 2. FSOIC network architecture
Fig. 3. Detection results based on TFA
Fig. 4. Attention-FPN network architecture
Fig. 5. Channel attention module
Fig. 6. FSOIC algorithm class template generation module
Fig. 7. Feature metric space
Fig. 8. Performance comparison of the detection results
Fig. 9. Detection results under the occlusion conditions in the 10 shot task
Fig. 10. 10 shot task detection results. (a) Detection results of the Faster R-CNN network based on TFA; (b) Detection results of the Faster R-CNN net work using the online inference calibration module; (c) Detection results of the Faster R-CNN network using the online inference calibration module and adding the Attention-FPN network
Shot | Backbone | Regressor | Classifer | Attention-FPN | RPN | ROI | 1 | × | √ | √ | × | × | × | 2 | × | × | × | 3 | √ | × | √ | 5 | √ | √ | √ | 10 | √ | √ | √ |
|
Table 1. Hierarchical freezing mechanism
Dataset | Shot | Number of categories | Initial learning rate | Batch_size | Decay ratio of learning rate | Number of attenuation | Iterations | VOC | 1 | 20 | 0.001 | 16 | 0.1 | 1 | 6000 | 2 | 0.1 | 1 | 7000 | 3 | 0.1 | 2 | 8000 | 5 | 0.5 | 2 | 9000 | 10 | 0.5 | 2 | 13000 | COCO | 10 | 80 | 0.001 | 16 | 0.3 | 1 | 30000 | 30 | 40000 |
|
Table 2. Experimental settings of the dataset
Method | Year | Novel Set 1 | | Novel Set 2 | | Novel Set 3 | 1 | 2 | 3 | 5 | 10 | 1 | 2 | 3 | 5 | 10 | 1 | 2 | 3 | 5 | 10 | LSTD[26] | AAAI 18 | 8.2 | 1.0 | 12.4 | 29.1 | 38.5 | | 11.4 | 3.8 | 5.0 | 15.7 | 31.0 | | 12.6 | 8.5 | 15.0 | 27.3 | 36.3 | MetaDet[40] | ICCV 19 | 18.9 | 20.6 | 30.2 | 36.8 | 49.6 | 21.8 | 23.1 | 27.8 | 31.7 | 43.0 | 20.6 | 23.9 | 29.4 | 43.9 | 44.1 | Meta R-CNN[15] | ICCV 19 | 19.9 | 25.5 | 35.0 | 45.7 | 51.5 | 10.4 | 19.4 | 29.6 | 34.8 | 45.4 | 14.3 | 18.2 | 27.5 | 41.2 | 48.1 | RepMet[28] | CVPR 19 | 26.1 | 32.9 | 34.4 | 38.6 | 41.3 | 17.2 | 22.1 | 23.4 | 28.3 | 35.8 | 27.5 | 31.1 | 31.5 | 34.4 | 37.2 | FSRW[37] | ICCV 19 | 14.8 | 15.5 | 26.7 | 33.9 | 47.2 | 15.7 | 15.3 | 22.7 | 30.1 | 40.5 | 21.3 | 25.6 | 28.4 | 42.8 | 45.9 | FSDetView[42] | ECCV 20 | 24.2 | 35.3 | 42.2 | 49.1 | 57.4 | 21.6 | 24.6 | 31.9 | 37.0 | 45.7 | 21.2 | 30.0 | 37.2 | 43.8 | 49.6 | TFA w/cos[44] | ICML 20 | 39.8 | 36.1 | 44.7 | 55.7 | 56.0 | 23.5 | 26.9 | 34.1 | 35.1 | 39.1 | 30.8 | 34.8 | 42.8 | 49.5 | 49.8 | MPSR[51] | ECCV 20 | 41.7 | - | 51.4 | 55.2 | 61.8 | 24.4 | - | 39.2 | 39.9 | 47.8 | 35.6 | - | 42.3 | 48.0 | 49.7 | TFA w/cos+Halluc[18] | CVPR 21 | 45.1 | 44.0 | 44.7 | 55.0 | 55.9 | 23.2 | 27.5 | 35.1 | 34.9 | 39.0 | 30.5 | 35.1 | 41.4 | 49.0 | 49.3 | TIP[41] | CVPR 21 | 27.7 | 36.5 | 43.3 | 50.2 | 59.6 | 22.7 | 30.1 | 33.8 | 40.9 | 46.9 | 21.7 | 30.6 | 38.1 | 44.5 | 50.9 | FSCE[25] | CVPR 21 | 44.2 | 43.8 | 51.4 | 61.9 | 63.4 | 27.3 | 29.5 | 43.5 | 44.2 | 50.2 | 37.2 | 41.9 | 47.5 | 54.6 | 58.5 | Retentive R-CNN[45] | CVPR 21 | 42.4 | 45.8 | 45.9 | 53.7 | 56.1 | 21.7 | 27.8 | 35.2 | 37.0 | 40.3 | 30.2 | 37.6 | 43.0 | 49.7 | 50.1 | Meta-DETR[38] | IEEE 22 | 35.1 | 49.0 | 53.2 | 57.4 | 62.0 | 27.9 | 32.3 | 38.4 | 43.2 | 51.8 | 34.9 | 41.8 | 47.1 | 54.1 | 58.2 | AGCM[33] | IEEE 22 | 40.3 | - | - | 58.5 | 59.9 | 27.5 | - | - | 49.3 | 50.6 | 42.1 | - | - | 54.2 | 58.2 | FSOIC(Ours) | | 46.6 | 53.4 | 56.6 | 62.0 | 64.5 | 25.7 | 30.5 | 43.8 | 45.9 | 53.3 | 42.4 | 44.9 | 49.5 | 56.6 | 58.8 |
|
Table 3. Performance analysis and comparison of the few shot object detection algorithm in VOC new class partition sets
Method | Year | Novel AP | 10 | 30 | LSTD [26] | AAAI 18 | 3.2 | 6.7 | FSRW [37] | ICCV 19 | 5.6 | 9.1 | MPSR[51] | ECCV 20 | 9.8 | 14.1 | TFA w/cos [44] | ICML 20 | 10.0 | 13.7 | Retentive R-CNN [45] | CVPR 21 | 10.5 | 13.8 | FSCE[25] | CVPR 21 | 11.9 | 16.4 | FSOIC(Ours) | | 12.7 | 16.7 |
|
Table 4. Performance analysis and comparison of few shot object detection algorithms in the COCO datasets
Method | FPN+4*ROI | Finetune RPN | Online calibration | Attention of channel | | Novel Set1 | | 1 | 3 | 10 | TFA w/cos[44] | - | - | - | - | 39.8 | 44.7 | 56.0 | FSOIC(Ours) | √ | × | × | × | 43.6 | 52.2 | 62.5 | FSOIC(Ours) | √ | √ | × | × | 44.1 | 53.0 | 63.2 | FSOIC(Ours) | √ | √ | √ | × | 45.7 | 54.2 | 64.2 | FSOIC(Ours) | √ | × | √ | √ | 46.2 | 54.9 | 62.8 | FSOIC(Ours) | √ | √ | × | √ | 44.7 | 54.0 | 61.7 | FSOIC(Ours) | √ | √ | √ | √ | 46.6 | 56.6 | 64.5 |
|
Table 5. Comparison of the ablation experimental performance