Author Affiliations
Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, Yunnan 650500, Chinashow less
Fig. 1. Structure of Faster R-CNN algorithm
Fig. 2. Region proposal network
Fig. 3. Problem with ROI pooling
Fig. 4. Problem with non-maximum suppression algorithm
Fig. 5. Detection results with normal conditions. (a) Faster R-CNN; (b) add Soft-NMS+crop_and_resize; (c) add data enhancement; (d) our algorithm
Fig. 6. Detection results with grayscale image. (a) Faster R-CNN; (b) add Soft-NMS+crop_and_resize; (c) add data enhancement; (d) our algorithm
Fig. 7. Detection results with multiple targets overlapping. (a) Faster R-CNN; (b) add Soft-NMS+crop_and_resize; (c) add data enhancement; (d) our algorithm
Algorithm | Backbone | Training set | Testing set | mAP /% |
---|
Fast R-CNN | VGG-16 | VOC2007 | VOC2007 | 66.90 | Faster R-CNN | VGG-16 | VOC2007 | VOC2007 | 69.90 | SSD300 | VGG-16 | VOC2007 | VOC2007 | 68.00 | YOLO | GoogleNet | VOC2007 | VOC2007 | 63.40 | Data enhancement | VGG-16 | VOC2007 | VOC2007 | 70.90 | Soft-NMS+crop_and_resize | VGG-16 | VOC07++ | VOC2007 | 73.10 | Ours | VGG-16 | VOC07++ | VOC2007 | 76.40 |
|
Table 1. Test results on the PASCAL VOC2007
Enter size | Backbone | Training set | Testing set | mAP /% |
---|
1282,2562,5212 | VGG-16 | VOC07++ | VOC2007 | 76.40 | 642,1282,2562,5212 | VGG-16 | VOC07++ | VOC2007 | 77.69 | 322,642,1282,2562,5212 | VGG-16 | VOC07++ | VOC2007 | 77.63 |
|
Table 2. Detection results on PASCAL VOC07 ++ data set at different scales
Algorithm | Backbone | Training set | Testing set | mAP /% |
---|
Fast R-CNN | VGG-16 | VOC07+12 | VOC2007 | 70.00 | Faster R-CNN | VGG-16 | VOC07+12 | VOC2007 | 73.20 | Faster R-CNN | ResNet-101 | VOC07+12 | VOC2007 | 76.40 | MR-CNN | ResNet-101 | VOC07+12 | VOC2007 | 78.20 | ION | VGG-16 | VOC07+12 | VOC2007 | 76.50 | YOLO | GoogleNet | VOC07+12 | VOC2007 | 63.40 | YOLOV2 | Darknet-19 | VOC07+12 | VOC2007 | 78.60 | SSD300 | VGG-16 | VOC07+12 | VOC2007 | 77.20 | Data enhancement | VGG-16 | VOC07+12 | VOC2007 | 75.80 | Soft-NMS+crop_and_resize | VGG-16 | VOC07+++12 | VOC2007 | 78.40 | Ours | VGG-16 | VOC07+++12 | VOC2007 | 81.20 |
|
Table 3. Test results on PASCAL VOC07+12 test set
Enter size | Backbone | Training set | Testing set | mAP /% |
---|
1282,2562,5212 | VGG-16 | VOC07+++12 | VOC2007 | 81.22 | 642,1282,2562,5212 | VGG-16 | VOC07+++12 | VOC2007 | 83.00 | 322,642,1282,2562,5212 | VGG-16 | VOC07+++12 | VOC2007 | 82.94 |
|
Table 4. Detection results on PASCAL VOC07+++12 at different scales
Algorithm | Training set | IOU | Image size |
---|
0.50∶0.95 | 0.50 | 0.75 | S | M | L |
---|
Fast R-CNN | train | 19.70 | 35.90 | - | - | - | - | Faster R-CNN | train | 20.50 | 39.90 | 19.40 | 4.10 | 20.00 | 35.80 | Faster R-CNN | train | 21.90 | 42.70 | - | - | - | - | ION[18] | train | 23.60 | 43.20 | 23.60 | 6.40 | 24.10 | 38.30 | Faster R-CNN | trainval35 | 24.20 | 45.30 | 23.50 | 7.70 | 26.40 | 37.10 | SSD300 | trainval35 | 23.20 | 41.20 | 23.40 | 5.30 | 23.20 | 39.60 | SSD512 | trainval35 | 26.80 | 46.50 | 27.80 | 9.00 | 28.90 | 41.90 | YOLOV2[19] | trainval35 | 21.60 | 44.00 | 19.20 | 5.00 | 22.40 | 35.50 | Ours | trainval35 | 26.60 | 47.20 | 27.00 | 11.40 | 30.80 | 37.10 |
|
Table 5. mAP of different algorithms on COCO2014unit:%
Algorithm | Training set | Number of iterations | Image size |
---|
1 | 10 | 100 | S | M | L |
---|
Faster R-CNN | train | 21.30 | 29.50 | 30.10 | 7.30 | 32.10 | 52.00 | ION | train | 23.20 | 32.70 | 33.50 | 10.10 | 37.70 | 53.60 | Faster R-CNN | trainval35 | 23.80 | 34.00 | 34.60 | 12.00 | 38.50 | 54.40 | SSD300 | trainval35 | 22.50 | 33.20 | 35.30 | 9.60 | 37.60 | 56.50 | SSD512 | trainval35 | 24.80 | 37.50 | 39.80 | 14.00 | 43.50 | 59.00 | YOLOV2 | trainval35 | 20.70 | 31.60 | 33.30 | 9.80 | 36.50 | 54.40 | Ours | trainval35 | 25.50 | 38.30 | 39.30 | 19.70 | 45.50 | 55.40 |
|
Table 6. mAR of different algorithms on COCO2014unit:%