Fig. 1. Information extracted by the convolutional structure of each layer in CNN structure
Fig. 2. Generator network model for coarse-grained network
Fig. 3. Discriminator network model for coarse-grained network
Fig. 4. Discriminator network model for fine-grained network
Fig. 5. Extracted objects to be detected by tailoring
Fig. 6. Model of object detection network
Fig. 7. Part of dataset
Fig. 8. Loss function values for different models in coarse-grained network. (a) Discriminator network; (b) generator network
Fig. 9. Loss function values for different models in fine-grained network. (a) Discriminator network; (b) generator network
Fig. 10. Airplane images produced by fine-grained network
Fig. 11. Change in loss function value during the training process
Fig. 12. Part of detection results
Fig. 13. Loss function value curves during the training process. (a) With GAN for pretraining; (b) without GAN for pretraining
Fig. 14. Comparison of the mAP of different network models
| With GAN | Without GAN |
---|
Training step | 200 | 500 | 1000 | 2000 | 500 | 1000 | 2000 | 5000 | 8000 | Loss | 0.26 | 0.14 | 0.14 | 0.13 | 0.50 | 0.40 | 0.36 | 0.19 | 0.35 |
|
Table 1. Loss function value variation with the number of training steps
Labeleddata | mAP /% | |
---|
SSD | Faster-RCNN | YOLOv3 | Withoutcoarse-grained network | Withoutfine-grained network | With GAN |
---|
100 | 38.40 | 35.26 | 37.47 | 56.82 | 35.73 | 60.80 | 300 | 43.85 | 38.75 | 42.06 | 67.10 | 40.21 | 70.93 | 500 | 48.71 | 42.92 | 47.83 | 75.31 | 42.80 | 76.19 | 1000 | 57.69 | 50.38 | 58.04 | 75.49 | 51.82 | 77.27 | 2000 | 68.06 | 65.73 | 69.41 | 75.93 | 64.73 | 77.49 | 3000 | 76.50 | 74.51 | 76.62 | 76.45 | 74.06 | 77.93 | 5000 | 78.04 | 77.28 | 78.12 | 77.52 | 76.71 | 78.17 |
|
Table 2. Effect of sample size on detection accuracy
Method | SSD | Faster-RCNN | YOLOv3 | Ours |
---|
FPS /(frame·s-1) | 46 | 9 | 35 | 49 |
|
Table 3. Detection speed of different detection methods