Fig. 1. Structure of YOLOv8 network
Fig. 2. Overall structure of TFF module
Fig. 3. Structure of TS-YOLOv8 network
Fig. 4. Partial images in datasets
Fig. 5. Variation trend of mAP50. (a) On the AFO dataset; (b) on the SeaDronesSee dataset
Fig. 6. Comparison of detection results and heatmaps on the AFO test set by baseline model and proposed model. (a) Input images; (b)‒(c) detection results; (d)‒(e) heatmaps
Fig. 7. Comparison of detection results and heatmaps on the SeaDronesSee test set by baseline model and proposed model. (a) Input images; (b)‒(c) detection results; (d)‒(e) heatmaps
Parameter | Configuration |
---|
Python version | 3.11 | PyTorch version | 2.0.0 | CUDA version | 11.7 | Learning rate | 0.01 | Optimizer | Stochastic gradient descent (SGD) | Batch size | 16 | Epoch | 100 | Momentum | 0.937 | Weight decay | 5×10-4 |
|
Table 1. Configuration and parameters setting for experiment
Algorithm | Parameters /106 | GFLOPs | P /% | R /% | mAP50 /% | mAP95 /% | FPS /(frame·s-1) |
---|
Faster RCNN | 136.79 | 369.84 | 42.96 | 47.97 | 41.01 | 17.39 | 17 | SSD | 24.28 | 61.25 | 72.26 | 19.96 | 44.10 | 22.26 | 39 | YOLOv3 | 61.55 | 65.63 | 83.70 | 46.12 | 53.99 | 20.69 | 35 | YOLOv4 | 63.97 | 59.99 | 68.28 | 20.82 | 35.79 | 13.80 | 62 | YOLOv5 | 46.17 | 108.31 | 94.01 | 51.01 | 57.83 | 31.69 | 73 | YOLOX | 54.15 | 155.69 | 94.31 | 75.26 | 76.26 | 46.10 | 69 | YOLOv7 | 37.22 | 105.20 | 87.40 | 76.76 | 84.88 | 44.80 | 81 | RetinaNet | 36.43 | 146.91 | 77.69 | 28.06 | 35.24 | 19.36 | 71 | CenterNet | 32.67 | 70.22 | 90.89 | 52.02 | 61.71 | 28.89 | 82 | RT-DETR | 42.31 | 136.74 | 89.45 | 83.79 | 88.09 | 47.68 | 108 | YOLOv8 | 11.64 | 28.43 | 90.1 | 84.69 | 89.54 | 53.67 | 114 | TS-YOLOv8 | 12.17 | 29.06 | 94.54 | 91.56 | 95.14 | 61.05 | 110 |
|
Table 2. Overall performance of different algorithms on the AFO dataset
Algorithm | AP50 | mAP50 |
---|
Human | Board | Boat | Buoy | Sailboat | Kayak |
---|
Faster RCNN | 19.26 | 69.32 | 45.78 | — | 34.01 | 77.69 | 41.01 | SSD | 22.12 | 87.83 | 31.83 | 10.06 | 34.13 | 78.63 | 44.10 | YOLOv3 | 62.11 | 87.92 | 30.34 | 30.08 | 31.11 | 82.40 | 53.99 | YOLOv4 | 32.49 | 78.54 | 12.90 | — | 28.70 | 62.13 | 35.79 | YOLOv5 | 82.04 | 97.90 | 46.53 | 16.65 | 15.80 | 88.05 | 57.83 | YOLOX | 91.43 | 98.88 | 58.10 | 86.20 | 34.69 | 88.23 | 76.26 | YOLOv7 | 82.72 | 98.72 | 68.40 | 79.34 | 80.88 | 99.24 | 84.88 | RetinaNet | 4.29 | 74.36 | 37.72 | — | 33.79 | 61.26 | 35.24 | CenterNet | 65.54 | 82.38 | 55.02 | 45.85 | 33.56 | 87.93 | 61.71 | RT-DETR | 84.78 | 98.26 | 77.51 | 84.18 | 84.92 | 98.79 | 88.07 | YOLOv8 | 82.75 | 98.19 | 80.64 | 88.18 | 87.92 | 99.54 | 89.54 | TS-YOLOv8 | 89.82 | 99.14 | 91.54 | 94.48 | 96.28 | 99.60 | 95.14 |
|
Table 3. AP of each category in the AFO dataset detected by different algorithms
Model | Parameters /106 | GFLOPs | P /% | R /% | mAP50 /% | mAP95 /% |
---|
YOLOv8 | 11.64 | 28.43 | 90.10 | 84.69 | 89.54 | 53.67 | YOLOv8+EMA | 11.64 | 28.40 | 91.07 | 85.79 | 89.75 | 53.69 | YOLOv8+SimAM | 11.64 | 28.43 | 91.29 | 85.30 | 90.06 | 55.01 | YOLOv8+CA | 11.75 | 28.91 | 90.95 | 85.90 | 90.18 | 54.92 | YOLOv8+GAM | 11.71 | 28.77 | 90.10 | 84.97 | 89.66 | 53.70 | YOLOv8+CBAM | 11.64 | 28.53 | 92.10 | 86.36 | 91.09 | 55.15 | YOLOv8+SA | 11.64 | 28.43 | 91.91 | 87.02 | 91.45 | 55.74 |
|
Table 4. Comparison of different attention mechanisms
YOLOv8 | TFF | SA | Parameters /106 | GFLOPs | P /% | R /% | mAP50 /% | mAP95 /% | FPS /(frame·s-1) |
---|
√ | | | 11.64 | 28.43 | 90.10 | 84.69 | 89.54 | 53.67 | 114 | √ | √ | | 12.17 | 29.06 | 92.35 | 88.61 | 93.46 | 59.79 | 112 | √ | | √ | 11.64 | 28.43 | 91.91 | 87.02 | 91.45 | 55.74 | 112 | √ | √ | √ | 12.17 | 29.06 | 94.54 | 91.56 | 95.14 | 61.05 | 110 |
|
Table 5. Overall performance of each ablation algorithm on the AFO dataset
YOLOv8 | TFF | SA | AP50 | mAP50 |
---|
Human | Board | Boat | Buoy | Sailboat | Kayak |
---|
√ | | | 82.75 | 99.19 | 80.64 | 78.18 | 90.92 | 99.54 | 89.54 | √ | √ | | 88.27 | 99.29 | 92.52 | 86.60 | 94.49 | 99.56 | 93.46 | √ | | √ | 87.09 | 98.48 | 87.27 | 84.35 | 91.98 | 99.53 | 91.45 | √ | √ | √ | 89.82 | 99.14 | 91.54 | 94.48 | 96.28 | 99.60 | 95.14 |
|
Table 6. AP of each category in AFO dataset detected by each ablation algorithm
YOLOv8 | TFF | SA | Parameters /106 | GFLOPs | P /% | R /% | mAP50 /% | mAP95 /% | FPS /(frame·s-1) |
---|
√ | | | 11.64 | 28.43 | 88.42 | 82.82 | 86.87 | 57.61 | 110 | √ | √ | | 12.17 | 29.06 | 90.37 | 86.14 | 90.09 | 61.91 | 107 | √ | | √ | 11.64 | 28.43 | 88.93 | 85.75 | 88.71 | 59.47 | 108 | √ | √ | √ | 12.17 | 29.06 | 91.66 | 87.59 | 91.34 | 63.53 | 106 |
|
Table 7. Overall performance of each ablation algorithm on the SeaDronesSee dataset
YOLOv8 | TFF | SA | AP50 | mAP50 |
---|
Swimmer | Boat | Jetski | Life jacket | Buoy |
---|
√ | | | 80.15 | 92.63 | 90.76 | 81.01 | 89.80 | 86.87 | √ | √ | | 85.75 | 94.13 | 92.93 | 85.08 | 92.56 | 90.09 | √ | | √ | 83.99 | 93.49 | 91.86 | 83.05 | 91.16 | 88.71 | √ | √ | √ | 87.04 | 95.41 | 94.60 | 86.38 | 93.27 | 91.34 |
|
Table 8. AP of each category in SeaDronesSee dataset detected by each ablation algorithm