Remote-Sensing Image Object Detection Based on Improved YOLOv8 Algorithm

Xiuzai Zhang; Tao Shen; Dai Xu

doi:10.3788/LOP231803

[1] Zou Z X, Chen K Y, Shi Z W et al. Object detection in 20 years: a survey[EB/OL]. https://arxiv.org/abs/1905.05055

[2] Redmon J, Divvala S, Girshick R et al. You only look once: unified, real-time object detection[C], 779-788(2016).

[3] Liu W, Anguelov D, Erhan D, Leibe B, Matas J, Sebe N et al. SSD: single shot MultiBox detector[M]. Computer vision-ECCV 2016. Lecture notes in computer science, 9905, 21-37(2016).

[4] Girshick R. Fast R-CNN[C], 1440-1448(2016).

[5] Liu T, Ding X Y, Zhang B B et al. Improved YOLOv5 for remote sensing image detection[J]. Computer Engineering and Applications, 59, 253-261(2023).

[6] Zhang Z, Bai J H, Tian Q. Image rotating objects detection based on single level feature pyramid[J]. Computer Engineering and Applications, 59, 235-242(2023).

[7] Yuan Y M, Bai H Y, Guo H W et al. HourglassNet: an improved FCOS algorithm for remote sensing target detection[J]. Journal of Nanjing University of Science and Technology, 46, 719-727, 741(2022).

[8] Redmon J, Farhadi A. YOLOV3: an incremental improvement[EB/OL]. https://arxiv.org/abs/1804.02767

[9] Lou H T, Duan X H, Guo J M et al. DC-YOLOv8: small-size object detection algorithm based on camera sensor[J]. Electronics, 12, 2323(2023).

[10] Neubeck A, Van Gool L. Efficient non-maximum suppression[C], 850-855(2006).

[11] Zheng Z H, Wang P, Liu W et al. Distance-IoU loss: faster and better learning for bounding box regression[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 34, 12993-13000(2020).

[12] Rezatofighi H, Tsoi N, Gwak J et al. Generalized intersection over union: a metric and a loss for bounding box regression[C], 658-666(2020).

[13] Tong Z J, Chen Y H, Xu Z W et al. Wise-IoU: bounding box regression loss with dynamic focusing mechanism[EB/OL]. https://arxiv.org/abs/2301.10051

[14] Hu J, Shen L, Sun G. Squeeze-and-excitation networks[C], 7132-7141(2018).

[15] Woo S, Park J, Lee J Y, Ferrari V, Hebert M, Sminchisescu C et al. CBAM: convolutional block attention module[M]. Computer vision-ECCV 2018. Lecture notes in computer science, 11211, 3-19(2018).

[16] Liu Y C, Shao Z R, Hoffmann N. Global attention mechanism: retain information to enhance channel-spatial interactions[EB/OL]. https://arxiv.org/abs/2112.05561

[17] Tolstikhin I, Houlsby N, Kolesnikov A et al. MLP-mixer: an all-MLP architecture for vision[EB/OL]. https://arxiv.org/abs/2105.01601

[18] He K M, Zhang X Y, Ren S Q et al. Deep residual learning for image recognition[C], 770-778(2016).

[19] Dai J F, Qi H Z, Xiong Y W et al. Deformable convolutional networks[C], 764-773(2017).

[20] Zhu X Z, Hu H, Lin S et al. Deformable ConvNets V2: more deformable, better results[C], 9300-9308(2019).

[21] Xia G S, Bai X, Ding J et al. DOTA: a large-scale dataset for object detection in aerial images[C], 3974-3983(2018).

[22] Xiao Z F, Liu Q, Tang G F et al. Elliptic Fourier transformation-based histograms of oriented gradients for rotationally invariant object detection in remote-sensing images[J]. International Journal of Remote Sensing, 36, 618-644(2015).

微信扫一扫：分享

微信扫一扫：分享