Civil Drone Detection Based on Deep Convolutional Neural Networks: a Survey

Xin YANG; Gang WANG; Liang LI; Shaogang LI; Jin GAO; Yizheng WANG

[1] WANG J, LIU Y, SONG H. Counter-unmanned aircraft system (s)(C-UAS): State of the art, challenges, and future trends[J]. IEEE Aerospace and Electronic Systems Magazine, 2021, 36(3): 4-29.

[2] LI Xiaoping, LEI Songze, ZHANG Boxing, et al. Fast aerial UAV detection using improved inter-frame difference and SVM[C]//Journal of Physics: Conference Series. IOP Publishing, 2019, 1187(3): 032082.

[3] WANG C, WANG T, WANG E, et al. Flying small target detection for anti-UAV based on a Gaussian mixture model in a compressive sensing domain[J]. Sensors, 2019, 19(9): 2168.

[4] Seidaliyeva U, Akhmetov D, Ilipbayeva L, et al. Real-time and accurate drone detection in a video with a static background[J]. Sensors, 2020, 20(14): 3856.

[5] ZHAO W, CHEN X, CHENG J, et al. An application of scale-invariant feature transform in iris recognition[C]//Proceedings of the IEEE/ACIS 12th International Conference on Computer and Information Science, IEEE, 2013: 219-222.

[6] SHU C, DING X, FANG C. Histogram of the oriented gradient for face recognition[J]. Tsinghua Science and Technology, 2011, 16(2): 216-224.

[7] SHEN Y K, CHIU C T. Local binary pattern orientation based face recognition[C]//Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, IEEE, 2015: 1091-1095.

[8] YUAN Xiaofang, WANG Yaonan. Parameter selection of support vector machine for function approximation based on chaos optimization[J]. Journal of Systems Engineering and Electronics, 2008, 19(1): 191-197.

[9] FENG J, WANG L, Sugiyama M, et al. Boosting and margin theory[J]. Frontiers of Electrical and Electronic Engineering, 2012, 7(1): 127-133.

[10] WEI L, HONG Z, Gui-Jin H. NMS-based blurred image sub-pixel registration[C]//Proceedings of the International Conference on Image Analysis and Signal Processing. IEEE, 2011: 98-101.

[12] Bosquet B, Mucientes M, Brea V M. STDNet: exploiting high resolution feature maps for small object detection[J]. Engineering Applications of Artificial Intelligence, 2020, 91: 103615.

[13] SUN H, YANG J, SHEN J, et al. TIB-Net: Drone detection network with tiny iterative backbone[J]. IEEE Access, 2020, 8: 130697-130707.

[14] LIU L, OUYANG W, WANG X, et al. Deep learning for generic object detection: a survey[J]. International Journal of Computer Vision, 2020, 128(2): 261-318.

[15] Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014: 580-587.

[16] Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks[C]//Proceedings of the Advances in Neural Information Processing Systems, 2012, 25: 1097-1105.

[17] HE K, ZHANG X, REN S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904-1916.

[18] Girshick R. Fast R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision, 2015: 1440-1448.

[19] Zeiler M D, Fergus R. Visualizing and understanding convolutional networks[C]//Proceedings of the European Conference on Computer Vision, 2014: 818-833.

[20] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[J/OL]. arXiv preprint arXiv:1409.1556, 2014.

[21] REN S, HE K, Girshick R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 39(6): 1137-1149.

[22] Bell S, Lawrence Zitnick C, Bala K, et al. Inside-outside net: detecting objects in context with skip pooling and recurrent neural networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 2874-2883.

[23] LE Q V, Jaitly N, Hinton G E. A simple way to initialize recurrent networks of rectified linear units[J/OL]. arXiv preprint arXiv: 1504.00941, 2015.

[24] DAI J, LI Y, HE K, et al. R-FCN: Object detection via region-based fully convolutional networks[J/OL]. arXiv preprint arXiv:1605.06409, 2016.

[25] HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 770-778.

[26] LIN T Y, Dollár P, Girshick R, et al. Feature pyramid networks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 2117-2125.

[27] He K, Gkioxari G, Dollár P, et al. Mask R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision, 2017: 2961-2969.

[28] XIE S, Girshick R, Dollár P, et al. Aggregated residual transformations for deep neural networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 1492-1500.

[29] LIU S, QI L, QIN H, et al. Path aggregation network for instance segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 8759-8768.

[30] LI Y, CHEN Y, WANG N, et al. Scale-aware trident networks for object detection[C]//Proceedings of the IEEE International Conference on Computer Vision, 2019: 6054-6063.

[31] DUAN K, XIE L, QI H, et al. Corner proposal network for anchor-free, two-stage object detection[C]//European Conference on Computer Vision. Springer, Cham, 2020: 399-416.

[32] Newell A, YANG K, DENG J. Stacked hourglass networks for human pose estimation[C]//Proceedings of the European Conference on Computer Vision, 2016: 483-499.

[33] Redmon J, Divvala S, Girshick R, et al. You only look once: unified, real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 779-788.

[34] Szegedy C, LIU W, JIA Y, et al. Going deeper with convolutions [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015: 1-9.

[35] LIU W, Anguelov D, Erhan D, et al. SSD: single shot multibox detector[C]//Proceedings of the European Conference on Computer Vision. Springer, 2016: 21-37.

[36] Redmon J, Farhadi A. YOLO9000: Better, faster, stronger[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 7263-7271.

[37] LIN T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision, 2017: 2980-2988.

[38] Redmon J, Farhadi A. YOLOv3: An incremental improvement[J/OL]. arXiv preprint arXiv: 1804.02767, 2018.

[39] ZHOU P, NI B, GENG C, et al. Scale-transferrable object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 528-537.

[40] HUANG G, LIU Z, Van Der Maaten L, et al. Densely connected convolutional networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 4700-4708.

[41] LAW H, DENG J. Cornernet: Detecting objects as paired keypoints[C]//Proceedings of the European Conference on Computer Vision, 2018: 734-750.

[42] Bochkovskiy A, WANG C Y, LIAO H Y M. YOLOv4: Optimal speed and accuracy of object detection[J/OL]. arXiv preprint arXiv: 2004.10934, 2020.

[43] Carion N, Massa F, Synnaeve G, et al. End-to-end object detection with transformers[C]//European Conference on Computer Vision. Springer, Cham, 2020: 213-229.

[44] JIANG N, WANG K, PENG X, et al. Anti-UAV: A large multi-modal benchmark for UAV tracking[J]. arXiv preprint arXiv:2101.08466, 2021.

[45] ZHAO J, WANG G, LI J, et al. The 2nd Anti-UAV Workshop & Challenge: Methods and results[J]. arXiv preprint arXiv:2108.09909, 2021.

[46] Coluccia A, Fascista A, Schumann A, et al. Drone-vs-Bird detection challenge at IEEE AVSS2019[C]//Proceedings of the IEEE International Conference on Advanced Video and Signal Based Surveillance. IEEE, 2019: 1-7.

[47] WU M, XIE W, SHI X, et al. Real-time drone detection using deep learning approach[C]//Proceedings of the International Conference on Machine Learning and Intelligent Communications, 2018: 22-32.

[48] ZHAO W, ZHANG Q, LI H, et al. Low-altitude UAV detection method based on one-staged detection framework[C]//Proceedings of the International Conference on Advances in Computer Technology, Information Science and Communications IEEE, 2020: 112-117.

[49] Magoulianitis V, Ataloglou D, Dimou A, et al. Does deep super-resolution enhance UAV detection?[C]//Proceedings of the IEEE International Conference on Advanced Video and Signal Based Surveillance IEEE, 2019: 1-6.

[50] Kim J, Kwon Lee J, Mu Lee K. Accurate image super-resolution using very deep convolutional networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 1646-1654.

[51] Craye C, Ardjoune S. Spatio-temporal semantic segmentation for drone detection[C]//Proceedings of the IEEE International Conference on Advanced Video and Signal Based Surveillance. IEEE, 2019: 1-5.

[52] Ronneberger O, Fischer P, Brox T. U-Net: Convolutional networks for biomedical image segmentation[C]//Proceedings of the International Conference on Medical Image Computing and Computer-assisted Intervention, 2015: 234-241.

[53] Aker C. End-to-end Networks for Detection and Tracking of Micro Unmanned Aerial Vehicles[D]. Ankara, Turkey: Middle East Technical University, 2018.

[56] Cohen M B, Elder S, Musco C, et al. Dimensionality reduction for kmeans clustering and low rank approximation[C]//Proceedings of the Forty-Seventh Annual ACM Symposium on Theory of Computing, 2015:163-172.

[57] Saqib M, Khan S D, Sharma N, et al. A study on detecting drones using deep convolutional neural networks[C]//Proceedings of the IEEE International Conference on Advanced Video and Signal Based Surveillance. IEEE, 2017: 1-5.

[58] Nalamati M, Kapoor A, Saqib M, et al. Drone detection in long-range surveillance videos[C]//Proceedings of the IEEE International Conference on Advanced Video and Signal Based Surveillance. IEEE, 2019: 1-6.

[59] Aker C, Kalkan S. Using deep networks for drone detection[C]//Proceedings of the IEEE International Conference on Advanced Video and Signal Based Surveillance. IEEE, 2017: 1-6.

[63] CUI Z, YANG J, JIANG S, et al. An infrared small target detection algorithm based on high-speed local contrast method[J]. Infrared Physics & Technology, 2016, 76: 474-481.

[64] ZHAO Y, PAN H, DU C, et al. Bilateral two-dimensional least mean square filter for infrared small target detection[J]. Infrared Physics & Technology, 2014, 65: 17-23.

[65] Lange H. Real-time contrasted target detection for IR imagery based on a multiscale top hat filter[C]//Signal Processing, Sensor Fusion, and Target Recognition VIII. International Society for Optics and Photonics, 1999, 3720: 214-226.

[66] BAI X, ZHOU F, ZHANG S, et al. Top-Hat by the reconstruction operation-based infrared small target detection[C]//Proceedings of the International Conference in Electrics, Communication and Automatic Control Proceedings, 2012: 867-873.

[70] Horn B K P, Schunck B G. Determining optical flow[C]//Techniques and Applications of Image Understanding. International Society for Optics and Photonics, 1981, 281: 319-331.

[71] Lucas B D, Kanade T. An iterative image registration technique with an application to stereo vision[C]//Proceedings of the International Joint Conference on Artificial Intelligence, 1981: 674-679.

[72] Dosovitskiy A, Fischer P, Ilg E, et al. Flownet: Learning optical flow with convolutional networks[C]//Proceedings of the IEEE International onference on Computer Vision, 2015: 2758-2766.

[73] Ilg E, Mayer N, Saikia T, et al. FlowNet 2.0: Evolution of optical flow estimation with deep networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 2462-2470.

[74] Teed Z, Deng J. Raft: Recurrent all-pairs field transforms for optical flow[C]// Proceedings of the European Conference on Computer Vision, 2020: 402-419.

[75] ZHU X, XIONG Y, DAI J, et al. Deep feature flow for video recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 2349-2358.

[76] ZHU X, WANG Y, DAI J, et al. Flow-guided feature aggregation for video object detection[C]//Proceedings of the IEEE International Conference on Computer Vision, 2017: 408-417.

[77] Rozantsev A, Lepetit V, Fua P. Flying objects detection from a single moving camera[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015: 4128-4136.

[78] Bertinetto L, Valmadre J, Henriques J F, et al. Fully-convolutional siamese networks for object tracking[C]//Proceedings of the European Conference on Computer Vision, 2016: 850-865.

[79] Stewart R, Andriluka M, Ng A Y. End-to-end people detection in crowded scenes[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 2325-2333.

[80] ZHAO B, ZHAO B, TANG L, et al. Deep spatial-temporal joint feature representation for video object detection[J]. Sensors, 2018, 18(3): 774.