UAV Formation Control Based on Deep Reinforcement Learning

ZHAO Qi; ZHEN Ziyang; GONG Huajun; HU Zhou; DONG Aixin

doi:10.3969/j.issn.1671-637x.2022.10.006

[3] XU Y,ZHEN Z.Multivariable adaptive distributed leader-follower flight control for multiple UAVs formation［J］.The Aeronautical Journal,2017,121(1241):877-900．

[6] HUNG S M,GIVIGI S N.A Q-learning approach to flock-ing with UAVs in a stochastic environment［J］.IEEE Transactions on Cybernetics,2017,47(1):186-197.

[7] WANG C,WANG J,ZHANG X.A deep reinforcement learning approach to flocking and navigation of UAVs in large-scale complex environment［C］//IEEE Global Conference on Signal and Information Processing.Anaheim,CA:IEEE, 2018:1228-1232.

[8] WANG C,YAN C,XIANG X,et al.A continuous actor-critic reinforcement learning approach to flocking with fixed-wing UAVs［C］//The Eleventh Asian Conference on Machine Learning.［S.l.］:ACML,2019:64-79.

[9] HUNG S M,GIVIGI S N,NOURELDIN A.A Dyna-Q(λ) approach to flocking with fixed-wing UAVs in a stochastic environment［C］//IEEE International Conference on Systems,Man,and Cybernetics.Hong Kong:IEEE,2015:1918-1923.

[11] PACHTER M,D’AZZO J J,DARGAN J L.Automatic formation flight control［J］.Journal of Guidance,Control, and Dynamics,1994,17(6):1380-1383.

[13] VAN HASSELT H,GUEZ A,SILVER D.Deep reinforcement learning with double Q-learning［C］//Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence.Phoenix:AAAI,2016:2094-2100.

[14] KRSE B J A.Learning from delayed rewards［J］.Robotics and Autonomous Systems,1995,15(4):233-235.

[15] MNIH V,KAVUKCUOGLU K,SILVER D,et al.Human-level control through deep reinforcement learning［J］.Nature,2015,518(7540):529-533.

微信扫一扫：分享

微信扫一扫：分享