• Electronics Optics & Control
  • Vol. 29, Issue 10, 29 (2022)
ZHAO Qi1, ZHEN Ziyang1, GONG Huajun1, HU Zhou2, and DONG Aixin1
Author Affiliations
  • 1[in Chinese]
  • 2[in Chinese]
  • show less
    DOI: 10.3969/j.issn.1671-637x.2022.10.006 Cite this Article
    ZHAO Qi, ZHEN Ziyang, GONG Huajun, HU Zhou, DONG Aixin. UAV Formation Control Based on Deep Reinforcement Learning[J]. Electronics Optics & Control, 2022, 29(10): 29 Copy Citation Text show less
    References

    [3] XU Y,ZHEN Z.Multivariable adaptive distributed leader-follower flight control for multiple UAVs formation[J].The Aeronautical Journal,2017,121(1241):877-900.

    [6] HUNG S M,GIVIGI S N.A Q-learning approach to flock-ing with UAVs in a stochastic environment[J].IEEE Transactions on Cybernetics,2017,47(1):186-197.

    [7] WANG C,WANG J,ZHANG X.A deep reinforcement learning approach to flocking and navigation of UAVs in large-scale complex environment[C]//IEEE Global Conference on Signal and Information Processing.Anaheim,CA:IEEE, 2018:1228-1232.

    [8] WANG C,YAN C,XIANG X,et al.A continuous actor-critic reinforcement learning approach to flocking with fixed-wing UAVs[C]//The Eleventh Asian Conference on Machine Learning.[S.l.]:ACML,2019:64-79.

    [9] HUNG S M,GIVIGI S N,NOURELDIN A.A Dyna-Q(λ) approach to flocking with fixed-wing UAVs in a stochastic environment[C]//IEEE International Conference on Systems,Man,and Cybernetics.Hong Kong:IEEE,2015:1918-1923.

    [11] PACHTER M,D’AZZO J J,DARGAN J L.Automatic formation flight control[J].Journal of Guidance,Control, and Dynamics,1994,17(6):1380-1383.

    [13] VAN HASSELT H,GUEZ A,SILVER D.Deep reinforcement learning with double Q-learning[C]//Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence.Phoenix:AAAI,2016:2094-2100.

    [14] KRSE B J A.Learning from delayed rewards[J].Robotics and Autonomous Systems,1995,15(4):233-235.

    [15] MNIH V,KAVUKCUOGLU K,SILVER D,et al.Human-level control through deep reinforcement learning[J].Nature,2015,518(7540):529-533.