• Electronics Optics & Control
  • Vol. 30, Issue 8, 1 (2023)
YANG Xiuxia, WANG Chenlei, ZHANG Yi, YU Hao, and JIANG Zijie
Author Affiliations
  • [in Chinese]
  • show less
    DOI: 10.3969/j.issn.1671-637x.2023.08.001 Cite this Article
    YANG Xiuxia, WANG Chenlei, ZHANG Yi, YU Hao, JIANG Zijie. UAV Path Planning Based on Reverse Reinforcement Learning[J]. Electronics Optics & Control, 2023, 30(8): 1 Copy Citation Text show less

    Abstract

    In the planning of UAV safe collision avoidance path,Deep Deterministic Policy Gradient (DDPG) algorithm suffers from slow convergence rate and reward function setting difficulties.To solve the problems,based on reverse reinforcement learning,a UAV path planning algorithm that integrates expert demonstration trajectories is proposed.Firstly,based on the simulator software,the demostration trajectory dataset of the expert manipulating the UAV to avoid obstacles is collected.Secondly,the hybrid sampling mechanism is used to update the network parameters by integrating high-quality expert demonstration trajectory data in the self-exploration data to reduce the cost of algorithm exploration.Finally,according to the maximum entropy reverse reinforcement learning algorithm,the optimal reward function implied in the experience of experts is calculated,which solves the problem that the reward function is difficult to design in complex tasks.Comparative experimental results show that the improved algorithm can effectively improve the efficiency of algorithm training and the obstacle avoidance performance is better.
    YANG Xiuxia, WANG Chenlei, ZHANG Yi, YU Hao, JIANG Zijie. UAV Path Planning Based on Reverse Reinforcement Learning[J]. Electronics Optics & Control, 2023, 30(8): 1
    Download Citation