[1] JISHA V R, GHOSE D.Frontier based goal seeking for robots in unknown environments[J].Journal of Intelligent & Robotic Systems, 2012, 67(3/4):229-254.
[2] PANOV A I, YAKOVLEV K S, SUVOROV R.Grid path planning with deep reinforcement learning:preliminary results[J].Procedia Computer Science, 2018, 123:347-353.
[3] ZHELO O, ZHANG J W, TAI L, et al.Curiosity-driven exploration for mapless navigation with deep reinforcement learning[EB/OL].(2018-05-14)[2022-03-05].https://arxiv.org/abs/1804.00456.
[4] WANG G Q, ZHENG X Y, ZHAO H T, et al.Unmanned aerial vehicles path planning based on deep reinforcement learning[C]//The International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery.Cham: Springer, 2019:81-88.
[8] HASSELT H V, GUEZ A, SILVER D.Deep reinforcement learning with double Q-learning[EB/OL].(2018-05-14)[2022-03-05].https://arxiv.org/abs/1509.06461.
[9] HOCHREITER S, SCHMIDHUBER J.Long short-term memory[J].Neural Computation, 1997, 9(8):1735-1780.
[10] SCHAUL T, QUAN J, ANTONOGLOU I, et al.Prioritized experience replay[EB/OL].(2016-02-25)[2022-03-05].https://arxiv.org/abs/1511.05952.
[11] HOU Y N, LIU L F, WEI Q, et al.A novel DDPG method with prioritized experience replay[C]//IEEE International Conference on Systems, Man, and Cybernetics (SMC).Banff, AB:IEEE, 2017:316-321.