[3] ZENG Z, SAMMUT K, LIAN L, et al.A comparison of optimization techniques for AUV path planning in environments with ocean currents[J].Robotics and Autonomous Systems, 2016, 82(C):61-72.
[5] XIN J, ZHAO H, LIU D, et al.Application of deep reinforcement learning in mobile robot path planning[C]// Chinese Automation Congress (CAC).Jinan:CAC, 2017:7112-7116.
[8] ISHIGURO H, HAGITA N, SHINOZAWA K, et al.Behavior selection and environment recognition methods for humanoids based on sensor history[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems.Beijing:IEEE, 2006:3468-3473.
[9] BROCKMAN G, CHEUNG V, PETTERSSON L, et al.Openai gym[J].arXiv preprint arXiv:1606.01540, 2016.
[11] SCHULMAN J, WOLSKI F, DHARIWAL P, et al.Proximal policy optimization algorithm[J].arXiv preprint arXiv:1707.06347, 2017.
[12] RAFFIN A, HILL A, ERNESTUS M, et al.Stable baselines3[Z].GitHub Repository, 2019.