[3] YIN H, YANG Y F, ZHAO Y, Self-tuning controller design for a 2-DOF flight attitude simulator[J]. Electric Machines and Control, 2018,22(4): 105-112. (in Chinese)
[5] RECHT B . A tour of reinforcement learning: the view from continuous control[Z/OL]. https: //arxiv.org/pdf/1806.09460.pdf. [2018-09-09].
[6] DONG L, GUANG-HONG Y . Model-free adaptive control design for nonlinear discrete-time processes with reinforcement learning techniques[J]. International Journal of Systems Science, 2018, 49(11): 2298-2308.
[7] LEVINE S. Reinforcement learning and control as probabilistic inference: tutorial and review[Z/OL]. https: //arxiv.org/abs/1805.00909. [2018-05-20].
[8] ZHANG T. Research on Path Planning Method of Quadrotor UAV Based on Reinforcement Learning[D]. Harbin: Harbin Institute of Technology, 2018(in Chinese)
[9] KAELBLING L P, LITTMAN M L, MOORE A W. Reinforcement learning: a survey[J]. Artificial Intelligence Research, 1996, 4(1): 237-285.
[10] CHUA K , CALANDRA R , MCALLISTER R , et al.. Deep reinforcement learning in a handful of trials using probabilistic dynamics models[Z/OL]. https: //arxiv.org/abs/1805.12114. [2018-11-02].
[11] DEISENROTH M, RASMUSSEN C. PILCO: A model-based and data-efficient approach to policy search[C]. International Conference on International Conference on Machine Learning. Omnipress, 2011.
[12] RICHARD S, ANDREW G. Reinforcement Learning: An Introduction[M]. Second Edition. London: The MIT Press,2016: 78-88
[13] DURRANT-WHYTE H, ROY N, ABBEEL P. Learning to control a low-cost manipulator using data-efficient reinforcement learning[C]. Robotics: Science and Systems VII. MIT Press, 2011.
[14] DEISENROTH M P. Efficient Reinforcement Learning using Gaussian Processes[D]. Karlsruhe: Karlsruhe Institute of Technology, 2015.