[2] KONDA V R, TSITSIKLIS J N.Actor-critic algorithms[C]//Advances in Neural Information Processing Systems, 2000:1008-1014.
[3] LILLICRAP T P, HUNT J J, PRITZEL A, et al.Continuous control with deep reinforcement learning[J].arXiv Preprint arXiv:1509.0297U2015.