• Electronics Optics & Control
  • Vol. 29, Issue 2, 53 (2022)
DAI Xiaoqing1 and ZHAO Xu2
Author Affiliations
  • 1[in Chinese]
  • 2[in Chinese]
  • show less
    DOI: 10.3969/j.issn.1671-637x.2022.02.012 Cite this Article
    DAI Xiaoqing, ZHAO Xu. An Online Q-Learning Algorithm for a Model-Free Infinite Horizon System[J]. Electronics Optics & Control, 2022, 29(2): 53 Copy Citation Text show less
    References

    [1] MNIH V, KAVUKCUOGLU K, SILVER D, et al.Human-level control through deep reinforcement learning[J].Nature, 2015, 518: 529-533.

    [2] LILLICRAP T P, HUNT J J, PRITZEL A, et al.Continuous control with deep reinforcement learning[J].IEICE Transactions on Fundamentals of Electronics, Communica-tions and Computer Sciences, 2015.doi: 10.1016/S1098-3015(10)67722-4.

    [3] MOGHADAM R, LEWIS F L.Output-feedback H∞ quadratic tracking control of linear systems using reinforcement learning[J].International Journal of Adaptive Control & Signal Processing, 2019, 33(2): 628-640.

    [5] SUTTON R S, BARTO A G.Reinforcement learning: an introduction[M].Cambridge, Massachusetts: MIT Press, 1998.

    [8] REN H, ZHANG H, WEN Y, et al.Integral reinforcement learning off-policy method for solving nonlinear multi-player nonzero-sum games with saturated actuator[J].Neurocomputing, 2019, 335(28): 96-104.

    [9] ARAGON-GMEZ R, CLEMPNER J B.Traffic-signal control reinforcement learning approach for continuous-time Markov games[J].Engineering Applications of Artificial Intelligence, 2020, 89: 103415.

    [13] JIANG Y, JIANG Z P.Computational adaptive optimal control for continuous time linear systems with completely unknown dynamics[J].Automatica, 2012, 48(10): 2699-2704.

    DAI Xiaoqing, ZHAO Xu. An Online Q-Learning Algorithm for a Model-Free Infinite Horizon System[J]. Electronics Optics & Control, 2022, 29(2): 53
    Download Citation