• Journal of Electronic Science and Technology
  • Vol. 23, Issue 1, 100290 (2025)
Jian-Dong Yao, Wen-Bin Hao, Zhi-Gao Meng*, Bo Xie..., Jian-Hua Chen and Jia-Qi Wei|Show fewer author(s)
Author Affiliations
  • State Grid Sichuan Electric Power Company Chengdu Power Supply Company, Chengdu, 610041, China
  • show less
    DOI: 10.1016/j.jnlest.2024.100290 Cite this Article
    Jian-Dong Yao, Wen-Bin Hao, Zhi-Gao Meng, Bo Xie, Jian-Hua Chen, Jia-Qi Wei. Adaptive multi-agent reinforcement learning for dynamic pricing and distributed energy management in virtual power plant networks[J]. Journal of Electronic Science and Technology, 2025, 23(1): 100290 Copy Citation Text show less
    References

    [1] Asmus P.. Microgrids. virtual power plants and our distributed energy future, Electr. J., 23, 72-82(2010).

    [2] Pudjianto D., Ramsay C., Strbac G.. Virtual power plant and system integration of distributed energy resources. IET Renew. Power Gen., 1, 10-16(2007).

    [3] Yavuz L., Önen A., Muyeen S.M., Kamwa I.. Transformation of microgrid to virtual power plant―a comprehensive review. IET Gener. Transm. Dis., 13, 1994-2005(2019).

    [4] P. Lombardi, M. Powalko, K. Rudion, Optimal operation of a virtual power plant, in: Proc. of IEEE Power & Energy Society General Meeting, Calgary, Canada, 2009, pp. 1–6.

    [5] Zamani A.G., Zakariazadeh A., Jadid S.. Day-ahead resource scheduling of a renewable energy based virtual power plant. Appl. Energ., 169, 324-340(2016).

    [6] Khaksari A., Steriotis K., Makris P., Tsaousoglou G., Efthymiopoulos N., Varvarigos E.. Electricity market equilibria analysis on the value of demand-side flexibility portfolios’ mix and the strategic demand aggregators’ market power. Sustain. Energy Grids, 38, 101399(2024).

    [7] Yu S.-Y., Fang F., Liu Y.-J., Liu J.-Z.. Uncertainties of virtual power plant: problems and countermeasures. Appl. Energ., 239, 454-470(2019).

    [8] Y.H. Peng, Big Data, Machine Learning Challenges of High Dimensionality in Financial Administration, Ph.D. dissertation, University of Brasilia, Brasilia, 2019.

    [9] Tushar W., Chai B., Yuen C., Smith D.B., Wood K.L., Yang Z.-Y.. Three-party energy management with distributed energy resources in smart grid. IEEE T. Ind. Electron., 62, 2487-2498(2015).

    [10] Kazemi M., Zareipour H., Amjady N., Rosehart W.D., Ehsan M.. Operation scheduling of battery storage systems in joint energy and ancillary services markets. IEEE T. Sustain. Energ., 8, 1726-1735(2017).

    [11] Wang H., Huang J.-W.. Incentivizing energy trading for interconnected microgrids. IEEE T. Smart Grid, 9, 2647-2657(2018).

    [12] Tushar W., Yuen C., Mohsenian-Rad H., Saha T., Poor H.V., Wood K.L.. Transforming energy networks via peer-to-peer energy trading: the potential of game-theoretic approaches. IEEE Signal Proc. Mag., 35, 90-111(2018).

    [13] Wang Z.-Y., Chen B.-K., Wang J.-H., Begovic M.M., Chen C.. Coordinated energy management of networked microgrids in distribution systems. IEEE T. Smart Grid, 6, 45-53(2015).

    [14] Wang Y.-P., Saad W., Han Z., Poor H.V., Başar T.. A game-theoretic approach to energy trading in the smart grid. IEEE T. Smart Grid, 5, 1439-1450(2014).

    [15] Mohsenian-Rad A.H., Wong V.W.S., Jatskevich J., Schober R., Leon-Garcia A.. Autonomous demand-side management based on game-theoretic energy consumption scheduling for the future smart grid. IEEE T. Smart Grid, 1, 320-331(2010).

    [16] Saad W., Han Z., Poor H.V., Başar T.. Game-theoretic methods for the smart grid: an overview of microgrid systems. demand-side management, and smart grid communications, IEEE Signal Proc. Mag., 29, 86-105(2012).

    [17] Papavasiliou A., Mou Y.-T., Cambier L., Scieur D.. Application of stochastic dual dynamic programming to the real-time dispatch of storage under renewable supply uncertainty. IEEE T. Sustain. Energ., 9, 547-558(2019).

    [18] L. Buşoniu, R. Babuška, B. De Schutter, Multiagent reinfcement learning: An overview, in: D. Srinivasan, L.C. Jain (Eds.), Innovations in MultiAgent Systems Application1, Springer, Berlin, Germany, 2010, pp. 183–221.

    [19] J.N. Foerster, I.M. Assael, N. de Freitas, S. Whiteson, Learning to communicate with deep multiagent reinfcement learning, in: Proc. of the 30th Intl. Conf. on Neural Infmation Processing Systems, Red Hook, USA, 2016, pp. 2145–2153.

    [20] K.Q. Zhang, Z.R. Yang, T. Başar, Multiagent reinfcement learning: a ive overview of theies algithms, in: K.G. Vamvoudakis, Y. Wan, F.L. Lewis, D. Cansever (Eds.), Hbook of Reinfcement Learning Control, Springer, Cham, Germany, 2021, pp. 321–384.

    [21] Nguyen T.T., Nguyen N.D., Nahavandi S.. Deep reinforcement learning for multiagent systems: a review of challenges. solutions, and applications, IEEE T. Cybernetics, 50, 3826-3839(2020).

    [22] Hu J.-L., Wellman M.P.. Nash Q-learning for general-sum stochastic games. J. Mach. Learn. Res., 4, 1039-1069(2003).

    [23] M. Lauer, M.A. Riedmiller, An algithm f distributed reinfcement learning in cooperative multiagent systems, in: Proc. of the 17th Intl. Conf. on Machine Learning, Stanfd, USA, 2000, pp. 535–542.

    [24] R. Lowe, Y. Wu, A. Tamar, J. Harb, P. Abbeel, I. Mdatch, Multiagent actcritic f mixed cooperativecompetitive environments, in: Proc. of the 31st Intl. Conf. on Neural Infmation Processing Systems, Long Beach, USA, 2017, pp. 6382–6393.

    [25] M. Tokic, G. Palm, Valuedifference based explation: adaptive control between epsilongreedy softmax, in: Proc. of the 34th Annual German Conf. on AI, Berlin, Germany, 2011, pp. 335–346.

    [26] Wu H.-C., Qiu D.-W., Zhang L.-Y., Sun M.-Y.. Adaptive multi-agent reinforcement learning for flexible resource management in a virtual power plant with dynamic participating multi-energy buildings. Appl. Energ., 374, 123998(2024).

    [27] Liu W.-R., Zhuang P., Liang H., Peng J., Huang Z.-W.. Distributed economic dispatch in microgrids based on cooperative reinforcement learning. IEEE T. Neur. Net. Lear., 29, 2192-2203(2018).

    [28] Xu H.-C., Sun H.-B., Nikovski D., Kitamura S., Mori K., Hashimoto H.. Deep reinforcement learning for joint bidding and pricing of load serving entity. IEEE T. Smart Grid, 10, 6366-6375(2019).

    [29] Chen T., Su W.-C.. Indirect customer-to-customer energy trading with reinforcement learning. IEEE T. Smart Grid, 10, 4338-4348(2019).

    [30] Wang H.-W., Huang T.-W., Liao X.-F., Abu-Rub H., Chen G.. Reinforcement learning in energy trading game among smart microgrids. IEEE T. Ind. Electron., 63, 5109-5119(2016).

    [31] Y.D. Yang, J.Y. Hao, M.Y. Sun, Z. Wang, C.J. Fan, G. Strbac, Recurrent deep multiagent Qlearning f autonomous brokers in smart grid, in: Proc. of the 27th Intl. Joint Conf. on Artificial Intelligence, Stockholm, Sweden, 2018, pp. 569–575.

    [32] Vázquez-Canteli J.R., Nagy Z.. Reinforcement learning for demand response: a review of algorithms and modeling techniques. Appl. Energ., 235, 1072-1089(2019).

    [33] Tushar W., Saha T.K., Yuen C. et al. A motivational game-theoretic approach for peer-to-peer energy trading in the smart grid. Appl. Energ., 243, 10-20(2019).

    [34] Ye Y.-J., Qiu D.-W., Sun M.-Y., Papadaskalopoulos D., Strbac G.. Deep reinforcement learning for strategic bidding in electricity markets. IEEE T. Smart Grid, 11, 1343-1355(2020).

    [35] Yang Q.-L., Wang G., Sadeghi A., Giannakis G.B., Sun J.. Two-timescale voltage control in distribution grids using deep reinforcement learning. IEEE T. Smart Grid, 11, 2313-2323(2019).

    [36] Mbuwir B.V., Ruelens F., Spiessens F., Deconinck G.. Battery energy management in a microgrid using batch reinforcement learning. Energies, 10, 1846(2017).

    [37] Mocanu E., Mocanu D.C., Nguyen P.H., Liotta A., Webber M.E., Gibescu M.. On-line building energy optimization using deep reinforcement learning. IEEE T. Smart Grid, 10, 3698-3708(2019).

    [38] Da Silva F.L., Nishida C.E.H., Roijers D.M., Costa A.H.R.. Coordination of electric vehicle charging through multiagent reinforcement learning. IEEE T. Smart Grid, 11, 2347-2356(2020).

    [39] P. Vytelingum, S.D. Ramchurn, T.D. Voice, A. Rogers, N.R. Jennings, Trading agents f the smart electricity grid, in: Proc. of the 9th Intl. Conf. on Autonomous Agents Multiagent Systems, Tonto, Canada, 2010, pp. 897–904.

    [40] Chakraborty S., Okabe T.. Robust energy storage scheduling for imbalance reduction of strategically formed energy balancing groups. Energy, 114, 405-417(2016).

    [41] Glanois C., Weng P., Zimmer M. et al. A survey on interpretable reinforcement learning. Mach. Learn., 113, 5847-5890(2024).

    [42] G. Dalal, K. Dvijotham, M. Vecerik, T. Hester, C. Paduraru, Y. Tassa, Safe explation in continuous action spaces [Online]. Available, https:arxiv.gabs1801.08757, January 2018.

    [43] Busoniu L., Babuska R., De Schutter B., comprehensive survey of multiagent reinforcement learning A, T. Systems IEEE. Man. and Cybernetics, Part C (Applications and Reviews), 38, 156-172(2008).

    [44] Du W., Ding S.. A survey on multi-agent deep reinforcement learning: from the perspective of challenges and applications. Artif. Intell. Rev., 54, 3215-3238(2021).

    [45] Gronauer S., Diepold K.. Multi-agent deep reinforcement learning: a survey. Artif. Intell. Rev., 55, 895-943(2022).

    [46] Lezama F., Soares J., Hernandez-Leal P., Kaisers M., Pinto T., Vale Z.. Local energy markets: paving the path toward fully transactive energy systems. IEEE T. Power Syst., 34, 4081-4088(2019).

    [47] T.P. Lillicrap, J.J. Hunt, A. Pritzel, et al., Continuous control with deep reinfcement learning [Online]. Available, https:arxiv.gabs1509.02971, July 2019.

    [48] J.N. Foerster, G. Farquhar, T. Afouras, N. Nardelli, S. Whiteson, Counterfactual multiagent policy gradients, in: Proc. of the 32nd AAAI Conf. on Artificial Intelligence, New leans, USA, 2018, pp. 2974–2982.

    [49] A. Vaswani, N. Shazeer, N. Parmar, et al., Attention is all you need, in: Proc. of the 31st Intl. Conf. on Neural Infmation Processing Systems, Long Beach, USA, 2017, pp. 6000–6010.

    [50] J.C. Jiang, C. Dun, T.J. Huang, Z.Q. Lu, Graph convolutional reinfcement learning, in: Proc. of the 8th Intl. Conf. on Learning Representations, Addis Ababa, Ethiopia, 2020, pp. 1–13.

    [51] S. Iqbal, F. Sha, Actattentioncritic f multiagent reinfcement learning, in: Proc. of the 36th Intl. Conf. on Machine Learning, Long Beach, USA, 2019, pp. 2961–2970.

    [52] T. Schaul, J. Quan, I. Antonoglou, D. Silver, Priitized experience replay [Online]. Available, https:arxiv.gabs1511.05952, February 2016.

    [53] M. Plappert, R. Houthooft, P. Dhariwal, et al., Parameter space noise f explation, in: Proc. of the 6th Intl. Conf. on Learning Representations, Vancouver, Canada, 2018, pp. 1–18.

    [54] Uhlenbeck G.E., Ornstein L.S.. On the theory of the Brownian motion. Phys. Rev., 36, 823-841(1930).

    [55] D.P. Kingma, J. Ba, Adam: a method f stochastic optimization, in: Proc. of the 3rd Intl. Conf. on Learning Representations, San Diego, USA, 2015, p. 6.

    [56] T.P. Lillicrap, J.J. Hunt, A. Pritzel, et al., Continuous control with deep reinfcement learning, in: Proc. of the 4th Intl. Conf. on Learning Representations, San Juan, Puerto Rico, 2016, pp. 1–14.

    [57] J. Nocedal, S.J. Wright, Numerical Optimization, 2nd ed., Springer, New Yk, USA, 2006.

    [58] J. Achiam, D. Held, A. Tamar, P. Abbeel, Constrained policy optimization, in: Proc. of the 34th Intl. Conf. on Machine Learning, Sydney, Australia, 2017, pp. 22–31.

    [59] F. Berkenkamp, M. Turchetta, A.P. Schoellig, A. Krause, Safe modelbased reinfcement learning with stability guarantees, in: Proc. of the 31st Intl. Conf. on Neural Infmation Processing Systems, Long Beach, USA, 2017, pp. 908–919.

    [60] Z. Sheebaelhamd, K. Zisis, A. Nisioti, D. Gkouletsos, D. Pavllo, J. Kohler, Safe deep reinfcement learning f multiagent systems with continuous action spaces, arXiv preprint, arXiv: 2108.03952 (2021).

    [61] A.P. Dobos, PV Watts Version 5 Manual, National Renewable Energy Labaty, Washington, USA, 2014.

    [62] Draxl C., Clifton A., Hodge B.-M., McCaa J.. The Wind Integration National Dataset (WIND) toolkit. Appl. Energ., 151, 355-366(2015).

    [63] S. Ong, N. Clark, Commercial Residential Hourly Load Profiles f All TMY3 Locations in the United States, DOE Open Energy Data Initiative (OEDI), National Renewable Energy Lab. (NREL), Golden, USA, 2014.

    [64] Weron R.. Electricity price forecasting: a review of the state-of-the-art with a look into the future. Int. J. Forecasting, 30, 1030-1081(2014).

    [65] Dong L., Tu S.-Q., Li Y., Pu T.-J.. A stackelberg game model for dynamic pricing and energy management of multiple virtual power plants using metamodel-based optimization method. Power Syst. Technol., 44, 973-981(2020).

    [66] Parisio A., Rikos E., Glielmo L.. A model predictive control approach to microgrid operation optimization. IEEE T. Contr. Syst. T., 22, 1813-1827(2014).

    [67] Stock S., Babazadeh D., Becker C.. Applications of artificial intelligence in distribution power system operation. IEEE Access, 9, 150098-150119(2021).

    [68] J. Snoek, H. Larochelle, R.P. Adams, Practical Bayesian optimization of machine learning algithms, in: Proc. of the 25th Intl. Conf. on Neural Infmation Processing Systems, Lake Tahoe, USA, 2012, pp. 2951–2959.

    [69] P. Henderson, R. Islam, P. Bachman, J. Pineau, D. Precup, D. Meger, Deep reinfcement learning that matters, in: Proc. of the 32nd AAAI Conf. on Artificial Intelligence, New leans, USA, 2018, pp. 3207–3214.

    [70] P. Mitz, R. Nishihara, S. Wang, et al., Ray: a distributed framewk f emerging AI applications, in: Proc. of the 13th USENIX Conf. on Operating Systems Design Implementation, Carlsbad, USA, 2018, pp. 561–577.

    [71] Roesch M., Linder C., Zimmermann R., Rudolf A., Hohmann A., Reinhart G.. Smart grid for industry using multi-agent reinforcement learning. Appl. Sci., 10, 6900(2020).

    [72] Duan J.-J., Shi D., Diao R.-S. et al. Deep-reinforcement-learning-based autonomous voltage control for power grid operations. IEEE T. Power Syst., 35, 814-817(2020).

    [73] Xu B., Luan W.-P., Yang J. et al. Integrated three-stage decentralized scheduling for virtual power plants: a model-assisted multi-agent reinforcement learning method. Appl. Energ., 376, 123985(2024).

    [74] Zhang Z.-D., Zhang D.-X., Qiu R.C.. Deep reinforcement learning for power system applications: an overview. CSEE J. Power Energy, 6, 213-225(2019).

    [75] B. McMahan, E. Moe, D. Ramage, S. Hampson, B.A.Y Arcas, Communicationefficient learning of deep wks from decentralized data, in: Proc. of the 20th Intl. Conf. on Artificial Intelligence Statistics, Ft Lauderdale, USA, 2017, pp. 1273–1282.

    [76] A. Gupta, C. Devin, Y.X. Liu, P. Abbeel, S. Levine, Learning invariant feature spaces to transfer skills with reinfcement learning, in: Proc. of the 5th Intl. Conf. on Learning Representations, Toulon, France, 2017, pp. 1–14.

    Jian-Dong Yao, Wen-Bin Hao, Zhi-Gao Meng, Bo Xie, Jian-Hua Chen, Jia-Qi Wei. Adaptive multi-agent reinforcement learning for dynamic pricing and distributed energy management in virtual power plant networks[J]. Journal of Electronic Science and Technology, 2025, 23(1): 100290
    Download Citation