• Journal of Electronic Science and Technology
  • Vol. 23, Issue 1, 100290 (2025)
Jian-Dong Yao, Wen-Bin Hao, Zhi-Gao Meng*, Bo Xie..., Jian-Hua Chen and Jia-Qi Wei|Show fewer author(s)
Author Affiliations
  • State Grid Sichuan Electric Power Company Chengdu Power Supply Company, Chengdu, 610041, China
  • show less
    DOI: 10.1016/j.jnlest.2024.100290 Cite this Article
    Jian-Dong Yao, Wen-Bin Hao, Zhi-Gao Meng, Bo Xie, Jian-Hua Chen, Jia-Qi Wei. Adaptive multi-agent reinforcement learning for dynamic pricing and distributed energy management in virtual power plant networks[J]. Journal of Electronic Science and Technology, 2025, 23(1): 100290 Copy Citation Text show less

    Abstract

    This paper presents a novel approach to dynamic pricing and distributed energy management in virtual power plant (VPP) networks using multi-agent reinforcement learning (MARL). As the energy landscape evolves towards greater decentralization and renewable integration, traditional optimization methods struggle to address the inherent complexities and uncertainties. Our proposed MARL framework enables adaptive, decentralized decision-making for both the distribution system operator and individual VPPs, optimizing economic efficiency while maintaining grid stability. We formulate the problem as a Markov decision process and develop a custom MARL algorithm that leverages actor-critic architectures and experience replay. Extensive simulations across diverse scenarios demonstrate that our approach consistently outperforms baseline methods, including Stackelberg game models and model predictive control, achieving an 18.73% reduction in costs and a 22.46% increase in VPP profits. The MARL framework shows particular strength in scenarios with high renewable energy penetration, where it improves system performance by 11.95% compared with traditional methods. Furthermore, our approach demonstrates superior adaptability to unexpected events and mis-predictions, highlighting its potential for real-world implementation.
    $ \underset{{{\bf{a}}}_{\text{DSO}}}{\mathrm{max}}\;{J}_{\text{DSO}}=\sum _{t=1}^{T}\left({R}_{t}^{\text{market}}-{C}_{t}^{\text{market}}+{R}_{t}^{\text{VPP}}-{P}_{t}^{\text{stability}}-{P}_{t}^{\text{emissions}}\right) $(1)

    View in Article

    $ \underset{{{\bf{a}}}_{i}}{\mathrm{max}}\;{J}_{i}^{\text{VPP}}=\sum _{t=1}^{T}\left({R}_{t}^{\text{sell}}-{C}_{t}^{\text{buy}}-{C}_{t}^{\text{operation}}-{P}_{t}^{\text{deviation}}\right) $(2)

    View in Article

    $ {{\mathbf{a}}_{{\text{DSO}}{\mathrm{,}}t}} = \left[ {\lambda _{t + 1}^{{\text{DA}}{\mathrm{,}}s}{\mathrm{,}}{\text{ }}\lambda _{t + 1}^{{\text{DA}}{\mathrm{,}}b}{\mathrm{,}}{\text{ }}{\delta _t}} \right] \in {\mathcal{A}_{{\text{DSO}}}} $(3)

    View in Article

    $ \lambda _t^{W{\mathrm{,}}s} \leq \lambda _t^{{\text{DA}}{\mathrm{,}}s} \leq \lambda _t^{{\text{DA{\mathrm{,}}}}b} \leq \lambda _t^{W{\mathrm{,}}b} $(4)

    View in Article

    $ {{\mathbf{a}}_{i{\mathrm{,}}t}} = \left[ {P_{i{\mathrm{,}}t}^{{\text{VPP}}{\mathrm{,}}s}{\mathrm{,}}{\text{ }}P_{i{\mathrm{,}}t}^{{\text{VPP}}{\mathrm{,}}b}{\mathrm{,}}{\text{ }}P_{i{\mathrm{,}}t}^{{\text{MT}}}{\mathrm{,}}{\text{ }}P_{i{\mathrm{,}}t}^{{\text{ES}}}{\mathrm{,}}{\text{ }}P_{i{\mathrm{,}}t}^{{\text{IL}}}} \right] \in {\mathcal{A}_{{\text{VPP}}{\mathrm{,}}i}} $(5)

    View in Article

    $ 0 \leq P_{i{\mathrm{,}}t}^{{\text{VPP}}{\mathrm{,}}s} \leq {\theta _{i{\mathrm{,}}t}}P_{i{\mathrm{,}}{\rm{max}} }^{{\text{VP}}} $(6a)

    View in Article

    $ 0 \leq P_{i{\mathrm{,}}t}^{{\text{VPP}}{\mathrm{,}}b} \leq \left( {1 - {\theta _{i{\mathrm{,}}t}}} \right)P_{i{\mathrm{,}}{\rm{max}} }^{{\text{VPP}}} $(6b)

    View in Article

    $ 0 \leq P_{i{\mathrm{,}}t}^{{\text{MT}}} \leq P_{i{\mathrm{,}}{\rm{max}} }^{{\text{MT}}} $(6c)

    View in Article

    $ P_{i{\mathrm{,}}{\rm{min}} }^{{\text{ES}}} \leq P_{i{\mathrm{,}}t}^{{\text{ES}}} \leq P_{i{\mathrm{,}}{\rm{max}} }^{{\text{ES}}} $(6d)

    View in Article

    $ 0 \leq P_{i{\mathrm{,}}t}^{{\text{IL}}} \leq P_{i{\mathrm{,}}{\rm{max}} }^{{\text{IL}}} $(6e)

    View in Article

    $ {{\bf{s}}}_{t+1}=f\left({{\bf{s}}}_{t}{\mathrm{,}}\text{ }{{\bf{a}}}_{\text{DSO}{\mathrm{,}}t}{\mathrm{,}}\text{ }{\left\{{{\bf{a}}}_{i{\mathrm{,}}t}\right\}}_{i\in \mathcal{V}}{\mathrm{,}}\text{ }{w}_{t}{\mathrm{,}}\text{ }{ϵ}_{t}\right) $(7)

    View in Article

    $ {{\mathbf{p}}_{t + 1}} = {g_{\mathbf{p}}}\left( {{{\mathbf{p}}_t}{\mathrm{,}}{\text{ }}{{\mathbf{a}}_{{\text{DSO}}{\mathrm{,}}t}}{\mathrm{,}}{\text{ }}{{\left\{ {{{\mathbf{a}}_{i{\mathrm{,}}t}}} \right\}}_{i \in \mathcal{V}}}{\mathrm{,}}{\text{ }}w_t^{\mathbf{p}}} \right) $(8)

    View in Article

    $ {{\bf{x}}}_{i{\mathrm{,}}t+1}={g}_{{\bf{x}}}\left({{\bf{x}}}_{i{\mathrm{,}}t}{\mathrm{,}}\text{ }{{\bf{a}}}_{i{\mathrm{,}}t}{\mathrm{,}}\text{ }{w}_{t}^{{\bf{x}}}{\mathrm{,}}\text{ }{ϵ}_{t}^{{\bf{x}}}\right){\mathrm{,}}\text{ }\forall i\in \mathcal{V} $(9)

    View in Article

    $ {{\bf{y}}}_{t+1}={g}_{{\bf{y}}}\left({{\bf{y}}}_{t}{\mathrm{,}}\text{ }{\left\{{{\bf{a}}}_{i{\mathrm{,}}t}\right\}}_{i\in \mathcal{V}}{\mathrm{,}}\text{ }{w}_{t}^{{\bf{y}}}{\mathrm{,}}\text{ }{ϵ}_{t}^{{\bf{y}}}\right) $(10)

    View in Article

    $ r_{{\text{DSO}}}^t = r_{{\text{profit}}}^t + r_{{\text{stability}}}^t + r_{{\text{renewable}}}^t - r_{{\text{emissions}}}^t $(11)

    View in Article

    $ {r}_{\text{profit}}^{t}=\sum _{i\in \mathcal{V}}\left({\lambda }_{t}^{\text{DA}{\mathrm{,}}b}{P}_{i{\mathrm{,}}t}^{\text{VPP}{\mathrm{,}}b}-{\lambda }_{t}^{\text{DA}{\mathrm{,}}s}{P}_{i{\mathrm{,}}t}^{\text{VP}{\mathrm{,}}s}\right)+\left({\lambda }_{t}^{W{\mathrm{,}}s}{P}_{t}^{\text{DSO}{\mathrm{,}}s}-{\lambda }_{t}^{W{\mathrm{,}}b}{P}_{t}^{\text{DSO}{\mathrm{,}}b}\right) $(12)

    View in Article

    $ r_{{\text{stability}}}^t = {\alpha _f}{\exp {( - {\beta _f}\left| {{f_t} - {f_{{\text{nominal}}}}} \right|)}} + {\alpha _v}{\exp {( - {\beta _v}\parallel {{\mathbf{v}}_t} - {{\mathbf{v}}_{{\text{nominal}}})}}} $(13)

    View in Article

    $ {r}_{\text{renewable}}^{t}=\gamma \frac{\displaystyle\sum _{i\in \mathcal{V}}\left({R}_{i{\mathrm{,}}t}^{W}+{R}_{i{\mathrm{,}}t}^{\text{PV}}\right)}{{D}_{t}} $(14)

    View in Article

    $ {r}_{\text{emissions}}^{t}=\delta \displaystyle\sum _{i\in \mathcal{V}}{e}_{i}\left({P}_{i{\mathrm{,}}t}^{\text{MT}}\right) $(15)

    View in Article

    $ r_i^t = r_{{\text{profit}}{\mathrm{,}}i}^t - r_{{\text{cost}}{\mathrm{,}}i}^t - r_{{\text{deviation}}{\mathrm{,}}i}^t $(16)

    View in Article

    $ r_{{\text{profit}}{\mathrm{,}}i}^t = \lambda _t^{{\text{DA}}{\mathrm{,}}s}P_{i{\mathrm{,}}t}^{{\text{VPP}}{\mathrm{,}}s} - \lambda _t^{{\text{DA}}{\mathrm{,}}b}P_{i{\mathrm{,}}t}^{{\text{VPP}}{\mathrm{,}}b} $(17)

    View in Article

    $ r_{{\text{cost}}{\mathrm{,}}i}^t = C_i^{{\text{MT}}}\left( {P_{i{\mathrm{,}}t}^{{\text{MT}}}} \right) + C_i^{{\text{ES}}}\left( {P_{i{\mathrm{,}}t}^{{\text{ES}}}} \right) + C_i^{{\text{IL}}}\left( {P_{i{\mathrm{,}}t}^{{\text{IL}}}} \right) $(18)

    View in Article

    $ r_{{\text{deviation}}{\mathrm{,}}i}^t = {\mu _i}\left| {P_{i{\mathrm{,}}t}^{{\text{VPP}}} - P_{i{\mathrm{,}}t}^{{\text{PV}}{\mathrm{,}}{\text{forecast}}}} \right| $(19)

    View in Article

    $ \lambda _t^{W{\mathrm{,}}s} \leq \lambda _t^{{\text{DA}}{\mathrm{,}}s} \leq \lambda _t^{{\text{DA}}{\mathrm{,}}b} \leq \lambda _t^{W{\mathrm{,}}b}{\mathrm{,}}{\text{ }}\forall t $(20)

    View in Article

    $ {f_{{\rm{min}} }} \leq {f_t} \leq {f_{{\rm{max}} }}{\mathrm{,}}{\text{ }}\forall t $(21a)

    View in Article

    $ {{\mathbf{v}}_{{\rm{min}} }} \leq {{\mathbf{v}}_t} \leq {{\mathbf{v}}_{{\rm{max}} }}{\mathrm{,}}{\text{ }}\forall t $(21b)

    View in Article

    $ \sum _{i\in \mathcal{V}}{P}_{i{\mathrm{,}}t}^{\text{VPP}}+{P}_{t}^{\text{DSO}{\mathrm{,}}b}-{P}_{t}^{\text{DSO}{\mathrm{,}}s}=0{\mathrm{,}}\text{ }\forall t $(22)

    View in Article

    $ 0 \leq P_{i{\mathrm{,}}t}^{{\text{MT}}} \leq P_{i{\mathrm{,}}{\text{max}}}^{{\text{MT}}}{\mathrm{,}}{\text{ }}\forall t $(23a)

    View in Article

    $ P_{i{\mathrm{,}}t}^W \leq r_{i{\mathrm{,}}t}^W{\mathrm{,}}{\text{ }}\forall t $(23b)

    View in Article

    $ P_{i{\mathrm{,}}t}^{{\text{PV}}} \leq r_{i{\mathrm{,}}t}^{{\text{PV}}}{\mathrm{,}}{\text{ }}\forall t $(23c)

    View in Article

    $ P_{i{\mathrm{,}}{\rm{min}} }^{{\text{ES}}} \leq P_{i{\mathrm{,}}t}^{{\text{ES}}} \leq P_{i{\mathrm{,}}{\rm{max}} }^{{\text{ES}}}{\mathrm{,}}{\text{ }}\forall t $(24a)

    View in Article

    $ S_{i{\mathrm{,}}{\rm{min}} }^{{\text{ES}}} \leq S_{i{\mathrm{,}}t}^{{\text{ES}}} \leq S_{i{\mathrm{,}}{\rm{max}} }^{{\text{ES}}}{\mathrm{,}}{\text{ }}\forall t $(24b)

    View in Article

    $ S_{i{\mathrm{,}}t + 1}^{{\text{ES}}} = S_{i{\mathrm{,}}t}^{{\text{ES}}} - \frac{{{{\Delta }}t}}{{E_{i{\mathrm{,}}{\rm{max}} }^{{\text{ES}}}}}P_{i{\mathrm{,}}t}^{{\text{ES}}}{\mathrm{,}}{\text{ }}\forall t $(24c)

    View in Article

    $ 0 \leq P_{i{\mathrm{,}}t}^{{\text{IL}}} \leq P_{i{\mathrm{,}}{\rm{max}} }^{{\text{IL}}}{\mathrm{,}}\;\forall t $(25)

    View in Article

    $ P_{i{\mathrm{,}}t}^{{\text{VPP}}} + P_{i{\mathrm{,}}t}^{{\text{MT}}} + P_{i{\mathrm{,}}t}^W + P_{i{\mathrm{,}}t}^{{\text{PV}}} + P_{i{\mathrm{,}}t}^{{\text{ES}}} + P_{i{\mathrm{,}}t}^{{\text{IL}}} = {d_{i{\mathrm{,}}t}}{\mathrm{,}}{\text{ }}\forall t $(26)

    View in Article

    $ \underset{{\pi }_{\text{DSO}}{\mathrm{,}}{\left\{{\pi }_{i}\right\}}_{i\in \mathcal{V}}}{\mathrm{max}}\mathbb{E}\left[\sum _{t=1}^{T}{\gamma }^{t}\left({r}_{\text{DSO}}^{t}+\sum _{i\in \mathcal{V}}{r}_{i}^{t}\right)\right] $(27)

    View in Article

    $ {{\bf{a}}}_{t}={\pi }_{\theta } \left({{\bf{s}}}_{t}\right)+{ϵ}_{t} $(28)

    View in Article

    $ {Q}_{\phi } \left({{\bf{s}}}_{t}{\mathrm{,}}\;{{\bf{a}}}_{t}\right)=\mathbb{E}\left[\sum _{k=0}^{\infty }{\gamma }^{k}{r}_{t+k}\mid {{\bf{s}}}_{t}{\mathrm{,}}\;{{\bf{a}}}_{t}\right] $(29)

    View in Article

    $ {e_t} = {E_\psi } \left( {{{\mathbf{s}}_t}} \right) = {\text{FC}} \left( {\left[ {{E_{{\text{market}}}} \left( {{{\mathbf{p}}_t}} \right){\mathrm{;}}{\text{ }}{E_{{\text{VPP}}}} \left( {{{\left\{ {{{\mathbf{x}}_{i{\mathrm{,}}t}}} \right\}}_{i \in \mathcal{V}}}} \right){\mathrm{;}}{\text{ }}{E_{{\text{grid}}}} \left( {{{\mathbf{y}}_t}} \right)} \right]} \right) $(30)

    View in Article

    $ {E_{{\text{market}}}} \left( {{{\mathbf{p}}_t}} \right) = {\text{MLP}} \left( {{{\mathbf{p}}_t}} \right) $(31)

    View in Article

    $ {E_{{\text{VPP}}}} \left( {{{\left\{ {{{\mathbf{x}}_{i{\mathrm{,}}t}}} \right\}}_{i \in \mathcal{V}}}} \right) = {\text{LSTM}} \left( {{\text{CNN}} \left( {{{\left\{ {{{\mathbf{x}}_{i{\mathrm{,}}t}}} \right\}}_{i \in \mathcal{V}}}} \right)} \right) $(32)

    View in Article

    $ {E_{{\text{grid}}}} \left( {{{\mathbf{y}}_t}} \right) = {\text{GNN}} \left( {{{\mathbf{y}}_t}} \right) $(33)

    View in Article

    $ {{\mathbf{a}}_t} = {\pi _\theta } \left( {{e_t}} \right) $(34a)

    View in Article

    $ {Q_\phi } \left( {{{\mathbf{s}}_t}{\mathrm{,}}{\text{ }}{{\mathbf{a}}_t}} \right) = {Q_\phi } \left( {{e_t}{\mathrm{,}}{\text{ }}{{\mathbf{a}}_t}} \right) $(34b)

    View in Article

    $ {\nabla _{{\theta _i}}}J({\theta _i}) = {\mathbb{E}_{{\mathbf{s}}{\mathrm{,}}{\mathbf{a}} \sim \mathcal{D}}}\left[ {{{\left. {{\nabla _{{\theta _i}}}{\pi _i}({{\mathbf{a}}_i} \mid {{\mathbf{s}}_i}){\nabla _{{{\mathbf{a}}_i}}}{Q_i}({\mathbf{s}}{\mathrm{,}}\;{{\mathbf{a}}_1}{\mathrm{,}}\;{{\mathbf{a}}_2}{\mathrm{,}}\; \cdots {\mathrm{,}}\;{{\mathbf{a}}_N})} \right|}_{{{\mathbf{a}}_i} = {\pi _i}({{\mathbf{s}}_i})}}} \right] $(35)

    View in Article

    $ \mathcal{L}({\theta _i}) = {\mathbb{E}_{{\mathbf{s}}{\mathrm{,}}{\mathbf{a}}{\mathrm{,}}r{\mathrm{,}}{\mathbf{s}}' \sim \mathcal{D}}}\left[ {{{\left( {{Q_i}({\mathbf{s}}{\mathrm{,}}\;{{\mathbf{a}}_1}{\mathrm{,}}\;{{\mathbf{a}}_2}{\mathrm{,}}\; \cdots {\mathrm{,}}\;{{\mathbf{a}}_N}) - y} \right)}^2}} \right] $(36)

    View in Article

    $ {z}_{i}=\sum _{j\ne i}{\alpha }_{i{\mathrm{,}}j}{{\bf{h}}}_{j} $(37)

    View in Article

    $ {\alpha }_{i{\mathrm{,}}j}=\frac{\mathrm{exp}\left({e}_{i{\mathrm{,}}j}\right)}{\displaystyle\sum _{k\ne i}\mathrm{exp}\left({e}_{i{\mathrm{,}}k}\right)} $(38)

    View in Article

    $ {\eta _{i{\mathrm{,}}j}} = f({{\mathbf{q}}_i}{\mathrm{,}}\;{{\mathbf{k}}_j}) $(39)

    View in Article

    $ {\mathcal{P}}(i)=\frac{p_{i}^{\mu}}{\displaystyle\sum_{k} p_{k}^{\mu}} $(40)

    View in Article

    $ {{\Cambriabifont\text{ω}} ^{{\text{new}}}} = {{\Cambriabifont\text{ω}} ^{{\text{curr}}}} + \mathcal{N}(0{\mathrm{,}}\;{\sigma ^2}{\mathbf{I}}) $(41)

    View in Article

    $ {\text{d}}{x_t} = \theta \left( {\mu - {x_t}} \right){\text{d}}t + \sigma {\text{d}}{W_t} $(42)

    View in Article

    $ {\nabla _{{\theta _i}}}J!\left( {{\theta _i}} \right) = \mathbb{E}_{{\mathbf{s}} \sim \mathcal{D}}\left[ {\left. {\nabla {\theta _i}{\pi _i} \left( {{{\mathbf{s}}_i}} \right)\nabla {{\mathbf{a}}_i}{Q_i} \left( {{\mathbf{s}}{\mathrm{,}}\;{{\mathbf{a}}_1}{\mathrm{,}}\;{{\mathbf{a}}_2}{\mathrm{,}}\; \cdots {\mathrm{,}}\;{{\mathbf{a}}_N}} \right)} \right|{{\mathbf{a}}_i} = {\pi _i} \left( {{{\mathbf{s}}_i}} \right)} \right] $(43)

    View in Article

    $ {\theta _i} \leftarrow {\theta _i} + \xi {\text{Adam}}\left( {{\nabla _{{\theta _i}}}J \left( {{\theta _i}} \right)} \right) $(44)

    View in Article

    $ \mathcal{L} \left( {{\phi _i}} \right) = {\mathbb{E}_{{\mathbf{s}}{\mathrm{,}}{\mathbf{a}}{\mathrm{,}}r{\mathrm{,}}{\mathbf{s}}' \sim \mathcal{D}}}\left[ {{{\left( {{Q_i} \left( {{\mathbf{s}}{\mathrm{,}}\;{{\mathbf{a}}_1}{\mathrm{,}}\;{{\mathbf{a}}_2}{\mathrm{,}}\; \cdots {\mathrm{,}}\;{{\mathbf{a}}_N}} \right) - y} \right)}^2}} \right] $(45)

    View in Article

    $ y = {r_i} + \gamma Q_i^\prime \left( {{\mathbf{s}}'{\mathrm{,}}\;\pi _1^\prime \left( {{\mathbf{s}}_1^\prime } \right){\mathrm{,}}\;\pi _2^\prime \left( {{\mathbf{s}}_2^\prime } \right){\mathrm{,}}\; \cdots {\mathrm{,}}\;\pi _N^\prime \left( {{\mathbf{s}}_N^\prime } \right)} \right) $(46)

    View in Article

    $ \phi _i^\prime \leftarrow \tau {\phi _i} + (1 - \tau )\phi _i^\prime $(47a)

    View in Article

    $ \theta _i^\prime \leftarrow \tau {\theta _i} + (1 - \tau )\theta _i^\prime $(47b)

    View in Article

    $ {\tilde{r}}_{i}({\bf{s}}{\mathrm{,}}\text{ }{\bf{a}}{\mathrm{,}}\text{ }\lambda )={r}_{i}({\bf{s}}{\mathrm{,}}\text{ }{\bf{a}})-\sum _{j}{\lambda }_{j}\mathrm{max}\left(0{\mathrm{,}}\text{ }{g}_{j}({\bf{s}}{\mathrm{,}}\text{ }{\bf{a}})\right) $(48)

    View in Article

    $ {\lambda _j} \leftarrow {\lambda _j} + \beta {\rm{max}} \left( {0{\mathrm{,}}{\text{ }}{g_j}({\mathbf{s}}{\mathrm{,}}{\text{ }}{\mathbf{a}})} \right) $(49)

    View in Article

    $ \mathop {{\rm{max}} }\limits_{{\theta _i}} {J_{{R_i}}} \left( {{\theta _i}} \right)\;{\text{ s}}{\text{.t}}{\text{. }}\;{J_{{C_j}}} \left( {{\theta _i}} \right) \leq {d_j}{\mathrm{,}}{\text{ }}\forall j $(50)

    View in Article

    $ \lambda _t^{W{\mathrm{,}}s} \leq \lambda _t^{{\text{DA}}{\mathrm{,}}s} \leq \lambda _t^{{\text{DA}}{\mathrm{,}}b} \leq \lambda _t^{W{\mathrm{,}}b} $(51)

    View in Article

    $ {{\bf{a}}}_{\text{safe}}=\mathrm{arg}\;\underset{{\bf{a}}}{\mathrm{min}}\left|\right|{{\bf{a}}}^{\prime }-{\bf{a}}|{|}_{2}^{2}\;\text{ s}\text{.t}\text{. }\;{g}_{j} \left({\bf{s}}\text{{\mathrm{,}}}\;{{\bf{a}}}^{\prime }\right)\le 0{\mathrm{,}}\text{ }\forall j $(52)

    View in Article

    $ {P_{i{\mathrm{,}}t}^{{\text{MT}}} = P_{i{\mathrm{,}}{\rm{max}} }^{{\text{MT}}}\varsigma \left( {{f_{{\text{MT}}}} \left( {{{\mathbf{s}}_t}} \right)} \right)} $(53a)

    View in Article

    $ {P_{i{\mathrm{,}}t}^{{\text{ES}}} = P_{i{\mathrm{,}}{\rm{max}} }^{{\text{ES}}}{\rm{tanh}} \left( {{f_{{\text{ES}}}} \left( {{{\mathbf{s}}_t}} \right)} \right)} $(53b)

    View in Article

    $ {P_{i{\mathrm{,}}t}^{{\text{IL}}} = P_{i{\mathrm{,}}{\rm{max}} }^{{\text{IL}}}\varsigma \left( {{f_{{\text{IL}}}} \left( {{{\mathbf{s}}_t}} \right)} \right)} $(53c)

    View in Article

    $ {r_{{\text{penalty}}}} = - \alpha {\rm{max}} {\left( {0{\mathrm{,}}{\text{ }}\left| {f(t) - {f_{{\text{nominal}}}}} \right| - {f_{{\text{tolerance}}}}} \right)^2} $(54)

    View in Article

    Jian-Dong Yao, Wen-Bin Hao, Zhi-Gao Meng, Bo Xie, Jian-Hua Chen, Jia-Qi Wei. Adaptive multi-agent reinforcement learning for dynamic pricing and distributed energy management in virtual power plant networks[J]. Journal of Electronic Science and Technology, 2025, 23(1): 100290
    Download Citation