An Improved QMIX Network Based on Gradient Entropy Regularization

LU Rui; PENG Pengfei

doi:10.3969/j.issn.1671-637x.2023.04.015

Journals >Electronics Optics & Control >Volume 30 >Issue 4 >Page 78 > Article

Electronics Optics & Control
Vol. 30, Issue 4, 78 (2023)

An Improved QMIX Network Based on Gradient Entropy Regularization

LU Rui and PENG Pengfei

Author Affiliations

[in Chinese]

show less

DOI: 10.3969/j.issn.1671-637x.2023.04.015 Cite this Article

LU Rui, PENG Pengfei. An Improved QMIX Network Based on Gradient Entropy Regularization[J]. Electronics Optics & Control, 2023, 30(4): 78 Copy Citation Text

show less

Abstract

When cooperative multi-agent system lacks individual reward signals, the contribution of different agents cannot be distinguished, which leads to low cooperation efficiency.To solve the problem, the discriminability evaluation index of credit allocation is introduced by using the value decomposition paradigm, and a method based on gradient entropy regularization is proposed to achieve highly discriminable credit allocation.Based on this, an improved QMIX network is proposed by using the multi-agent deep reinforcement learning algorithm.Through SMAC multi-agent learning environment and Starcraft2s built-in map editor, the corresponding simulation environment is established.The results show that the learning efficiency and overall performance of the improved QMIX network are improved compared with that of QMIX network, and it is more suitable for cooperative multi-agent reinforcement learning in partially observable environment.

Keywords

credit allocation gradient entropy multi-agent reinforcement learning

LU Rui, PENG Pengfei. An Improved QMIX Network Based on Gradient Entropy Regularization[J]. Electronics Optics & Control, 2023, 30(4): 78

Download Citation

Tools

Save the article for my favorites

Paper Information