Sparse Discrete Communication Learning for Multi-Agent Cooperation Through Backpropagation
Abstract
Recent approaches to multi-agent reinforcement learning (MARL) with inter-agent communication have often overlooked important considerations of real-world communication networks, such as limits on bandwidth. In this paper, we propose an approach to learning sparse discrete communication through backpropagation in the context of MARL, in which agents are incentivized to communicate as little as possible while still achieving high reward. Building on top of our prior work on differentiable discrete communication learning, we develop a regularization-inspired message-length penalty term, that encourages agents to send shorter messages and avoid unnecessary communications. To this end, we introduce a variable-length message code that provides agents with a general means of modulating message length while keeping the overall learning objective differentiable. We present simulation results on a partially-observable robot navigation task, where we first show how our approach allows learning of sparse communication behavior while still solving the task. We finally demonstrate our approach can even learn an effective sparse communication behavior from demonstrations of an expert (potentially communication-free) policy.
BibTeX
@conference{Freed-2020-128169,author = {Benjamin Freed and Rohan James and Guillaume Sartoretti and Howie Choset},
title = {Sparse Discrete Communication Learning for Multi-Agent Cooperation Through Backpropagation},
booktitle = {Proceedings of (IROS) IEEE/RSJ International Conference on Intelligent Robots and Systems},
year = {2020},
month = {October},
pages = {7993 - 7998},
publisher = {IEEE/RSJ},
keywords = {communication, multi-agent reinforcement learning, gradient backpropagation},
}