Sparse Discrete Communication Learning for Multi-Agent Cooperation Through Backpropagation

Benjamin Freed, Rohan James, Guillaume Sartoretti, and Howie Choset

Conference Paper, Proceedings of (IROS) IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 7993 - 7998, October, 2020

View Publication

Abstract

Recent approaches to multi-agent reinforcement learning (MARL) with inter-agent communication have often overlooked important considerations of real-world communication networks, such as limits on bandwidth. In this paper, we propose an approach to learning sparse discrete communication through backpropagation in the context of MARL, in which agents are incentivized to communicate as little as possible while still achieving high reward. Building on top of our prior work on differentiable discrete communication learning, we develop a regularization-inspired message-length penalty term, that encourages agents to send shorter messages and avoid unnecessary communications. To this end, we introduce a variable-length message code that provides agents with a general means of modulating message length while keeping the overall learning objective differentiable. We present simulation results on a partially-observable robot navigation task, where we first show how our approach allows learning of sparse communication behavior while still solving the task. We finally demonstrate our approach can even learn an effective sparse communication behavior from demonstrations of an expert (potentially communication-free) policy.

BibTeX

@conference{Freed-2020-128169,
author = {Benjamin Freed and Rohan James and Guillaume Sartoretti and Howie Choset},
title = {Sparse Discrete Communication Learning for Multi-Agent Cooperation Through Backpropagation},
booktitle = {Proceedings of (IROS) IEEE/RSJ International Conference on Intelligent Robots and Systems},
year = {2020},
month = {October},
pages = {7993 - 7998},
publisher = {IEEE/RSJ},
keywords = {communication, multi-agent reinforcement learning, gradient backpropagation},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.