Coding for Distributed Multi-Agent Reinforcement Learning

A recent paper by the members of the DCIST alliance develops a multi-agent reinforcement learning (MARL) algorithm which uses coding theory to mitigate straggler effects in distributed training. Stragglers are delayed, non-responsive or compromised compute nodes, which occur commonly in distributed learning systems, due to communication bottlenecks and adversarial conditions. Coding techniques have been utilized to speed up distributed computation tasks in the presence of stragglers, such as matrix multiplications and inverse problems. Their proposed coded distributed learning framework can be applied with any policy gradient method to train policies for MARL problems in the presence of stragglers. They develop a coded distributed version of multi-agent deep deterministic policy gradient (MADDPG), a state-of-the-art MARL algorithm. To gain a comprehensive understanding of the benefits of coding in distributed MARL, they investigated various coding schemes, including the maximum distance separable (MDS) code, random sparse code, replication-based code, and regular low density parity check (LDPC) code. All of these methods were implemented in simulation on several multi-robot problems, including cooperative navigation, predator-prey, physical deception and keep-away tasks. Their approach achieves the same training accuracy while significantly speeding up the training of policy gradient algorithms in the presence of stragglers.

Capability: T3C1D: Optimal control & reinforcement learning with information theoretic objectives

Points of Contact: Nikolay Atanasov (PI), Baoqian Wang, and Junfei Xie



Citation: B. Wang, J. Xie, and N. Atanasov “Coding for Distributed Multi-Agent Reinforcement Learning”, IEEE International Conference on Robotics and Automation (ICRA), 2021.