NeurIPS 2020

Scalable Multi-Agent Reinforcement Learning for Networked Systems with Average Reward

Meta Review

The paper has been extensively discussed and reviewers agree the paper has merit and the rebuttal brings a lot of clarification on a number of questions identified by the reviewers (e.g. the difference between underlying framework of the proposed method and that of mean field RL). General consensus is to propose acceptance of the paper; reviewers would like the authors to clarify the following in the paper though: In difference to their claim, Theorem 2 does not really depend on Theorem 1, as it only assumes the exponential decay property, which Theorem 1 only widens.