NeurIPS 2020

Cooperative Multi-player Bandit Optimization

Meta Review

The paper proposes an algorithm for cooperative multi-agent games where players are trying to maximize total reward. All reviewers found the problem setting interesting and well-motivated. The two biggest concerns were the clarity of writing and how to select M when G(t) is unknown. The former was largely addressed by the authors as confirmed by the reviewers both in discussion and post-rebuttal sections of their reviews, and the scores were adjusted accordingly. The latter, however-- everyone agreed-- is problematic. The reviewers agreed that it is not so serious as to prevent publication, but that the authors should really say something about how one could choose M when G(t) is not known. Finally, the experiments included in the author response were appreciated by some reviewers, and one noted that the paper would be improved by including these results in the final version. I trust that the authors will take the feedback about the presentation into strong consideration when preparing the final version.