NeurIPS 2020

Delay and Cooperation in Nonstochastic Linear Bandits

Meta Review

This paper presents the first optimal (up to logarithmic factors) regret bound for the adversarial linear bandit problem with delayed feedback (with a fixed delay). Using this approach, results on cooperative bandits are also extended to the linear bandit setting. All reviewers were very positive about the paper. (Beside taking into account the suggestions of the reviewers, in the final version, please move the Broder Impact section to its designated place.)