Review for NeurIPS paper: Collapsing Bandits and Their Application to Public Health Intervention

NeurIPS 2020

Collapsing Bandits and Their Application to Public Health Intervention

Meta Review

The reviewers are all enthusiastic about the paper, though by varying degrees. The paper's main significant contribution is to the problem of planning for a class of partially observed restless bandits with two arm states each, for which a monotone transition probability structure holds -- the paper argues that this structure is quite natural in several applications, and demonstrates numerical results on one such setting involving medical interventions. It is shown that under this structure, the restless bandit is Whittle-indexable. Although there is no learning component addressed in the paper, the hope is that such a structural characterization will open up avenues for more work on learning good policies when there is a priori uncertainty about the restless Markov decision processes.