The reviewers mostly agree that the paper makes valuable contributions to studying learning in blocking bandits without purely stochastic assumptions, and systematically investigates the hardness of both planning and learning depending on the information structure available to the learner (advance information vs. online (bandit)). The reviewers engaged in a detailed discussion after the author feedback was received, in which several illuminating observations and suggestions were brought up. In view of the positive signals received from the reviewers, I recommend acceptance. I would request that the author(s) pay close attention to the additional feedback from the reviewers and incorporate the suggestions when preparing the final version, especially those from R1 and R2 whose comments were quite insightful.