NeurIPS 2019
Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
Paper ID:8183
Title:Categorized Bandits

Reviewer 1

The paper is clear and well organized. The technical content seems simple and correct. The biggest weakness is that they exploit the fact that categories are known, while in the literature I found more general works (with unknown categories see Regional Multi-Armed Bandits, Zhiyang Wang et al. 2018; or with similarities information see Contextual Bandits with Similarity Information, Aleksandrs Slivkins 2009). Moreover, I did not understand why the three dominance relations should be relevant. According to the defined regret, they compete against the policy always pulling the best arm mu_1^1. Therefore, I wonder why they do not reject all the categories i: mu_1^i < mu_1^1. I do not see any original technical aspect, the novelty seems to be in the dominance relations but I do not see why they are really necessary. As more general approaches exist, I do not see how this project can be an useful starting point for additional research projects. ==== I have carefully read the authors comments. I still have doubts concerning the dominance relations. I do think that they are too strong to represent real situations. Besides that, I think that the proposed solution does not fully address the proposed setting.

Reviewer 2

This paper studies on three different levels of assumptions and provides both lower bounds and upper bounds for them. I mainly have several concerns. 1. The upper and lower bounds are stating *with ... assumption*. But what is known and what is unknown exactly to the learning agent? Are there uncomparable pairs under these assumptions? 2. If we regard these three assumptions as several examples of dominance spectrum over categories, then it seems that there is a big gap between strong dominance and first-order dominance, as well as the first-order dominance and no dominance. Can you give some discussions, or intuitions, for what kind of dominance should lie in these gaps? 3. It looks to me that these assumptions are quite strong. I guess in real applications, most categories are not comparable for these levels of relations, which limits the applicability of the algorithm. Can you provide some discussions on possible solutions for the case that these assumptions don't hold? 4. Why the regret curves are straight lines in Figure 3(a)? Perhaps more rounds would help show the learning behavior. Also, it would be better to also plot error bars. --- I have read the rebuttal. The concern is on the reasonability of the dominance assumptions. These dominances would be more reasonable if there are additional dominance types to formulate a good spectrum. Also from the rebuttal, I realize the theorems need knowledge of the dominance type, which is usually impossible.

Reviewer 3

The setting introduced in this paper is novel and interesting. I don't think adding the group sparse dominance criterion adds much value, it is also unrealistic from an applications point of view. The idea of categorized bandits is not new as the authors note. While the related work covers most of the papers, I would like to bring the authors' attention to for completeness, which tries to cluster and rank the clusters (as opposed to this work where the clusters are pre-specified), and uses the strong dominance relation between the clusters. The CATSE algorithm can be improved for the group-sparse case. According to the equation after line 218, a category becomes active if there exists an arm whose lower bound is greater than 0 - at this point the algorithm has identified the winning category and stops. Until this point, CATSE pulls all arms. However, categories where there exist arms whose upper bound is less than 0 need not be pulled, and can be eliminated. I would like to see details of murphy sampling (code that I can read would be great). Other suggestions: 1) add error bars to the plots 2) use dotted for 0 and solid for s in Fig 3a for easy distinguishing between the two dominances. == After rebuttal == I have read the author's response. The consensus we had was that the dominance relations are strong and impractical.