NeurIPS 2019
Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
This paper proposes algorithms for cumulative regret minimization with respect to the best arm in the best category. Each arm belongs to a single known category and the categories can be ordered such that they can eliminate each other. The authors study several notions of category dominance. They prove lower bounds and also propose a single elimination algorithm for all notions, which is analyzed. The general consensus is that this paper studies an interesting problem, although contrived. The first two notions of dominance, group-sparse and strong domination, are too restrictive to be practical. The last notion is interesting. But even this seems too restrictive given the following observation. Under first-order dominance, CATSE behaves as follows. Unless the best category has been determined, plays all arms in all categories that have not been eliminated yet. Therefore, to show that CATSE is sound, it is only necessary to assume that the mean of the best category is above those of the other categories, which is weaker than first-order dominance.