The paper has been received three positive review and one lukewarm/slightly negative review. The lukewarm review was very informative and the concerns raised were mostly admitted by authors. The main concerns were that the contributions beyond the previous work is incremental in the sense that it only fills some gap in the literature and does not make very impactful contribution. I agree with this criticism even after reading author's response. The authors argue that the use of stochastic dominance technique is novel, however, the use of stochastic optimism in previous bandit papers (e.g., several papers by Osband and Van Roy) involves similar arguments. Having said that, I do think that the paper is well written, the results are correct, and the results will be of interest to the community working on this problem. Overall, the paper meets the bar for publication.