All reviewers had a positive overall impression of this paper, and highlighted some salient features of the work: + generic solution to the important problem of coping with label noise + interesting, somewhat novel approach backed by theoretical guarantees + encouraging empirical results In terms of weaknesses, it was pointed out that the empirical comparisons are only done on three datasets (CIFAR-10, 100, and MNIST), and that the discussion of Theorem 1 could be improved. The former appears reasonable for a novel idea that with a core theoretical contribution, although the authors are encouraged to incorporate elements of their response in updating the discussion around Theorem 1.