NeurIPS 2019
Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
Paper ID:3704
Title:Are Anchor Points Really Indispensable in Label-Noise Learning?

Reviewer 1

The paper proposes a new method for learning a label noise transition matrix from a noisy dataset. Unlike previously proposed approaches (e.g. the forward method) the proposed approach is shown to be less dependent on anchor points, which are defined to be examples that would have a have the output of the softmax extremely focused on the correct class in the noiseless domain. The main idea is to use importance reweighting to account for the discrepancy between the output of a noise free classifier and a noisy one, and to allow learnable modifications to the initially estimated transition matrix. The experiments show modest improvements over the SoA on MNIST and CIFAR-10 under low and medium levels of label noise. When anchor points are purposefully removed, the improvements on CIFAR-10 and 100 are more substantial, illustrating that the approach is less dependent on the presence of anchor points. Originality: The improvement over the forward method is somewhat incremental, but the idea is novel as far as I know. Clarity: There are a few language errors in the paper, but overall it is well written and the explanation of the approach is clear. Significance: While the results on CIFAR-10 look good, the results on CIFAR-100 are a little less convincing, especially in the case of 50% label noise, especially given the overlapping error bars. The results on Clothing1M demonstrate a clear improvement on the SoA.

Reviewer 2

The paper adresses an aimportant problem and proposes an original approach allowing to avoid the use of anchor point The method is both assess theoritically and experimentally.

Reviewer 3

This paper studies the challenging problem of learning from data with noisy labels. In particular, a novel transition revision (T-Revision) method is proposed, which does not require anchor points. T-Revision can effectively learn transition matrices which lead to better classifiers. A deep-learning-based risk-consistent estimator is designed to tune transition matrix accurately. Experimental results on multiple benchmark datasets show that T-Revision outperforms the state-of-the-art methods. This paper makes a significant contribution to the label-noise learning problem. The proposed method is well motivated and clearly presented. Technical details are easy to follow, and theoretical analysis on generalization error is provided. Moreover, the implementation details of the proposed method are also provided, which will be very helpful in reproducing the reported results. My comments are as follows. 1. The difference between two categories of label-noise learning algorithms has been mentioned in Section 1. It will be helpful if the authors can elaborate more about the advantages of risk-/classifier-consistent algorithms. 2. In the experiments, the Sym-50 setting on MNIST can still obtain a good performance (about 98%). I wonder what the performance would be in extreme cases with even more label noise. 3. I'm curious if the proposed mechanism can be easily extended to semi-supervised setting.