Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
The present paper proposes a method for detecting adversarial examples after the training phase. The idea is to compare the feature attribution map statistics of natural and adversarial examples. To obtain such result, this paper unifies two major learning frameworks: Semi-Supervised Learning and Distributionally Robust Learning. Also, the author proposes a new complexity measure that is an adversarial extension of a Rademacher complexity and its semi-supervised analogue. Namely this measure is related to the “need of supervision”. Based on this theoretical approach, the paper also proposed a new algorithm. The later comes with convergence guarantees. The paper is interesting and addresses an important problem for the machine learning community. Most of the reviewers agree on the good level of interest of the proposed result and to the fact that the paper is quite well written even if they would have like to see more explanations related to the proposed theoretical concepts. I strongly suggest to include in the camera ready, the example the authors provides in the rebuttal. Note that there is a code segment in the supplementary file, which includes a name that is likely to be one of the authors. This has been detected by one of the reviewer. However, as from the point of view of the reviewers, this had no effect on the scoring, and since this is, on my opinion a non-intentional error, I decide to not take this fact into account in my final decision.