NeurIPS 2020

Hold me tight! Influence of discriminative features on deep network boundaries

Meta Review

This paper provides a few interesting novel insights between the discriminativeness of input features and decision boundaries. For example, the directions that an input image is nearest to the decision boundary are the directions of the discriminative features. Experiments were performed on MNIST, CIFAR-10 and ImageNet. Further findings also gave insights into how adversarial training modifies the decision boundaries to improve model robustness. Reviewers agreed that this work is particularly relevant and interesting to the adversarial example community.