Approximate Feature Collisions in Neural Nets

Part of Advances in Neural Information Processing Systems 32 (NeurIPS 2019)

AuthorFeedback Bibtex MetaReview Metadata Paper Reviews Supplemental

Authors

Ke Li, Tianhao Zhang, Jitendra Malik

Abstract

Work on adversarial examples has shown that neural nets are surprisingly sensitive to adversarially chosen changes of small magnitude. In this paper, we show the opposite: neural nets could be surprisingly insensitive to adversarially chosen changes of large magnitude. We observe that this phenomenon can arise from the intrinsic properties of the ReLU activation function. As a result, two very different examples could share the same feature activation and therefore the same classification decision. We refer to this phenomenon as feature collision and the corresponding examples as colliding examples. We find that colliding examples are quite abundant: we empirically demonstrate the existence of polytopes of approximately colliding examples in the neighbourhood of practically any example.