Part of Advances in Neural Information Processing Systems 33 (NeurIPS 2020)
Pranjal Awasthi, Himanshu Jain, Ankit Singh Rawat, Aravindan Vijayaraghavan
Adversarial robustness measures the susceptibility of a classifier to imperceptible perturbations made to the inputs at test time. In this work we highlight the benefits of natural low rank representations that often exist for real data such as images, for training neural networks with certified robustness guarantees.
Our first contribution is for certified robustness to perturbations measured in L_2 norm. We exploit low rank data representations to provide improved guarantees over state-of-the-art randomized smoothing-based approaches on standard benchmark datasets such as CIFAR-10 and CIFAR-100.
Our second contribution is for the more challenging setting of certified robustness to perturbations measured in L\infty norm. We demonstrate empirically that natural low rank representations have inherent robustness properties that can be leveraged to provide significantly better guarantees for certified robustness to L\infty perturbations in those representations. Our certificate of L\infty robustness relies on a natural quantity involving the \infty -> 2 matrix operator norm associated with the representation, to translate robustness guarantees from L2 to L_\infty perturbations.
A key technical ingredient for our certification guarantees is a fast algorithm with provable guarantees based on the multiplicative weights update method to provide upper bounds on the above matrix norm. Our algorithmic guarantees improve upon the state of the art for this problem, and may be of independent interest.