Review for NeurIPS paper: Consistency Regularization for Certified Robustness of Smoothed Classifiers

NeurIPS 2020

Consistency Regularization for Certified Robustness of Smoothed Classifiers

Review 1

Summary and Contributions: This paper studied the problem of training of Gaussian smoothed robust classifiers, which was first proposed in Cohen et al. 2019. Following recent work by Zhai et al. 2020, this paper proposed a consistency regularization method to improve the training. Specifically, the authors follows the natural-robust error decomposition by Zhai et al. 2020, and used a different surrogate loss for the certified robust error. The paper is very well-written.

Strengths: Simplicity of the algorithm, very good experimental results overall.

Weaknesses: Somewhat limited novelty.

Correctness: I have verified that the mathematical and algorithmic claims are correct.

Clarity: The paper is well written.

Relation to Prior Work: The comparison to the prior work is thorough and complete.

Reproducibility: Yes

Additional Feedback: The method proposed here is very simple while very effective, improved over several strong baselines in NeurIPS 2019 and ICLR 2020. Another benefit of this method is that it requires minimal computational overhead (only about 2x time compare to the original Cohen et al. 2019 paper, much faster than SmoothAdv and MACER). I would like to highlight that this method looks super natural and elegant: it only requires 2-3 lines of modification over MACER or Gaussian training, but improves a lot in terms of performance and efficiency. My only conservation is that this paper mostly follows the surrogate loss framework proposed by Zhai et al. 2020, which somewhat limits the novelty of this work.

Review 2

Summary and Contributions: The paper tackles the problem of learning a classifier robust to L2-ball attacks, by introducing a consistency regularizer that enforces consistency of the classifier's predictions with the smoothed classifier's predictions over the L2-ball. Performance of this new regularizer is evaluated extensively on several datasets, including ImageNet, where it outperforms competing approaches, in addition to being more computationally efficient.

Strengths: - Proposed regularizer is easy to incorporate and results in a method that is more efficient than competing approaches - Extensive evaluation on a range of datasets, in particular ImageNet, showing performance at scale - Ablation study comparing different design choices for the regularizer and sensitivity of performance to hyperparameters - Certifiable robustness is a topic of strong interest to the community and society in general

Weaknesses: - Experimental results seem to be on a single run. Given that some of the differences are not very large, ideally results from multiple runs are included to show the variance in metrics. - "Clean" accuracy and accuracy in the case of small perturbations is worse than MACER; this could compromise practical application where the clean accuracy is also important.

Correctness: Seems fine.

Clarity: The paper is generally easy to read, but there were a few parts that were confusing: - line 104: the "sufficient condition" didn't seem to be defined here, and was again referred to on line 107. What is this condition? - the last expectation in equation (6) - should this just be the probability associated with the indicator instead of an expectation

Relation to Prior Work: Yes, there is a section in the paper dedicated to this.

Reproducibility: Yes

Additional Feedback: Overall, this paper presents an efficient approach to training L2-robust models, that outperforms existing approaches in the large perturbation regime. While experiments could be improved with multiple runs, I thought they were extensive and included analyses of different design choices. Releasing code/models would help further improve reproducibility of the work. Additional comments: - Why does m have to be larger than 1? How does the method perform with m=1? - The analysis resulting in Figure 1 focuses on the log-probability gap, or logit-margin of the various classifiers. However, this is not the only factor contributing to robustness in the case of deep neural networks, which perform a highly non-linear mapping from inputs to logits; the distance to the decision boundary in input space (or input margin) is what we really care about, and is related to the logit-margin by the Lipschitzness of the mapping from input to logits; see Lipschitz-Margin Training: Scalable Certification of Perturbation Invariance for Deep Neural Networks, NeurIPS 2018 for a discussion. This analysis therefore doesn't really give the full picture of what is happening with the different algorithms. === Post Rebuttal Comments === I have read the rebuttal. Thank you for the clarifications.

Review 3

Summary and Contributions: This paper proposes a regularized term on the loss function to provide namely "Certified Robustness" for deep neural networks. The regularized term is simply a cross-entropy term and introduces relatively small overhead, but the performance increase is significant. ------------ Post rebuttal: most of my concerns are addressed and I think it is a good submission.

Strengths: 1) This paper is well-written and self-contained. 2) The major part of the theoretical grounding seems sound. 3) Experiment results are convincing and comprehensive.

Weaknesses: 1) Some notations seem not defined before use, and I have questions on Eq. 6 and Eq. 7 (see questions below). 2) The broader impact part seems missing some items provided by the NeurIPS template, e.g. I do not see answers to "Who may benefit from this research", "Who may be put at disadvantage from this research" and "What are the consequences of failure of the system". The broader impact part mighit need a full revision before publication.

Correctness: Can you clearly state the gap between Eq. 6 and Eq. 7? From my perspective, Eq. 6 means maximizing P_\delta(f(x+\delta)=\hat{f}(x)) and Eq. 7 is standard cross-entropy. However, P_\delta(f(x+\delta)=\hat{f}(x)) does not grow monotonically if the cross-entropy grows, since there are cases where models have different cross-entropy losses but the same P_\delta(f(x+\delta)=\hat{f}(x)) is achieved. The authors should state the gap more clearly between Eq. 6 and Eq. 7.

Clarity: Generally yes. I have the following questions/suggestions: 1) What does the bold 1 in Eq. 2 & Eq. 5 mean? This seems not a standard notation and should be explained. 2) It will make it clearer to add references to the legend of Fig. 1. I cannot realize that "consistency" is the proposed method when I reach Fig. 1.

Relation to Prior Work: Yes.

Reproducibility: Yes

Additional Feedback: Why SmoothAdv + Consistency is not considered in Table 2 for ImageNet dataset?

Review 4

Summary and Contributions: - Paper proposes a simple consistency regularization term added on a standard training scheme surprisingly improves the certified robustness of smoothed classifiers. - Owing to the simplicity of the method it allows faster training of robust classifiers. - Even though results are similar to other approaches, the speed up in training is a desirable property. - Observations show that the single hyper-parameter of the method is stable across multiple choices; showing the methods robustness/in-sensitivity to hyper-parameters.

Strengths: - Compared to other approaches, this offers significantly less training cost with fewer hyperparameters. - Using the proposed method with SmoothAdv [30] shows further improvement of the certified robustness (although it is significantly more expensive). This shows that the method can be combined with other methods in different ways and still show its benefits. - Results on a range of different experiments, including imagenet. - Experimental results are convincing showing at times the benefit of the consistency regularization, though the strengthes lie in that it speeds up the training process.

Weaknesses: - Would have been interesting to see the results on multiple models for the same dataset. It is not uncommon that the networks behave very differently given different network architectures.

Correctness: Yes

Clarity: Yes

Relation to Prior Work: Yes

Reproducibility: Yes

Additional Feedback: