Review for NeurIPS paper: Once-for-All Adversarial Training: In-Situ Tradeoff between Robustness and Accuracy for Free

NeurIPS 2020

Once-for-All Adversarial Training: In-Situ Tradeoff between Robustness and Accuracy for Free

Review 1

Summary and Contributions: This paper aims to explore the balance between robustness and accuracy, by proposing training methods to work on regularisation terms including both classification loss and robustness loss. It considers an approach that can make the training algorithm conditioned on lambda -- a hyper-parameter.

Strengths: A framework for adversarial training that can balance multiple regularisation terms.

Weaknesses: Based on the idea of conditioning, some detailed study is conducted, although the key technical methods are all from other papers. For example, it uses Feature-wise Linear Modulation from [31] for the initial implementation, but discovers that the performance is significantly degraded. Then, a dual batch normalization (dual BN) from [16] is considered. Afterwards, the framework is extended to work with the other dimension on efficiency. Technically, my concern is with the seemingness small number of sampled lambda values and that the lambda values are sampled uniformly. Is there any theoretical argument that such sampling can actually be able to give us a sufficiently good picture on what is going on in terms of the tuning of hyper-parameter. More specifically, it is unclear to me how to validate the resulting framework beyond the simple comparisons as e.g., in Fig 5 and Fig 6. At least I am not convinced that the balance between SA and RA is so simple. It might be useful to also discuss the cost of this framework. From Fig 3, it looks to me it is training a network of multiple times larger than the original network. Based on the above observation, I think the problem itself is important and the general approach is interesting, with my conservations on the novelty of this work -- given that it is following the conditional training approach and adapted from other existing methods ([31],[16], etc), the novelty is not as significant. Also, it is unclear to me how good this approach can be used to study the robustness-acccuracy balance -- more theoretical argument might be needed.

Correctness: the method looks interesting to me, although I am a bit conservative on if it is able to solve, or is the ultimate tool to analyse, the robustness-accuracy balance.

Clarity: it is well written

Relation to Prior Work: related work is discussed, and cited

Reproducibility: Yes

Additional Feedback:

Review 2

Summary and Contributions: This paper describes a novel “in-situ” flexible model that can trade-off between accuracy and robustness (as well as compactness, as an extension), at test time based on user specification. For the first time such problem is raised, formulated and addressed in the adversarial defense literature.

Strengths: The method was invented to tackle an important and under-addressed problem: trading off accuracy and robustness using adversarial training, without re-training. The proposed solution is a plug-n-play extension to standard AT, based on “model-data space joint sampling” (as called by authors). Another interesting extension was also considered for co-optimizing the network width. I think the overall methodology is sound and novel. Figure 2 visualization of BN features explains the challenge of packing different SA-RA models into one. Supplement D proves that dual BN is indeed a knob of method success. Experimental results show that CAT (dual BN) works well in general, being comparable to traditional AT training for each individual lambda parameter. The authors also presented a large number of ablation experiments, making their conclusion more convincing.

Weaknesses: IMHO, the SA-RA trade-off seems to be most useful when human is placed in the loop to decide “how aggressive” the predictor shall behave as. For example, besides standard image classification, it would be meaningful to benchmark the proposed approach on medical image classification, or other human-in-loop decision making, for showing its real benefits. It seems lambda is specified by the user during test time. But I am not sure for users, what is the principle or rule of thumb here, for selecting the lambda value to achieve their desired performance. There is still room for performance improvement. It shows that retraining can still boost TA-RA in some data points. Also, CATS is (understandably) a bit outperformed by PGD-AT, hence jointly optimizing three objectives seem to be challenging. There is no discussion of training costs. Does CAT take longer to train compared to PGD-AT?

Correctness: Yes

Clarity: Yes

Relation to Prior Work: Yes

Reproducibility: Yes

Additional Feedback:

Review 3

Summary and Contributions: The paper proposed an empirical framework called calibratable adversarial training. It allows for user-specified calibration of desired robustness level, depending on test-time use case, without re-training. Their experiments show that, with just “once for all” training, this method impressively achieves similar or superior performance, compared to dedicatedly trained models at various configurations.

Strengths: This is the first framework proposed for a new goal: to control the trade-off between accuracy and adversarial accuracy “in-situ”. The motivation is clear and interesting: their new goal is meaningful for quickly exploring the accuracy-robustness performance spectrum of a model, avoiding repetitive training to exhaust hyperparameters. It also enables future applications where the perceiving agent may switch between “optimistic” and “conservative” from time to time, as reacting to the dynamic environment. The framework is enabled by taking the trade-off parameter as a conditional input and is trained with sampling this additional parameter differently at each minibatch (the loss is adapted accordingly). This is a sound idea, and the authors also discussed the impact of sampling strategy in their experiments. The idea is extended to incorporate another parameter controlling model complexity. Their methodology hence appears to be general. Experiments compared with strong PGD-AT baselines. The authors describe their protocols in detail, and the results look convincing (comparable to adversarial training but avoiding re-training). I like the authors plotting the SA-RA frontiers that nicely summarize the big picture of algorithm performance spectrum. From the supplementary, the Jacobian visualizations, and the demonstrated robustness to more unseen attack types, are also noted.

Weaknesses: This is a strong paper that I feel overall positive about. No particular weakness is required for authors to address during rebuttal; just some nitpicks and suggestions: - I am quite curious what will happen, if you feed in lambda outside [0,1] range, at test time? - For the role of dual BN in adversarial robustness, a better reference than [16] is “Intriguing properties of adversarial training at scale”, ICLR 2020 (from the same authors). - As one key building block, the introduction of FiLM layer is underwhelming, and more details should be included to make the paper self-contained. - As an experimental paper, it would be nicer to demonstrate results on larger datasets. This is just a suggestion afterwards; I understand adversarial training on ImageNet scale cannot be completed easily in short time. - (minor) You might want to discuss a recent paper of similar taste, “Once-for-All: Train One Network and Specialize it for Efficient Deployment on Diverse Hardware Platforms”, although admittedly the two works solve different questions.

Correctness: Yes

Clarity: This paper is well-written, and the logic flow is easy to follow. Contributions are laid out clearly. Figures 1 to 3 are nicely done and helpful for understand the algorithms.

Relation to Prior Work: Yes

Reproducibility: Yes

Additional Feedback:

Review 4

Summary and Contributions: This paper studies an important problem in adversarial machine learning: how to flexibly switch between different levels of robustness and accuracy tradeoff without retraining the models. Motivated by an existing work that applies batch normalization separately for clean and adversarial data, this paper proposes a calibratable adversarial training (CAT) scheme to achieve different levels of in-situ robustness-accuracy tradeoff all at once in a single round of training. At test time, this allows the user to obtain any level of robustness/accuracy by specifying a tradeoff parameter. A slimming parameter is further incorporated into the CAT framework for runtime efficiency. The proposed methods are verified on three datasets, along with an explanation for the clean-adversarial batch norm separation. ----------- I am happy to raise my score to 6 after the rebuttal. Most of my concerns have been addressed to some extent. While I still think the unified formulation is not strictly precise, I appreciate the authors' attempts. Overall, the idea proposed in this paper is very novel.

Strengths: 1. This paper is very well written and easy to read. 2. The proposed CAT/CATS training schemes are well-motivated, and could benefit a wide range of industrial applications. 3. The idea of batch norm separation can be easily implemented, and seem to work well with adversarial training. 4. Complete experimental evaluation and explanations.

Weaknesses: 1. Novelty of the clean vs adversarial statistics separation via isolated batch normalization is limited, considering a similar technique has been proposed in [16]. Although [16] is not for adversarial training, but technically, it proves that data statistics can be separated without harming the learning (it even improves the learning). 2. The proposed CAT/CATS training is very similar to ensembling, but with shared weights for Conv layers, which is also similar to the weight sharing across child networks in neural architecture search (NAS). I do acknowledge the contribution of transferring these ideas to adversarial training. 3. The demonstrated robustness-accuracy-runtime tradeoffs in Figures 4 and 6 are not even over different parameters. RA doesn’t change much when SA is small, but suddenly drops to a much lower level when SA is above some values. This means the proposed methods did not really address the trade-off issue, but simply put together different subnetworks. 4. The formulation of existing adversarial training methods in Equation (1) and (2) are wrong! In Eq. (1), left is not equal to right: left is a minimization problem, right is just an empirical error, missing min_{\theta} on the right. The formulation of PGD-AT in Equation (2) is wrong, L_c=0 for PGD-AT, and the hyper-parameter is not (1-lambda)/(lambda) for TRADES (not sum up to one). These formulations have been well summarized in one ICLR20 paper [2]. Please double check the original formulation used in [6], by its Equation (2.1). There is also no L-c term in MMA, see its Equation (3). 5. A followed up issue, I am not sure what it is for PGD-AT with \lambda != 1. For example in Figure 5, the authors tested different \lambda for PGD-AT, where it is hard to interpret what that means for PGD-AT (it is no longer the PGD-AT if \lambda != 1). 6. The evaluation metric RA uses the same PGD attacks for both training and testing, which is a bit too weak, should use a stronger PGD attack for testing (for example, train on PGD-10, test on PGD-40). This may not change the conclusion of this paper though. And also, the experiments in Figure 5 should be done on CIFAR-10 not SVHN, otherwise it is hard to interpret the real difference compared to standard PGD-AT, as most of the understanding of PGD-AT in this field is on CIFAR-10.

Correctness: Claims and method are correct, though there are formulation issues.

Clarity: Yes, very well written.

Relation to Prior Work: Yes.

Reproducibility: Yes

Additional Feedback: