Review for NeurIPS paper: Adversarially Robust Few-Shot Learning: A Meta-Learning Approach

NeurIPS 2020

Adversarially Robust Few-Shot Learning: A Meta-Learning Approach

Review 1

Summary and Contributions: The proposed work addresses the notion of model robustness in few shot regime. The authors show that naturally trained meta learners are not robust to adversarial examples and develop a robust meta-learner which uses adversarial examples on the query set to improve robustness.

Strengths: The authors develop a simple algorithm-agnostic method to improve robustness in the context of few shot learning and show that models trained using Adversarial Query (AQ) method are more robust than their naturally trained counterparts. Another interesting observation is that they show transfer learning from Adversarially trained model has lower performance than AQ. They also discuss the effectiveness of commonly used methods from adversarial examples literature such as preprocessing and denoising blocks in the context of few shot learning.

Weaknesses: Although the proposed approach is simple, some experimental findings can be explained better. From the results of Tables 7 and 8, it seems that perturbing support data does not provide any advantage (in the case of Omniglot dataset it is less robust than even the naturally trained model). Similarly, it is unclear why AQ is more robust than transfer learning from adversarially trained models. Previous works[1,2] have shown that transfer learning can be a very good baseline for few-shot classification so the significant reduction in natural accuracy is an interesting observation. [1] A Closer look at few shot classification, Chen et al., arXiv:1904.04232 [2] A Baseline for few shot classification, Dhillon et al., arXiv:1909.02729

Correctness: Yes

Clarity: The structure of the paper and writing quality of the paper can be improved. The usage of multiple small tables makes the paper little difficult to read and can make the results from different experimental settings difficult to summarize. The best results from the algorithm is seen in Table 9 which could have been included along with the main results.

Relation to Prior Work: Yes, the authors have highlighted how the proposed work is different from previous approaches.

Reproducibility: Yes

Additional Feedback: A few additional concerns: — Most of the experiments are done using the backbone from R2-D2. It would be interesting to see the effect of using a larger backbone on the transfer learning experiments. — The attack algorithms used for evaluation are based on l_\infty norm,does the robustness hold across different norms as well? — In Table 16 and Table 17, we expect adversarial accuracy to increase as 1/lambda increases. However, we see a drop in robustness. why? — In L164, it is mentioned that "Attacking only support data can be seen as maximizing clean test accuracy when fine-tuned in robust manner". Why?Its unclear why that should occur. To summarize, the proposed approach is interesting and could encourage further research towards robust models for few shot learning. However, the reasoning for some of the experimental findings is unclear and the paper needs to be restructured for easier understanding. ------ Update: The authors have addressed my concerns about the experiments with transfer learning and additional attack norm. The novelty of this work lies in perturbing only query data, but it seems natural that perturbing support data as well should increase robustness which is not observed. This concern was shared by other reviewers as well and more description/justification was required to explain this. For the META-Trades experiments as well, the original paper[30] showed that only natural accuracy decreases while robust accuracy improves when increasing 1/lambda. This is different from the networking behaving as a constant function. The authors are encouraged to look into this in more detail.

Review 2

Summary and Contributions: Summary: This work develops an approach to producing a meta-learner robust to adversarial examples in few-shot classification. The main difference (or say the novelty in this work), between the proposed algorithm in Algorithm 2 and the classical meta-learner in Algorithm 1, is that this work proposes to construct adversarial queries in the inner loop of meta-learning. The authors thoroughly investigate the causes for adversarial vulnerability. They also demonstrate the superior robust performance of the proposed method, on Mini-ImageNet and CIFAR-FS, to robust transfer learning for few-shot image classification. Contributions: 1. This paper proposes a new method called adversarial querying, in which adversarial queries are constructed in the inner loop of meta-learning. 2. This paper thoroughly investigates the causes for adversarial vulnerability of four meta-learning algorithms, which is quite impressive to me.

Strengths: 1. The proposed idea of constructing adversarial queries in the inner loop of meta-learning is simple, and good performance due to this idea is demonstrated. 2. The investigation of adversarial vulnerability of the four meta-learning algorithms is thorough. To me, this is the most impressive strength of this work.

Weaknesses: 1. This work is mostly empirical, as with many papers in this field; the results can be more convincing to me if any theoretical investigation can be provided, although I understand this may be hard. 2. Only four meta-learning algorithms (MAML, R2-D2, MetaOptNet, and ProtoNet) are tested with the proposed idea; more state-of-the-art meta-learning algorithms can be considered.

Correctness: To me, the conclusions drawn from empirical results are reasonable, and the empirical studies are thorough and impressive.

Clarity: The subsections in section 4 and section 5 look divergent to me for some reason, but I think overall the paper is well written.

Relation to Prior Work: The paper explains the relationship with related work including ADML in [29], a robust meta-learner to adversarial attacks based on MAML. If a more detailed comparison between the algorithm of ADML with the proposed Algorithm 2 can be provided, it will be more clearly about the difference between these two adversarial meta-learners, as well as the novelty of the proposed method, but I understand the limit of space.

Reproducibility: Yes

Additional Feedback: ######## After reading other reviewers' comments and the authors' responses, I decide to lower down my score, for the concerns about the novelty and the empirical investigation of this work.

Review 3

Summary and Contributions: This paper address the issue that few-shot learning methods are highly vulnerable to adversarial examples. The authors show that naturally trained meta-learning methods are not robust by PGD attack. Then the authors propose adversarial querying to solve the issue. The authors find that meta-learning models are most robust when the feature extractor is fixed, and only the last layer is retrained during fine-tuning. They also show that choice of classification head significantly impacts robustness.

Strengths: - The problem this paper studied is important. - This paper is the first one to propose a meta-learning approach for robust few-shot learning, which is novel. - This paper compares AQ with transfer learning form adversarially trained models, and find AQ is better. - This paper further study the robustness-accuracy trade-off, only fine-tuning the last layer, the R2-D2, etc. In summary, I think this paper well study the robustness of few-shot learning and deeply investigate the AQ method.

Weaknesses: The main algorithm is somewhat simple. It is just an adversarial training, using a meta-learning framework which is standard for few-shot learning. My concern is the novelty of the algorithm. But the problem is interesting and this paper moves the first step for developing adversarially robust methods for few-shot applications.

Correctness: Yes. I haven't seen anything wrong so far, but I am not the expert in this area.

Clarity: I suggest the authors can show at least one visual case instead of just accuracy numbers to better illustrate the problem and your performance.

Relation to Prior Work: Yes. This work is the first one to use meta-learning to achieve robustness for few-shot learning. It is clear that the method of this work is novel and different from previous works.

Reproducibility: Yes

Additional Feedback:

Review 4

Summary and Contributions: The paper proposes to make few-shot learning on image classification tasks: MiniImageNet, Omniglot and CIFAR-FS robust to PGD based adversarial attacks. To this end, the authors propose a simple solution: to optimise the loss in the outer loop of any meta-learning algorithm over adversarially perturbed query data points. The authors provide an extensive empirical evaluation using 4 baseline meta-learning algorithms: MAML, ProtoNet, R2-D2 and MetaOptNet and compare the performance of their method of adversarial querying across different configurations. Their results show that AQ does improve the adversarial robustness to PGD attacks for few-shot learning tasks.

Strengths: 1) To the best of my knowledge, this is the first work considering the problem of adversarial robustness of few-shot learning in image classification tasks. This work can prove to be an important starting point for robustness research for few-shot learning. 2) I really like the initial evaluation to support the claim of why existing methods are not very robust to PGD attacks, this provides motivation towards developing better meta-learning algorithms. Also their empirical comparison of meta-learning approaches being more robust than transfer learning approaches in Table 6 is insightful. 3) The empirical results do indicate that adversarial querying improves the robustness of few-shot learning on MiniImageNet, Omniglot and CIFAR-FS.

Weaknesses: [Edit after Author Response] I thank the authors for acknowledging the suggestion for merging the tables, captioning and moving Algorithm 1 to the Appendix. 1) Discussing the findings of Table 7 and 8 and providing a reasoning for them is fairly important since this detail forms the crux of your simple idea. The author response does not elaborate or explain too much on this, but rather states the observations from Table 8. 2) While I thank the authors for performing more experiments on state of the art meta-learning approaches like MCT and mentioning that AQ on MCT reduces the drop of natural accuracy, the current results in the paper using other meta-learning approaches do have a large drop in the natural accuracy. This certainly diminishes the practical use of AQ. ------- I agree that the paper does have some good positive points. However I am slightly inclined towards a rejection currently primarily due to the following reasons: 1) The core idea of this paper is very simple and straightforward. Though the authors justify that they are the first to do it, I am unsure whether this work might count as a novel enough contribution for the NeurIPS community. 2) While it is common knowledge that adversarial training leads to a drop in the natural accuracy while improving the adversarial robustness accuracy, the drop is usually small and not very significant. In contrast, from results in Table 4,5 vs Table 2, it appears that using AQ causes a big drop (sometimes almost 15-20%) in the natural accuracy. While the adversarial robustness increases, such a big drop in natural accuracy is not supportive of their AQ technique being very useful practically. 3) Apart from providing some empirical results in Table 7, 8, the authors provide very little reasoning/exploration of why only perturbing the query data and not the support data or both the (query+support data) is used in their proposed technique. This seems to be the crux of their simple idea, but is not well justified. I would have liked some better intuitive explanation and possibly some theoretical justification as to why only fine-tuning in the outer-loop with adversarially perturbed query data is sufficient to make the model robust to previously unseen few-shot (support, query) pairs. 4) Instead of having so many small individual tables, I would rather have a single big table comparing different methods/experiments (especially since almost all the tables are over the same 2 datasets). There are 2 main reasons for this: a) First, having separate tables makes it difficult to compare the improvements of a technique with respect to other techniques whose results have been put in a separate table. b) There are several repeating results across different tables which seems to be unnecessarily used for space filling.

Correctness: Yes, the methods used by the authors along with the empirical evaluation is correct, though lacking justification in places.

Clarity: For the most part the paper is well written. The tables however are sometimes tough to comprehend, especially since none of the table captions describe the brief findings from the results. Further, in some places the tables are not referenced from within the text and hence it becomes slightly confusing as to which tables contain which results (especially tables 7,8).

Relation to Prior Work: The paper clearly discusses prior work and related work of few-shot learning and adversarial attacks and learning.

Reproducibility: Yes

Additional Feedback: Questions: 1) I am surprised at the big difference in A_{adv} values of "MAML adv. query" and "MAML adv. query and support" in Table 8. Can you provide any intuition/justification of this observation? Suggestions: 1) The table captions are very poorly indicative. A majority of the table titles contain the exact common strings which are not necessary. Instead, since there are so many tables, you should indicate in 1 line what the table is trying to evaluate, and a 1 line description in words of the findings from the table's results. You don't need to repeat everywhere that nat and adv refers to natural and robustness accuracy (these can be mentioned just once in the first table). 2) The basic description of the meta-learning algorithm (Algorithm 1) consumes too much space without adding anything novel and should be shifted to the Appendix according to me.

Review 5

Summary and Contributions: The paper studies performance of meta-learning algorithms for few-shot learning under white/black-box adversarial attacks. Systematic experiments with adversarial learning (mainly with projected gradient attacks -- PGD) lead the authors to propose to train meta-learners with adversarially modified query data (constructed with PGD). The proposed method shows favorable results with several meta-learners (MAML, ProtoNet, R2-D2, MetaOptNet) on a few datasets (Mini-ImageNet, Omniglot, and CIFAR-FS) under white-box attacks. Furthermore, favorable results are also demonstrated when compared to architectural and pre-processing based defenses.

Strengths: The proposed approach is intuitive and simple to implement. It can be used with any episodically trained, meta-learner. The proposed approach is motivated by systematic experiments. Additional experiments with black box attacks are presented. Comparisons are made with a wide range of other defenses.

Weaknesses: The proposed approach is somewhat obvious, it is an almost out-of-the-box application of [17] to the episodic training. In this sense, the contributions of this paper are mostly to provide baseline empirical results.

Correctness: Questions/Issues: 1-The proposed adversarially trained approach performs considerably worse when no adversarial attack is present (57.87% vs 73.01%), why is that? Since an attacker can choose to attack or simply not change the input, a reasonable performance measure is the average performance (A_{nat} + A_{adv}) / 2. Which yields 44.69% on 5-shot ImageNet compared to the baseline (R2-D2) 36.51% -- still demonstrating the advantage of the proposed method, though not as great as comparing A_{adv} alone. 2-The proposed approach was not demonstrated on meta-learners that minimize a *support* set loss during meta-training. For example, instead of minimizing the generalization error one can minimize the leave-one-out validation error (e.g. I believe REPTILE does this). It is not clear if this method would apply in that setting, it would be nice to see an experiment on this. 3-Results in each section are not necessarily complete. Certain sections present results on some datasets (and shots) and not others. For example: section 4.2 presents results on only 5-shot ImageNet, supporting evidence on doing query-only augmentation could be made stronger with results on other datasets. Also same section: MAML 5-shot on mini ImageNet should be 63% and not the reported 60%. Same with the experiment in Section 4.1 -- where are the analogous experiments on other datasets? 4-Finally it should be noted that the datasets used in this paper are somewhat small and do not test out-of-distribution generalization. I would like to see results on other datasets (e.g. MetaDataset) which contains far more data and tests out-of-distribution generalization.

Clarity: The paper is mostly well written. A few minor points: 1-copyright-control is not a safety critical application. 2-section 2.3 needs more clarification is the term "retraining" is used to mean "fine-tuning" are the layers in question re-initialized to random? Line 97: " against a weak attacker" -- what kind of attack? How weak?

Relation to Prior Work: As far as the paper mentions the only prior work on adversarial meta-learning was [29]. I am not aware of other works myself.

Reproducibility: Yes

Additional Feedback: