Summary and Contributions: The paper tackles the problem of domain adaptation in which no target labels are observed. The authors proposed an unsupervised meta-adaptation framework EAML that is claimed to deal with domain shift and catastrophic forgetting effectively. Experiments back their claim up.
Strengths: - The paper claims that EAML has applications in continuous domains, which enables EAML to have a broader impact on practice. - The methods are described in detail for each hierarchical step. - Theorem 1 is sound from the perspective that it is later used for representation learning (by using the upper bound).
Weaknesses: - 3 datasets are employed but only for MNIST were the quantified results presented in detail. The ablation study for Vehicles and t-SNE for Caltran are useful, yet cannot be substituted for such results. Likewise, the authors might need to make it clearer as to why such results are missing. Similarly for ablation study and t-SNE, the authors might want to present results for other datasets in the supplemental materials as well. - Figure 5 seems not to clearly show the superiority of EAML vs. JAN Merge. Maybe plotting in the same chart would make things easier to observe.
Correctness: The derivations look correct. For experiments, however, MNIST and Vehicles datasets being used in the paper seem not continuous enough. Maybe if the authors present experiments with more continuous datasets, the correctness of the paper and its claim would be more sound.
Clarity: The paper is written clearly and easy to follow any points they are making.
Relation to Prior Work: The authors discussed the prior work, which covers classic discrete-domain methods and continuos domain adaptation with differences from this work.
Reproducibility: Yes
Additional Feedback: Post Rebuttal: I acknowledge your great effort in providing detailed responses to all reviewers' concerns and would like to increase my score.
Summary and Contributions: This paper presents a new domain adaptation setting and proposes an domain adaptation framework based on meta learning to address that. In the new setting the target domain unlabeled data evolves over time and the model is required to adapt to continually evolving target domain without forgetting. Authors proposed the Evolution Adaptive Meta-Learning (EAML) framework, which includes a meta-objective to learn representations for adaptation to continually evolving target domain data, and a meta-adapter for adapting to current target without forgetting the previous target. Experiments demonstrate the effectiveness of the proposed method.
Strengths: + A new practical domain adaptation setting which combines domain adaptation and continual learning. + A novel meta learning framework which can capture the structure of evolving target domain and conduct adaptation to continually evolving target domain data without catastrophic forgetting. + Insightful analysis on the learned meta-representation.
Weaknesses: - It is not clear how to balance domain adaptation and learning without forgetting. - In figure 3, why does JAN Merge underperform EAML-full and EAML-rep at the beginning? Is hyperparameters well tuned? - There is no comparison with other continuous domain adaptation work such as [2,10,14]. - No discussioni with a very relevant work [a]. How does meta-adapter proposed in this paper compared to the AdaIN used in [a]? Z. Wu, et al., ACE: Adapting to Changing Environments for Semantic Segmentation, ICCV 2019. - How to address the case when there is large intra-variance within target domain and there could be abrupt change over time?
Correctness: Yes, sounds correct.
Clarity: It is clear and well organized.
Relation to Prior Work: Most relevant work is discussed but one very relevant work is missing. Z. Wu, et al., ACE: Adapting to Changing Environments for Semantic Segmentation, ICCV 2019.
Reproducibility: Yes
Additional Feedback:
Summary and Contributions: The paper aims to study evolving domain adaptation where the domain continuously adapts over time. The paper proposes EAML, Evolution Adaptive Meta Learning which seems to be inspired by MAML by performing representation updates at the outer loop and adapt the adapter module and classifier at the inner loop. In addition, the paper proposes using a meta-adapter (adapter over the adapter modules) to overcome overfitting.
Strengths: - Using more realistic benchmarks (Evolving Vehicles and Caltran) in addition to the simpler rotated MNIST - Moderate novelty. There are multiple levels of meta-learning proposed. - The results seem quite interpretable with the tsne visualization showing the evolution of representation.
Weaknesses: - The model does not consistently perform better than baselines for different settings (e.g. Table 1) - More discussion / comparison with other continual learning literature needed.
Correctness: Seems reasonable.
Clarity: Mostly. I wish the paper describes the baselines in a bit more details. I do not feel that I have a good grasps of how the baselines handle the adaptation.
Relation to Prior Work: I think it is quite clear from how the model is proposed. However the paper can better mention other related work that inspires the proposed approach. For instance, the paper is an adaptation of MAML with modified inner loop that trains only the adapter and the classifier, as well as other meta components. I feel that this is somewhat inspired by Raghu et a. 2019 which shows that adapting the inner loop with only the classifier is quite sufficient. (although the paper adapts the adapter network too, which is slightly different) Overall the related work section is quite abridged. There has been much attention on continual learning recently so I believe it should be expanded more to give additional context of what the literature has done.
Reproducibility: Yes
Additional Feedback: The broader impact section needs much more consideration.
Summary and Contributions: This paper studies the problem of domain adaptation with evolving target domain. - A new problem setting, evolving domain adaptation (EDA), is proposed. - A meta-adaptation framework which enables the learner to adapt to continually evolving target domains without forgetting.
Strengths: - The problem of evolving domain adaptation (EDA) is important and practical, yet under explored. - The proposed framework is elegant and logical. - The ablation study in Figure 3 provides many insights into the approach.
Weaknesses: - Problem setting: The motivation behind the combination of domain adaptation and online learning can be further elaborated. Online learning is mainly used when it is computationally infeasible to train over the entire dataset. In domain adaptation, only very few examples are used for the adaptation process. Therefore it is usually feasible to simply store all the data and train in an off-line manner. - Performance: It is unclear how the proposed online setting compares against the offline setting (upperbound), but I suspect that the margin is big. As shown in Table 1, the performance of the proposed framework is weak, making it impractical to choose the online learning setting. - Experiments: Comparison with existing incremental/online learning techniques is missing. It would be interesting to see baseline methods that combine existing domain adaptation and incremental learning techniques. - Comparison with previous work: More comments are provided in the "Relation to prior work" section.
Correctness: Yes. The claims, method, and the expirical methodology seem correct.
Clarity: Yes, the paper is well written overall. Suggestions have been provided in the "additional feedback" section below.
Relation to Prior Work: The difference with previous contributions are not clearly discussed. - Several previous papers on domain adaptation with evolving domains are not cited or compared in the experiments. These papers include "Incremental Adversarial Domain Adaptation for Continually Changing Environments" and "Incremental Evolving Domain Adaptation". - The difference with other continuous domain adaptation methods is unclear. It seems that previous approaches on continuous domain adaptation can be adapted to the proposed evolving domain adaptation (EDA) setting. - The authors did not compare numerically against previous meta-learning-based techniques such as MAML.
Reproducibility: Yes
Additional Feedback: - In line 116-118: the notations f_theta, h_theta are a bit confusing. - The meaning of "rep", "ada", and "full" should be explained in the caption of Table 1. They are not explained until Section 4.4. - Details for network architectures should be provided for reproducibility.