Review for NeurIPS paper: Counterfactual Contrastive Learning for Weakly-Supervised Vision-Language Grounding

NeurIPS 2020

Counterfactual Contrastive Learning for Weakly-Supervised Vision-Language Grounding

Meta Review

All reviewers recommend acceptance (to varying degrees) after reviewing the author response. The submission focuses on weakly-supervised vision-language grounding and proposes a novel counterfactual contrastive learning objective. Some initial weaknesses with respect to comparison with hard-negative style approaches have been addressed in the rebuttal. I encourage authors to include these results and other suggested revisions in future versions.