Review for NeurIPS paper: Self-supervised Co-Training for Video Representation Learning

NeurIPS 2020

Self-supervised Co-Training for Video Representation Learning

Meta Review

This paper presents an approach to learn video representation via contrastive learning framework. All the reviewers like the proposed approach calling it intuitive and a step in right direction. Several concerns were there as well: (a) comparison to CMC; (b) relation to prior work; (c) reproducibility; (d) UberNCE being upper-bound. Authors submitted a strong rebuttal.They provided comparisons to CMC, promised to do better discussion of related work, release code. UberNCE argument still remains a concern but the this is a simple change. AC agrees with the reviewers and recommends acceptance. Please make all the changes suggested by reviewers in camera ready.