Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
The work does not include original ideas. It is exclusively a collection of previous ideas combined together in a rather classical way. Major remarks: Equation (6) makes loss non-smooth and non-differentiable. The authors do not discuss how they handle this. I assume they use the typical approach by getting the right 'case' in the forward step and then doing back-prop on the fixed smooth function. Line 129-130: Temperature should be tried to impose sparseness. In addition, the Gumble trick can be used here and should be tried. Minor remarks: Lines 218-220: They are unclear. The first part says that they pretrain the AE, but the next sentences says that they train without any pre-training. Line 120: that take
Self-supervised learning is an interesting topic to work on. This paper presents several constraints/losses to regularize the training of deep nets, including region-level localization, concentration regularization, and orthogonal regularization. The experimental results show that the proposed method outperforms the main competing baseline  (a CVPR'19 paper).
1) The paper is very well written and easy to follow 2) The approach being new, the intuition behind every model decision is well explained and it was easy to understand the model architecture design, the loss functions etc. 3) The paper thoroughly reviewed the previous work in tracking and fine-grained correspondence and clearly communicates the novelty of this paper. 4) The paper will be very significant as it's a new way to think about self-supervised learning in videos.