NeurIPS 2020

How to Characterize The Landscape of Overparameterized Convolutional Neural Networks


Meta Review

This paper studies the landscape of overparametrized convolutional networks and argues that their training dynamics can be analyzed by comparing the trajectories of feature distributions. Using “network grafting” as a metric, it shows feature distribution trajectories of two networks with the same architecture but different initializations remain close during training. The paper also shows that although the landscape is non-convex with respect to the trainable parameters, it can be reformulated as a convex function with respect to the features. Reviewers rate the paper as top 50%, marginally above, and marginally above. They find that the paper is well written and proposes a novel and appealing perspective for analyzing training dynamics. However, there was lack of clarity about the claim of convexity, which the authors clarified in the rebuttal. I think adding those clarifications to the paper is needed. I also note there are earlier papers on the analysis of the landscape which show that the non-convex objective on the parameters can be tightly connected to a convex objective on output space (https://arxiv.org/pdf/1506.07540). Finally, I also think it would be good to add the discussion in the rebuttal about using other standard statistic metrics. Overall, there is agreement this is a good paper.