NeurIPS 2020

An analytic theory of shallow networks dynamics for hinge loss classification

Meta Review

There were initially three positive reviews with high levels of confidence, and one negative review with low confidence, the latter of which has been revised upward after the author response. On the basis of the review reports, as well as my own reading of this paper, I would also evaluate this paper as positive, containing original and interesting contributions, including: - finding a case in which training dynamics of a single hidden layer neural network is analytically solvable in the mean-field limit, - exploring the lazy and rich regimes via controlling the parameter \alpha, - providing an O(1/M) correction to the analytical solution, leading to a spread in the dynamics, - providing an O(1/N) correction to the analytical solution, leading to overfitting. These items, as well as the insights gained therefrom, would compensate for the main weakness that the linearly separable model with spherically-distributed samples is simplistic (which is for the sake of analytical tractability). I would thus recommend acceptance of this paper. Minor points: "asses" should read assess. \theta(.) appearing in equations (6) and (7) is undefined.