Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
This paper proposes a new framework for bounding the generalization error of fully connected neural nets. The authors are able to show that, for sufficiently smooth activation functions, the number of examples required to achieve a good generalization error scales sublinearly with the total number of parameters in the network. This is a significantly better bound than the previous state-of-the-art results. The analytical tools based on description length are very interesting, and could be applicable to the analysis of other multi-layer non-convex models. All three reviewers are uniformly enthusiastic about this work, which is guaranteed to attract a great deal of attention and to catalyze further research activity.