Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
There is a lot of support for the paper in the reviews. While much "folklore knowledge" exists around implicit regularization of SGD (e.g. towards approx. linear models), the paper does a very good job formalizing and answering relevant questions in a fruitful, yet simple, information theoretic framework. Some suggestions of improvement should be taken seriously, but all in all the paper makes a valuable contribution towards understanding the interplay of optimization and representational power (types of functions).