NIPS 2018
Sun Dec 2nd through Sat the 8th, 2018 at Palais des Congrès de Montréal

### Reviewer 1

This paper studied the problem of minimizing a convex function of a measure. The authors focus on particle gradient descent algorithm, and showed in the limit of infinitely many particle and infinite small learning rate (thus follow gradient flow), the algorithm can converge to a measure which is the global minimizer of the problem. One layer neural network and sparse spikes deconvolution can be casted into the framework of this paper. I feel the title is slightly misleading (since the crucial property of global convergence relies on convex rather than over-parameterization). The theoretical results are also in some sense weak 1) it requires infinitely many particle which could be exponentially large; 2) it does not provide convergence rate, which might require exponentially many steps (thus intractable). Nevertheless, the idea and introduction of particle gradient decent and optimization on measure is quite interesting, and innovative. As one of the earliest work in this vein, I suggest the results may already worth acceptance to NIPS. Minor suggestion: I feel some efforts could be spent towards making the detail more transparent and easy to follow. In the part why training NN fall in the formulation of this paper, it might be easier to understand to directly say what is function R and G in those setting. Also the main paper seems never officially define what is 1-homogeneous or 2-homogeneous, although they are very important assumption for the theorems.