NeurIPS 2019
Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
Paper ID: 8196 Hamiltonian descent for composite objectives

### Reviewer 1

To minimize a function f in a continuous time setting, this paper provides an ODE that converges linearly. This ODE is built from the Hamiltonian flow, well known in mechanics, where a perturbation is added. The Hamiltonian is used as a Lyapunov function and the perturbation is necessary, otherwise the Hamiltonian is conserved. However the perturbation is often intractable. Then the authors consider the case where $f$ is composite i.e $f = h \circ A + g$ and build an Hamiltonian from the duality gap of this problem. Using some tricks in the definition of the Hamiltonian, the Hamiltonian dynamics is now tractable. Therefore, the proposed dynamics provide an ODE that exhibits linear convergence of the duality gap to zero. Moreover, a careful discretization of this ODE leads to the well known ADMM. The derivations are elegant and the paper is pleasant to read. The approach of this paper is original and insightful. It provides an new way to look at primal dual algorithms, which are usually seen as discretization of monotone flows, see @article{peypouquet2009evolution, title={Evolution equations for maximal monotone operators: asymptotic analysis in continuous and discrete time}, author={Peypouquet, Juan and Sorin, Sylvain}, journal={arXiv preprint arXiv:0905.1270}, year={2009} } @article{bianchi2017constant, title={A constant step Forward-Backward algorithm involving random maximal monotone operators}, author={Bianchi, Pascal and Hachem, Walid and Salim, Adil}, journal={arXiv preprint arXiv:1702.04144}, year={2017} } or @article{condat2013primal, title={A primal--dual splitting method for convex optimization involving Lipschitzian, proximable and linear composite terms}, author={Condat, Laurent}, journal={Journal of Optimization Theory and Applications}, volume={158}, number={2}, pages={460--479}, year={2013}, publisher={Springer} } ******************************************************************** I appreciate that the authors will cover PDHG algorithm in the camera-ready version. If PDHG can be covered, I expect that Vu-Condat can also be covered. Regarding the references provided in the review, please note that I'm not asking the authors to add all references to their paper (some may be relevant while some other may not). I pointed out these references for future works.

### Reviewer 2

Although the paper contains a lot of reminders about the Hamiltonian, it proposes a new research direction that would deserve to be explored. The proposed Halmitonian descent seems indeed to provide better results than the gradient descent. Numerical results in two simple regression scenarios evidence this fact.