NIPS Proceedingsβ

Necessary and Sufficient Geometries for Gradient Methods

Part of: Advances in Neural Information Processing Systems 32 (NIPS 2019) pre-proceedings

[PDF] [BibTeX] [Supplemental]

Authors

Conference Event Type: Oral

Abstract

We study the impact of the constraint set and gradient geometry on the convergence of online and stochastic methods for convex optimization, providing a characterization of the geometries for which stochastic gradient and adaptive gradient methods are (minimax) optimal. In particular, we show that when the constraint set is quadratically convex, diagonally pre-conditioned stochastic gradient methods are minimax optimal. We further provide a converse that shows that when the constraints are not quadratically convex---for example, any $\ell_p$-ball for $p < 2$---the methods are far from optimal. Based on this, we can provide concrete recommendations for when one should use adaptive, mirror or stochastic gradient methods.