Natasha 2: Faster Non-Convex Optimization Than SGD

Allen-Zhu, Zeyuan

Loading [MathJax]/jax/output/CommonHTML/jax.js

Natasha 2: Faster Non-Convex Optimization Than SGD

Part of Advances in Neural Information Processing Systems 31 (NeurIPS 2018)

Bibtex Metadata Paper Reviews

Authors

Zeyuan Allen-Zhu

Abstract

We design a stochastic algorithm to find $\varepsilon$ -approximate local minima of any smooth nonconvex function in rate $O(\varepsilon^{-3.25})$ , with only oracle access to stochastic gradients. The best result before this work was $O(\varepsilon^{-4})$ by stochastic gradient descent (SGD).

Natasha 2: Faster Non-Convex Optimization Than SGD

Authors

Abstract

Name Change Policy