A Stochastic Gradient Method with an Exponential Convergence _Rate for Finite Training Sets

Part of Advances in Neural Information Processing Systems 25 (NIPS 2012)

Bibtex Metadata Paper Supplemental


Nicolas Roux, Mark Schmidt, Francis Bach


We propose a new stochastic gradient method for optimizing the sum of
 a finite set of smooth functions, where the sum is strongly convex.
 While standard stochastic gradient methods
 converge at sublinear rates for this problem, the proposed method incorporates a memory of previous gradient values in order to achieve a linear convergence 
rate. In a machine learning context, numerical experiments indicate that the new algorithm can dramatically outperform standard
 algorithms, both in terms of optimizing the training error and reducing the test error quickly.