Efficient Learning using Forward-Backward Splitting

Part of Advances in Neural Information Processing Systems 22 (NIPS 2009)

Bibtex Metadata Paper Supplemental


Yoram Singer, John C. Duchi


We describe, analyze, and experiment with a new framework for empirical loss minimization with regularization. Our algorithmic framework alternates between two phases. On each iteration we first perform an {\em unconstrained} gradient descent step. We then cast and solve an instantaneous optimization problem that trades off minimization of a regularization term while keeping close proximity to the result of the first phase. This yields a simple yet effective algorithm for both batch penalized risk minimization and online learning. Furthermore, the two phase approach enables sparse solutions when used in conjunction with regularization functions that promote sparsity, such as $\ell_1$. We derive concrete and very simple algorithms for minimization of loss functions with $\ell_1$, $\ell_2$, $\ell_2^2$, and $\ell_\infty$ regularization. We also show how to construct efficient algorithms for mixed-norm $\ell_1/\ell_q$ regularization. We further extend the algorithms and give efficient implementations for very high-dimensional data with sparsity. We demonstrate the potential of the proposed framework in experiments with synthetic and natural datasets.