Part of Advances in Neural Information Processing Systems 32 (NeurIPS 2019)
Yair Carmon, Yujia Jin, Aaron Sidford, Kevin Tian
We present a randomized primal-dual algorithm that solves the problem minx maxy y^T A x to additive error epsilon in time nnz(A) + sqrt{nnz(A) n} / epsilon, for matrix A with larger dimension n and nnz(A) nonzero entries. This improves the best known exact gradient methods by a factor of sqrt{nnz(A) / n} and is faster than fully stochastic gradient methods in the accurate and/or sparse regime epsilon < sqrt{n / nnz(A)$. Our results hold for x,y in the simplex (matrix games, linear programming) and for x in an \ell_2 ball and y in the simplex (perceptron / SVM, minimum enclosing ball). Our algorithm combines the Nemirovski's "conceptual prox-method" and a novel reduced-variance gradient estimator based on "sampling from the difference" between the current iterate and a reference point.