ADMM without a Fixed Penalty Parameter: Faster Convergence with New Adaptive Penalization

Part of Advances in Neural Information Processing Systems 30 (NIPS 2017)

Bibtex Metadata Paper Reviews Supplemental

Authors

Yi Xu, Mingrui Liu, Qihang Lin, Tianbao Yang

Abstract

Alternating direction method of multipliers (ADMM) has received tremendous interest for solving numerous problems in machine learning, statistics and signal processing. However, it is known that the performance of ADMM and many of its variants is very sensitive to the penalty parameter of a quadratic penalty applied to the equality constraints. Although several approaches have been proposed for dynamically changing this parameter during the course of optimization, they do not yield theoretical improvement in the convergence rate and are not directly applicable to stochastic ADMM. In this paper, we develop a new ADMM and its linearized variant with a new adaptive scheme to update the penalty parameter. Our methods can be applied under both deterministic and stochastic optimization settings for structured non-smooth objective function. The novelty of the proposed scheme lies at that it is adaptive to a local sharpness property of the objective function, which marks the key difference from previous adaptive scheme that adjusts the penalty parameter per-iteration based on certain conditions on iterates. On theoretical side, given the local sharpness characterized by an exponent $\theta\in(0, 1]$, we show that the proposed ADMM enjoys an improved iteration complexity of $\widetilde O(1/\epsilon^{1-\theta})$\footnote{$\widetilde O()$ suppresses a logarithmic factor.} in the deterministic setting and an iteration complexity of $\widetilde O(1/\epsilon^{2(1-\theta)})$ in the stochastic setting without smoothness and strong convexity assumptions. The complexity in either setting improves that of the standard ADMM which only uses a fixed penalty parameter. On the practical side, we demonstrate that the proposed algorithms converge comparably to, if not much faster than, ADMM with a fine-tuned fixed penalty parameter.