(Nearly) Optimal Algorithms for Private Online Learning in Full-information and Bandit Settings

Part of Advances in Neural Information Processing Systems 26 (NIPS 2013)

Bibtex Metadata Paper Reviews Supplemental


Abhradeep Guha Thakurta, Adam Smith


We provide a general technique for making online learning algorithms differentially private, in both the full information and bandit settings. Our technique applies to algorithms that aim to minimize a \emph{convex} loss function which is a sum of smaller convex loss terms, one for each data point. We modify the popular \emph{mirror descent} approach, or rather a variant called \emph{follow the approximate leader}. The technique leads to the first nonprivate algorithms for private online learning in the bandit setting. In the full information setting, our algorithms improve over the regret bounds of previous work. In many cases, our algorithms (in both settings) matching the dependence on the input length, $T$, of the \emph{optimal nonprivate} regret bounds up to logarithmic factors in $T$. Our algorithms require logarithmic space and update time.