Part of Advances in Neural Information Processing Systems 35 (NeurIPS 2022) Main Conference Track
Raman Arora, Raef Bassily, Cristóbal Guzmán, Michael Menart, Enayat Ullah
We study the problem of (ϵ,δ)-differentially private learning of linear predictors with convex losses. We provide results for two subclasses of loss functions. The first case is when the loss is smooth and non-negative but not necessarily Lipschitz (such as the squared loss). For this case, we establish an upper bound on the excess population risk of ˜O(‖, where n is the number of samples, d is the dimension of the problem, and w^* is the minimizer of the population risk. Apart from the dependence on \Vert w^\ast\Vert, our bound is essentially tight in all parameters. In particular, we show a lower bound of \tilde{\Omega}\left(\frac{1}{\sqrt{n}} + {\min\left\{\frac{\Vert w^*\Vert^{4/3}}{(n\epsilon)^{2/3}}, \frac{\sqrt{d}\Vert w^*\Vert}{n\epsilon}\right\}}\right). We also revisit the previously studied case of Lipschitz losses \cite{SSTT21}. For this case, we close the gap in the existing work and show that the optimal rate is (up to log factors) \Theta\left(\frac{\Vert w^*\Vert}{\sqrt{n}} + \min\left\{\frac{\Vert w^*\Vert}{\sqrt{n\epsilon}},\frac{\sqrt{\text{rank}}\Vert w^*\Vert}{n\epsilon}\right\}\right), where \text{rank} is the rank of the design matrix. This improves over existing work in the high privacy regime. Finally, our algorithms involve a private model selection approach that we develop to enable attaining the stated rates without a-priori knowledge of \Vert w^*\Vert.