A Unified Fast Gradient Clipping Framework for DP-SGD

Kong, Weiwei; Munoz Medina, Andres

A Unified Fast Gradient Clipping Framework for DP-SGD

Weiwei Kong, Andres Munoz Medina

Advances in Neural Information Processing Systems 36 (NeurIPS 2023) Main Conference Track

Abstract

A well-known numerical bottleneck in the differentially-private stochastic gradient descent (DP-SGD) algorithm is the computation of the gradient norm for each example in a large input batch. When the loss function in DP-SGD is consists of an intermediate linear operation, existing methods in the literature have proposed decompositions of gradients that are amenable to fast norm computations. In this paper, we present a framework that generalizes the above approach to arbitrary (possibly nonlinear) intermediate operations. Moreover, we show that for certain operations, such as fully-connected and embedding layer computations, further improvements to the runtime and storage costs of existing decompositions can be deduced using certain components of our framework. Finally, preliminary numerical experiments are given to demonstrate the substantial effects of the aforementioned improvements.

Abstract

Name Change Policy