Jiong Zhang, Hsiang-Fu Yu, Inderjit S. Dhillon
Deep neural networks have yielded superior performance in many contemporary applications. However, the gradient computation in a deep model with millions of instances leads to a lengthy training process even with modern GPU/TPU hardware acceleration. In this paper, we propose AutoAssist, a simple framework to accelerate training of a deep neural network. Typically, as the training procedure evolves, the amount of improvement by a stochastic gradient update varies dynamically with the choice of instances in the mini-batch. In AutoAssist, we utilize this fact and design an instance shrinking operation that is used to filter out instances with relatively low marginal improvement to the current model; thus the computationally intensive gradient computations are performed on informative instances as much as possible. Specifically, we train a very lightweight Assistant model jointly with the original deep network, which we refer to as Boss. The Assistant model is designed to gauge the importance of a given instance with respect to the current Boss such that the shrinking operation can be applied in the batch generator. With careful design, we train the Boss and Assistant in a nonblocking and asynchronous fashion such that overhead is minimal. To demonstrate the effectiveness of AutoAssist, we conduct experiments on two contemporary applications: image classification using ResNets with varied number of layers, and neural machine translation using LSTMs, ConvS2S and Transformer models. For each application, we verify that AutoAssist leads to significant reduction in training time; in particular, 30% to 40% of the total operation count can be reduced which leads to faster convergence and a corresponding decrease in training time.