Geometry of Early Stopping in Linear Networks

Dodier, Robert

Geometry of Early Stopping in Linear Networks

Part of Advances in Neural Information Processing Systems 8 (NIPS 1995)

Bibtex Metadata Paper

Authors

Robert H. Dodier

Abstract

A theory of early stopping as applied to linear models is presented. The backpropagation learning algorithm is modeled as gradient descent in continuous time. Given a training set and a validation set, all weight vectors found by early stopping must lie on a cer(cid:173) tain quadric surface, usually an ellipsoid. Given a training set and a candidate early stopping weight vector, all validation sets have least-squares weights lying on a certain plane. This latter fact can be exploited to estimate the probability of stopping at any given point along the trajectory from the initial weight vector to the least(cid:173) squares weights derived from the training set, and to estimate the probability that training goes on indefinitely. The prospects for extending this theory to nonlinear models are discussed.

Geometry of Early Stopping in Linear Networks

Authors

Abstract

Name Change Policy