Part of Advances in Neural Information Processing Systems 8 (NIPS 1995)
Ansgar West, David Saad
An adaptive back-propagation algorithm is studied and compared with gradient descent (standard back-propagation) for on-line learning in two-layer neural networks with an arbitrary number of hidden units. Within a statistical mechanics framework , both numerical studies and a rigorous analysis show that the adaptive back-propagation method results in faster training by breaking the symmetry between hidden units more efficiently and by providing faster convergence to optimal generalization than gradient descent.