Part of Advances in Neural Information Processing Systems 2 (NIPS 1989)
The learning dynamics of the back-propagation algorithm are in(cid:173) vestigated when complexity constraints are added to the standard Least Mean Square (LMS) cost function. It is shown that loss of generalization performance due to overtraining can be avoided when using such complexity constraints. Furthermore, "energy," hidden representations and weight distributions are observed and compared during learning. An attempt is made at explaining the results in terms of linear and non-linear effects in relation to the gradient descent learning algorithm.