Steven Nowlan, Geoffrey E. Hinton
One way of simplifying neural networks so they generalize better is to add an extra t.erm 10 the error fUll ction that will penalize complexit.y. \Ve propose a new penalt.y t.erm in which the dist rihution of weight values is modelled as a mixture of multiple gaussians . C nder this model, a set of weights is simple if the weights can be clustered into subsets so that the weights in each cluster have similar values . We allow the parameters of the mixture model to adapt at t.he same time as t.he network learns. Simulations demonstrate that this complexity term is more effective than previous complexity terms.