Part of Advances in Neural Information Processing Systems 9 (NIPS 1996)
David Barber, Christopher Williams
The full Bayesian method for applying neural networks to a pre(cid:173) diction problem is to set up the prior/hyperprior structure for the net and then perform the necessary integrals. However, these inte(cid:173) grals are not tractable analytically, and Markov Chain Monte Carlo (MCMC) methods are slow, especially if the parameter space is high-dimensional. Using Gaussian processes we can approximate the weight space integral analytically, so that only a small number of hyperparameters need be integrated over by MCMC methods. We have applied this idea to classification problems, obtaining ex(cid:173) cellent results on the real-world problems investigated so far .