I. Guyon, B. Boser, V. Vapnik
Large VC-dimension classifiers can learn difficult tasks, but are usually impractical because they generalize well only if they are trained with huge quantities of data. In this paper we show that even high-order polynomial classifiers in high dimensional spaces can be trained with a small amount of training data and yet generalize better than classifiers with a smaller VC-dimension. This is achieved with a maximum margin algorithm (the Generalized Portrait). The technique is applicable to a wide variety of classifiers, including Perceptrons, polynomial classifiers (sigma-pi unit net(cid:173) works) and Radial Basis Functions. The effective number of parameters is adjusted automatically by the training algorithm to match the complexity of the problem. It is shown to equal the number of those training patterns which are closest patterns to the decision boundary (supporting patterns). Bounds on the generalization error and the speed of convergence of the al(cid:173) gorithm are given. Experimental results on handwritten digit recognition demonstrate good generalization compared to other algorithms.