Part of Advances in Neural Information Processing Systems 16 (NIPS 2003)
Thomas R. Strohmann, Andrei Belitski, Gregory Grudic, Dennis DeCoste
The Minimax Probability Machine Classification (MPMC) framework [Lanckriet et al., 2002] builds classifiers by minimizing the maximum probability of misclassification, and gives direct estimates of the proba- bilistic accuracy bound Ω. The only assumptions that MPMC makes is that good estimates of means and covariance matrices of the classes exist. However, as with Support Vector Machines, MPMC is computationally expensive and requires extensive cross validation experiments to choose kernels and kernel parameters that give good performance. In this paper we address the computational cost of MPMC by proposing an algorithm that constructs nonlinear sparse MPMC (SMPMC) models by incremen- tally adding basis functions (i.e. kernels) one at a time – greedily select- ing the next one that maximizes the accuracy bound Ω. SMPMC auto- matically chooses both kernel parameters and feature weights without us- ing computationally expensive cross validation. Therefore the SMPMC algorithm simultaneously addresses the problem of kernel selection and feature selection (i.e. feature weighting), based solely on maximizing the accuracy bound Ω. Experimental results indicate that we can obtain reli- able bounds Ω, as well as test set accuracies that are comparable to state of the art classification algorithms.