Multiclass Boosting: Theory and Algorithms

Part of Advances in Neural Information Processing Systems 24 (NIPS 2011)

Bibtex Metadata Paper Supplemental


Mohammad Saberian, Nuno Vasconcelos


The problem of multiclass boosting is considered. A new framework,based on multi-dimensional codewords and predictors is introduced. The optimal set of codewords is derived, and a margin enforcing loss proposed. The resulting risk is minimized by gradient descent on a multidimensional functional space. Two algorithms are proposed: 1) CD-MCBoost, based on coordinate descent, updates one predictor component at a time, 2) GD-MCBoost, based on gradient descent, updates all components jointly. The algorithms differ in the weak learners that they support but are both shown to be 1) Bayes consistent, 2) margin enforcing, and 3) convergent to the global minimum of the risk. They also reduce to AdaBoost when there are only two classes. Experiments show that both methods outperform previous multiclass boosting approaches on a number of datasets.