Boosting Algorithms as Gradient Descent

Part of Advances in Neural Information Processing Systems 12 (NIPS 1999)

Bibtex Metadata Paper


Llew Mason, Jonathan Baxter, Peter Bartlett, Marcus Frean


We provide an abstract characterization of boosting algorithms as gradient decsent on cost-functionals in an inner-product function space. We prove convergence of these functional-gradient-descent algorithms under quite weak conditions. Following previous theo(cid:173) retical results bounding the generalization performance of convex combinations of classifiers in terms of general cost functions of the margin, we present a new algorithm (DOOM II) for performing a gradient descent optimization of such cost functions. Experiments on several data sets from the UC Irvine repository demonstrate that DOOM II generally outperforms AdaBoost, especially in high noise situations, and that the overfitting behaviour of AdaBoost is predicted by our cost functions.