Part of Advances in Neural Information Processing Systems 12 (NIPS 1999)
Yishay Mansour, David McAllester
It is known that decision tree learning can be viewed as a form of boosting. However, existing boosting theorems for decision tree learning allow only binary-branching trees and the generalization to multi-branching trees is not immediate. Practical decision tree al(cid:173) gorithms, such as CART and C4.5, implement a trade-off between the number of branches and the improvement in tree quality as measured by an index function. Here we give a boosting justifica(cid:173) tion for a particular quantitative trade-off curve. Our main theorem states, in essence, that if we require an improvement proportional to the log of the number of branches then top-down greedy con(cid:173) struction of decision trees remains an effective boosting algorithm.