Multiclass Learning Approaches: A Theoretical Comparison with Implications

Part of Advances in Neural Information Processing Systems 25 (NIPS 2012)

Bibtex Metadata Paper Supplemental


Amit Daniely, Sivan Sabato, Shai Shwartz


We theoretically analyze and compare the following five popular multiclass classification methods: One vs. All, All Pairs, Tree-based classifiers, Error Correcting Output Codes (ECOC) with randomly generated code matrices, and Multiclass SVM. In the first four methods, the classification is based on a reduction to binary classification. We consider the case where the binary classifier comes from a class of VC dimension $d$, and in particular from the class of halfspaces over $\reals^d$. We analyze both the estimation error and the approximation error of these methods. Our analysis reveals interesting conclusions of practical relevance, regarding the success of the different approaches under various conditions. Our proof technique employs tools from VC theory to analyze the \emph{approximation error} of hypothesis classes. This is in sharp contrast to most, if not all, previous uses of VC theory, which only deal with estimation error.