Distribution-Calibrated Hierarchical Classification

Part of Advances in Neural Information Processing Systems 22 (NIPS 2009)

Bibtex Metadata Paper


Ofer Dekel


While many advances have already been made on the topic of hierarchical classi- fication learning, we take a step back and examine how a hierarchical classifica- tion problem should be formally defined. We pay particular attention to the fact that many arbitrary decisions go into the design of the the label taxonomy that is provided with the training data, and that this taxonomy is often unbalanced. We correct this problem by using the data distribution to calibrate the hierarchical classification loss function. This distribution-based correction must be done with care, to avoid introducing unmanagable statstical dependencies into the learning problem. This leads us off the beaten path of binomial-type estimation and into the uncharted waters of geometric-type estimation. We present a new calibrated definition of statistical risk for hierarchical classification, an unbiased geometric estimator for this risk, and a new algorithmic reduction from hierarchical classifi- cation to cost-sensitive classification.