Loading [MathJax]/jax/output/CommonHTML/jax.js

Trading off Consistency and Dimensionality of Convex Surrogates for Multiclass Classification

Part of Advances in Neural Information Processing Systems 37 (NeurIPS 2024) Main Conference Track

Bibtex Paper Supplemental

Authors

Enrique Nueve, Bo Waggoner, Dhamma Kimpara, Jessie Finocchiaro

Abstract

In multiclass classification over n outcomes, we typically optimize some surrogate loss L:Rd×YR assigning real-valued error to predictions in Rd. In this paradigm, outcomes must be embedded into the reals with dimension dn in order to design a consistent surrogate loss. Consistent losses are well-motivated theoretically, yet for large n, such as in information retrieval and structured prediction tasks, their optimization may be computationally infeasible. In practice, outcomes are typically embedded into some Rd for dn, with little known about their suitability for multiclass classification. We investigate two approaches for trading off consistency and dimensionality in multiclass classification while using a convex surrogate loss. We first formalize partial consistency when the optimized surrogate has dimension dn. We then check if partial consistency holds under a given embedding and low-noise assumption, providing insight into when to use a particular embedding into Rd. Finally, we present a new method to construct (fully) consistent losses with dn out of multiple problem instances. Our practical approach leverages parallelism to sidestep lower bounds on d.