A general agnostic active learning algorithm

Part of Advances in Neural Information Processing Systems 20 (NIPS 2007)

Bibtex Metadata Paper

Authors

Sanjoy Dasgupta, Daniel J. Hsu, Claire Monteleoni

Abstract

We present an agnostic active learning algorithm for any hypothesis class of bounded VC dimension under arbitrary data distributions. Most previ- ous work on active learning either makes strong distributional assumptions, or else is computationally prohibitive. Our algorithm extends the simple scheme of Cohn, Atlas, and Ladner [1] to the agnostic setting, using re- ductions to supervised learning that harness generalization bounds in a simple but subtle manner. We provide a fall-back guarantee that bounds the algorithm’s label complexity by the agnostic PAC sample complexity. Our analysis yields asymptotic label complexity improvements for certain hypothesis classes and distributions. We also demonstrate improvements experimentally.