Nicolò Cesa-bianchi, Claudio Gentile, Luca Zaniboni
We provide a worst-case analysis of selective sampling algorithms for learning linear threshold functions. The algorithms considered in this paper are Perceptron-like algorithms, i.e., algorithms which can be efﬁ- ciently run in any reproducing kernel Hilbert space. Our algorithms ex- ploit a simple margin-based randomized rule to decide whether to query the current label. We obtain selective sampling algorithms achieving on average the same bounds as those proven for their deterministic coun- terparts, but using much fewer labels. We complement our theoretical ﬁndings with an empirical comparison on two text categorization tasks. The outcome of these experiments is largely predicted by our theoreti- cal results: Our selective sampling algorithms tend to perform as good as the algorithms receiving the true label after each classiﬁcation, while observing in practice substantially fewer labels.