Transfer learning for text classification

Part of Advances in Neural Information Processing Systems 18 (NIPS 2005)

Bibtex Metadata Paper

Authors

Chuong B. Do, Andrew Y. Ng

Abstract

Linear text classification algorithms work by computing an inner prod- uct between a test document vector and a parameter vector. In many such algorithms, including naive Bayes and most TFIDF variants, the parame- ters are determined by some simple, closed-form, function of training set statistics; we call this mapping mapping from statistics to parameters, the parameter function. Much research in text classification over the last few decades has consisted of manual efforts to identify better parameter func- tions. In this paper, we propose an algorithm for automatically learning this function from related classification problems. The parameter func- tion found by our algorithm then defines a new learning algorithm for text classification, which we can apply to novel classification tasks. We find that our learned classifier outperforms existing methods on a variety of multiclass text classification tasks.

arg maxk2f1;:::;KgPn