Part of Advances in Neural Information Processing Systems 15 (NIPS 2002)
Naonori Ueda, Kazumi Saito
We propose probabilistic generative models, called parametric mix- ture models (PMMs), for multiclass, multi-labeled text categoriza- tion problem. Conventionally, the binary classi(cid:12)cation approach has been employed, in which whether or not text belongs to a cat- egory is judged by the binary classi(cid:12)er for every category. In con- trast, our approach can simultaneously detect multiple categories of text using PMMs. We derive e(cid:14)cient learning and prediction algo- rithms for PMMs. We also empirically show that our method could signi(cid:12)cantly outperform the conventional binary methods when ap- plied to multi-labeled text categorization using real World Wide Web pages.