Part of Advances in Neural Information Processing Systems 14 (NIPS 2001)
Martin Szummer, Tommi Jaakkola
To classify a large number of unlabeled examples we combine a lim- ited number of labeled examples with a Markov random walk represen- tation over the unlabeled examples. The random walk representation ex- ploits any low dimensional structure in the data in a robust, probabilistic manner. We develop and compare several estimation criteria/algorithms suited to this representation. This includes in particular multi-way clas- sification with an average margin criterion which permits a closed form solution. The time scale of the random walk regularizes the representa- tion and can be set through a margin-based criterion favoring unambigu- ous classification. We also extend this basic regularization by adapting time scales for individual examples. We demonstrate the approach on synthetic examples and on text classification problems.