Part of Advances in Neural Information Processing Systems 17 (NIPS 2004)
Adrian Corduneanu, Tommi Jaakkola
We provide a principle for semi-supervised learning based on optimizing the rate of communicating labels for unlabeled points with side informa- tion. The side information is expressed in terms of identities of sets of points or regions with the purpose of biasing the labels in each region to be the same. The resulting regularization objective is convex, has a unique solution, and the solution can be found with a pair of local prop- agation operations on graphs induced by the regions. We analyze the properties of the algorithm and demonstrate its performance on docu- ment classification tasks.