Part of Advances in Neural Information Processing Systems 22 (NIPS 2009)
Yang Wang, Gholamreza Haffari, Shaojun Wang, Greg Mori
We propose a novel information theoretic approach for semi-supervised learning of conditional random fields. Our approach defines a training objective that combines the conditional likelihood on labeled data and the mutual information on unlabeled data. Different from previous minimum conditional entropy semi-supervised discriminative learning methods, our approach can be naturally cast into the rate distortion theory framework in information theory. We analyze the tractability of the framework for structured prediction and present a convergent variational training algorithm to defy the combinatorial explosion of terms in the sum over label configurations. Our experimental results show that the rate distortion approach outperforms standard $l_2$ regularization and minimum conditional entropy regularization on both multi-class classification and sequence labeling problems.