Spectral Learning of General Weighted Automata via Constrained Matrix Completion

Balle, Borja; Mohri, Mehryar

Spectral Learning of General Weighted Automata via Constrained Matrix Completion

Borja Balle, Mehryar Mohri

Advances in Neural Information Processing Systems 25 (NIPS 2012)

Abstract

Many tasks in text and speech processing and computational biology require es- timating functions mapping strings to real numbers. A broad class of such func- tions can be deﬁned by weighted automata. Spectral methods based on the sin- gular value decomposition of a Hankel matrix have been recently proposed for learning a probability distribution represented by a weighted automaton from a training sample drawn according to this same target distribution. In this paper, we show how spectral methods can be extended to the problem of learning a general weighted automaton from a sample generated by an arbitrary distribution. The main obstruction to this approach is that, in general, some entries of the Hankel matrix may be missing. We present a solution to this problem based on solving a constrained matrix completion problem. Combining these two ingredients, matrix completion and spectral method, a whole new family of algorithms for learning general weighted automata is obtained. We present generalization bounds for a particular algorithm in this family. The proofs rely on a joint stability analysis of matrix completion and spectral learning.

Abstract

Name Change Policy