Part of Advances in Neural Information Processing Systems 11 (NIPS 1998)
Lawrence Saul, Mazin Rahim
We investigate a probabilistic framework for automatic speech recognition based on the intrinsic geometric properties of curves. In particular, we analyze the setting in which two variables-one continuous (~), one discrete (s )-evolve jointly in time. We sup(cid:173) pose that the vector ~ traces out a smooth multidimensional curve and that the variable s evolves stochastically as a function of the arc length traversed along this curve. Since arc length does not depend on the rate at which a curve is traversed, this gives rise to a family of Markov processes whose predictions, Pr[sl~]' are invariant to nonlinear warpings of time. We describe the use of such models, known as Markov processes on curves (MPCs), for automatic speech recognition, where ~ are acoustic feature trajec(cid:173) tories and s are phonetic transcriptions. On two tasks-recognizing New Jersey town names and connected alpha-digits- we find that MPCs yield lower word error rates than comparably trained hidden Markov models.