William Huang, Richard P. Lippmann
Two approaches were explored which integrate neural net classifiers with Hidden Markov Model (HMM) speech recognizers. Both at(cid:173) tempt to improve speech pattern discrimination while retaining the temporal processing advantages of HMMs. One approach used neu(cid:173) ral nets to provide second-stage discrimination following an HMM recognizer. On a small vocabulary task, Radial Basis Function (RBF) and back-propagation neural nets reduced the error rate substantially (from 7.9% to 4.2% for the RBF classifier). In a larger vocabulary task, neural net classifiers did not reduce the error rate. They, however, outperformed Gaussian, Gaussian mixture, and k(cid:173) nearest neighbor (KNN) classifiers. In another approach, neural nets functioned as low-level acoustic-phonetic feature extractors. When classifying phonemes based on single 10 msec. frames, dis(cid:173) criminant RBF neural net classifiers outperformed Gaussian mix(cid:173) ture classifiers. Performance, however, differed little when classi(cid:173) fying phones by accumulating scores across all frames in phonetic segments using a single node HMM recognizer.
-This work was sponsored by the Department of the Air Force and the Air Force Office of
HMM Speech Recognition with Neural Net Discrimination