Search for Information Bearing Components in Speech

Part of Advances in Neural Information Processing Systems 12 (NIPS 1999)

Howard Yang, Hynek Hermansky


In this paper, we use mutual information to characterize the dis(cid:173) tributions of phonetic and speaker/channel information in a time(cid:173) frequency space. The mutual information (MI) between the pho(cid:173) netic label and one feature, and the joint mutual information (JMI) between the phonetic label and two or three features are estimated . The Miller's bias formulas for entropy and mutual information es(cid:173) timates are extended to include higher order terms. The MI and the JMI for speaker/channel recognition are also estimated. The results are complementary to those for phonetic classification. Our results show how the phonetic information is locally spread and how the speaker/channel information is globally spread in time and frequency.