Predicting Speech Intelligibility from a Population of Neurons

Part of Advances in Neural Information Processing Systems 16 (NIPS 2003)

Bibtex Metadata Paper


Jeff Bondy, Ian Bruce, Suzanna Becker, Simon Haykin


Simon Haykin

Dept. of Electrical Engineering

McMaster University

A major issue in evaluating speech enhancement and hearing compensation algorithms is to come up with a suitable metric that predicts intelligibility as judged by a human listener. Previous methods such as the widely used Speech Transmission Index (STI) fail to account for masking effects that arise from the highly nonlinear cochlear transfer function. We therefore propose a Neural Articulation speech intelligibility from the instantaneous neural spike rate over time, produced when a signal is processed by an auditory neural model. By using a well developed model of the auditory periphery and detection theory we show that human perceptual discrimination closely matches the modeled distortion in the instantaneous spike rates of the auditory nerve. In highly rippled frequency transfer conditions the NAI’s prediction error is 8% versus the STI’s prediction error of 10.8%.