Comparing human and automatic speech recognition in a perceptual restoration experiment

Research output: Contribution to journalArticleScientificpeer-review

Researchers

Research units

  • University of Sheffield

Abstract

Speech that has been distorted by introducing spectral or temporal gaps is still perceived as continuous and complete by human listeners, so long as the gaps are filled with additive noise of sufficient intensity. When such perceptual restoration occurs, the speech is also more intelligible compared to the case in which noise has not been added in the gaps. This observation has motivated so-called 'missing data' systems for automatic speech recognition (ASR), but there have been few attempts to determine whether such systems are a good model of perceptual restoration in human listeners. Accordingly, the current paper evaluates missing data ASR in a perceptual restoration task. We evaluated two systems that use a new approach to bounded marginalisation in the cepstral domain, and a bounded conditional mean imputation method. Both methods model available speech information as a clean-speech posterior distribution that is subsequently passed to an ASR system. The proposed missing data ASR systems were evaluated using distorted speech, in which spectro-temporal gaps were optionally filled with additive noise. Speech recognition performance of the proposed systems was compared against a baseline ASR system, and with human speech recognition performance on the same task. We conclude that missing data methods improve speech recognition performance in a manner that is consistent with perceptual restoration in human listeners.

Details

Original languageEnglish
Pages (from-to)14-31
Number of pages18
JournalComputer Speech and Language
Volume35
Publication statusPublished - 11 Jul 2016
MoE publication typeA1 Journal article-refereed

    Research areas

  • Automatic speech recognition, Missing data, Observation uncertainties, Perceptual restoration, Uncertainty propagation

ID: 1492969