Comparing human and automatic speech recognition in a perceptual restoration experiment

Ulpu Remes*, Ana Ramírez López, Lauri Juvela, Kalle Palomäki, Guy J. Brown, Paavo Alku, Mikko Kurimo

*Tämän työn vastaava kirjoittaja

Tutkimustuotos: LehtiartikkeliArticleScientificvertaisarvioitu

Abstrakti

Speech that has been distorted by introducing spectral or temporal gaps is still perceived as continuous and complete by human listeners, so long as the gaps are filled with additive noise of sufficient intensity. When such perceptual restoration occurs, the speech is also more intelligible compared to the case in which noise has not been added in the gaps. This observation has motivated so-called 'missing data' systems for automatic speech recognition (ASR), but there have been few attempts to determine whether such systems are a good model of perceptual restoration in human listeners. Accordingly, the current paper evaluates missing data ASR in a perceptual restoration task. We evaluated two systems that use a new approach to bounded marginalisation in the cepstral domain, and a bounded conditional mean imputation method. Both methods model available speech information as a clean-speech posterior distribution that is subsequently passed to an ASR system. The proposed missing data ASR systems were evaluated using distorted speech, in which spectro-temporal gaps were optionally filled with additive noise. Speech recognition performance of the proposed systems was compared against a baseline ASR system, and with human speech recognition performance on the same task. We conclude that missing data methods improve speech recognition performance in a manner that is consistent with perceptual restoration in human listeners.

AlkuperäiskieliEnglanti
Sivut14-31
Sivumäärä18
JulkaisuComputer Speech and Language
Vuosikerta35
DOI - pysyväislinkit
TilaJulkaistu - 11 heinäk. 2016
OKM-julkaisutyyppiA1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä

Sormenjälki

Sukella tutkimusaiheisiin 'Comparing human and automatic speech recognition in a perceptual restoration experiment'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.

Siteeraa tätä