Statistical methods for incomplete speech data

Research output: ThesisDoctoral ThesisCollection of Articles

Abstract

Speech can be represented as an observation matrix where each node corresponds to a certain speech feature. However when speech is mixed with environmental sounds, some features cannot be observed and the observation matrix remains incomplete. The missing values are a problem because incomplete observations can support incorrect conclusions and because most applications cannot process incomplete data. Methods that are used to handle incomplete observations are called missing-data methods. This thesis presents on overview on missing-data methods and discusses their application in noise-robust automatic speech recognition. Hence we assume that the speech observations are incomplete due to environmental sounds. The methods studied in this work substitute unobserved feature values with estimates calculated based on the incomplete observations and statistical dependencies between the observed and unobserved features. This is called missing-data imputation. The main research directions include imputation methods that utilise temporal dependencies between observations and imputation methods that associate feature estimates with uncertainties. The experiments conducted in this work indicate that temporal dependencies and imputation uncertainties improve automatic speech recognition performance when speech is corrupted with environmental noise. The thesis also discusses narrowband telephone speech and bandwidth extension. Narrowband speech can be considered incomplete since observations associated with certain features are not included in the narrowband transmission. Bandwidth extension means that the narrowband observations are converted into wideband observations which include more features. The bandwidth extension methods evaluated in this work estimate wideband observations based on narrowband observations and statistical dependencies between narrowband and wideband features.
Original languageEnglish
QualificationDoctor's degree
Awarding Institution
  • Aalto University
Supervisors/Advisors
  • Kurimo, Mikko, Supervisor
  • Palomäki, Kalle, Advisor
Publisher
Print ISBNs978-952-60-6936-4
Electronic ISBNs978-952-60-6937-1
Publication statusPublished - 2016
MoE publication typeG5 Doctoral dissertation (article)

Keywords

  • automatic speech recognition
  • missing-data methods
  • noise robustness
  • observation uncertainties

Fingerprint Dive into the research topics of 'Statistical methods for incomplete speech data'. Together they form a unique fingerprint.

  • Cite this