Interdisciplinary research on statistical parametric speech synthesis

  • Juvela, Lauri (Project Member)
  • Airaksinen, Manu (Project Member)
  • Bäckström, Tom (Project Member)
  • Pohjalainen, Jouni (Project Member)
  • Gowda, Dhananjaya (Project Member)
  • Jokinen, Emma (Project Member)
  • Alku, Paavo (Principal investigator)
  • Bollepalli, Bajibabu (Project Member)
  • Saeidi, Rahim (Project Member)
  • Raitio, Tuomo (Project Member)
  • Kakouros, Sofoklis (Project Member)

Project Details

Description

An interdisciplinary research project is proposed to develop statistical text-to-speech synthesis (TTS) technologies. We will focus on the core of TTS, the vocoder, which is the parametric block generating synthetic speech. We will search for completely new vocoding techniques based on a physiologically motivated modelling approach. The models studied utilize glottal inverse filtering (GIF), a computational method to separate speech into the glottal excitation and the vocal tract. The project aims particularly at new automatic GIF-based vocoders that outperform the current methods especially in parameterization of challenging data, such as female or child speech. The vocoders developed will be integrated into synthesis platforms to generate speech from arbitrary texts. The project is expected to improve the naturalness of spoken interaction systems hence having many potential ICT-related applications (e.g., speech-to-speech translation and assistive technology).
StatusFinished
Effective start/end date01/01/201531/12/2017

Collaborative partners

  • Aalto University (lead)
  • SA: Research funding (other) (Project partner)
  • SA: Research funding (other) (Project partner)
  • Suomen Akatemia (Project partner)

Fingerprint

Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.