Speaking style conversion from normal to Lombard speech using a glottal vocoder and Bayesian GMMs

Ana Ramirez Lopez, Shreyas Seshadri, Lauri Juvela, Okko Räsänen, Paavo Alku

Tutkimustuotos: Artikkeli kirjassa/konferenssijulkaisussaConference article in proceedingsScientificvertaisarvioitu

16 Sitaatiot (Scopus)
372 Lataukset (Pure)

Abstrakti

Speaking style conversion is the technology of converting natural speech signals from one style to another. In this study, we focus on normal-to-Lombard conversion. This can be used, for example, to enhance the intelligibility of speech in noisy environments. We propose a parametric approach that uses a vocoder to extract speech features. These features are mapped using Bayesian GMMs from utterances spoken in normal style to the corresponding features of Lombard speech. Finally, the mapped features are converted to a Lombard speech waveform with the vocoder. Two vocoders were compared in the proposed normal-to-Lombard conversion: a recently developed glottal vocoder that decomposes speech into glottal flow excitation and vocal tract, and the widely used STRAIGHT vocoder. The conversion quality was evaluated in two subjective listening tests measuring subjective similarity and naturalness. The similarity test results show that the system is able to convert normal speech into Lombard speech for the two vocoders. However, the subjective naturalness of the converted Lombard speech was clearly better using the glottal vocoder in comparison to STRAIGHT.
AlkuperäiskieliEnglanti
OtsikkoProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
KustantajaInternational Speech Communication Association (ISCA)
Sivut1363-1367
Sivumäärä5
Vuosikerta2017-August
ISBN (painettu)978-1-5108-4876-4
DOI - pysyväislinkit
TilaJulkaistu - elok. 2017
OKM-julkaisutyyppiA4 Artikkeli konferenssijulkaisussa
TapahtumaInterspeech - Stockholm, Ruotsi
Kesto: 20 elok. 201724 elok. 2017
Konferenssinumero: 18
http://www.interspeech2017.org/

Julkaisusarja

NimiInterspeech: Annual Conference of the International Speech Communication Association
ISSN (elektroninen)1990-9772

Conference

ConferenceInterspeech
Maa/AlueRuotsi
KaupunkiStockholm
Ajanjakso20/08/201724/08/2017
www-osoite

Sormenjälki

Sukella tutkimusaiheisiin 'Speaking style conversion from normal to Lombard speech using a glottal vocoder and Bayesian GMMs'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.

Siteeraa tätä