Augmented CycleGANs for continuous scale normal-to-Lombard speaking style conversion

Tutkimustuotos: Artikkeli kirjassa/konferenssijulkaisussaConference contributionScientificvertaisarvioitu

43 Lataukset (Pure)

Abstrakti

Lombard speech is a speaking style associated with increased vocal effort that is naturally used by humans to improve intelligibility in the presence of noise. It is hence desirable to have a system capable of converting speech from normal to Lombard style. Moreover, it would be useful if one could adjust the degree of Lombardness in the converted speech so that the system is more adaptable to different noise environments. In this study, we propose the use of recently developed Augmented cycle-consistent adversarial networks (Augmented CycleGANs) for conversion between normal and Lombard speaking styles. The proposed system gives a smooth control on the degree of Lombardness of the mapped utterances by traversing through different points in the latent space of the trained model. We utilize a parametric approach that uses the Pulse Model in Log domain (PML) vocoder to extract features from normal speech that are then mapped to Lombard-style features using the Augmented CycleGAN. Finally, the mapped features are converted to Lombard speech with PML. The model is trained on multi-language data recorded in different noise conditions, and we compare its effectiveness to a previously proposed CycleGAN system in experiments for intelligibility and quality of mapped speech.
AlkuperäiskieliEnglanti
OtsikkoProceedings of Interspeech
KustantajaInternational Speech Communication Association
Sivut2838-2842
DOI - pysyväislinkit
TilaJulkaistu - 2019
OKM-julkaisutyyppiA4 Artikkeli konferenssijulkaisuussa
TapahtumaInterspeech - Graz, Itävalta
Kesto: 15 syyskuuta 201919 syyskuuta 2019
https://www.interspeech2019.org/

Julkaisusarja

NimiInterspeech - Annual Conference of the International Speech Communication Association
ISSN (elektroninen)2308-457X

Conference

ConferenceInterspeech
MaaItävalta
KaupunkiGraz
Ajanjakso15/09/201919/09/2019
www-osoite

Sormenjälki Sukella tutkimusaiheisiin 'Augmented CycleGANs for continuous scale normal-to-Lombard speaking style conversion'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.

  • Projektit

    • 3 Päättynyt

    Poikkitieteellinen parametrisen puhesynteesin tutkimusprojekti

    Murtola, T., Bollepalli, B., Nonavinakere Prabhakera, N., Juvela, L., Airaksinen, M., Bäckström, T. & Alku, P.

    01/01/201824/01/2020

    Projekti: Academy of Finland: Other research funding

    Ihmisen ja koneen kielenoppimisen kontekstisidonnainen laskennallinen perusta

    Räsänen, O.

    31/12/201731/12/2017

    Projekti: Academy of Finland: Other research funding

    ACLEW: Lasten kielikokemuksien kartoitus ja analyysi koko maailman mittakaavassa

    Räsänen, O. & Seshadri, S.

    01/06/201719/05/2020

    Projekti: Academy of Finland: Other research funding

    Siteeraa tätä

    Seshadri, S., Juvela, L., Alku, P., & Räsänen, O. (2019). Augmented CycleGANs for continuous scale normal-to-Lombard speaking style conversion. teoksessa Proceedings of Interspeech (Sivut 2838-2842). (Interspeech - Annual Conference of the International Speech Communication Association). International Speech Communication Association. https://doi.org/10.21437/Interspeech.2019-1681