Gelp: GAN-excited linear prediction for speech synthesis from mel-spectrogram

Lauri Juvela, Bajibabu Bollepalli, Junichi Yamagishi, Paavo Alku

Tutkimustuotos: Artikkeli kirjassa/konferenssijulkaisussaConference article in proceedingsScientificvertaisarvioitu

33 Sitaatiot (Scopus)
225 Lataukset (Pure)

Abstrakti

Recent advances in neural network -based text-to-speech have reached human level naturalness in synthetic speech. The present sequence-to-sequence models can directly map text to mel-spectrogram acoustic features, which are convenient for modeling, but present additional challenges for vocoding (i.e., waveform generation from the acoustic features). High-quality synthesis can be achieved with neural vocoders, such as WaveNet, but such autoregressive models suffer from slow sequential inference. Meanwhile, their existing parallel inference counterparts are difficult to train and require increasingly large model sizes. In this paper, we propose an alternative training strategy for a parallel neural vocoder utilizing generative adversarial networks, and integrate a linear predictive synthesis filter into the model. Results show that the proposed model achieves significant improvement in inference speed, while outperforming a WaveNet in copy-synthesis quality.

AlkuperäiskieliEnglanti
OtsikkoProceedings of Interspeech
KustantajaInternational Speech Communication Association (ISCA)
Sivut694-698
Sivumäärä5
Vuosikerta2019-September
DOI - pysyväislinkit
TilaJulkaistu - 1 tammik. 2019
OKM-julkaisutyyppiA4 Artikkeli konferenssijulkaisussa
TapahtumaInterspeech - Graz, Itävalta
Kesto: 15 syysk. 201919 syysk. 2019
https://www.interspeech2019.org/

Julkaisusarja

NimiInterspeech - Annual Conference of the International Speech Communication Association
ISSN (elektroninen)2308-457X

Conference

ConferenceInterspeech
Maa/AlueItävalta
KaupunkiGraz
Ajanjakso15/09/201919/09/2019
www-osoite

Sormenjälki

Sukella tutkimusaiheisiin 'Gelp: GAN-excited linear prediction for speech synthesis from mel-spectrogram'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.

Siteeraa tätä