Abstrakti
Recent advances in speech synthesis suggest that limitations such as the lossy nature of the amplitude spectrum with minimum phase approximation and the over-smoothing effect in acoustic modeling can be overcome by using advanced machine learning approaches. In this paper, we build a framework in which we can fairly compare new vocoding and acoustic modeling techniques with conventional approaches by means of a large scale crowdsourced evaluation. Results on acoustic models showed that generative adversarial networks and an autoregressive (AR) model performed better than a normal recurrent network and the AR model performed best. Evaluation on vocoders by using the same AR acoustic model demonstrated that a Wavenet vocoder outperformed classical source-filter-based vocoders. Particularly, generated speech waveforms from the combination of AR acoustic model and Wavenet vocoder achieved a similar score of speech quality to vocoded speech.
| Alkuperäiskieli | Englanti |
|---|---|
| Otsikko | 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Proceedings |
| Julkaisupaikka | United States |
| Kustantaja | IEEE |
| Sivut | 4804-4808 |
| Sivumäärä | 5 |
| Vuosikerta | 2018-April |
| ISBN (elektroninen) | 978-1-5386-4658-8 |
| ISBN (painettu) | 978-1-5386-4659-5 |
| DOI - pysyväislinkit | |
| Tila | Julkaistu - 10 syysk. 2018 |
| OKM-julkaisutyyppi | A4 Artikkeli konferenssijulkaisussa |
| Tapahtuma | IEEE International Conference on Acoustics, Speech, and Signal Processing - Calgary, Kanada Kesto: 15 huhtik. 2018 → 20 huhtik. 2018 https://2018.ieeeicassp.org/ |
Julkaisusarja
| Nimi | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing |
|---|---|
| ISSN (elektroninen) | 2379-190X |
Conference
| Conference | IEEE International Conference on Acoustics, Speech, and Signal Processing |
|---|---|
| Lyhennettä | ICASSP |
| Maa/Alue | Kanada |
| Kaupunki | Calgary |
| Ajanjakso | 15/04/2018 → 20/04/2018 |
| www-osoite |
Sormenjälki
Sukella tutkimusaiheisiin 'A Comparison of Recent Waveform Generation and Acoustic Modeling Methods for Neural-Network-Based Speech Synthesis'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.Siteeraa tätä
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver