Projects per year
Abstract
This study explores various speech data augmentation methods for the task of noise-robust fundamental frequency (F0) estimation with neural networks. The explored augmentation strategies are split into additive noise and channel -based augmentation and into vocoder-based augmentation methods. In vocoder-based augmentation, a glottal vocoder is used to enhance the accuracy of ground truth F0 used for training of the neural network, as well as to expand the training data diversity in terms of F0 patterns and vocal tract lengths of the talkers. Evaluations on the PTDB-TUG corpus indicate that noise and channel augmentation can be used to greatly increase the noise robustness of trained models, and that vocoder-based ground truth enhancement further increases model performance. For smaller datasets, vocoder-based diversity augmentation can also be used to increase performance. The best-performing proposed method greatly outperformed the compared F0 estimation methods in terms of noise robustness.
Original language | English |
---|---|
Title of host publication | 44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019; Brighton; United Kingdom; 12-17 May 2019 : Proceedings |
Publisher | IEEE |
Pages | 6485 - 6489 |
Number of pages | 5 |
ISBN (Electronic) | 978-1-4799-8131-1 |
ISBN (Print) | 978-1-4799-8132-8 |
DOIs | |
Publication status | Published - 1 May 2019 |
MoE publication type | A4 Conference publication |
Event | IEEE International Conference on Acoustics, Speech, and Signal Processing - Brighton, United Kingdom Duration: 12 May 2019 → 17 May 2019 Conference number: 44 |
Publication series
Name | IEEE International Conference on Acoustics Speech and Signal Processing |
---|---|
Publisher | IEEE |
ISSN (Print) | 1520-6149 |
ISSN (Electronic) | 2379-190X |
Conference
Conference | IEEE International Conference on Acoustics, Speech, and Signal Processing |
---|---|
Abbreviated title | ICASSP |
Country/Territory | United Kingdom |
City | Brighton |
Period | 12/05/2019 → 17/05/2019 |
Fingerprint
Dive into the research topics of 'Data augmentation strategies for neural network F0 estimation'. Together they form a unique fingerprint.Projects
- 4 Finished
-
Interdisciplinary research on statistical parametric speech synthesis
Alku, P. (Principal investigator)
01/01/2018 → 31/12/2019
Project: Academy of Finland: Other research funding
-
RIB: RIB '- Rhythms in Infant Brain: Wearables for Computational Diagnostics and Mobile Monitoring of Treatment
Räsänen, O. (Principal investigator)
01/01/2018 → 31/12/2020
Project: Academy of Finland: Other research funding
-
-: Computational basis of contextually grounded language acquisition in humans and machines
Räsänen, O. (Principal investigator)
31/12/2017 → 31/08/2023
Project: Academy of Finland: Other research funding