Data augmentation strategies for neural network F0 estimation

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

Researchers

Research units

  • Tampere University

Abstract

This study explores various speech data augmentation methods for the task of noise-robust fundamental frequency (F0) estimation with neural networks. The explored augmentation strategies are split into additive noise and channel -based augmentation and into vocoder-based augmentation methods. In vocoder-based augmentation, a glottal vocoder is used to enhance the accuracy of ground truth F0 used for training of the neural network, as well as to expand the training data diversity in terms of F0 patterns and vocal tract lengths of the talkers. Evaluations on the PTDB-TUG corpus indicate that noise and channel augmentation can be used to greatly increase the noise robustness of trained models, and that vocoder-based ground truth enhancement further increases model performance. For smaller datasets, vocoder-based diversity augmentation can also be used to increase performance. The best-performing proposed method greatly outperformed the compared F0 estimation methods in terms of noise robustness.

Details

Original languageEnglish
Title of host publication2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings
Publication statusPublished - 1 May 2019
MoE publication typeA4 Article in a conference publication
EventIEEE International Conference on Acoustics, Speech, and Signal Processing - Brighton, United Kingdom
Duration: 12 May 201917 May 2019
Conference number: 44

Publication series

Name IEEE International Conference on Acoustics Speech and Signal Processing
PublisherIEEE
ISSN (Print)1520-6149
ISSN (Electronic)2379-190X

Conference

ConferenceIEEE International Conference on Acoustics, Speech, and Signal Processing
Abbreviated titleICASSP
CountryUnited Kingdom
CityBrighton
Period12/05/201917/05/2019

Download statistics

No data available

ID: 32097800