Projects per year
Abstract
Automatic syllable count estimation (SCE) is used in a variety of applications ranging from speaking rate estimation to detecting social activity from wearable microphones or developmental research concerned with quantifying speech heard by language-learning children in different environments. The majority of previously utilized SCE methods have relied on heuristic digital signal processing (DSP) methods, and only a small number of bi-directional long short-term memory (BLSTM) approaches have made use of modern machine learning approaches in the SCE task. This letter presents a novel end-to-end method called SylNet for automatic syllable counting from speech, built on the basis of a recent developments in neural network architectures. We describe how the entire model can be optimized directly to minimize SCE error on the training data without annotations aligned at the syllable level, and how it can be adapted to new languages using limited speech data with known syllable counts. Experiments on several different languages reveal that SylNet generalizes to languages beyond its training data and further improves with adaptation. It also outperforms several previously proposed methods for syllabification, including end-to-end BLSTMs.
| Original language | English |
|---|---|
| Pages (from-to) | 1359-1363 |
| Number of pages | 5 |
| Journal | IEEE Signal Processing Letters |
| Volume | 26 |
| Issue number | 9 |
| DOIs | |
| Publication status | Published - Sept 2019 |
| MoE publication type | A1 Journal article-refereed |
Funding
This work was supported by the Academy of Finland under Grants 312105 and 314602. The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Tomoki Toda.
Keywords
- syllable count estimation
- end-to-end learning
- deep learning
- speech processing
- SEGMENTATION
Fingerprint
Dive into the research topics of 'SylNet: An Adaptable End-to-End Syllable Count Estimator for Speech'. Together they form a unique fingerprint.Projects
- 2 Finished
-
-: Computational basis of contextually grounded language acquisition in humans and machines
Räsänen, O. (Principal investigator)
31/12/2017 → 31/08/2023
Project: Academy of Finland: Other research funding
-
ACLEW: Analyzing Child Language Experiences Around the World
Räsänen, O. (Principal investigator) & Seshadri, S. (Project Member)
01/06/2017 → 31/05/2020
Project: Academy of Finland: Other research funding