Projects per year
Abstract
Prior studies in the automatic classification of voice quality have mainly studied support vector machine (SVM) classifiers using the acoustic speech signal as input. Recently, one voice quality classification study was published using neck surface accelerometer (NSA) and speech signals as inputs and using SVMs with hand-crafted glottal source features. The present study examines simultaneously recorded NSA and speech signals in the classification of three voice qualities (breathy, modal, and pressed) using convolutional neural networks (CNNs) as classifier. The study has two goals: (1) to investigate which of the two signals (NSA vs. speech) is more useful in the classification task, and (2) to compare whether deep learning -based CNN classifiers with spectrogram and mel-spectrogram features are able to improve the classification accuracy compared to SVM classifiers using hand-crafted glottal source features. The results indicated that the NSA signal showed better classification of the voice qualities compared to the speech signal, and that the CNN classifier outperformed the SVM classifiers with large margins. The best mean classification accuracy was achieved with mel-spectrogram as input to the CNN classifier (93.8% for NSA and 90.6% for speech).
Original language | English |
---|---|
Title of host publication | Proceedings of Interspeech'22 |
Publisher | International Speech Communication Association (ISCA) |
Pages | 5253 - 5257 |
Number of pages | 5 |
Volume | 2022-September |
DOIs | |
Publication status | Published - 2022 |
MoE publication type | A4 Article in a conference publication |
Event | Interspeech - Incheon, Korea, Republic of Duration: 18 Sept 2022 → 22 Sept 2022 |
Publication series
Name | Annual Conference of the International Speech Communication Association |
---|---|
Publisher | International Speech Communication Association |
ISSN (Print) | 1990-9772 |
ISSN (Electronic) | 2308-457X |
Conference
Conference | Interspeech |
---|---|
Country/Territory | Korea, Republic of |
City | Incheon |
Period | 18/09/2022 → 22/09/2022 |
Keywords
- Voice quality
- neck surface accelerometer
- Melspectrogram
- computational paralinguistics
- CNNs
Fingerprint
Dive into the research topics of 'Convolutional Neural Networks for Classification of Voice Qualities from Speech and Neck Surface Accelerometer Signals'. Together they form a unique fingerprint.Projects
- 1 Active
-
HEART: Speech-based biomarking of heart failure
Alku, P., Javanmardi, F., Mittapalle, K., Tirronen, S., Kadiri, S., Pohjalainen, H. & Kodali, M.
01/09/2020 → 31/08/2024
Project: Academy of Finland: Other research funding