Projects per year
Abstract
The present study investigates the use of 1-dimensional (1-D) and 2-dimensional (2-D) spectral feature representations in voice pathology detection with several classical machine learning (ML) and recent deep learning (DL) classifiers. Four popularly used spectral feature representations (static mel-frequency cepstral coefficients (MFCCs), dynamic MFCCs, spectrogram and mel-spectrogram) are derived in both the 1-D and 2-D form from voice signals. Three widely used ML classifiers (support vector machine (SVM), random forest (RF) and Adaboost) and three DL classifiers (deep neural network (DNN), long short-term memory (LSTM) network, and convolutional neural network (CNN)) are used with the 1-D feature representations. In addition, CNN classifiers are built using the 2-D feature representations. The popularly used HUPA database is considered in the pathology detection experiments. Experimental results revealed that using the CNN classifier with the 2-D feature representations yielded better accuracy compared to using the ML and DL classifiers with the 1-D feature representations. The best performance was achieved using the 2-D CNN classifier based on dynamic MFCCs that showed a detection accuracy of 81%.
Original language | English |
---|---|
Title of host publication | INTERSPEECH 2022 |
Publisher | International Speech Communication Association (ISCA) |
Pages | 2173 - 2177 |
Number of pages | 5 |
Volume | 2022-September |
DOIs | |
Publication status | Published - Sept 2022 |
MoE publication type | A4 Article in a conference publication |
Event | Interspeech - Incheon, Korea, Republic of Duration: 18 Sept 2022 → 22 Sept 2022 |
Publication series
Name | Interspeech |
---|---|
Publisher | International Speech Communication Association |
ISSN (Print) | 1990-9772 |
ISSN (Electronic) | 2308-457X |
Conference
Conference | Interspeech |
---|---|
Country/Territory | Korea, Republic of |
City | Incheon |
Period | 18/09/2022 → 22/09/2022 |
Fingerprint
Dive into the research topics of 'Comparing 1-dimensional and 2-dimensional spectral feature representations in voice pathology detection using machine learning and deep learning classifiers'. Together they form a unique fingerprint.Projects
- 1 Active
-
HEART: Speech-based biomarking of heart failure
Alku, P., Javanmardi, F., Mittapalle, K., Tirronen, S., Kadiri, S., Pohjalainen, H. & Kodali, M.
01/09/2020 → 31/08/2024
Project: Academy of Finland: Other research funding