Dysarthric speech classification from coded telephone speech using glottal features

Research output: Contribution to journalArticle

Standard

Harvard

APA

Vancouver

Author

Bibtex - Download

@article{50a9cb90f7de4ab38adc73776470d179,
title = "Dysarthric speech classification from coded telephone speech using glottal features",
abstract = "This paper proposes a new dysarthric speech classification method from coded telephone speech using glottal features. The proposed method utilizes glottal features, which are efficiently estimated from coded telephone speech using a recently proposed deep neural net-based glottal inverse filtering method. Two sets of glottal features were considered: (1) time- and frequency-domainparameters and (2) parameters based on principal component analysis (PCA). In addition, acoustic features are extracted from coded telephone speech using the openSMILE toolkit. The proposed method utilizes both acoustic and glottal features extracted from coded speech utterances and their corresponding dysarthric/healthy labels to train support vector machine classifiers. Separateclassifiers are trained using both individual, and the combination of glottal and acoustic features. The coded telephone speech used in the experiments is generated using the adaptive multi-rate codec, which operates in two transmission bandwidths: narrowband (300 Hz - 3.4 kHz) and wideband (50 Hz - 7 kHz). The experiments were conducted using dysarthric and healthy speech utterances of the TORGO and universal access speech (UA-Speech) databases. Classification accuracy results indicated the effectiveness of glottal features in the identification of dysarthria from coded telephone speech. The results also showed that the glottal features in combination with the openSMILE-based acoustic features resulted in improved classification accuracies, which validate the complementary nature of glottal features. The proposed dysarthric speech classification method can potentially be employed in telemonitoring application for identifying the presence of dysarthria from coded telephone speech.",
keywords = "Dysarthric speech, glottal parameters, glottal source estimation, glottal inverse filtering, openSMILE, support vector machines, telemonitoring",
author = "{Nonavinakere Prabhakera}, Narendra and Paavo Alku",
year = "2019",
month = "7",
doi = "10.1016/j.specom.2019.04.003",
language = "English",
volume = "110",
pages = "47--55",
journal = "Speech Communication",
issn = "0167-6393",
publisher = "Elsevier",

}

RIS - Download

TY - JOUR

T1 - Dysarthric speech classification from coded telephone speech using glottal features

AU - Nonavinakere Prabhakera, Narendra

AU - Alku, Paavo

PY - 2019/7

Y1 - 2019/7

N2 - This paper proposes a new dysarthric speech classification method from coded telephone speech using glottal features. The proposed method utilizes glottal features, which are efficiently estimated from coded telephone speech using a recently proposed deep neural net-based glottal inverse filtering method. Two sets of glottal features were considered: (1) time- and frequency-domainparameters and (2) parameters based on principal component analysis (PCA). In addition, acoustic features are extracted from coded telephone speech using the openSMILE toolkit. The proposed method utilizes both acoustic and glottal features extracted from coded speech utterances and their corresponding dysarthric/healthy labels to train support vector machine classifiers. Separateclassifiers are trained using both individual, and the combination of glottal and acoustic features. The coded telephone speech used in the experiments is generated using the adaptive multi-rate codec, which operates in two transmission bandwidths: narrowband (300 Hz - 3.4 kHz) and wideband (50 Hz - 7 kHz). The experiments were conducted using dysarthric and healthy speech utterances of the TORGO and universal access speech (UA-Speech) databases. Classification accuracy results indicated the effectiveness of glottal features in the identification of dysarthria from coded telephone speech. The results also showed that the glottal features in combination with the openSMILE-based acoustic features resulted in improved classification accuracies, which validate the complementary nature of glottal features. The proposed dysarthric speech classification method can potentially be employed in telemonitoring application for identifying the presence of dysarthria from coded telephone speech.

AB - This paper proposes a new dysarthric speech classification method from coded telephone speech using glottal features. The proposed method utilizes glottal features, which are efficiently estimated from coded telephone speech using a recently proposed deep neural net-based glottal inverse filtering method. Two sets of glottal features were considered: (1) time- and frequency-domainparameters and (2) parameters based on principal component analysis (PCA). In addition, acoustic features are extracted from coded telephone speech using the openSMILE toolkit. The proposed method utilizes both acoustic and glottal features extracted from coded speech utterances and their corresponding dysarthric/healthy labels to train support vector machine classifiers. Separateclassifiers are trained using both individual, and the combination of glottal and acoustic features. The coded telephone speech used in the experiments is generated using the adaptive multi-rate codec, which operates in two transmission bandwidths: narrowband (300 Hz - 3.4 kHz) and wideband (50 Hz - 7 kHz). The experiments were conducted using dysarthric and healthy speech utterances of the TORGO and universal access speech (UA-Speech) databases. Classification accuracy results indicated the effectiveness of glottal features in the identification of dysarthria from coded telephone speech. The results also showed that the glottal features in combination with the openSMILE-based acoustic features resulted in improved classification accuracies, which validate the complementary nature of glottal features. The proposed dysarthric speech classification method can potentially be employed in telemonitoring application for identifying the presence of dysarthria from coded telephone speech.

KW - Dysarthric speech

KW - glottal parameters

KW - glottal source estimation

KW - glottal inverse filtering

KW - openSMILE

KW - support vector machines

KW - telemonitoring

U2 - 10.1016/j.specom.2019.04.003

DO - 10.1016/j.specom.2019.04.003

M3 - Article

VL - 110

SP - 47

EP - 55

JO - Speech Communication

JF - Speech Communication

SN - 0167-6393

ER -

ID: 32820526