Abstract
Nasals and approximants consonants are often confused with each other. Despite the distinction in the production mechanism, these two sound classes exhibit a similar low frequency behavior, and lack significant high frequency content. The present study uses a spectral representation obtained using the zero time windowing (ZTW) analysis of speech, for the task of distinction between these two. The instantaneous spectral representation has good resolution at resonances, which helps to highlight the difference in the acoustic vocal tract system response for these sounds. The ZTW spectra around the regions of glottal closure instants are averaged to derive parameters for their classification in continuous speech. A set of parameters based on the dominant resonances, center of gravity, band energy ratio, and cumulative spectral sum in low frequencies, is derived from the average spectrum. The paper proposes classification using a knowledge–based approach and training a support vector machine. These classifiers are tested on utterances from different English speakers in the TIMIT dataset. The proposed methods result in an average classification accuracy of 90% between the two classes in continuous speech.
Original language | English |
---|---|
Title of host publication | Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH |
Publisher | International Speech Communication Association (ISCA) |
Pages | 177-181 |
Number of pages | 5 |
DOIs | |
Publication status | Published - 2018 |
MoE publication type | A4 Conference publication |
Event | Interspeech - Hyderabad International Convention Centre, Hyderabad, India Duration: 2 Sept 2018 → 6 Sept 2018 http://interspeech2018.org/ |
Publication series
Name | Interspeech |
---|---|
ISSN (Print) | 1990-9772 |
ISSN (Electronic) | 2308-457X |
Conference
Conference | Interspeech |
---|---|
Country/Territory | India |
City | Hyderabad |
Period | 02/09/2018 → 06/09/2018 |
Internet address |