Mel-frequency cepstral coefficients of voice source waveforms for classification of phonation types in speech

Sudarsana Reddy Kadiri, Paavo Alku

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

12 Citations (Scopus)
400 Downloads (Pure)

Abstract

Voice source characteristics in different phonation types vary due to the tension of laryngeal muscles along with the respiratory effort. This study investigates the use of mel-frequency cepstral coefficients (MFCCs) derived from voice source waveforms for classification of phonation types in speech. The cepstral coefficients are computed using two source waveforms: (1) glottal flow waveforms estimated by the quasi-closed phase (QCP) glottal inverse filtering method and (2) approximate voice source waveforms obtained using the zero frequency filtering (ZFF) method. QCP estimates voice source waveforms based on the source-filter decomposition while ZFF yields source waveforms without explicitly computing the source-filter decomposition. Experiments using MFCCs computed from the two source waveforms show improved accuracy in classification of phonation types compared to the existing voice source features and conventional MFCC features. Further, it is observed that the proposed features have complimentary information to the existing features.

Original languageEnglish
Title of host publicationProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
PublisherInternational Speech Communication Association (ISCA)
Pages2508-2512
Number of pages5
Volume2019-September
DOIs
Publication statusPublished - 1 Jan 2019
MoE publication typeA4 Conference publication
EventInterspeech - Graz, Austria
Duration: 15 Sept 201919 Sept 2019
https://www.interspeech2019.org/

Publication series

NameInterspeech - Annual Conference of the International Speech Communication Association, INTERSPEECH
ISSN (Electronic)2308-457X

Conference

ConferenceInterspeech
Country/TerritoryAustria
CityGraz
Period15/09/201919/09/2019
Internet address

Keywords

  • Glottal inverse filtering
  • Phonation type
  • Speech analysis
  • Voice quality
  • Voice source
  • Zero frequency filtering

Fingerprint

Dive into the research topics of 'Mel-frequency cepstral coefficients of voice source waveforms for classification of phonation types in speech'. Together they form a unique fingerprint.

Cite this