Breathy to tense voice discrimination using zero-time windowing cepstral coefficients (ZTWCCs)

Sudarsana Reddy Kadiri, B. Yegnanarayana

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

11 Citations (Scopus)

Abstract

In this paper, we consider breathy to tense voices, which are often considered to be opposite ends of a voice quality continuum. Along with these, other aspects of a speaker's voice play an important role to convey the information to the listener such as mood, attitude and emotional state. The glottal pulse characteristics in different phonation types vary due to the tension of laryngeal muscles together with the respiratory effort. In the present study, we are deriving the features that can capture effects of excitation on the vocal tract system through a signal processing method, called as zero-time windowing (ZTW) method. The ZTW method gives the instantaneous spectrum which captures the changes in the speech production mechanism, providing higher spectral resolution. The cepstral coefficients derived from ZTW method are used for the classification of phonation types. Along with zero-time windowing cepstral coefficients (ZTWCCs), we use the excitation source features derived from zero frequency filtering (ZFF) method. The excitation features used are: strength of excitation, energy of excitation, loudness measure and ZFF signal energy. Classification experiments using ZTWCC and excitation features reveal a significant improvement in the detection of phonation type compared to the existing voice quality features and MFCC features.

Original languageEnglish
Title of host publicationInterspeech
PublisherInternational Speech Communication Association
Pages232-236
Number of pages5
Volume2018-September
DOIs
Publication statusPublished - 2018
MoE publication typeA4 Article in a conference publication
EventInterspeech - Hyderabad International Convention Centre, Hyderabad, India
Duration: 2 Sep 20186 Sep 2018
http://interspeech2018.org/

Publication series

NameInterspeech
PublisherInternational Speech Communication Association
ISSN (Print)2308-457X

Conference

ConferenceInterspeech
Country/TerritoryIndia
CityHyderabad
Period02/09/201806/09/2018
Internet address

Keywords

  • Excitation source
  • Phonation type
  • Speech analysis

Fingerprint

Dive into the research topics of 'Breathy to tense voice discrimination using zero-time windowing cepstral coefficients (ZTWCCs)'. Together they form a unique fingerprint.

Cite this