Whispered Speech Detection Using Fusion of Group-Delay-Based Subband Modulation Spectrum and Correntropy Features

Jinfang Wang, Yongqiang Shang, Shuangshuang Jiang, Dhananjaya Gowda, Ke Lv

Research output: Contribution to journalLetterScientificpeer-review

2 Citations (Scopus)


In this letter, we propose a novel fusion feature for detection of whispered speech in noisy environment using a group-delay-based instantaneous spectrum analysis. The fusion feature involves two individual components, namely, subband modulation spectrum (SMS)-based features and subband correntropy (SCE) features, both extracted from the instantaneous spectrum. The instantaneous spectrum estimation involves zero-time windowing for improved temporal resolution and group-delay computation for improved spectral resolution, as compared to the traditional discrete-Fourier-transform-based spectrum estimation. The SMS features capture the spectral representation of the subband energy time trajectories, while the SCE features model the fluctuations in the subband energy time trajectories. The SMS captures both the short-term as well as long-term spectral characteristics of whispered speech and is known to provide good separation between speech and noise components. The correntropy features help capture the dynamics of the vocal tract system to discriminate noisy whisper from noise. Whisper speech detection experiments using support vector machine models and the proposed features indicate promising performance under low signal-to-noise conditions.

Original languageEnglish
Article number7491318
Pages (from-to)1042-1046
Number of pages5
JournalIEEE Signal Processing Letters
Issue number8
Publication statusPublished - 1 Aug 2016
MoE publication typeA1 Journal article-refereed


  • correntropy
  • group delay
  • instantaneous spectrum analysis
  • modulation spectrum
  • Whispered speech detection
  • zero-time windowing


Dive into the research topics of 'Whispered Speech Detection Using Fusion of Group-Delay-Based Subband Modulation Spectrum and Correntropy Features'. Together they form a unique fingerprint.

Cite this