Whispered Speech Detection Using Fusion of Group-Delay-Based Subband Modulation Spectrum and Correntropy Features

Jinfang Wang, Yongqiang Shang, Shuangshuang Jiang, Dhananjaya Gowda, Ke Lv

Tutkimustuotos: LehtiartikkeliLetterScientificvertaisarvioitu

1 Sitaatiot (Scopus)

Abstrakti

In this letter, we propose a novel fusion feature for detection of whispered speech in noisy environment using a group-delay-based instantaneous spectrum analysis. The fusion feature involves two individual components, namely, subband modulation spectrum (SMS)-based features and subband correntropy (SCE) features, both extracted from the instantaneous spectrum. The instantaneous spectrum estimation involves zero-time windowing for improved temporal resolution and group-delay computation for improved spectral resolution, as compared to the traditional discrete-Fourier-transform-based spectrum estimation. The SMS features capture the spectral representation of the subband energy time trajectories, while the SCE features model the fluctuations in the subband energy time trajectories. The SMS captures both the short-term as well as long-term spectral characteristics of whispered speech and is known to provide good separation between speech and noise components. The correntropy features help capture the dynamics of the vocal tract system to discriminate noisy whisper from noise. Whisper speech detection experiments using support vector machine models and the proposed features indicate promising performance under low signal-to-noise conditions.

AlkuperäiskieliEnglanti
Artikkeli7491318
Sivut1042-1046
Sivumäärä5
JulkaisuIEEE Signal Processing Letters
Vuosikerta23
Numero8
DOI - pysyväislinkit
TilaJulkaistu - 1 elokuuta 2016
OKM-julkaisutyyppiA1 Julkaistu artikkeli, soviteltu

Sormenjälki

Sukella tutkimusaiheisiin 'Whispered Speech Detection Using Fusion of Group-Delay-Based Subband Modulation Spectrum and Correntropy Features'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.

Siteeraa tätä