Normal-to-shouted speech spectral mapping for speaker recognition under vocal effort mismatch

Ana Ramírez López, Rahim Saeidi, Lauri Juvela, Paavo Alku

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

6 Citations (Scopus)

Abstract

Speaker recognition performance degrades substantially in case of vocal effort mismatch (e.g. shouted vs. normal speech) between test and enrollment utterances. Such a mismatch is often encountered, for example, in forensic speaker recognition. This paper introduces a novel spectral mapping method which, when employed jointly with a statistical mapping technique, converts the Mel-frequency band energies of normal speech towards their counterparts in shouted speech. The aim is to obtain more robust performance in speaker recognition by tackling vocal effort mismatch between enrollment and test utterances. The processing is performed on the speech signal before feature extraction. The proposed approach was evaluated by testing the performance of a state-of-the-art i-vector-based speaker recognition system with and without applying the spectral mapping processing to the enrollment data. The results show that pre-processing with the proposed approach results in considerable improvement in correct identification rates.

Original languageEnglish
Title of host publication2017 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2017 - Proceedings
PublisherIEEE
Pages4940-4944
Number of pages5
ISBN (Electronic)9781509041176
DOIs
Publication statusPublished - 16 Jun 2017
MoE publication typeA4 Article in a conference publication
EventIEEE International Conference on Acoustics, Speech, and Signal Processing - New Orleans, United States
Duration: 5 Mar 20179 Mar 2017

Publication series

NameProceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing
PublisherIEEE
ISSN (Electronic)2379-190X

Conference

ConferenceIEEE International Conference on Acoustics, Speech, and Signal Processing
Abbreviated titleICASSP
Country/TerritoryUnited States
CityNew Orleans
Period05/03/201709/03/2017

Keywords

  • shouted speech
  • speaker recognition
  • spectral mapping
  • vocal effort mismatch

Fingerprint

Dive into the research topics of 'Normal-to-shouted speech spectral mapping for speaker recognition under vocal effort mismatch'. Together they form a unique fingerprint.

Cite this