Time-varying autoregressions for speaker verification in reverberant conditions

Ville Vestman, Dhananjaya Gowda, Md Sahidullah, Paavo Alku, Tomi Kinnunen

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

5 Citations (Scopus)
154 Downloads (Pure)

Abstract

In poor room acoustics conditions, speech signals received by a microphone might become corrupted by the signals’ delayed versions that are reflected from the room surfaces (e.g. wall, floor). This phenomenon, reverberation, drops the accuracy of automatic speaker verification systems by causing mismatch between the training and testing. Since reverberation causes temporal smearing to the signal, one way to tackle its effects is to study robust feature extraction, particularly based on long-time temporal feature extraction. This approach has been adopted previously in the form of 2-dimensional autoregressive (2DAR) feature extraction scheme by using frequency domain linear prediction (FDLP). In 2DAR, FDLP processing is followed by time domain linear prediction (TDLP). In the current study, we propose modifying the latter part of the 2DAR feature extraction scheme by replacing TDLP with time-varying linear prediction (TVLP) to add an extra layer of temporal processing. Our speaker verification experiments using the proposed features with the text-dependent RedDots corpus show small but consistent improvements in clean and reverberant conditions (up to 6.5%) over the 2DAR features and large improvements over the MFCC features in reverberant conditions (up to 46.5%).
Original languageEnglish
Title of host publicationProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
PublisherInternational Speech Communication Association
Pages1512-1516
Number of pages5
Volume2017-August
ISBN (Print)978-1-5108-4876-4
DOIs
Publication statusPublished - Aug 2017
MoE publication typeA4 Article in a conference publication
EventInterspeech - Stockholm, Sweden
Duration: 20 Aug 201724 Aug 2017
Conference number: 18
http://www.interspeech2017.org/

Publication series

NameInterspeech: Annual Conference of the International Speech Communication Association
ISSN (Electronic)1990-9772

Conference

ConferenceInterspeech
CountrySweden
CityStockholm
Period20/08/201724/08/2017
Internet address

Keywords

  • speaker recognition
  • autoregressive modeling
  • autocorrelation domain time-varying linear prediction

Fingerprint Dive into the research topics of 'Time-varying autoregressions for speaker verification in reverberant conditions'. Together they form a unique fingerprint.

Cite this