Postfiltering with Complex Spectral Correlations for Speech and Audio Coding

Sneha Das, Tom Bäckström

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

7 Citations (Scopus)
288 Downloads (Pure)

Abstract

State-of-the-art speech codecs achieve a good compromise between quality, bitrate and complexity. However, retaining performance outside the target bitrate range remains challenging. To improve performance, many codecs use pre- and post-filtering techniques to reduce the perceptual effect of quantization-noise. In this paper, we propose a postfiltering method to attenuate quantization noise which uses the complex spectral correlations of speech signals. Since conventional speech codecs cannot transmit information with temporal dependencies as transmission errors could result in severe error propagation, we model the correlation offline and employ them at the decoder, hence removing the need to transmit any side information. Objective evaluation indicates an average 4 dB improvement in the perceptual SNR of signals using the context-based post-filter, with respect to the noisy signal and an average 2 dB improvement relative to the conventional Wiener filter. These results are confirmed by an improvement of up to 30 MUSHRA points in a subjective listening test.
Original languageEnglish
Title of host publicationInterspeech
Subtitle of host publicationAnnual Conference of the International Speech Communication Association
PublisherInternational Speech Communication Association (ISCA)
Pages3538-3542
Number of pages5
DOIs
Publication statusPublished - Sept 2018
MoE publication typeA4 Conference publication
EventInterspeech - Hyderabad International Convention Centre, Hyderabad, India
Duration: 2 Sept 20186 Sept 2018
http://interspeech2018.org/

Publication series

NameInterspeech
ISSN (Electronic)1990-9772

Conference

ConferenceInterspeech
Country/TerritoryIndia
CityHyderabad
Period02/09/201806/09/2018
Internet address

Fingerprint

Dive into the research topics of 'Postfiltering with Complex Spectral Correlations for Speech and Audio Coding'. Together they form a unique fingerprint.

Cite this