Postfiltering Using Log-Magnitude Spectrum for Speech and Audio Coding

Sneha Das, Tom Bäckström

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

7 Citations (Scopus)
529 Downloads (Pure)

Abstract

Advanced coding algorithms yield high quality signals with good coding efficiency within their target bit-rate ranges, but their performance suffer outside the target range. At lower bitrates, the degradation in performance is because the decoded signals are sparse, which gives a perceptually muffled and distorted characteristic to the signal. Standard codecs reduce such distortions by applying noise filling and post-filtering methods. In this paper, we propose a post-processing method based on modeling the inherent time-frequency correlation in the log-magnitude spectrum. The goal is to improve the perceptual SNR of the decoded signals and, to reduce the distortions caused by signal sparsity. Objective measures show an average improvement of 1.5 dB for input perceptual SNR in range 4 to 18 dB. The improvement is especially prominent in components which had been quantized to zero.
Original languageEnglish
Title of host publicationInterspeech
Subtitle of host publicationAnnual Conference of the International Speech Communication Association
PublisherInternational Speech Communication Association (ISCA)
Pages3543-3547
Number of pages5
DOIs
Publication statusPublished - Sept 2018
MoE publication typeA4 Conference publication
EventInterspeech - Hyderabad International Convention Centre, Hyderabad, India
Duration: 2 Sept 20186 Sept 2018
http://interspeech2018.org/

Publication series

NameInterspeech
ISSN (Electronic)1990-9772

Conference

ConferenceInterspeech
Country/TerritoryIndia
CityHyderabad
Period02/09/201806/09/2018
Internet address

Fingerprint

Dive into the research topics of 'Postfiltering Using Log-Magnitude Spectrum for Speech and Audio Coding'. Together they form a unique fingerprint.

Cite this