Dithered Quantization for Frequency-Domain Speech and Audio Coding

Tom Bäckström, Johannes Fischer, Sneha Das

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

4 Citations (Scopus)
179 Downloads (Pure)

Abstract

A common issue in coding speech and audio in the frequency domain, which appears with decreasing bitrate, is that quantization levels become increasingly sparse. With low accuracy, high-frequency components are typically quantized to zero, which leads to a muffled output signal and musical noise. Band-width extension and noise-filling methods attempt to treat the problem by inserting noise of similar energy as the original signal, at the cost of low signal to noise ratio. Dithering methods however provide an alternative approach, where both accuracy and energy are retained. We propose a hybrid coding approach where low-energy samples are quantized using dithering, instead of the conventional uniform quantizer. For dithering, we apply 1 bit quantization in a randomized sub-space. We further show that the output energy can be adjusted to the desired level using a scaling parameter. Objective measurements and listening tests demonstrate the advantages of the proposed methods.
Original languageEnglish
Title of host publicationProceedings of Interspeech
Place of PublicationInternational
PublisherInternational Speech Communication Association
Pages3533-3537
DOIs
Publication statusPublished - Sep 2018
MoE publication typeA4 Article in a conference publication
EventInterspeech - Hyderabad International Convention Centre, Hyderabad, India
Duration: 2 Sep 20186 Sep 2018
http://interspeech2018.org/

Publication series

NameInterspeech
ISSN (Electronic)2308-457X

Conference

ConferenceInterspeech
CountryIndia
CityHyderabad
Period02/09/201806/09/2018
Internet address

Fingerprint Dive into the research topics of 'Dithered Quantization for Frequency-Domain Speech and Audio Coding'. Together they form a unique fingerprint.

Cite this