Blind recovery of perceptual models in distributed speech and audio coding

Tom Bäckström, Florin Ghido, Johannes Fischer

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

11 Citations (Scopus)

Abstract

A central part of speech and audio codecs are their perceptual models, which describe the relative perceptual importance of errors in different elements of the signal representation. In practice, the perceptual models consists of signal-dependent weighting factors which are used in quantization of each element. For optimal performance, we would like to use the same perceptual model at the decoder. While the perceptual model is signal-dependent, however, it is not known in advance at the decoder, whereby audio codecs generally transmit this model explicitly, at the cost of increased bit-consumption. In this work we present an alternative method which recovers the perceptual model at the decoder from the transmitted signal without any side-information. The approach will be especially useful in distributed sensor-networks and the Internet of things, where the added cost on bit-consumption from transmitting a perceptual model increases with the number of sensors.

Original languageEnglish
Title of host publicationProceedings of the Annual Conference of the International Speech Communication Association
Pages2483-2487
Number of pages5
Volume08-12-September-2016
DOIs
Publication statusPublished - 1 Jan 2016
MoE publication typeA4 Article in a conference publication
EventInterspeech - San Francisco, United States
Duration: 8 Sep 201612 Sep 2016
Conference number: 17

Publication series

NameProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
PublisherInternational Speech Communication Association
ISSN (Print)2308-457X

Conference

ConferenceInterspeech
CountryUnited States
CitySan Francisco
Period08/09/201612/09/2016

Keywords

  • Auditory perception
  • Distributed sensor networks
  • Envelope modelling
  • Internet of things
  • Speech analysis

Fingerprint

Dive into the research topics of 'Blind recovery of perceptual models in distributed speech and audio coding'. Together they form a unique fingerprint.

Cite this