Blind recovery of perceptual models in distributed speech and audio coding

Tom Bäckström, Florin Ghido, Johannes Fischer

Tutkimustuotos: Artikkeli kirjassa/konferenssijulkaisussaConference contributionScientificvertaisarvioitu

11 Sitaatiot (Scopus)

Abstrakti

A central part of speech and audio codecs are their perceptual models, which describe the relative perceptual importance of errors in different elements of the signal representation. In practice, the perceptual models consists of signal-dependent weighting factors which are used in quantization of each element. For optimal performance, we would like to use the same perceptual model at the decoder. While the perceptual model is signal-dependent, however, it is not known in advance at the decoder, whereby audio codecs generally transmit this model explicitly, at the cost of increased bit-consumption. In this work we present an alternative method which recovers the perceptual model at the decoder from the transmitted signal without any side-information. The approach will be especially useful in distributed sensor-networks and the Internet of things, where the added cost on bit-consumption from transmitting a perceptual model increases with the number of sensors.

AlkuperäiskieliEnglanti
OtsikkoProceedings of the Annual Conference of the International Speech Communication Association
Sivut2483-2487
Sivumäärä5
Vuosikerta08-12-September-2016
DOI - pysyväislinkit
TilaJulkaistu - 1 tammikuuta 2016
OKM-julkaisutyyppiA4 Artikkeli konferenssijulkaisuussa
TapahtumaInterspeech - San Francisco, Yhdysvallat
Kesto: 8 syyskuuta 201612 syyskuuta 2016
Konferenssinumero: 17

Julkaisusarja

NimiProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
KustantajaInternational Speech Communication Association
ISSN (painettu)2308-457X

Conference

ConferenceInterspeech
MaaYhdysvallat
KaupunkiSan Francisco
Ajanjakso08/09/201612/09/2016

Sormenjälki

Sukella tutkimusaiheisiin 'Blind recovery of perceptual models in distributed speech and audio coding'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.

Siteeraa tätä