Arithmetic coding of speech and audio spectra using tcx based on linear predictive spectral envelopes

Tom Backstrom, Christian R. Helmrich

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

15 Citations (Scopus)

Abstract

Unified speech and audio codecs often use a frequency domain coding technique of the transform coded excitation (TCX) type. It is based on modeling the speech source with a linear predictor, spectral weighting by a perceptual model and entropy coding of the frequency components. While previous approaches have used neighbouring frequency components to form a probability model for the entropy coder of spectral components, we propose to use the magnitude of the linear predictor to estimate the variance of spectral components. Since the linear predictor is transmitted in any case, this method does not require any additional side info. Subjective measurements show that the proposed methods give a statistically significant improvement in perceptual quality when the bit-rate is held constant. Consequently, the proposed method has been adopted to the 3GPP Enhanced Voice Services speech coding standard.

Original languageEnglish
Title of host publication2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings
PublisherIEEE
Pages5127-5131
Number of pages5
Volume2015-August
ISBN (Electronic)9781467369978
DOIs
Publication statusPublished - 1 Jan 2015
MoE publication typeA4 Article in a conference publication
EventIEEE International Conference on Acoustics, Speech, and Signal Processing - Brisbane, Australia
Duration: 19 Apr 201524 Apr 2015
Conference number: 40

Conference

ConferenceIEEE International Conference on Acoustics, Speech, and Signal Processing
Abbreviated titleICASSP
CountryAustralia
CityBrisbane
Period19/04/201524/04/2015

Keywords

  • arithmetic coding
  • frequency domain coding
  • speech and audio coding

Fingerprint

Dive into the research topics of 'Arithmetic coding of speech and audio spectra using tcx based on linear predictive spectral envelopes'. Together they form a unique fingerprint.

Cite this