End-to-End Optimized Multi-Stage Vector Quantization of Spectral Envelopes for Speech and Audio Coding

Tutkimustuotos: Artikkeli kirjassa/konferenssijulkaisussaConference article in proceedingsScientificvertaisarvioitu

1 Sitaatiot (Scopus)
267 Lataukset (Pure)


Spectral envelope modeling is an instrumental part of speech and audio codecs, which can be used to enable efficient entropy coding of spectral components. Overall optimization of codecs, including envelope models, has however been difficult due to the complicated interactions between different modules of the codec. In this paper, we study an end-to-end optimization methodology to optimize all modules in a codec integrally with respect to each other while capturing all these complex interactions with a global loss function. For the quantization of the spectral envelope parameters with a fixed bitrate, we use multistage vector quantization which gives high quality, but yet has a computational complexity which can be realistically applied in embedded devices. The obtained results demonstrate benefits in terms of PESQ and PSNR in comparison to the 3GPP EVS, as well as our recently proposed PyAWNeS codecs.
Otsikko22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021
KustantajaInternational Speech Communication Association (ISCA)
ISBN (elektroninen)9781713836902
DOI - pysyväislinkit
TilaJulkaistu - syysk. 2021
OKM-julkaisutyyppiA4 Artikkeli konferenssijulkaisussa
TapahtumaInterspeech - Brno, Tshekki
Kesto: 30 elok. 20213 syysk. 2021
Konferenssinumero: 22


NimiAnnual Conference of the International Speech Communication Association
ISSN (painettu)1990-9772
ISSN (elektroninen)2308-457X




Sukella tutkimusaiheisiin 'End-to-End Optimized Multi-Stage Vector Quantization of Spectral Envelopes for Speech and Audio Coding'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.

Siteeraa tätä