Stochastic Optimization of Vector Quantization Methods in Application to Speech and Image Processing

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

1 Citation (Scopus)
134 Downloads (Pure)

Abstract

Vector quantization (VQ) methods have been used in a wide range of applications for speech, image, and video data. While classic VQ methods often use expectation maximization, in this paper, we investigate the use of stochastic optimization employing our recently proposed noise substitution in vector quantization technique. We consider three variants of VQ including additive VQ, residual VQ, and product VQ, and evaluate their quality, complexity and bitrate in speech coding, image compression, approximate nearest neighbor search, and a selection of toy examples. Our experimental results demonstrate the trade-offs in accuracy, complexity, and bitrate such that using our open source implementations and complexity calculator, the best vector quantization method can be chosen for a particular problem.
Original languageEnglish
Title of host publicationInternational Conference on Acoustics, Speech, and Signal Processing
PublisherIEEE
Number of pages5
ISBN (Electronic)978-1-7281-6327-7
DOIs
Publication statusPublished - 2023
MoE publication typeA4 Conference publication
EventIEEE International Conference on Acoustics, Speech, and Signal Processing - Rhodes Island, Greece
Duration: 4 Jun 202310 Jun 2023

Publication series

NameProceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing
ISSN (Electronic)2379-190X

Conference

ConferenceIEEE International Conference on Acoustics, Speech, and Signal Processing
Abbreviated titleICASSP
Country/TerritoryGreece
CityRhodes Island
Period04/06/202310/06/2023

Keywords

  • Complexity
  • Machine learning
  • rate-distortion
  • Vector quantization

Fingerprint

Dive into the research topics of 'Stochastic Optimization of Vector Quantization Methods in Application to Speech and Image Processing'. Together they form a unique fingerprint.

Cite this