TY - GEN
T1 - Stochastic Optimization of Vector Quantization Methods in Application to Speech and Image Processing
AU - Vali, Mohammadhassan
AU - Bäckström, Tom
PY - 2023
Y1 - 2023
N2 - Vector quantization (VQ) methods have been used in a wide range of applications for speech, image, and video data. While classic VQ methods often use expectation maximization, in this paper, we investigate the use of stochastic optimization employing our recently proposed noise substitution in vector quantization technique. We consider three variants of VQ including additive VQ, residual VQ, and product VQ, and evaluate their quality, complexity and bitrate in speech coding, image compression, approximate nearest neighbor search, and a selection of toy examples. Our experimental results demonstrate the trade-offs in accuracy, complexity, and bitrate such that using our open source implementations and complexity calculator, the best vector quantization method can be chosen for a particular problem.
AB - Vector quantization (VQ) methods have been used in a wide range of applications for speech, image, and video data. While classic VQ methods often use expectation maximization, in this paper, we investigate the use of stochastic optimization employing our recently proposed noise substitution in vector quantization technique. We consider three variants of VQ including additive VQ, residual VQ, and product VQ, and evaluate their quality, complexity and bitrate in speech coding, image compression, approximate nearest neighbor search, and a selection of toy examples. Our experimental results demonstrate the trade-offs in accuracy, complexity, and bitrate such that using our open source implementations and complexity calculator, the best vector quantization method can be chosen for a particular problem.
KW - Complexity
KW - Machine learning
KW - rate-distortion
KW - Vector quantization
U2 - 10.1109/ICASSP49357.2023.10096204
DO - 10.1109/ICASSP49357.2023.10096204
M3 - Conference article in proceedings
T3 - Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing
BT - International Conference on Acoustics, Speech, and Signal Processing
PB - IEEE
T2 - IEEE International Conference on Acoustics, Speech, and Signal Processing
Y2 - 4 June 2023 through 10 June 2023
ER -