TY - JOUR
T1 - KLANN: Linearising Long-Term Dynamics in Nonlinear Audio Effects Using Koopman Networks
AU - Huhtala, Ville
AU - Juvela, Lauri
AU - Schlecht, Sebastian J.
N1 - Publisher Copyright:
Authors
PY - 2024/4/16
Y1 - 2024/4/16
N2 - In recent years, neural network-based black-box modeling of nonlinear audio effects has improved considerably. Present convolutional and recurrent models can model audio effects with long-term dynamics, but the models require many parameters, thus increasing the processing time. In this paper, we propose KLANN, a Koopman-Linearised Audio Neural Network structure that lifts a one-dimensional signal (mono audio) into a high-dimensional approximately linear state-space representation with nonlinear mapping, and then uses differentiable biquad filters to predict linearly within the lifted state-space. Results show that the proposed models match the high performance of the state-of-the-art neural models while having a more compact architecture, reducing the number of parameters by tenfold, and having interpretable components.
AB - In recent years, neural network-based black-box modeling of nonlinear audio effects has improved considerably. Present convolutional and recurrent models can model audio effects with long-term dynamics, but the models require many parameters, thus increasing the processing time. In this paper, we propose KLANN, a Koopman-Linearised Audio Neural Network structure that lifts a one-dimensional signal (mono audio) into a high-dimensional approximately linear state-space representation with nonlinear mapping, and then uses differentiable biquad filters to predict linearly within the lifted state-space. Results show that the proposed models match the high performance of the state-of-the-art neural models while having a more compact architecture, reducing the number of parameters by tenfold, and having interpretable components.
KW - Closed box
KW - Discrete Fourier transforms
KW - Filters
KW - Frequency-domain analysis
KW - Logic gates
KW - Time-domain analysis
KW - Training
UR - http://www.scopus.com/inward/record.url?scp=85190726420&partnerID=8YFLogxK
U2 - 10.1109/LSP.2024.3389465
DO - 10.1109/LSP.2024.3389465
M3 - Article
AN - SCOPUS:85190726420
SN - 1070-9908
VL - 31
SP - 1169
EP - 1173
JO - IEEE Signal Processing Letters
JF - IEEE Signal Processing Letters
ER -