A Variational Y-Autoencoder for Disentangling Gesture and Material of Interaction Sounds

Simon Schwär*, Meinard Müller, Sebastian J. Schlecht

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

2 Downloads (Pure)

Abstract

Appropriate sound effects are an important aspect of immersive virtual experiences. Particularly in mixed reality scenarios it may be desirable to change the acoustic properties of a naturally occurring interaction sound (e.g., the sound of a metal spoon scraping a wooden bowl) to a sound matching the characteristics of the corresponding interaction in the virtual environment (e.g., using wooden tools in a porcelain bowl). In this paper, we adapt the concept of a Y-Autoencoder (YAE) to the domain of sound effect analysis and synthesis. The YAE model makes it possible to disentangle the gesture and material properties of sound effects with a weakly supervised training strategy where only an identifier label for the material in each training example is given. We show that such a model makes it possible to resynthesize sound effects after exchanging the material label of an encoded example and obtain perceptually meaningful synthesis results with relatively low computational effort. By introducing a variational regularization for the encoded gesture, as well as an adversarial loss, we can further use the model to generate new and varying sound effects with the material characteristics of the training data, while the analyzed audio signal can originate from interactions with unknown materials.

Original languageEnglish
Title of host publicationAES International Conference on Audio for Virtual and Augmented Reality, AVAR 2022
PublisherCurran Associates Inc.
Pages205-214
Number of pages10
ISBN (Electronic)978-1-7138-5972-7
Publication statusPublished - 15 Aug 2022
MoE publication typeA4 Conference publication
EventAES International Conference on Audio for Virtual and Augmented Reality - DigiPen Institute of Technology, Redmond, United States
Duration: 15 Aug 202217 Aug 2022
Conference number: 4
https://aes2.org/events-calendar/avar-2022/

Publication series

NameProceedings of the AES International Conference
Volume2022-August

Conference

ConferenceAES International Conference on Audio for Virtual and Augmented Reality
Abbreviated titleAES AVAR
Country/TerritoryUnited States
CityRedmond
Period15/08/202217/08/2022
Internet address

Fingerprint

Dive into the research topics of 'A Variational Y-Autoencoder for Disentangling Gesture and Material of Interaction Sounds'. Together they form a unique fingerprint.

Cite this