GANSpaceSynth: A Hybrid Generative Adversarial Network Architecture for Organising the Latent Space using a Dimensionality Reduction for Real-Time Audio Synthesis

Koray Tahiroğlu, Miranda Kastemaa, Oskar Koli

    Research output: Contribution to conferencePaperScientificpeer-review

    197 Downloads (Pure)

    Abstract

    Generative models enable possibilities in audio domain to present timbre as vectors in a high-dimensional latent space with Gen- erative Adversarial Networks (GANs). It is a common method in GAN models in which the musician’s control over timbre is mostly limited to sampling random points from the space and interpolating between them. In this paper, we present a novel hybrid GAN architecture that allows musicians to explore the GAN latent space in a more controlled manner, identifying the audio features in the trained checkpoints and giving an opportunity to specify particular audio features to be present or absent in the generated audio samples. We extend the paper with the detailed description of our GANSpaceSynth and present the Hallu composition tool as an application of this hybrid method in computer music practices.
    Original languageEnglish
    Pages10
    DOIs
    Publication statusPublished - 19 Jul 2021
    MoE publication typeNot Eligible
    EventConference on AI Music Creativity - Graz, Austria
    Duration: 18 Jul 202122 Jul 2021
    Conference number: 2
    https://aimc2021.iem.at/

    Conference

    ConferenceConference on AI Music Creativity
    Abbreviated titleAIMC
    Country/TerritoryAustria
    CityGraz
    Period18/07/202122/07/2021
    Internet address

    Keywords

    • GANSpaceSynth
    • AI-terity
    • Artificial Intelligence (AI)
    • Deep Learning
    • new interfaces for musical expression
    • Digital musical instruments

    Field of art

    • Composition
    • Performance

    Fingerprint

    Dive into the research topics of 'GANSpaceSynth: A Hybrid Generative Adversarial Network Architecture for Organising the Latent Space using a Dimensionality Reduction for Real-Time Audio Synthesis'. Together they form a unique fingerprint.

    Cite this