Spatial Audio Feature Discovery with Convolutional Neural Networks

Etienne Thuillier, Hannes Gamper, Ivan J. Tashev

    Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

    26 Citations (Scopus)
    377 Downloads (Pure)

    Abstract

    The advent of mixed reality consumer products brings about a pressing need to develop and improve spatial sound rendering techniques for a broad user base. Despite a large body of prior work, the precise nature and importance of various sound localization cues and how they should be personalized for an individual user to improve localization performance is still an open research problem. Here we propose training a convolutional neural network (CNN) to classify the elevation angle of spatially rendered sounds and employing Layer-wise Relevance Propagation (LRP) on the trained CNN model. LRP provides saliency maps that can be used to identify spectral features used by the network for classification. These maps, in addition to the convolution filters learned by the CNN, are discussed in the context of listening tests reported in the literature. The proposed approach could potentially provide an avenue for future studies on modeling and personalization of head-related transfer functions (HRTFs).

    Original languageEnglish
    Title of host publication2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Proceedings
    PublisherIEEE
    Pages6797-6801
    Number of pages5
    Volume2018-April
    ISBN (Electronic)978-1-5386-4658-8
    ISBN (Print)978-1-5386-4659-5
    DOIs
    Publication statusPublished - 10 Sept 2018
    MoE publication typeA4 Conference publication
    EventIEEE International Conference on Acoustics, Speech, and Signal Processing - Calgary, Canada
    Duration: 15 Apr 201820 Apr 2018
    https://2018.ieeeicassp.org/

    Publication series

    NameProceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing
    ISSN (Electronic)2379-190X

    Conference

    ConferenceIEEE International Conference on Acoustics, Speech, and Signal Processing
    Abbreviated titleICASSP
    Country/TerritoryCanada
    CityCalgary
    Period15/04/201820/04/2018
    Internet address

    Keywords

    • Acoustic feature discovery
    • Deep Taylor Decomposition
    • HRTF personalization
    • Spatial sound
    • Virtual reality

    Fingerprint

    Dive into the research topics of 'Spatial Audio Feature Discovery with Convolutional Neural Networks'. Together they form a unique fingerprint.

    Cite this