Abstract
The advent of mixed reality consumer products brings about a pressing need to develop and improve spatial sound rendering techniques for a broad user base. Despite a large body of prior work, the precise nature and importance of various sound localization cues and how they should be personalized for an individual user to improve localization performance is still an open research problem. Here we propose training a convolutional neural network (CNN) to classify the elevation angle of spatially rendered sounds and employing Layer-wise Relevance Propagation (LRP) on the trained CNN model. LRP provides saliency maps that can be used to identify spectral features used by the network for classification. These maps, in addition to the convolution filters learned by the CNN, are discussed in the context of listening tests reported in the literature. The proposed approach could potentially provide an avenue for future studies on modeling and personalization of head-related transfer functions (HRTFs).
Original language | English |
---|---|
Title of host publication | 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Proceedings |
Publisher | IEEE |
Pages | 6797-6801 |
Number of pages | 5 |
Volume | 2018-April |
ISBN (Electronic) | 978-1-5386-4658-8 |
ISBN (Print) | 978-1-5386-4659-5 |
DOIs | |
Publication status | Published - 10 Sept 2018 |
MoE publication type | A4 Conference publication |
Event | IEEE International Conference on Acoustics, Speech, and Signal Processing - Calgary, Canada Duration: 15 Apr 2018 → 20 Apr 2018 https://2018.ieeeicassp.org/ |
Publication series
Name | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing |
---|---|
ISSN (Electronic) | 2379-190X |
Conference
Conference | IEEE International Conference on Acoustics, Speech, and Signal Processing |
---|---|
Abbreviated title | ICASSP |
Country/Territory | Canada |
City | Calgary |
Period | 15/04/2018 → 20/04/2018 |
Internet address |
Keywords
- Acoustic feature discovery
- Deep Taylor Decomposition
- HRTF personalization
- Spatial sound
- Virtual reality