Abstract
Streaming 360 videos to a head-mounted display (HMD) client is challenging due to their high network resource consumption and computational load. This is due to the use of gaze point prediction or image saliency features from the field of view (FoV) since, in real-time scenarios, FoV extraction is computationally demanding. We propose a functional gaze prediction system that addresses these issues by relying on a tiling scheme for gaze prediction. We condition gaze point prediction on virtual reality (VR) content and long short-term memory (LSTM)-encoded eye movement history. Further, we encode image flow and saliency maps of RGB images via VGG16, using a convolutional neural network (CNN). Future gaze points are then predicted using a novel sinusoidal encoding technique. In experiments, our tile-based approach outperforms state-of-the-art FoV-based schemes in terms of computational load and predicted gaze position.
Original language | English |
---|---|
Title of host publication | 18th International Conference on Advances in Mobile Computing and Multimedia, MoMM2020 - Proceedings |
Editors | Pari Delir Haghighi, Ivan Luiz Salvadori, Matthias Steinbauer, Ismail Khalil, Gabriele Kotsis |
Publisher | ACM |
Pages | 40-47 |
Number of pages | 8 |
ISBN (Electronic) | 9781450389242 |
DOIs | |
Publication status | Published - 30 Nov 2020 |
MoE publication type | A4 Conference publication |
Event | International Conference on Advances in Mobile Computing & Multimedia - Chiang Mai, Thailand Duration: 30 Nov 2020 → 2 Dec 2020 Conference number: 18 |
Conference
Conference | International Conference on Advances in Mobile Computing & Multimedia |
---|---|
Abbreviated title | MoMM |
Country/Territory | Thailand |
City | Chiang Mai |
Period | 30/11/2020 → 02/12/2020 |
Keywords
- 360 video
- convolutional neural network
- gaze prediction
- machine learning
- pervasive HMD interaction
- virtual and augmented reality