Utilising Kronecker Decomposition and Tensor-based Multi-view Learning to predict where people are looking in images

Kitsuchart Pasupa*, Sandor Szedmak

*Corresponding author for this work

Research output: Contribution to journalArticleScientificpeer-review

4 Citations (Scopus)


EEye movement data collection is very expensive and laborious. Moreover, there are usually missing values. Assuming that we are collecting eye movement data from a set of images viewed by different users, there is a possibility that we will not able to collect the data of every user from every image–one or more views may not be represented in the image. We assume that the relationships among the views can be learnt from the whole collection of views (or items). The task is then to reproduce the missing part of the incomplete items from the relationships derived from the complete items and the known part of these items. Using certain properties of tensor algebra, we showed that this problem can be formulated consistently as a regression type learning task. Furthermore, there is a maximum margin based optimisation framework in which this problem can be solved in a tractable way. This problem is similar to learning to predict where a person is looking in an image. Therefore, we proposed an algorithm called “Tensor-based Multi-View Learning”(TMVL) in this paper. Furthermore, we also proposed a technique for improving prediction by introducing a new feature set obtained from Kronecker decomposition of the image fused with user’s eye movement data. Using this new feature can improve prediction performance markedly. The proposed approach was proven to be more effective than two well-known saliency detection techniques.

Original languageEnglish
Pages (from-to)80–93
Publication statusPublished - 2017
MoE publication typeA1 Journal article-refereed


  • Eye movements
  • Kronecker decomposition
  • Maximum margin learning
  • Missing data
  • Multi-view learning
  • Tensor algebra


Dive into the research topics of 'Utilising Kronecker Decomposition and Tensor-based Multi-view Learning to predict where people are looking in images'. Together they form a unique fingerprint.

Cite this