Towards Gaze-Based Video Annotation

Mohamed Soliman, Hamed Rezazadegan Tavakoli, Jorma Laaksonen

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

4 Citations (Scopus)


This paper presents our efforts towards a framework for video annotation using gaze. In computer vision, video annotation (VA) is an essential step in providing a ground truth for the evaluation of object detection and tracking techniques. VA is a demanding element in the development of video processing algorithms, where each object of interest should be manually labelled. Although the community has handled VA for a long time, the size of new data sets and the complexity of the new tasks pushes us to revisit it. A barrier towards automated video annotation is the recognition of the object of interest and tracking it over image sequences. To tackle this problem, we employ the concept of visual attention for enhancing video annotation. In an image, human attention naturally grasps interesting areas that provide valuable information for extracting the objects of interest, which can be exploited to annotate videos. Under task-based gaze recording, we utilize an observer's gaze to filter seed object detector responses in a video sequence. The filtered boxes are then passed to an appearance-based tracking algorithm. We evaluate the gaze usefulness by comparing the algorithm with gaze and without it. We show that eye gaze is an influential cue for enhancing the automated video annotation, improving the annotation significantly.
Original languageEnglish
Title of host publication2016 Sixth International Conference on Image Processing Theory, Tools and Applications (IPTA)
Number of pages5
ISBN (Electronic) 978-1-4673-8910-5
Publication statusPublished - 2017
MoE publication typeA4 Conference publication
EventInternational Conference on Image Processing Theory, Tools and Applications - Oulu, Finland
Duration: 12 Dec 201615 Dec 2016
Conference number: 6

Publication series

NameInternational Conference on Image Processing Theory Tools and Applications
ISSN (Print)2154-512X


ConferenceInternational Conference on Image Processing Theory, Tools and Applications
Abbreviated titleIPTA
Internet address


  • Detectors
  • Observers
  • Face
  • Object detection
  • Manuals
  • Motion pictures
  • Pipelines


Dive into the research topics of 'Towards Gaze-Based Video Annotation'. Together they form a unique fingerprint.

Cite this