Combining Textual and Visual Modeling for Predicting Media Memorability

Alison Reboud, Ismail Harrando, Jorma Laaksonen, Danny Francis, Raphaël Troncy, Hector Laria Mantecon

Research output: Chapter in Book/Report/Conference proceedingChapterScientific

2 Citations (Scopus)
114 Downloads (Pure)

Abstract

This paper describes a multimodal approach proposed by the MeMAD team for the MediaEval 2019 “Predicting Media memorability” task. Our best approach is a weighted average method combining predictions made separately from visual and textual representations of videos. In particular, we augmented the provided textual descriptions with automatically generated deep captions. For long term
memorability, we obtained better scores using the short term predictions rather than the long term ones. Our best model achieves Spearman scores of 0.522 and 0.277 respectively for the short and long term predictions tasks.
Original languageEnglish
Title of host publicationWorking Notes Proceedings of the MediaEval 2019 Workshop, Sophia Antipolis, France, 27-30 October 2019
PublisherCEUR
Publication statusPublished - 27 Oct 2019
MoE publication typeB2 Book section
EventMultimedia Benchmark Workshop - Sophia Antipolis, France
Duration: 27 Oct 201930 Oct 2019

Publication series

NameCEUR Workshop Proceedings
PublisherCEUR
Volume2670
ISSN (Electronic)1613-0073

Workshop

WorkshopMultimedia Benchmark Workshop
Abbreviated titleMediaEval
Country/TerritoryFrance
CitySophia Antipolis
Period27/10/201930/10/2019

Fingerprint

Dive into the research topics of 'Combining Textual and Visual Modeling for Predicting Media Memorability'. Together they form a unique fingerprint.
  • MeMAD Laaksonen

    Laaksonen, J., Sjöberg, M., Pehlivan Tort, S. & Laria Mantecon, H.

    01/01/201831/03/2021

    Project: EU: Framework programmes funding

Cite this