Projekteja vuodessa
Abstrakti
The parallel audio-visual-text data contains vast amount of information. Thus it is essential to develop machine learning algorithms that can utilise them efficiently. In this work, we investigated unimodal and multimodal solutions for MuSe Humor and Perception challenges. Our main goal was to explicitly show the contribution of each modality in the multimodal systems. In addition, for the Humor challenge, we examined the effect of extending the input context and smoothing the framewise predictions. For Perception challenge, we trained an attention-encoder-decoder model to predict all perceived labels with a single model. During the challenge, the best results were achieved by a fusion of unimodal and multimodal systems, AUC = 0.8645 for Humor, and mean Pearson’s correlation ρ = 0.3550 for Perception. By investigating the multimodal systems we found that using only part of the video for model training can be beneficial, suggesting that valuable information is condensed to certain parts of the video. The implementation of our models and experiments can be found at https://github.com/aalto-speech/MuSe-2024.
Alkuperäiskieli | Englanti |
---|---|
Otsikko | MuSe 2024 - Proceedings of the 5th Multimodal Sentiment Analysis Challenge and Workshop |
Alaotsikko | Social Perception and Humor, Co-Located with: MM 2024 |
Kustantaja | ACM |
Sivut | 60-64 |
Sivumäärä | 5 |
ISBN (elektroninen) | 9798400711992 |
DOI - pysyväislinkit | |
Tila | Julkaistu - 28 lokak. 2024 |
OKM-julkaisutyyppi | A4 Artikkeli konferenssijulkaisussa |
Tapahtuma | Multimodal Sentiment Analysis Challenge and Workshop: Social Perception and Humor - Melbourne, Austraalia Kesto: 28 lokak. 2024 → 1 marrask. 2024 |
Julkaisusarja
Nimi | MuSe 2024 - Proceedings of the 5th Multimodal Sentiment Analysis Challenge and Workshop: Social Perception and Humor, Co-Located with: MM 2024 |
---|
Workshop
Workshop | Multimodal Sentiment Analysis Challenge and Workshop |
---|---|
Lyhennettä | MuSe |
Maa/Alue | Austraalia |
Kaupunki | Melbourne |
Ajanjakso | 28/10/2024 → 01/11/2024 |
Sormenjälki
Sukella tutkimusaiheisiin 'Multimodal Humor Detection and Social Perception Prediction'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.-
LAREINA: LAREINA - Language Resource Infrastructure for AI
Kurimo, M. (Vastuullinen tutkija)
01/01/2023 → 31/12/2025
Projekti: BF Co-Innovation
-
USSEE: Understanding Speech and Scene with Ears and Eyes
Kurimo, M. (Vastuullinen tutkija)
01/01/2022 → 31/12/2024
Projekti: RCF Academy Project
-
-: Finnish Center for Artificial Intelligence
Kaski, S. (Vastuullinen tutkija)
01/01/2019 → 31/12/2022
Projekti: Academy of Finland: Other research funding