Projects per year
Abstract
Test data is said to be out-of-distribution (OOD) when it unex-
pectedly differs from the training data, a common challenge in
real-world use cases of machine learning. Although OOD gen-
eralisation has gained interest in recent years, few works have
focused on OOD generalisation in spoken language understand-
ing (SLU) tasks. To facilitate research on this topic, we intro-
duce a modified version of the popular SLU dataset SLURP,
featuring data splits for testing OOD generalisation in the SLU
task. We call our modified dataset SLURP For OOD gener-
alisation, or SLURPFOOD. Utilising our OOD data splits, we
find end-to-end SLU models to have limited capacity for gen-
eralisation. Furthermore, by employing model interpretability
techniques, we shed light on the factors contributing to the gen-
eralisation difficulties of the models. To improve the generali-
sation, we experiment with two techniques, which improve the
results on some, but not all the splits, emphasising the need for
new techniques.
pectedly differs from the training data, a common challenge in
real-world use cases of machine learning. Although OOD gen-
eralisation has gained interest in recent years, few works have
focused on OOD generalisation in spoken language understand-
ing (SLU) tasks. To facilitate research on this topic, we intro-
duce a modified version of the popular SLU dataset SLURP,
featuring data splits for testing OOD generalisation in the SLU
task. We call our modified dataset SLURP For OOD gener-
alisation, or SLURPFOOD. Utilising our OOD data splits, we
find end-to-end SLU models to have limited capacity for gen-
eralisation. Furthermore, by employing model interpretability
techniques, we shed light on the factors contributing to the gen-
eralisation difficulties of the models. To improve the generali-
sation, we experiment with two techniques, which improve the
results on some, but not all the splits, emphasising the need for
new techniques.
Original language | English |
---|---|
Title of host publication | Interspeech 2024 |
Publisher | International Speech Communication Association (ISCA) |
Number of pages | 5 |
DOIs | |
Publication status | Published - 5 Sept 2024 |
MoE publication type | A4 Conference publication |
Event | Interspeech - Kos Island, Greece Duration: 1 Sept 2024 → 5 Sept 2024 |
Publication series
Name | Interspeech |
---|---|
Publisher | International Speech Communication Association |
ISSN (Electronic) | 2958-1796 |
Conference
Conference | Interspeech |
---|---|
Country/Territory | Greece |
City | Kos Island |
Period | 01/09/2024 → 05/09/2024 |
Fingerprint
Dive into the research topics of 'Out-of-distribution generalisation in spoken language understanding'. Together they form a unique fingerprint.Projects
- 1 Active
-
LAREINA: LAREINA - Language Resource Infrastructure for AI
Kurimo, M. (Principal investigator), Moisio, A. (Project Member), Getman, Y. (Project Member), Porjazovski, D. (Project Member), Rouhe, A. (Project Member) & Virkkunen, A. (Project Member)
01/01/2023 → 31/12/2025
Project: Business Finland: Strategic centres for science, technology and innovation (SHOK)