Projekteja vuodessa
Abstrakti
Test data is said to be out-of-distribution (OOD) when it unex-
pectedly differs from the training data, a common challenge in
real-world use cases of machine learning. Although OOD gen-
eralisation has gained interest in recent years, few works have
focused on OOD generalisation in spoken language understand-
ing (SLU) tasks. To facilitate research on this topic, we intro-
duce a modified version of the popular SLU dataset SLURP,
featuring data splits for testing OOD generalisation in the SLU
task. We call our modified dataset SLURP For OOD gener-
alisation, or SLURPFOOD. Utilising our OOD data splits, we
find end-to-end SLU models to have limited capacity for gen-
eralisation. Furthermore, by employing model interpretability
techniques, we shed light on the factors contributing to the gen-
eralisation difficulties of the models. To improve the generali-
sation, we experiment with two techniques, which improve the
results on some, but not all the splits, emphasising the need for
new techniques.
pectedly differs from the training data, a common challenge in
real-world use cases of machine learning. Although OOD gen-
eralisation has gained interest in recent years, few works have
focused on OOD generalisation in spoken language understand-
ing (SLU) tasks. To facilitate research on this topic, we intro-
duce a modified version of the popular SLU dataset SLURP,
featuring data splits for testing OOD generalisation in the SLU
task. We call our modified dataset SLURP For OOD gener-
alisation, or SLURPFOOD. Utilising our OOD data splits, we
find end-to-end SLU models to have limited capacity for gen-
eralisation. Furthermore, by employing model interpretability
techniques, we shed light on the factors contributing to the gen-
eralisation difficulties of the models. To improve the generali-
sation, we experiment with two techniques, which improve the
results on some, but not all the splits, emphasising the need for
new techniques.
Alkuperäiskieli | Englanti |
---|---|
Otsikko | Interspeech 2024 |
Kustantaja | International Speech Communication Association (ISCA) |
Sivumäärä | 5 |
DOI - pysyväislinkit | |
Tila | Julkaistu - 5 syysk. 2024 |
OKM-julkaisutyyppi | A4 Artikkeli konferenssijulkaisussa |
Tapahtuma | Interspeech - Kos Island, Kreikka Kesto: 1 syysk. 2024 → 5 syysk. 2024 |
Julkaisusarja
Nimi | Interspeech |
---|---|
Kustantaja | International Speech Communication Association |
ISSN (elektroninen) | 2958-1796 |
Conference
Conference | Interspeech |
---|---|
Maa/Alue | Kreikka |
Kaupunki | Kos Island |
Ajanjakso | 01/09/2024 → 05/09/2024 |
Sormenjälki
Sukella tutkimusaiheisiin 'Out-of-distribution generalisation in spoken language understanding'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.Projektit
- 1 Aktiivinen
-
LAREINA: LAREINA - Language Resource Infrastructure for AI
Kurimo, M. (Vastuullinen tutkija), Moisio, A. (Projektin jäsen), Getman, Y. (Projektin jäsen), Porjazovski, D. (Projektin jäsen), Rouhe, A. (Projektin jäsen) & Virkkunen, A. (Projektin jäsen)
01/01/2023 → 31/12/2025
Projekti: Business Finland: Strategic centres for science, technology and innovation (SHOK)