Out-of-distribution generalisation in spoken language understanding

Test data is said to be out-of-distribution (OOD) when it unex-
pectedly differs from the training data, a common challenge in
real-world use cases of machine learning. Although OOD gen-
eralisation has gained interest in recent years, few works have
focused on OOD generalisation in spoken language understand-
ing (SLU) tasks. To facilitate research on this topic, we intro-
duce a modified version of the popular SLU dataset SLURP,
featuring data splits for testing OOD generalisation in the SLU
task. We call our modified dataset SLURP For OOD gener-
alisation, or SLURPFOOD. Utilising our OOD data splits, we
find end-to-end SLU models to have limited capacity for gen-
eralisation. Furthermore, by employing model interpretability
techniques, we shed light on the factors contributing to the gen-
eralisation difficulties of the models. To improve the generali-
sation, we experiment with two techniques, which improve the
results on some, but not all the splits, emphasising the need for
new techniques.
