Projects per year
Abstract
Automatic speech assessment (ASA) supports learning but often requires extensive data, which is scarce for languages with fewer learners. Recent research shows that Large Language Models (LLMs) can generalize to new tasks with minimal training data using in-context learning (ICL). We find LLMs effective in estimating the proficiency of individuals learning Finnish as a second language (L2) when given a few examples of human expert grading. The proficiency grades produced by the model, when evaluating verbatim transcripts from an automatic speech recognition (ASR) system, agree with human ratings at a level comparable to the agreement between the human raters. Our experiments reveal that adding more grading demonstrations in ICL improves the model’s accuracy but, counterintuitively, increases its uncertainty when selecting an appropriate proficiency level. We show that this uncertainty can be leveraged further by creating soft labels: instead of assigning the most probable level (hard label), we aggregate the model’s confidence across all possible levels, resulting in noticeable performance improvements. Further analysis reveals that the sources of model uncertainty differ across ICL settings. In zero-shot, uncertainty stems from intrinsic response properties, such as proficiency level. In few-shot, it is driven by the relationship between the sample and the demonstrations.
Original language | English |
---|---|
Title of host publication | The Workshop on Automatic Assessment of Atypical Speech (AAAS-2025). Proceedings of the Workshop |
Publisher | University of Tartu Library |
Number of pages | 9 |
ISBN (Electronic) | 978-9908-53-115-1 |
Publication status | Published - 2025 |
MoE publication type | A4 Conference publication |
Event | Workshop on Automatic Assessment of Atypical Speech - Tallinn, Estonia Duration: 5 Mar 2025 → 5 Mar 2025 |
Workshop
Workshop | Workshop on Automatic Assessment of Atypical Speech |
---|---|
Abbreviated title | AAAS |
Country/Territory | Estonia |
City | Tallinn |
Period | 05/03/2025 → 05/03/2025 |
Keywords
- LLM
Fingerprint
Dive into the research topics of 'Leveraging Uncertainty for Finnish L2 Speech Scoring with LLMs'. Together they form a unique fingerprint.-
AASIS: Automatic assessment of spoken interaction in second language
Kurimo, M. (Principal investigator)
01/09/2023 → 31/08/2027
Project: RCF Academy Project
-
USSEE: Understanding Speech and Scene with Ears and Eyes
Kurimo, M. (Principal investigator)
01/01/2022 → 31/12/2024
Project: RCF Academy Project
-
DigiTala: Aka-Digi Tala
Kurimo, M. (Principal investigator)
01/01/2020 → 31/08/2023
Project: RCF Academy Project targeted call