Leveraging Uncertainty for Finnish L2 Speech Scoring with LLMs

Ekaterina Voskoboinik, Nhan Phan, Tamás Grósz, Mikko Kurimo

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

43 Downloads (Pure)

Abstract

Automatic speech assessment (ASA) supports learning but often requires extensive data, which is scarce for languages with fewer learners. Recent research shows that Large Language Models (LLMs) can generalize to new tasks with minimal training data using in-context learning (ICL). We find LLMs effective in estimating the proficiency of individuals learning Finnish as a second language (L2) when given a few examples of human expert grading. The proficiency grades produced by the model, when evaluating verbatim transcripts from an automatic speech recognition (ASR) system, agree with human ratings at a level comparable to the agreement between the human raters. Our experiments reveal that adding more grading demonstrations in ICL improves the model’s accuracy but, counterintuitively, increases its uncertainty when selecting an appropriate proficiency level. We show that this uncertainty can be leveraged further by creating soft labels: instead of assigning the most probable level (hard label), we aggregate the model’s confidence across all possible levels, resulting in noticeable performance improvements. Further analysis reveals that the sources of model uncertainty differ across ICL settings. In zero-shot, uncertainty stems from intrinsic response properties, such as proficiency level. In few-shot, it is driven by the relationship between the sample and the demonstrations.
Original languageEnglish
Title of host publicationThe Workshop on Automatic Assessment of Atypical Speech (AAAS-2025). Proceedings of the Workshop
PublisherUniversity of Tartu Library
Number of pages9
ISBN (Electronic)978-9908-53-115-1
Publication statusPublished - 2025
MoE publication typeA4 Conference publication
EventWorkshop on Automatic Assessment of Atypical Speech - Tallinn, Estonia
Duration: 5 Mar 20255 Mar 2025

Workshop

WorkshopWorkshop on Automatic Assessment of Atypical Speech
Abbreviated titleAAAS
Country/TerritoryEstonia
CityTallinn
Period05/03/202505/03/2025

Keywords

  • LLM

Fingerprint

Dive into the research topics of 'Leveraging Uncertainty for Finnish L2 Speech Scoring with LLMs'. Together they form a unique fingerprint.
  • Science-IT

    Hakala, M. (Manager)

    School of Science

    Facility/equipment: Facility

Cite this