Projects per year
Abstract
Speech foundation models such as wav2vec 2.0 have made it possible to develop highly accurate models for low-resourced languages using a limited amount of speech data. For optimal results, the pre-training should already include data from the target language, but unfortunately, none of the available foundation models include Northern Sámi. In this work, we explore various ways of preparing the foundation model for the Northern Sámi, including continued pre-training with a small untranscribed corpus and our new extended fine-tuning method. The extended fine-tuning starts from an already fine-tuned ASR model and augments it with new output units for the unique Sámi characters before new fine-tuning with transcribed Sámi data. Our results demonstrate the benefits of these advanced adaptation techniques, as both approaches lead to better performance than the direct fine-tuning-based adaptation.
Original language | English |
---|---|
Title of host publication | Interspeech 2024 |
Publisher | International Society for Computers and Their Applications (ISCA) |
Pages | 2539-2543 |
Number of pages | 5 |
DOIs | |
Publication status | Published - 2024 |
MoE publication type | A4 Conference publication |
Event | Interspeech - Kos Island, Greece Duration: 1 Sept 2024 → 5 Sept 2024 |
Publication series
Name | Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH |
---|---|
Publisher | International Speech Communication Association (ISCA) |
ISSN (Print) | 2308-457X |
Conference
Conference | Interspeech |
---|---|
Country/Territory | Greece |
City | Kos Island |
Period | 01/09/2024 → 05/09/2024 |
Keywords
- ASR
- low-resource
- model adaptation
- Northern Sámi
- wav2vec2
Fingerprint
Dive into the research topics of 'Exploring adaptation techniques of large speech foundation models for low-resource ASR: a case study on Northern Sámi'. Together they form a unique fingerprint.-
LAREINA: LAREINA - Language Resource Infrastructure for AI
Kurimo, M. (Principal investigator)
01/01/2023 → 31/12/2025
Project: BF Co-Innovation
-
USSEE: Understanding Speech and Scene with Ears and Eyes
Kurimo, M. (Principal investigator)
01/01/2022 → 31/12/2024
Project: RCF Academy Project