Collecting Linguistic Resources for Assessing Children's Pronunciation of Nordic Languages

Anne Marte Haug Olstad, Anna Smolander, Sofia Strömbergsson, Sari Ylinen, Minna Lehtonen, Mikko Kurimo, Yaroslav Getman, Támas Grosz, Xinwei Cao, Torbjørn Svendsen, Giampiero Salvi

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

Abstract

This paper reports on the experience collecting a number of corpora of Nordic languages spoken by children. The aim of the data collection is providing annotated data to develop and evaluate computer assisted pronunciation assessment systems both for non-native children learning a Nordic language (L2) and for L1 children with speech sound disorder (SSD). The paper presents the challenges encountered recording and annotating data for Finnish, Swedish and Norwegian, as well as the ethical considerations related with making this data publicly available. We hope that sharing this experience will encourage others to collect similar data for other languages. Of the different data collections, we were able to make the Norwegian corpus publicly available in the hope that it will serve as a reference in pronunciation assessment research.

Original languageEnglish
Title of host publication2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings
EditorsNicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
PublisherEuropean language resources distribution agency
Pages3529-3537
Number of pages9
ISBN (Electronic)978-2-493814-10-4
Publication statusPublished - 2024
MoE publication typeA4 Conference publication
EventJoint International Conference on Computational Linguistics, Language Resources and Evaluation - Torino, Italy
Duration: 20 May 202425 May 2024
https://lrec-coling-2024.org/conference-program/
https://aclanthology.org/2024.lrec-main

Publication series

Name2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings

Conference

ConferenceJoint International Conference on Computational Linguistics, Language Resources and Evaluation
Abbreviated titleLREC-COLING
Country/TerritoryItaly
CityTorino
Period20/05/202425/05/2024
Internet address

Keywords

  • CAPT
  • child speech
  • Nordic languages
  • pronunciation assessment
  • second language acquisition
  • speech sound disorder

Fingerprint

Dive into the research topics of 'Collecting Linguistic Resources for Assessing Children's Pronunciation of Nordic Languages'. Together they form a unique fingerprint.

Cite this