TY - JOUR
T1 - Developing an AI-assisted Low-resource Spoken Language Learning App for Children
AU - Getman, Yaroslav
AU - Phan, Nhan
AU - Al-Ghezi, Ragheb
AU - Voskoboinik, Ekaterina
AU - Singh, Mittul
AU - Grosz, Tamas
AU - Kurimo, Mikko
AU - Salvi, Giampiero
AU - Svendsen, Torbjorn
AU - Strombergsson, Sofia
AU - Smolander, Anna
AU - Ylinen, Sari
N1 - Publisher Copyright:
Author
PY - 2023
Y1 - 2023
N2 - Computer-assisted Language Learning (CALL) is a rapidly developing area accelerated by advancements in the field of AI. A well-designed and reliable CALL system allows students to practice language skills, like pronunciation, any time outside of the classroom. Furthermore, gamification via mobile applications has shown encouraging results on learning outcomes and motivates young users to practice more and perceive language learning as a positive experience. In this work, we adapt the latest speech recognition technology to be a part of an online pronunciation training system for small children. As part of our gamified mobile application, our models will assess the pronunciation quality of young Swedish children diagnosed with Speech Sound Disorder, and participating in speech therapy. Additionally, the models provide feedback to young non-native children learning to pronounce Swedish and Finnish words. Our experiments revealed that these new models fit into an online game as they function as speech recognizers and pronunciation evaluators simultaneously. To make our systems more trustworthy and explainable, we investigated whether the combination of modern input attribution algorithms and time-aligned transcripts can explain the decisions made by the models, give us insights into how the models work and provide a tool to develop more reliable solutions.
AB - Computer-assisted Language Learning (CALL) is a rapidly developing area accelerated by advancements in the field of AI. A well-designed and reliable CALL system allows students to practice language skills, like pronunciation, any time outside of the classroom. Furthermore, gamification via mobile applications has shown encouraging results on learning outcomes and motivates young users to practice more and perceive language learning as a positive experience. In this work, we adapt the latest speech recognition technology to be a part of an online pronunciation training system for small children. As part of our gamified mobile application, our models will assess the pronunciation quality of young Swedish children diagnosed with Speech Sound Disorder, and participating in speech therapy. Additionally, the models provide feedback to young non-native children learning to pronounce Swedish and Finnish words. Our experiments revealed that these new models fit into an online game as they function as speech recognizers and pronunciation evaluators simultaneously. To make our systems more trustworthy and explainable, we investigated whether the combination of modern input attribution algorithms and time-aligned transcripts can explain the decisions made by the models, give us insights into how the models work and provide a tool to develop more reliable solutions.
KW - Artificial intelligence
KW - ASR
KW - children’s speech
KW - Computer aided diagnosis
KW - Hidden Markov models
KW - L2 speech
KW - Pediatrics
KW - Recording
KW - speech rating
KW - Speech recognition
KW - SSD
KW - Task analysis
KW - Toy manufacturing industry
KW - Training
KW - Vocabulary
KW - wav2vec2
UR - http://www.scopus.com/inward/record.url?scp=85167833032&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2023.3304274
DO - 10.1109/ACCESS.2023.3304274
M3 - Article
AN - SCOPUS:85167833032
SN - 2169-3536
VL - 11
SP - 86025
EP - 86037
JO - IEEE Access
JF - IEEE Access
ER -