Grapheme-Based Cross-Language Forced Alignment: Results with Uralic Languages

Juho Leinonen, Sami Virpioja, Mikko Kurimo

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

31 Downloads (Pure)

Abstract

Forced alignment is an effective process to speed up linguistic research. However, most forced aligners are language-dependent, and under-resourced languages rarely have enough resources to train an acoustic model for an aligner. We present a new Finnish grapheme-based forced aligner and demonstrate its performance by aligning multiple Uralic languages and English as an unrelated language. We show that even a simple non-expert created grapheme-to-phoneme mapping can result in useful word alignments.
Original languageEnglish
Title of host publicationProceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa)
PublisherLinköping University Electronic Press
Pages345-350
Number of pages6
ISBN (Electronic)978-91-7929-614-8
Publication statusPublished - 1 May 2021
MoE publication typeA4 Article in a conference publication
EventNordic Conference on Computational Linguistics - Reykjavik, Iceland
Duration: 31 May 20212 Jun 2021

Publication series

NameLinköping Electronic Conference Proceedings
PublisherLinköping University Electronic Press
Number178
ISSN (Print)1650-3740
ISSN (Electronic)1650-3686
NameNEALT Proceedings Series
PublisherUniversity of Tartu
Volume45
ISSN (Print)1736-8197
ISSN (Electronic)1736-6305

Conference

ConferenceNordic Conference on Computational Linguistics
Abbreviated titleNoDaLiDa
Country/TerritoryIceland
CityReykjavik
Period31/05/202102/06/2021

Fingerprint

Dive into the research topics of 'Grapheme-Based Cross-Language Forced Alignment: Results with Uralic Languages'. Together they form a unique fingerprint.

Cite this