Grapheme-Based Cross-Language Forced Alignment: Results with Uralic Languages

Juho Leinonen, Sami Virpioja, Mikko Kurimo

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

60 Downloads (Pure)


Forced alignment is an effective process to speed up linguistic research. However, most forced aligners are language-dependent, and under-resourced languages rarely have enough resources to train an acoustic model for an aligner. We present a new Finnish grapheme-based forced aligner and demonstrate its performance by aligning multiple Uralic languages and English as an unrelated language. We show that even a simple non-expert created grapheme-to-phoneme mapping can result in useful word alignments.
Original languageEnglish
Title of host publicationProceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa)
PublisherLinköping University Electronic Press
Number of pages6
ISBN (Electronic)978-91-7929-614-8
Publication statusPublished - 1 May 2021
MoE publication typeA4 Conference publication
EventNordic Conference on Computational Linguistics - Reykjavik, Iceland
Duration: 31 May 20212 Jun 2021

Publication series

NameLinköping Electronic Conference Proceedings
PublisherLinköping University Electronic Press
ISSN (Print)1650-3740
ISSN (Electronic)1650-3686
NameNEALT Proceedings Series
PublisherUniversity of Tartu
ISSN (Print)1736-8197
ISSN (Electronic)1736-6305


ConferenceNordic Conference on Computational Linguistics
Abbreviated titleNoDaLiDa


Dive into the research topics of 'Grapheme-Based Cross-Language Forced Alignment: Results with Uralic Languages'. Together they form a unique fingerprint.

Cite this