First-Pass Techniques for Very Large Vocabulary Speech Recognition of Morphologically Rich Languages

Tutkimustuotos: Artikkeli kirjassa/konferenssijulkaisussavertaisarvioitu

Tutkijat

Organisaatiot

  • Utopia Analytics Oy

Kuvaus

In speech recognition of morphologically rich languages, very large vocabulary sizes are required to achieve good error rates. Especially traditional n-gram language models trained over word sequences suffer from data sparsity issues. The language modelling can often be improved by segmenting the words to sequences of subword units that are more frequent. Another solution is to cluster the words into classes and apply a class-based language model. We show that linearly interpolating n-gram models trained over words, subwords, and word classes improves the first-pass speech recognition accuracy in
very large vocabulary speech recognition tasks for two morphologically rich and agglutinative languages, Finnish and Estonian. To overcome performance issues, we also introduce a novel language model look-ahead method utilizing a class bigram model. The method improves the results over a unigram look-ahead model with the same recognition speed, the difference increasing for small real-time factors. The improved model combination and look-ahead model are useful in cases where real-time recognition is required or when the improved hypotheses help with further recognition passes. For instance, neural network language models are mostly applied by rescoring the generated hypotheses due to higher computational costs.

Yksityiskohdat

AlkuperäiskieliEnglanti
Otsikko2018 IEEE Spoken Language Technology Workshop, December 18-21, 2018, Athens, Greece
TilaJulkaistu - 2018
OKM-julkaisutyyppiA4 Artikkeli konferenssijulkaisuussa
TapahtumaIEEE Spoken Language Technology Workshop - Athens, Kreikka
Kesto: 18 joulukuuta 201821 joulukuuta 2018

Workshop

WorkshopIEEE Spoken Language Technology Workshop
LyhennettäSLT
MaaKreikka
KaupunkiAthens
Ajanjakso18/12/201821/12/2018

ID: 30566774