Abstrakti
In speech recognition of morphologically rich languages, very large vocabulary sizes are required to achieve good error rates. Especially traditional n-gram language models trained over word sequences suffer from data sparsity issues. The language modelling can often be improved by segmenting the words to sequences of subword units that are more frequent. Another solution is to cluster the words into classes and apply a class-based language model. We show that linearly interpolating n-gram models trained over words, subwords, and word classes improves the first-pass speech recognition accuracy in
very large vocabulary speech recognition tasks for two morphologically rich and agglutinative languages, Finnish and Estonian. To overcome performance issues, we also introduce a novel language model look-ahead method utilizing a class bigram model. The method improves the results over a unigram look-ahead model with the same recognition speed, the difference increasing for small real-time factors. The improved model combination and look-ahead model are useful in cases where real-time recognition is required or when the improved hypotheses help with further recognition passes. For instance, neural network language models are mostly applied by rescoring the generated hypotheses due to higher computational costs.
very large vocabulary speech recognition tasks for two morphologically rich and agglutinative languages, Finnish and Estonian. To overcome performance issues, we also introduce a novel language model look-ahead method utilizing a class bigram model. The method improves the results over a unigram look-ahead model with the same recognition speed, the difference increasing for small real-time factors. The improved model combination and look-ahead model are useful in cases where real-time recognition is required or when the improved hypotheses help with further recognition passes. For instance, neural network language models are mostly applied by rescoring the generated hypotheses due to higher computational costs.
| Alkuperäiskieli | Englanti |
|---|---|
| Otsikko | 2018 IEEE Spoken Language Technology Workshop, December 18-21, 2018, Athens, Greece |
| Kustantaja | IEEE |
| Sivut | 227-234 |
| Sivumäärä | 6 |
| ISBN (elektroninen) | 978-1-5386-4333-4 |
| DOI - pysyväislinkit | |
| Tila | Julkaistu - 2018 |
| OKM-julkaisutyyppi | A4 Artikkeli konferenssijulkaisussa |
| Tapahtuma | IEEE Spoken Language Technology Workshop - Athens, Kreikka Kesto: 18 jouluk. 2018 → 21 jouluk. 2018 |
Workshop
| Workshop | IEEE Spoken Language Technology Workshop |
|---|---|
| Lyhennettä | SLT |
| Maa/Alue | Kreikka |
| Kaupunki | Athens |
| Ajanjakso | 18/12/2018 → 21/12/2018 |
Sormenjälki
Sukella tutkimusaiheisiin 'First-Pass Techniques for Very Large Vocabulary Speech Recognition of Morphologically Rich Languages'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.Siteeraa tätä
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver