Finnish ASR with deep transformer models

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

45 Downloads (Pure)

Abstract

Recently, BERT and Transformer-XL based architectures have achieved strong results in a range of NLP applications. In this paper, we explore Transformer architectures-BERT and Transformer-XL-as a language model for a Finnish ASR task with different rescoring schemes. We achieve strong results in both an intrinsic and an extrinsic task with Transformer-XL. Achieving 29% better perplexity and 3% better WER than our previous best LSTM-based approach. We also introduce a novel three-pass decoding scheme which improves the ASR performance by 8%. To the best of our knowledge, this is also the first work (i) to formulate an alpha smoothing framework to use the non-autoregressive BERT language model for an ASR task, and (ii) to explore sub-word units with Transformer-XL for an agglutinative language like Finnish.

Original languageEnglish
Title of host publicationProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
PublisherInternational Speech Communication Association
Pages3630-3634
Number of pages5
Volume2020-October
DOIs
Publication statusPublished - 2020
MoE publication typeA4 Article in a conference publication
EventInterspeech - Shanghai, China
Duration: 25 Oct 202029 Oct 2020
Conference number: 21
http://www.interspeech2020.org/

Publication series

NameInterspeech
PublisherInternational Speech Communication Association
ISSN (Print)2308-457X

Conference

ConferenceInterspeech
Abbreviated titleINTERSPEECH
CountryChina
CityShanghai
Period25/10/202029/10/2020
Internet address

Keywords

  • BERT
  • Language modeling
  • Speech recognition
  • Transformer-XL
  • Transformers

Fingerprint Dive into the research topics of 'Finnish ASR with deep transformer models'. Together they form a unique fingerprint.

Cite this