Comparison and Analysis of New Curriculum Criteria for End-to-End ASR

Georgios Karakasidis, Tamás Grósz, Mikko Kurimo

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

38 Downloads (Pure)

Abstract

It is common knowledge that the quantity and quality of the training data play a significant role in the creation of a good machine learning model. In this paper, we take it one step further and demonstrate that the way the training examples are arranged is also of crucial importance. Curriculum Learning is
built on the observation that organized and structured assimilation of knowledge has the ability to enable faster training and better comprehension. When humans learn to speak, they first try to utter basic phones and then gradually move towards more complex structures such as words and sentences. This methodology is known as Curriculum Learning, and we employ it in the context of Automatic Speech Recognition. We hypothesize that end-to-end models can achieve better performance when provided with an organized training set consisting of examples that exhibit an increasing level of difficulty (i.e. a curriculum). To impose structure on the training set and to define the notion of an easy example, we explored multiple scoring functions that either use feedback from an external neural network or incorporate feedback from the model itself. Empirical results show that with different curriculums we can balance the training times and the network’s performance.
Original languageEnglish
Title of host publicationProceedings of Interspeech'22
PublisherInternational Speech Communication Association (ISCA)
Pages66-70
Number of pages5
Volume2022-September
DOIs
Publication statusPublished - 2022
MoE publication typeA4 Article in a conference publication
EventInterspeech - Incheon, Korea, Republic of
Duration: 18 Sept 202222 Sept 2022

Publication series

NameProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
PublisherInternational Speech Communication Association
ISSN (Print)2308-457X
ISSN (Electronic)1990-9772

Conference

ConferenceInterspeech
Country/TerritoryKorea, Republic of
CityIncheon
Period18/09/202222/09/2022

Keywords

  • Curriculum Learning
  • Automatic Speech Recognition
  • End-to-End

Fingerprint

Dive into the research topics of 'Comparison and Analysis of New Curriculum Criteria for End-to-End ASR'. Together they form a unique fingerprint.

Cite this