Recurrent neural network language modelwith incremental updated context information generated using bag-of-words representation

Md Akmal Haidar, Mikko Kurimo

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

Abstract

Recurrent neural network language model (RNNLM) is becoming popular in the state-of-the-art speech recognition systems. However, it can not remember long term patterns well due to a so-called vanishing gradient problem. Recently, Bag-of-words (BOW) representation of a word sequence is frequently used as a context feature to improve the performance of a standard feedforward NNLM. However, the BOW features have not been shown to benefit RNNLM. In this paper, we introduce a technique using BOW features to remember long term dependencies in RNNLM by creating a context feature vector in a separate non-linear context layer during the training of RNNLM. The context information is incrementally updated based on the BOW features and processed further in a non-linear context layer. The output of this layer is used as a context feature vector and fed into the hidden and output layers of the RNNLM. Experiments with Penn Treebank corpus indicate that our approach can provide lower perplexity with fewer parameters and faster training compared to the conventional RNNLM. Moreover, we carried out speech recognition experiments with Wall Street Journal corpus and achieved lower word error rate than RNNLM.

Original languageEnglish
Title of host publicationProceedings of the Annual Conference of the International Speech Communication Association
Subtitle of host publicationInterspeech'16, San Francisco, USA, Sept. 8-12, 2016
PublisherInternational Speech Communication Association
Pages3504-3508
Number of pages5
Volume08-12-September-2016
ISBN (Electronic)978-1-5108-3313-5
DOIs
Publication statusPublished - 2016
MoE publication typeA4 Article in a conference publication
EventInterspeech - San Francisco, United States
Duration: 8 Sep 201612 Sep 2016
Conference number: 17

Publication series

NameProceedings of the Annual Conference of the International Speech Communication Association
PublisherInternational Speech Communication Association
ISSN (Print)1990-9770
ISSN (Electronic)2308-457X

Conference

ConferenceInterspeech
CountryUnited States
CitySan Francisco
Period08/09/201612/09/2016

Keywords

  • Bag-of-words
  • Language modeling
  • Recurrent neural networks
  • Speech recognition

Fingerprint Dive into the research topics of 'Recurrent neural network language modelwith incremental updated context information generated using bag-of-words representation'. Together they form a unique fingerprint.

  • Equipment

    Science-IT

    Mikko Hakala (Manager)

    School of Science

    Facility/equipment: Facility

  • Cite this

    Haidar, M. A., & Kurimo, M. (2016). Recurrent neural network language modelwith incremental updated context information generated using bag-of-words representation. In Proceedings of the Annual Conference of the International Speech Communication Association: Interspeech'16, San Francisco, USA, Sept. 8-12, 2016 (Vol. 08-12-September-2016, pp. 3504-3508). (Proceedings of the Annual Conference of the International Speech Communication Association). International Speech Communication Association. https://doi.org/10.21437/Interspeech.2016-375