Recurrent neural network language modelwith incremental updated context information generated using bag-of-words representation

Md Akmal Haidar, Mikko Kurimo

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

1 Citation (Scopus)

Abstract

Recurrent neural network language model (RNNLM) is becoming popular in the state-of-the-art speech recognition systems. However, it can not remember long term patterns well due to a so-called vanishing gradient problem. Recently, Bag-of-words (BOW) representation of a word sequence is frequently used as a context feature to improve the performance of a standard feedforward NNLM. However, the BOW features have not been shown to benefit RNNLM. In this paper, we introduce a technique using BOW features to remember long term dependencies in RNNLM by creating a context feature vector in a separate non-linear context layer during the training of RNNLM. The context information is incrementally updated based on the BOW features and processed further in a non-linear context layer. The output of this layer is used as a context feature vector and fed into the hidden and output layers of the RNNLM. Experiments with Penn Treebank corpus indicate that our approach can provide lower perplexity with fewer parameters and faster training compared to the conventional RNNLM. Moreover, we carried out speech recognition experiments with Wall Street Journal corpus and achieved lower word error rate than RNNLM.

Original languageEnglish
Title of host publicationProceedings of the Annual Conference of the International Speech Communication Association
Subtitle of host publicationInterspeech'16, San Francisco, USA, Sept. 8-12, 2016
PublisherInternational Speech Communication Association (ISCA)
Pages3504-3508
Number of pages5
Volume08-12-September-2016
ISBN (Electronic)978-1-5108-3313-5
DOIs
Publication statusPublished - 2016
MoE publication typeA4 Conference publication
EventInterspeech - San Francisco, United States
Duration: 8 Sept 201612 Sept 2016
Conference number: 17

Publication series

NameProceedings of the Annual Conference of the International Speech Communication Association
PublisherInternational Speech Communication Association
ISSN (Print)1990-9770
ISSN (Electronic)2308-457X

Conference

ConferenceInterspeech
Country/TerritoryUnited States
CitySan Francisco
Period08/09/201612/09/2016

Keywords

  • Bag-of-words
  • Language modeling
  • Recurrent neural networks
  • Speech recognition

Fingerprint

Dive into the research topics of 'Recurrent neural network language modelwith incremental updated context information generated using bag-of-words representation'. Together they form a unique fingerprint.

Cite this