Recurrent neural network language modelwith incremental updated context information generated using bag-of-words representation

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

Researchers

Research units

Abstract

Recurrent neural network language model (RNNLM) is becoming popular in the state-of-the-art speech recognition systems. However, it can not remember long term patterns well due to a so-called vanishing gradient problem. Recently, Bag-of-words (BOW) representation of a word sequence is frequently used as a context feature to improve the performance of a standard feedforward NNLM. However, the BOW features have not been shown to benefit RNNLM. In this paper, we introduce a technique using BOW features to remember long term dependencies in RNNLM by creating a context feature vector in a separate non-linear context layer during the training of RNNLM. The context information is incrementally updated based on the BOW features and processed further in a non-linear context layer. The output of this layer is used as a context feature vector and fed into the hidden and output layers of the RNNLM. Experiments with Penn Treebank corpus indicate that our approach can provide lower perplexity with fewer parameters and faster training compared to the conventional RNNLM. Moreover, we carried out speech recognition experiments with Wall Street Journal corpus and achieved lower word error rate than RNNLM.

Details

Original languageEnglish
Title of host publicationProceedings of the Annual Conference of the International Speech Communication Association
Subtitle of host publicationInterspeech'16, San Francisco, USA, Sept. 8-12, 2016
Publication statusPublished - 2016
MoE publication typeA4 Article in a conference publication
EventInterspeech - San Francisco, United States
Duration: 8 Sep 201612 Sep 2016
Conference number: 17

Publication series

NameProceedings of the Annual Conference of the International Speech Communication Association
PublisherInternational Speech Communication Association
ISSN (Print)1990-9770
ISSN (Electronic)2308-457X

Conference

ConferenceInterspeech
CountryUnited States
CitySan Francisco
Period08/09/201612/09/2016

    Research areas

  • Bag-of-words, Language modeling, Recurrent neural networks, Speech recognition

ID: 9715443