LSTM-XL: Attention Enhanced Long-Term Memory for LSTM Cells

Tamás Grósz*, Mikko Kurimo

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

1 Citation (Scopus)

Abstract

Long Short-Term Memory (LSTM) cells, frequently used in state-of-the-art language models, struggle with long sequences of inputs. One major problem in their design is that they try to summarize long-term information into a single vector, which is difficult. The attention mechanism aims to alleviate this problem by accumulating the relevant outputs more efficiently. One very successful attention-based model is the Transformer; but it also has issues with long sentences. As a solution, the latest version of Transformers incorporates recurrence into the model. The success of these recurrent attention-based models inspired us to revise the LSTM cells by incorporating the attention mechanism. Our goal is to improve their long-term memory by attending to past outputs. The main advantage of our proposed approach is that it directly accesses the stored preceding vectors, making it more effective for long sentences. Using this method, we can also avoid the undesired resetting of the long-term vector by the forget gate. We evaluated our new cells on two speech recognition tasks and found that it is more beneficial to use attention inside the cells than after them.

Original languageEnglish
Title of host publicationText, Speech, and Dialogue - 24th International Conference, TSD 2021, Proceedings
EditorsKamil Ekštein, František Pártl, Miloslav Konopík
Pages382-393
Number of pages12
ISBN (Electronic)978-3-030-83527-9
DOIs
Publication statusPublished - 2021
MoE publication typeA4 Article in a conference publication
EventInternational Conference on Text, Speech, and Dialogue - Olomouc, Czech Republic
Duration: 6 Sep 20219 Sep 2021
Conference number: 24

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12848 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceInternational Conference on Text, Speech, and Dialogue
Abbreviated titleTSD
Country/TerritoryCzech Republic
CityOlomouc
Period06/09/202109/09/2021

Keywords

  • Attention
  • LSTM
  • RNNLM
  • Speech recognition

Fingerprint

Dive into the research topics of 'LSTM-XL: Attention Enhanced Long-Term Memory for LSTM Cells'. Together they form a unique fingerprint.

Cite this