LSTM-XL: Attention Enhanced Long-Term Memory for LSTM Cells

Tamás Grósz*, Mikko Kurimo

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

1 Citation (Scopus)


Long Short-Term Memory (LSTM) cells, frequently used in state-of-the-art language models, struggle with long sequences of inputs. One major problem in their design is that they try to summarize long-term information into a single vector, which is difficult. The attention mechanism aims to alleviate this problem by accumulating the relevant outputs more efficiently. One very successful attention-based model is the Transformer; but it also has issues with long sentences. As a solution, the latest version of Transformers incorporates recurrence into the model. The success of these recurrent attention-based models inspired us to revise the LSTM cells by incorporating the attention mechanism. Our goal is to improve their long-term memory by attending to past outputs. The main advantage of our proposed approach is that it directly accesses the stored preceding vectors, making it more effective for long sentences. Using this method, we can also avoid the undesired resetting of the long-term vector by the forget gate. We evaluated our new cells on two speech recognition tasks and found that it is more beneficial to use attention inside the cells than after them.

Original languageEnglish
Title of host publicationText, Speech, and Dialogue - 24th International Conference, TSD 2021, Proceedings
EditorsKamil Ekštein, František Pártl, Miloslav Konopík
Number of pages12
ISBN (Electronic)978-3-030-83527-9
ISBN (Print)9783030835262
Publication statusPublished - 2021
MoE publication typeA4 Conference publication
EventInternational Conference on Text, Speech, and Dialogue - Olomouc, Czech Republic
Duration: 6 Sept 20219 Sept 2021
Conference number: 24

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12848 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


ConferenceInternational Conference on Text, Speech, and Dialogue
Abbreviated titleTSD
Country/TerritoryCzech Republic


  • Attention
  • LSTM
  • Speech recognition


Dive into the research topics of 'LSTM-XL: Attention Enhanced Long-Term Memory for LSTM Cells'. Together they form a unique fingerprint.

Cite this