LSTM-XL: Attention Enhanced Long-Term Memory for LSTM Cells

Tamás Grósz*, Mikko Kurimo

*Tämän työn vastaava kirjoittaja

Tutkimustuotos: Artikkeli kirjassa/konferenssijulkaisussaConference contributionScientificvertaisarvioitu

1 Sitaatiot (Scopus)

Abstrakti

Long Short-Term Memory (LSTM) cells, frequently used in state-of-the-art language models, struggle with long sequences of inputs. One major problem in their design is that they try to summarize long-term information into a single vector, which is difficult. The attention mechanism aims to alleviate this problem by accumulating the relevant outputs more efficiently. One very successful attention-based model is the Transformer; but it also has issues with long sentences. As a solution, the latest version of Transformers incorporates recurrence into the model. The success of these recurrent attention-based models inspired us to revise the LSTM cells by incorporating the attention mechanism. Our goal is to improve their long-term memory by attending to past outputs. The main advantage of our proposed approach is that it directly accesses the stored preceding vectors, making it more effective for long sentences. Using this method, we can also avoid the undesired resetting of the long-term vector by the forget gate. We evaluated our new cells on two speech recognition tasks and found that it is more beneficial to use attention inside the cells than after them.

AlkuperäiskieliEnglanti
OtsikkoText, Speech, and Dialogue - 24th International Conference, TSD 2021, Proceedings
ToimittajatKamil Ekštein, František Pártl, Miloslav Konopík
KustantajaSPRINGER
Sivut382-393
Sivumäärä12
ISBN (elektroninen)978-3-030-83527-9
ISBN (painettu)9783030835262
DOI - pysyväislinkit
TilaJulkaistu - 2021
OKM-julkaisutyyppiA4 Artikkeli konferenssijulkaisuussa
TapahtumaInternational Conference on Text, Speech, and Dialogue - Olomouc, Tshekki
Kesto: 6 syysk. 20219 syysk. 2021
Konferenssinumero: 24

Julkaisusarja

NimiLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Vuosikerta12848 LNAI
ISSN (painettu)0302-9743
ISSN (elektroninen)1611-3349

Conference

ConferenceInternational Conference on Text, Speech, and Dialogue
LyhennettäTSD
Maa/AlueTshekki
KaupunkiOlomouc
Ajanjakso06/09/202109/09/2021

Sormenjälki

Sukella tutkimusaiheisiin 'LSTM-XL: Attention Enhanced Long-Term Memory for LSTM Cells'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.

Siteeraa tätä