Understanding the Mechanics of SPIGOT: Surrogate Gradients for Latent Structure Learning

Tsvetomila Mihaylova, Vlad Niculae, Andre F.T. Martins

Tutkimustuotos: Artikkeli kirjassa/konferenssijulkaisussaConference article in proceedingsScientificvertaisarvioitu

Abstrakti

Latent structure models are a powerful tool for modeling language data: they can mitigate the error propagation and annotation bottleneck in pipeline systems, while simultaneously uncovering linguistic insights about the data. One challenge with end-to-end training of these models is the argmax operation, which has null gradient. In this paper, we focus on surrogate gradients, a popular strategy to deal with this problem. We explore latent structure learning through the angle of pulling back the downstream learning objective. In this paradigm, we discover a principled motivation for both the straight-through estimator (STE) as well as the recently-proposed SPIGOT – a variant of STE for structured models. Our perspective leads to new algorithms in the same family. We empirically compare the known and the novel pulled-back estimators against the popular alternatives, yielding new insight for practitioners and revealing intriguing failure cases.
AlkuperäiskieliEnglanti
OtsikkoProceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, November 16-20, 2020
KustantajaAssociation for Computational Linguistics
Sivut2186-2202
ISBN (elektroninen)978-1-952148-90-3
DOI - pysyväislinkit
TilaJulkaistu - 2020
OKM-julkaisutyyppiA4 Artikkeli konferenssijulkaisussa
TapahtumaConference on Empirical Methods in Natural Language Processing - Virtual, Online
Kesto: 16 marrask. 202020 marrask. 2020

Conference

ConferenceConference on Empirical Methods in Natural Language Processing
LyhennettäEMNLP
KaupunkiVirtual, Online
Ajanjakso16/11/202020/11/2020

Sormenjälki

Sukella tutkimusaiheisiin 'Understanding the Mechanics of SPIGOT: Surrogate Gradients for Latent Structure Learning'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.

Siteeraa tätä