Word embedding based on low-rank doubly stochastic matrix decomposition

Tutkimustuotos: Artikkeli kirjassa/konferenssijulkaisussaConference contributionScientificvertaisarvioitu

14 Lataukset (Pure)

Abstrakti

Word embedding, which encodes words into vectors, is an important starting point in natural language processing and commonly used in many text-based machine learning tasks. However, in most current word embedding approaches, the similarity in embedding space is not optimized in the learning. In this paper we propose a novel neighbor embedding method which directly learns an embedding simplex where the similarities between the mapped words are optimal in terms of minimal discrepancy to the input neighborhoods. Our method is built upon two-step random walks between words via topics and thus able to better reveal the topics among the words. Experiment results indicate that our method, compared with another existing word embedding approach, is more favorable for various queries.
AlkuperäiskieliEnglanti
OtsikkoNeural Information Processing
Alaotsikko25th International Conference, ICONIP 2018 Siem Reap, Cambodia, December 13–16, 2018 Proceedings, Part III
Sivut90-100
Sivumäärä10
Vuosikerta3
ISBN (elektroninen)978-3-030-04182-3
TilaJulkaistu - 2018
OKM-julkaisutyyppiA4 Artikkeli konferenssijulkaisuussa
TapahtumaInternational Conference on Neural Information Processing - Siem Reap, Kambodza
Kesto: 13 joulukuuta 201816 joulukuuta 2018
Konferenssinumero: 25

Julkaisusarja

NimiLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
KustantajaSpringer
Vuosikerta11303 LNCS
ISSN (painettu)0302-9743
ISSN (elektroninen)1611-3349

Conference

ConferenceInternational Conference on Neural Information Processing
LyhennettäICONIP
MaaKambodza
KaupunkiSiem Reap
Ajanjakso13/12/201816/12/2018

Sormenjälki Sukella tutkimusaiheisiin 'Word embedding based on low-rank doubly stochastic matrix decomposition'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.

  • Siteeraa tätä

    Sedov, D., & Yang, Z. (2018). Word embedding based on low-rank doubly stochastic matrix decomposition. teoksessa Neural Information Processing: 25th International Conference, ICONIP 2018 Siem Reap, Cambodia, December 13–16, 2018 Proceedings, Part III (Vuosikerta 3, Sivut 90-100). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vuosikerta 11303 LNCS).