Word embedding based on low-rank doubly stochastic matrix decomposition

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

Standard

Word embedding based on low-rank doubly stochastic matrix decomposition. / Sedov, Denis; Yang, Zhirong.

Neural Information Processing: 25th International Conference, ICONIP 2018 Siem Reap, Cambodia, December 13–16, 2018 Proceedings, Part III. Vol. 3 2018. p. 90-100 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11303 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

Harvard

Sedov, D & Yang, Z 2018, Word embedding based on low-rank doubly stochastic matrix decomposition. in Neural Information Processing: 25th International Conference, ICONIP 2018 Siem Reap, Cambodia, December 13–16, 2018 Proceedings, Part III. vol. 3, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11303 LNCS, pp. 90-100, International Conference on Neural Information Processing, Siem Reap, Cambodia, 13/12/2018.

APA

Sedov, D., & Yang, Z. (2018). Word embedding based on low-rank doubly stochastic matrix decomposition. In Neural Information Processing: 25th International Conference, ICONIP 2018 Siem Reap, Cambodia, December 13–16, 2018 Proceedings, Part III (Vol. 3, pp. 90-100). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11303 LNCS).

Vancouver

Sedov D, Yang Z. Word embedding based on low-rank doubly stochastic matrix decomposition. In Neural Information Processing: 25th International Conference, ICONIP 2018 Siem Reap, Cambodia, December 13–16, 2018 Proceedings, Part III. Vol. 3. 2018. p. 90-100. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

Author

Sedov, Denis ; Yang, Zhirong. / Word embedding based on low-rank doubly stochastic matrix decomposition. Neural Information Processing: 25th International Conference, ICONIP 2018 Siem Reap, Cambodia, December 13–16, 2018 Proceedings, Part III. Vol. 3 2018. pp. 90-100 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

Bibtex - Download

@inproceedings{b778e60584ee4a029e7d2c3cb56a196c,
title = "Word embedding based on low-rank doubly stochastic matrix decomposition",
abstract = "Word embedding, which encodes words into vectors, is an important starting point in natural language processing and commonly used in many text-based machine learning tasks. However, in most current word embedding approaches, the similarity in embedding space is not optimized in the learning. In this paper we propose a novel neighbor embedding method which directly learns an embedding simplex where the similarities between the mapped words are optimal in terms of minimal discrepancy to the input neighborhoods. Our method is built upon two-step random walks between words via topics and thus able to better reveal the topics among the words. Experiment results indicate that our method, compared with another existing word embedding approach, is more favorable for various queries.",
keywords = "Nonnegative matrix factorization, Word embedding, Cluster analysis, Doubly stochastic",
author = "Denis Sedov and Zhirong Yang",
year = "2018",
language = "English",
isbn = "978-3-030-04181-6",
volume = "3",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer",
pages = "90--100",
booktitle = "Neural Information Processing",

}

RIS - Download

TY - GEN

T1 - Word embedding based on low-rank doubly stochastic matrix decomposition

AU - Sedov, Denis

AU - Yang, Zhirong

PY - 2018

Y1 - 2018

N2 - Word embedding, which encodes words into vectors, is an important starting point in natural language processing and commonly used in many text-based machine learning tasks. However, in most current word embedding approaches, the similarity in embedding space is not optimized in the learning. In this paper we propose a novel neighbor embedding method which directly learns an embedding simplex where the similarities between the mapped words are optimal in terms of minimal discrepancy to the input neighborhoods. Our method is built upon two-step random walks between words via topics and thus able to better reveal the topics among the words. Experiment results indicate that our method, compared with another existing word embedding approach, is more favorable for various queries.

AB - Word embedding, which encodes words into vectors, is an important starting point in natural language processing and commonly used in many text-based machine learning tasks. However, in most current word embedding approaches, the similarity in embedding space is not optimized in the learning. In this paper we propose a novel neighbor embedding method which directly learns an embedding simplex where the similarities between the mapped words are optimal in terms of minimal discrepancy to the input neighborhoods. Our method is built upon two-step random walks between words via topics and thus able to better reveal the topics among the words. Experiment results indicate that our method, compared with another existing word embedding approach, is more favorable for various queries.

KW - Nonnegative matrix factorization

KW - Word embedding

KW - Cluster analysis

KW - Doubly stochastic

M3 - Conference contribution

SN - 978-3-030-04181-6

VL - 3

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 90

EP - 100

BT - Neural Information Processing

ER -

ID: 30347849