Abstrakti
Large 0--1 datasets arise in various applications, such as market basket analysis and information retrieval. We concentrate on the study of topic models, aiming at results which indicate why certain methods succeed or fail. We describe simple algorithms for finding topic models from 0--1 data. We give theoretical results showing that the algorithms can discover the epsilon-separable topic models of Papadimitriou et al. We present empirical results showing that the algorithms find natural topics in real-world data sets. We also briefly discuss the connections to matrix approaches, including nonnegative matrix factorization and independent component analysis.
Alkuperäiskieli | Englanti |
---|---|
Otsikko | KDD '02: Proceedings of the eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining |
Kustantaja | ACM |
Sivut | 450-455 |
ISBN (elektroninen) | 978-1-58113-567-1 |
DOI - pysyväislinkit | |
Tila | Julkaistu - 2002 |
OKM-julkaisutyyppi | A4 Artikkeli konferenssijulkaisussa |
Tapahtuma | ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - Edmonton, Kanada Kesto: 23 kesäk. 2002 → 26 kesäk. 2002 Konferenssinumero: 8 |
Conference
Conference | ACM SIGKDD International Conference on Knowledge Discovery and Data Mining |
---|---|
Lyhennettä | KDD |
Maa/Alue | Kanada |
Kaupunki | Edmonton |
Ajanjakso | 23/06/2002 → 26/06/2002 |
Tutkimusalat
- 0-1 data
- data mining
- latent variable model
- topic models