Topics in 0--1 data

Ella Bingham, H. Mannila, J.K. Seppänen

    Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

    3 Citations (Scopus)

    Abstract

    Large 0--1 datasets arise in various applications, such as market basket analysis and information retrieval. We concentrate on the study of topic models, aiming at results which indicate why certain methods succeed or fail. We describe simple algorithms for finding topic models from 0--1 data. We give theoretical results showing that the algorithms can discover the epsilon-separable topic models of Papadimitriou et al. We present empirical results showing that the algorithms find natural topics in real-world data sets. We also briefly discuss the connections to matrix approaches, including nonnegative matrix factorization and independent component analysis.
    Original languageEnglish
    Title of host publicationKDD '02: Proceedings of the eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
    PublisherACM
    Pages450-455
    ISBN (Electronic)978-1-58113-567-1
    DOIs
    Publication statusPublished - 2002
    MoE publication typeA4 Conference publication
    EventACM SIGKDD International Conference on Knowledge Discovery and Data Mining - Edmonton, Canada
    Duration: 23 Jun 200226 Jun 2002
    Conference number: 8

    Conference

    ConferenceACM SIGKDD International Conference on Knowledge Discovery and Data Mining
    Abbreviated titleKDD
    Country/TerritoryCanada
    CityEdmonton
    Period23/06/200226/06/2002

    Keywords

    • 0-1 data
    • data mining
    • latent variable model
    • topic models

    Fingerprint

    Dive into the research topics of 'Topics in 0--1 data'. Together they form a unique fingerprint.

    Cite this