Topic Identification in Dynamical Text by Complexity Pursuit

Ella Bingham, Ata Kaban, M. Girolami

    Tutkimustuotos: LehtiartikkeliArticleScientificvertaisarvioitu

    34 Sitaatiot (Scopus)

    Abstrakti

    The problem of analysing dynamically evolving textual data has arisen within the last few years. An example of such data is the discussion appearing in Internet chat lines. In this Letter a recently introduced source separation method, termed as complexity pursuit, is applied to the problem of finding topics in dynamical text and is compared against several blind separation algorithms for the problem considered. Complexity pursuit is a generalisation of projection pursuit to time series and it is able to use both higher-order statistical measures and temporal dependency information in separating the topics. Experimental results on chat line and newsgroup data demonstrate that the minimum complexity time series indeed do correspond to meaningful topics inherent in the dynamical text data, and also suggest the applicability of the method to query-based retrieval from a temporally changing text stream.
    AlkuperäiskieliEnglanti
    Sivut69-83
    JulkaisuNeural Processing Letters
    Vuosikerta17
    Numero1
    TilaJulkaistu - 2003
    OKM-julkaisutyyppiA1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä

    Tutkimusalat

    • chat line discussion
    • complexity pursuit
    • dynamical text
    • independent component analysis
    • time series

    Sormenjälki

    Sukella tutkimusaiheisiin 'Topic Identification in Dynamical Text by Complexity Pursuit'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.

    Siteeraa tätä