Author Tree-Structured Hierarchical Dirichlet Process

Md Hijbul Alam*, Jaakko Peltonen, Jyrki Nummenmaa, Kalervo Järvelin

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

1 Citation (Scopus)

Abstract

Three key aspects of online discussion venues are the multitude of participants, the underlying trends of content, and the structure of the venue. However, most models are unable to take into account all three of these. In hierarchically organized message forums, authors may participate differently at multiple levels of sections, with different interests and contributions across the hierarchy. Well-designed probabilistic models of online discussion are applicable to many tasks such as prediction of future content or authorship attribution. However, traditional models such as Hierarchical Dirichlet Processes (HDPs) do not fully take into account authors, and are further unable to fully take into account deep hierarchical venues where documents can arise at all tree nodes. We introduce the Author Tree-structured Hierarchical Dirichlet Process (ATHDP), allowing Dirichlet process based topic modeling of both text content and authors over a given tree structure of arbitrary size and height. Experiments on six hierarchical discussion data sets demonstrate better performance of ATHDP compared to traditional HDP based alternatives in terms of perplexity and authorship attribution accuracy.

Original languageEnglish
Title of host publicationDiscovery Science - 21st International Conference, DS 2018, Proceedings
EditorsMichelangelo Ceci, Larisa Soldatova, Joaquin Vanschoren, George Papadopoulos
PublisherSpringer
Pages311-327
Number of pages17
ISBN (Print)9783030017705
DOIs
Publication statusPublished - 1 Jan 2018
MoE publication typeA4 Conference publication
EventInternational Conference on Discovery Science - Limassol, Cyprus
Duration: 29 Oct 201831 Oct 2018
Conference number: 21

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11198 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceInternational Conference on Discovery Science
Abbreviated titleDS
Country/TerritoryCyprus
CityLimassol
Period29/10/201831/10/2018

Keywords

  • Hierarchical Dirichlet Processes
  • Message Forum
  • Topic Modeling

Fingerprint

Dive into the research topics of 'Author Tree-Structured Hierarchical Dirichlet Process'. Together they form a unique fingerprint.

Cite this