Abstract
Three key aspects of online discussion venues are the multitude of participants, the underlying trends of content, and the structure of the venue. However, most models are unable to take into account all three of these. In hierarchically organized message forums, authors may participate differently at multiple levels of sections, with different interests and contributions across the hierarchy. Well-designed probabilistic models of online discussion are applicable to many tasks such as prediction of future content or authorship attribution. However, traditional models such as Hierarchical Dirichlet Processes (HDPs) do not fully take into account authors, and are further unable to fully take into account deep hierarchical venues where documents can arise at all tree nodes. We introduce the Author Tree-structured Hierarchical Dirichlet Process (ATHDP), allowing Dirichlet process based topic modeling of both text content and authors over a given tree structure of arbitrary size and height. Experiments on six hierarchical discussion data sets demonstrate better performance of ATHDP compared to traditional HDP based alternatives in terms of perplexity and authorship attribution accuracy.
Original language | English |
---|---|
Title of host publication | Discovery Science - 21st International Conference, DS 2018, Proceedings |
Editors | Michelangelo Ceci, Larisa Soldatova, Joaquin Vanschoren, George Papadopoulos |
Publisher | Springer |
Pages | 311-327 |
Number of pages | 17 |
ISBN (Print) | 9783030017705 |
DOIs | |
Publication status | Published - 1 Jan 2018 |
MoE publication type | A4 Conference publication |
Event | International Conference on Discovery Science - Limassol, Cyprus Duration: 29 Oct 2018 → 31 Oct 2018 Conference number: 21 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 11198 LNAI |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | International Conference on Discovery Science |
---|---|
Abbreviated title | DS |
Country/Territory | Cyprus |
City | Limassol |
Period | 29/10/2018 → 31/10/2018 |
Keywords
- Hierarchical Dirichlet Processes
- Message Forum
- Topic Modeling