DOLDA: a regularized supervised topic model for high-dimensional multi-class regression

Måns Magnusson*, Leif Jonsson, Mattias Villani

*Tämän työn vastaava kirjoittaja

Tutkimustuotos: LehtiartikkeliArticleScientificvertaisarvioitu

5 Sitaatiot (Scopus)
118 Lataukset (Pure)

Abstrakti

Generating user interpretable multi-class predictions in data-rich environments with many classes and explanatory covariates is a daunting task. We introduce Diagonal Orthant Latent Dirichlet Allocation (DOLDA), a supervised topic model for multi-class classification that can handle many classes as well as many covariates. To handle many classes we use the recently proposed Diagonal Orthant probit model (Johndrow et al., in: Proceedings of the sixteenth international conference on artificial intelligence and statistics, 2013) together with an efficient Horseshoe prior for variable selection/shrinkage (Carvalho et al. in Biometrika 97:465–480, 2010). We propose a computationally efficient parallel Gibbs sampler for the new model. An important advantage of DOLDA is that learned topics are directly connected to individual classes without the need for a reference class. We evaluate the model’s predictive accuracy and scalability, and demonstrate DOLDA’s advantage in interpreting the generated predictions.

AlkuperäiskieliEnglanti
Sivut175-201
Sivumäärä27
JulkaisuComputational Statistics
Vuosikerta35
Numero1
DOI - pysyväislinkit
TilaJulkaistu - 1 maalisk. 2020
OKM-julkaisutyyppiA1 Julkaistu artikkeli, soviteltu

Sormenjälki

Sukella tutkimusaiheisiin 'DOLDA: a regularized supervised topic model for high-dimensional multi-class regression'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.

Siteeraa tätä