Scalable Gaussian Process for Extreme classification

Akash Dhaka, Michael Andersen, Pablo Moreno, Aki Vehtari

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

77 Downloads (Pure)


We address the limitations of Gaussian processes for multiclass classification in the setting where both the number of classes and the number of observations is very large. We propose a scalable approximate inference framework by combining the inducing points method with variational approximations of the likelihood that have been recently proposed in the literature. This leads to a tractable lower bound on the marginal likelihood that decomposes into a sum over both data points and class labels, and hence, is amenable to doubly stochastic optimization. To overcome memory issues when dealing with large datasets, we resort to amortized inference, which coupled with subsampling over classes reduces the computational and the memory footprint without a significant loss in performance. We demonstrate empirically that the proposed algorithm leads to superior performance in terms of test accuracy, and improved detection of tail labels.
Original languageEnglish
Title of host publicationProceedings of the 2020 IEEE 30th International Workshop on Machine Learning for Signal Processing, MLSP 2020
Number of pages6
ISBN (Electronic)978-1-7281-6662-9
Publication statusPublished - 1 Oct 2020
MoE publication typeA4 Article in a conference publication
EventIEEE International Workshop on Machine Learning for Signal Processing - Aalto University, Espoo, Finland
Duration: 21 Sep 202024 Sep 2020
Conference number: 30

Publication series

NameMachine Learning for Signal Processing
ISSN (Print)2161-0363
ISSN (Electronic)2161-0371


WorkshopIEEE International Workshop on Machine Learning for Signal Processing
Abbreviated titleMLSP
Internet address


  • Augmented model
  • Gaussian process classification
  • Variational inference


Dive into the research topics of 'Scalable Gaussian Process for Extreme classification'. Together they form a unique fingerprint.

Cite this