Improving Hyperparameter Learning under Approximate Inference in Gaussian Process Models

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

32 Downloads (Pure)

Abstract

Approximate inference in Gaussian process (GP) models with non-conjugate likelihoods gets entangled with the learning of the model hyperparameters. We improve hyperparameter learning in GP models and focus on the interplay between variational inference (VI) and the learning target. While VI’s lower bound to the marginal likelihood is a suitable objective for inferring the approximate posterior, we show that a direct approximation of the marginal likelihood as in Expectation Propagation (EP) is a better learning objective for hyperparameter optimization. We design a hybrid training procedure to bring the best of both worlds: it leverages conjugate-computation VI for inference and uses an EP-like marginal likelihood approximation for hyperparameter learning. We compare VI, EP, Laplace approximation, and our proposed training procedure and empirically demonstrate the effectiveness of our proposal across a wide range of data sets.
Original languageEnglish
Title of host publicationProceedings of the 40th International Conference on Machine Learning
EditorsAndread Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, Jonathan Scarlett
PublisherJMLR
Pages19595-19615
Number of pages21
Publication statusPublished - Jul 2023
MoE publication typeA4 Conference publication
EventInternational Conference on Machine Learning - Honolulu, United States
Duration: 23 Jul 202329 Jul 2023
Conference number: 40

Publication series

NameProceedings of Machine Learning Research
PublisherPMLR
Volume202
ISSN (Electronic)2640-3498

Conference

ConferenceInternational Conference on Machine Learning
Abbreviated titleICML
Country/TerritoryUnited States
CityHonolulu
Period23/07/202329/07/2023

Fingerprint

Dive into the research topics of 'Improving Hyperparameter Learning under Approximate Inference in Gaussian Process Models'. Together they form a unique fingerprint.

Cite this