Gaussian Process Surrogate Methods for Sample-Efficient Approximate Bayesian Computation

Marko Järvenpää

Research output: ThesisDoctoral ThesisCollection of Articles

Abstract

In many application fields such as ecology, epidemiology and astronomy, simulation models are used to study complex phenomena that occur in nature. Often the analytical form of the likelihood function of these models is either unavailable or too costly to evaluate which complicates statistical inference. Likelihood-free inference (LFI) methods such as approximate Bayesian computation (ABC), based on replacing the evaluations of the intractable likelihood with forward simulations of the model, have become a popular approach to conduct inference for simulation models. Nevertheless, current LFI methods feature several computational and statistical challenges. Especially, standard ABC algorithms require a huge number of simulations which makes them infeasible when the forward simulations are expensive. This thesis deals with likelihood-free inference for computationally costly models. The main contribution is a coherent framework for LFI based on Gaussian process (GP) surrogate models. GP models allow to encode smoothness assumptions of the simulation model output to reduce the amount of simulations needed. Additionally, the uncertainty in the resulting model-based posterior approximations due to the limited simulation budget can be quantified. We develop Bayesian experimental design strategies to select the evaluation locations as to minimise the computational cost. Both sequential designs, where simulations are chosen one-at-a-time basis, and batch strategies, which allow to take advantage of parallel computing, are derived. In addition to the LFI scenario, the proposed methods also apply when the likelihood can be evaluated but is expensive. In essence, the proposed framework can be viewed as an LFI counterpart of probabilistic numerical methods such as Bayesian optimisation, developed for optimising expensive objective functions, and Bayesian quadrature, developed for computing integrals of expensive functions. We demonstrate the advantages of the proposed LFI methods using extensive empirical simulations. Some theoretical analysis of the proposed algorithms is also provided and their relation to some other GP surrogate methods are discussed. In addition to the contributions to statistical methodology, applications in population genomics are also considered. In particular, we use the GP-based ABC methodology to obtain an approximate posterior of a simulation model describing horizontal gene transfer in bacteria. We also develop a probabilistic model and an inference algorithm using a novel combination of ABC and Metropolis-within-Gibbs sampling to facilitate better understanding of bacterial colonisation.
Translated title of the contributionGaussin prosessi -surrogaattimenetelmiä likimääräiseen Bayesilaiseen päättelyyn
Original languageEnglish
QualificationDoctor's degree
Awarding Institution
  • Aalto University
Supervisors/Advisors
  • Marttinen, Pekka, Supervising Professor
  • Marttinen, Pekka, Thesis Advisor
  • Vehtari, Aki, Thesis Advisor
Publisher
Print ISBNs978-952-60-3996-1
Electronic ISBNs978-952-60-3997-8
Publication statusPublished - 2020
MoE publication typeG5 Doctoral dissertation (article)

Keywords

  • approximate Bayesian computation
  • simulator-based models
  • Gaussian processes
  • Bayesian experimental design
  • uncertainty quantification

Fingerprint Dive into the research topics of 'Gaussian Process Surrogate Methods for Sample-Efficient Approximate Bayesian Computation'. Together they form a unique fingerprint.

Cite this