Gaussian process modeling in approximate Bayesian computation to estimate horizontal gene transfer in bacteria

Research output: Contribution to journalArticle

Details

Original languageEnglish
Number of pages23
JournalANNALS OF APPLIED STATISTICS
StateAccepted/In press - Feb 2018
MoE publication typeA1 Journal article-refereed

Researchers

Research units

  • University of Edinburgh

Abstract

Approximate Bayesian computation (ABC) can be used for model fitting when the likelihood function is intractable but simulating from the model is feasible. However, even a single evaluation of a complex model may take several hours, limiting the number of model evaluations available. Modeling the discrepancy between the simulated and observed data using a Gaussian process (GP) can be used to reduce the number of model evaluations required by ABC, but the sensitivity of this approach to a specific GP formulation has not been thoroughly investigated. We begin with a comprehensive empirical evaluation of using GPs in ABC, including various transformations of the discrepancies and two novel GP formulations. Our results indicate the choice of GP may significantly affect the accuracy of the estimated posterior distribution. Selection of an appropriate GP model is thus important. We define expected utility to measure the accuracy of classifying discrepancies below or above the ABC threshold, and show that by using this utility, the GP model selection step can be made automatic. Finally, based on the understanding gained with toy examples, we fit a population genetic model for bacteria, providing insight into horizontal gene transfer events within the population and from external origins.

    Research areas

  • Approximate Bayesian computation, intractable likelihood, Gaussian process, input-dependent noise, model selection

ID: 10399374