During the recent decades much research has been done on a very general approximate Bayesian inference framework known as expectation propagation (EP), which has been found to be a fast and very accurate method in many experimental comparisons. A challenge with the practical application of EP is that a numerically robust and computationally efficient implementation is not straightforward with many model specifications, and that there is no guarantee for the convergence of the standard EP algorithm. This thesis considers robust and efficient application of EP using Gaussian approximating families in three challenging inference problems. In addition, various experimental results are presented to compare the accuracy of EP with several alternative methods for approximate Bayesian inference. The first inference problem considers Gaussian process (GP) regression with the Student-t observation model, where standard EP may run into convergence problems, because the posterior distribution may contain multiple modes. This thesis illustrates the situations where standard EP fails to converge, reviews different modifications and alternative algorithms for improving the convergence, and presents a robust EP implementation that relies primarily on parallel EP updates and uses a provably convergent double-loop algorithm with adaptively selected step size in difficult cases. The second inference problem considers multi-class GP classification with the multinomial probit model, where a straightforward EP implementation requires either multi-dimensional numerical integrations or a factored posterior approximation for the latent values related to the different classes. This thesis describes a novel nested EP approach that does not require numerical integrations and approximates accurately all between-class posterior dependencies of the latent values, but still scales linearly in the number of classes. The third inference problem considers nonlinear regression using two-layer neural networks (NNs) with sparsity-promoting hierarchical priors on the inputs, where the challenge is to construct sufficiently accurate and computationally efficient approximations for the likelihood terms that depend in a non-linear manner on the network weights. This thesis describes a novel computationally efficient EP approach for simultaneous approximate integration over the posterior distribution of the weights, the hierarchical scale parameters of the priors, and the residual scale. The approach enables flexible definition of weight priors with different sparseness properties, and it can be extended beyond standard activation functions and NN model structures to form flexible nonlinear predictors from multiple sparse linear models.
|Translated title of the contribution||Approksimatiivisia bayesilaisia päättelymenetelmiä regressioon ja luokitteluun gaussisilla prosesseilla ja neuroverkoilla|
|Publication status||Published - 2013|
|MoE publication type||G5 Doctoral dissertation (article)|
- approximate Bayesian inference
- expectation propagation
- Gaussian processes
- neural networks