Uncertainty Quantification in Deep Learning

Lassi Meronen

Research output: ThesisDoctoral ThesisCollection of Articles

Abstract

Deep learning has recently become a popular method for solving problems involving large data sets, and in many applications, human performance has been exceeded. However, deep learning models tend to be overconfident in their predictions, especially when encountering new input samples that differ from anything the model has learned during training. This thesis aims to address this problem by developing uncertainty quantification techniques that allow deep learning models to recognise the limits of their capabilities better and when they should be uncertain in their predictions. Improved uncertainty quantification would enable deep learning models to be used in safety-critical applications that require reliable uncertainty estimates. Uncertainty quantification is improved through a Bayesian perspective, and making connections between neural networks and Gaussian processes is at the core of this research. Gaussian processes are principled Bayesian models that are known to provide reliable uncertainty estimates for their predictions, and the aim is to bring these desirable properties to deep learning models. Another key benefit of Gaussian processes in terms of uncertainty quantification is the possibility of including prior assumptions into the model through a covariance function. The results in this thesis show that similar prior assumptions can be induced into deep learning models through activation functions. This allows neural networks to replicate stationary Gaussian process behaviour with a Matérn covariance. This result fills a gap in research connecting Gaussian processes and neural networks that has existed for over twenty years. Matérn covariance is arguably the most used covariance function in Gaussian processes, making this result impactful. This thesis considers two distinct parts contributing to uncertainty quantification: 1. encoding meaningful priors and 2. approximate inference. The main focus is on meaningful priors, but approximate inference is also focused on, as it is required to use Bayesian deep learning models in practice. Publications in this thesis show theoretical results that progress uncertainty quantification through model design, which allows the encoding of conservative behaviour into the model. In addition, this thesis tackles the problem of increasing size and computational requirements of modern deep learning models. This is also done with uncertainty quantification methods by applying them to dynamic neural networks that attempt to achieve improved performance for a limited computational budget. Computationally efficient uncertainty quantification methods that fit into the computationally restricted regime of dynamic neural networks are introduced. The results show that uncertainty quantification improves decision-making in dynamic neural networks, which leads to better predictive performance. This means high performance is achieved at a lower computational cost, making high-end deep learning models available on hardware with limited computational capacity, such as mobile devices. Improving dynamic neural network performance also helps decrease the energy consumption of large deep learning models.
Translated title of the contributionEpävarmuuden arviointi syväoppimisessa
Original languageEnglish
QualificationDoctor's degree
Awarding Institution
  • Aalto University
Supervisors/Advisors
  • Solin, Arno, Supervising Professor
  • Solin, Arno, Thesis Advisor
Publisher
Print ISBNs978-952-64-1531-4
Electronic ISBNs978-952-64-1532-1
Publication statusPublished - 2023
MoE publication typeG5 Doctoral dissertation (article)

Keywords

  • uncertainty quantification
  • Bayesian deep learning
  • gaussian processes

Fingerprint

Dive into the research topics of 'Uncertainty Quantification in Deep Learning'. Together they form a unique fingerprint.

Cite this