Interaction Detection with Probabilistic Deep Learning for Genetics

Julkaisun otsikon käännös: Interaction Detection with Probabilistic Deep Learning for Genetics

Tutkimustuotos: Doctoral ThesisCollection of Articles

Abstrakti

Deep learning is an important machine learning tool in genetics because of its ability to model nonlinear relations between genotypes and phenotypes, such as genetic interactions, without any assumptions about the forms of relations. However, current deep learning approaches are restricted in genetics applications by (i) the lack of well-calibrated uncertainty estimation about the model and (ii) limited accessible individual-level data for model training. This thesis aims to design principled approaches to tackle the shortcomings of deep learning with two relevant statistical genetics applications: gene-gene interaction detection and genotype-phenotype prediction. First, we focus on interaction detection with deep learning. We provide calibrated uncertainty estimations to interaction detection in deep learning with Bayesian principles, which are used to control statistical errors, e.g., false positive rate and false negative rate, of detected interactions. In genetic interaction detection applications, we design a novel neural network architecture to increase the power of detecting complex gene-gene interactions by learning gene representations that aggregate information from all SNPs (single-nucleotide polymorphisms) of the genes being analyzed and considering complex interaction forms between them beyond only the currently considered multiplicative interactions. Moreover, we propose a new permutation procedure that gives calibrated null distributions of genetic interactions from the neural network. Second, we study deep learning models in the low-data regime. We improve deep learning prediction by incorporating domain knowledge with informative priors. Specifically, we design informative Gaussian scale mixture priors that explicitly encode prior beliefs about feature sparsity and data signal-to-noise ratio into deep learning models, which improve their accuracy on regression tasks, such as genotype-phenotype prediction, especially when only a small training set is available. Moreover, we study how to understand better the working mechanism of low-data deep learning models that share knowledge from multiple similar domains, such as transfer learning, with representation similarity. We find that current representation similarities of deep learning models on multiple domains give counter-intuitive conclusions about their functional similarities due to the confounding effect of the input data structure. Therefore, we introduce a deconfounding step to adjust for the confounder, which improves the consistency of representation similarities w.r.t. functional similarities of models.
Julkaisun otsikon käännösInteraction Detection with Probabilistic Deep Learning for Genetics
AlkuperäiskieliEnglanti
PätevyysTohtorintutkinto
Myöntävä instituutio
  • Aalto-yliopisto
Valvoja/neuvonantaja
  • Kaski, Samuel, Vastuuprofessori
  • Marttinen, Pekka, Ohjaaja
Kustantaja
Painoksen ISBN978-952-64-1218-4
Sähköinen ISBN978-952-64-1219-1
TilaJulkaistu - 2023
OKM-julkaisutyyppiG5 Artikkeliväitöskirja

Sormenjälki

Sukella tutkimusaiheisiin 'Interaction Detection with Probabilistic Deep Learning for Genetics'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.

Siteeraa tätä