Statistical and computational analysis of high-throughput ‘omics’ datasets for understanding the etiology and pathogenesis of autoimmune diseases

Research output: ThesisDoctoral ThesisCollection of Articles


The primary function of the human immune system is to maintain our wellbeing by protecting ourselves from harmful substances and microbes (i.e. pathogens) that we continuously encounter through our surroundings. However, a variety of factors can lead to immune system dysfunction, which in turn can give rise to various incurable diseases, including autoimmune diseases (ADs), such as type 1 diabetes (T1D), immunoglobulin G4 related disease (IgG4-RD) and systemic sclerosis (SSc). In ADs, the immune system fails to distinguish between pathogens and body's own cells, and mistakenly attacks body's healthy tissues. Unfortunately, the factors that trigger ADs (i.e. etiology) and the molecular mechanisms by which ADs develop (i.e. pathogenesis) remain poorly understood. Genetics and environmental factors, such as gut microbiome, have been implicated in triggering or influencing the development of ADs, but the concerned mechanisms remain largely elusive. Therefore, the aim of this thesis is to further our understanding about the etiology and pathogenesis of ADs by performing robust statistical and computational analyses on high-throughput 'omics' datasets. More specifically, one of the aims of this thesis was to study transcriptomics data from immune cells of T1D susceptible infants in order to identify gene expression markers that can aid in predicting the onset of autoimmunity and/or characterizing the disease progression. We found several genes to be associated with the pathogenesis of T1D, including IL32 that has not been associated with T1D before. Another aim was to develop a personalised method that can robustly model longitudinal transcriptomics data from heterogeneous diseases and identify pathways associated with the pathogenesis of the disease. When applied to T1D data, this method was able to associate several key pathways to T1D pathogenesis that were missed by other methods. Additionally, this thesis aimed to study the gut microbial architecture of IgG4-RD and SSc patients (metagenomics data) and identify potential sources of microbial signals that may be contributing to the etiology of the two diseases. Among other interesting results, we found a specific strain of Eggerthella lenta that contains genes with the potential of influencing the immune system, to be significantly overabundant in both diseases. Finally, this thesis also aimed to identify the environmental and host-related factors that may be influencing the development of the highly dynamic early gut microbiome of T1D susceptible infants. In effect, we linked several new factors to the development of the early gut, such as household location at birth, maternal antibiotic treatments and average increase in height and weight per year, to name a few.
Translated title of the contributionStatistical and computational analysis of high-throughput ‘omics’ datasets for understanding the etiology and pathogenesis of autoimmune diseases
Original languageEnglish
QualificationDoctor's degree
Awarding Institution
  • Aalto University
  • Lähdesmäki, Harri, Supervising Professor
  • Lähdesmäki, Harri, Thesis Advisor
Print ISBNs978-952-64-0435-6
Electronic ISBNs978-952-64-0436-3
Publication statusPublished - 2021
MoE publication typeG5 Doctoral dissertation (article)


  • high-throughput data
  • statistical modelling
  • computational analysis
  • immune system
  • gut microbiome
  • genes
  • pathways
  • autoimmune diseases


Dive into the research topics of 'Statistical and computational analysis of high-throughput ‘omics’ datasets for understanding the etiology and pathogenesis of autoimmune diseases'. Together they form a unique fingerprint.

Cite this