Enhancer prediction in the human genome by probabilistic modelling of the chromatin feature patterns

Tutkimustuotos: LehtiartikkeliArticleScientificvertaisarvioitu

51 Lataukset (Pure)

Abstrakti

BACKGROUND: The binding sites of transcription factors (TFs) and the localisation of histone modifications in the human genome can be quantified by the chromatin immunoprecipitation assay coupled with next-generation sequencing (ChIP-seq). The resulting chromatin feature data has been successfully adopted for genome-wide enhancer identification by several unsupervised and supervised machine learning methods. However, the current methods predict different numbers and different sets of enhancers for the same cell type and do not utilise the pattern of the ChIP-seq coverage profiles efficiently. RESULTS: In this work, we propose a PRobabilistic Enhancer PRedictIoN Tool (PREPRINT) that assumes characteristic coverage patterns of chromatin features at enhancers and employs a statistical model to account for their variability. PREPRINT defines probabilistic distance measures to quantify the similarity of the genomic query regions and the characteristic coverage patterns. The probabilistic scores of the enhancer and non-enhancer samples are utilised to train a kernel-based classifier. The performance of the method is demonstrated on ENCODE data for two cell lines. The predicted enhancers are computationally validated based on the transcriptional regulatory protein binding sites and compared to the predictions obtained by state-of-the-art methods. CONCLUSION: PREPRINT performs favorably to the state-of-the-art methods, especially when requiring the methods to predict a larger set of enhancers. PREPRINT generalises successfully to data from cell type not utilised for training, and often the PREPRINT performs better than the previous methods. The PREPRINT enhancers are less sensitive to the choice of prediction threshold. PREPRINT identifies biologically validated enhancers not predicted by the competing methods. The enhancers predicted by PREPRINT can aid the genome interpretation in functional genomics and clinical studies.

AlkuperäiskieliEnglanti
Sivumäärä37
JulkaisuBMC Bioinformatics
Vuosikerta21
Numero1
DOI - pysyväislinkit
TilaJulkaistu - 20 heinäkuuta 2020
OKM-julkaisutyyppiA1 Julkaistu artikkeli, soviteltu

Sormenjälki Sukella tutkimusaiheisiin 'Enhancer prediction in the human genome by probabilistic modelling of the chromatin feature patterns'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.

  • Projektit

    Immuunivasteen säätely ja hoidon kohdennus reumataudeissa

    Jokinen, E., Lähdesmäki, H., Rehn, E., Dumitrescu, A. & Osmala, M.

    01/01/201831/12/2020

    Projekti: Academy of Finland: Other research funding

    Yksilöllistetty lääketiede tyypin 1 diabeteksen ennustamisessa ja estossa

    Lähdesmäki, H., Somani, J., Cheng, L., Tikhonov, G., Jokinen, E., Halla-aho, V. & Papatheodorou, D.

    01/09/201524/09/2019

    Projekti: Academy of Finland: Other research funding

    Laitteet

    Science-IT

    Mikko Hakala (Manager)

    Perustieteiden korkeakoulu

    Laitteistot/tilat: Facility

  • Siteeraa tätä