Abstrakti
Motivation
Identification of small molecules in a biological sample remains a major bottleneck in molecular biology, despite a decade of rapid development of computational approaches for predicting molecular structures using mass spectrometry (MS) data. Recently, there has been increasing interest in utilizing other information sources, such as liquid chromatography (LC) retention time (RT), to improve identifications solely based on MS information, such as precursor mass-per-charge and tandem mass spectra (MS2).
Results
We put forward a probabilistic modelling framework to integrate MS and RT data of multiple features in an LC-MS experiment. We model the MS measurements and all pairwise retention order information as a Markov random field and use efficient approximate inference for scoring and ranking potential molecular structures. Our experiments show improved identification accuracy by combining MS2 data and retention orders using our approach, thereby outperforming state-of-the-art methods. Furthermore, we demonstrate the benefit of our model when only a subset of LC-MS features have MS2 measurements available besides MS1.
Availability and implementation
Software and data is freely available at https://github.com/aalto-ics-kepaco/msms_rt_score_integration.
Identification of small molecules in a biological sample remains a major bottleneck in molecular biology, despite a decade of rapid development of computational approaches for predicting molecular structures using mass spectrometry (MS) data. Recently, there has been increasing interest in utilizing other information sources, such as liquid chromatography (LC) retention time (RT), to improve identifications solely based on MS information, such as precursor mass-per-charge and tandem mass spectra (MS2).
Results
We put forward a probabilistic modelling framework to integrate MS and RT data of multiple features in an LC-MS experiment. We model the MS measurements and all pairwise retention order information as a Markov random field and use efficient approximate inference for scoring and ranking potential molecular structures. Our experiments show improved identification accuracy by combining MS2 data and retention orders using our approach, thereby outperforming state-of-the-art methods. Furthermore, we demonstrate the benefit of our model when only a subset of LC-MS features have MS2 measurements available besides MS1.
Availability and implementation
Software and data is freely available at https://github.com/aalto-ics-kepaco/msms_rt_score_integration.
Alkuperäiskieli | Englanti |
---|---|
Sivut | 1724-1731 |
Sivumäärä | 8 |
Julkaisu | Bioinformatics |
Vuosikerta | 37 |
Numero | 12 |
DOI - pysyväislinkit | |
Tila | Julkaistu - 27 marrask. 2020 |
OKM-julkaisutyyppi | A1 Julkaistu artikkeli, soviteltu |
Sormenjälki
Sukella tutkimusaiheisiin 'Probabilistic Framework for Integration of Mass Spectrum and Retention Time Information in Small Molecule Identification'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.Tietoaineistot
-
Dataset: "Probabilistic Framework for Integration of Mass Spectrum and Retention Time Information in Small Molecule Identification"
Bach, E. (Creator), 1 huhtik. 2020
DOI - pysyväislinkki: 10.5281/zenodo.4305918
Tietoaineisto: Dataset
-
aalto-ics-kepaco/msms_rt_score_integration: Relase for the Bioinformatics publication
Bach, E. (Creator), 2020
DOI - pysyväislinkki: 10.5281/zenodo.4306269, https://doi.org/10.5281/zenodo.4306270
Tietoaineisto: Ohjelmisto tai koodi