Data Used In "Fast Metabolite Identification With Input Output Kernel Regression"

  • Celine Brouard (Contributor)
  • Huibin Shen (Contributor)
  • Kai Dührkop (Contributor)
  • Florence d'Alché-Buc Telecom Paris Tech, ComUE Paris-Saclay (Contributor)
  • Sebastian Böcker (Contributor)
  • Juho Rousu (Contributor)

Dataset

Description

This repository contains the data used in [1] to evaluate the performance for metabolite identification from tandem mass spectra. These data have been extracted and processed in [2]. We used a subset of 4138 MS/MS spectra extracted from the GNPS public spectral library (https://gnps.ucsd.edu/ProteoSAFe/libraries.jsp) for training and evaluation. For searching, we used molecular structures from PubChem as candidate sets.

Please mention and cite GNPS when using these data.

The implementation of the method proposed in [1] is available on: https://version.aalto.fi/gitlab/kepaco/Fast-metabolite-identification-with-IOKR

Files description:
•spectra.txt: informations about the MS/MS spectra (GNPS identifier, compound name and INCHI identifier)
•data_GNPS.mat: contains the molecular fingerprints, molecular formula and InCHI corresponding to the MS/MS spectra
•cv_ind.txt: indices of the cross-validation folds
•ind_eval.txt: indices of the examples used for evaluation
•candidates: fingerprints and INCHI for the different candidate sets
•input_kernels: contains 24 input kernel matrices

References:

[1] Brouard, C., Shen, H., Dührkop, K., d'Alché-Buc, F., Böcker, S. and Rousu, J.: Fast metabolite identification with Input Output Kernel Regression. In the proceedings of ISMB 2016, Bioinformatics 32(12): i28-i36, 2016. DOI: https://doi.org/10.1093/bioinformatics/btw246

[2] Dührkop, K., Shen, H., Meusel, M., Rousu, J. and Böcker, S.: Searching molecular structure databases with tandem mass spectra using CSI:FingerID. PNAS, 112(41), 12580-12585, 2015. doi:10.1073/pnas.1509788112
Date made available1 Jan 2017

Research Output

Fast metabolite identification with Input Output Kernel Regression

Brouard, C., Shen, H., Dührkop, K., d'Alché-Buc, F., Böcker, S. & Rousu, J., 15 Jun 2016, In : Bioinformatics. 32, 12, p. 28-36

Research output: Contribution to journalArticleScientificpeer-review

  • 28 Citations (Scopus)

    Cite this

    Brouard, C. (Contributor), Shen, H. (Contributor), Dührkop, K. (Contributor), d'Alché-Buc, F. (Contributor), Böcker, S. (Contributor), Rousu, J. (Contributor) (1 Jan 2017). Data Used In "Fast Metabolite Identification With Input Output Kernel Regression"10.5281/zenodo.804240