Abstract
We report the results of a DREAM challenge designed to predict relative genetic essentialities based on a novel dataset testing 98,000 shRNAs against 149 molecularly characterized cancer cell lines. We analyzed the results of over 3,000 submissions over a period of 4 months. We found that algorithms combining essentiality data across multiple genes demonstrated increased accuracy; gene expression was the most informative molecular data type; the identity of the gene being predicted was far more important than the modeling strategy; well-predicted genes and selected molecular features showed enrichment in functional categories; and frequently selected expression features correlated with survival in primary tumors. This study establishes benchmarks for gene essentiality prediction, presents a community resource for future comparison with this benchmark, and provides insights into factors influencing the ability to predict gene essentiality from functional genetic screens. This study also demonstrates the value of releasing pre-publication data publicly to engage the community in an open research collaboration. Gönen et al. report the results of an open-participation DREAM challenge to critically assess the ability to predict gene essentiality on a novel functional screening dataset of 149 cancer cell lines. This study establishes benchmarks for gene essentiality prediction, presents a community resource for future comparison with this benchmark, and provides insights into factors influencing the ability to predict gene essentiality from functional genetic screens.
Original language | English |
---|---|
Pages (from-to) | 485-497 |
Journal | Cell Systems |
Volume | 5 |
Issue number | 5 |
DOIs | |
Publication status | Published - 2017 |
MoE publication type | A1 Journal article-refereed |
Keywords
- Cancer genomics
- Community challenge
- Crowdsourcing
- Functional screen
- Machine learning
- Oncogene