Knowledge elicitation via sequential probabilistic inference for high-dimensional prediction

Pedram Daee, Tomi Peltola, Marta Soare, Samuel Kaski

Research output: Contribution to journalArticleScientificpeer-review

25 Citations (Scopus)
204 Downloads (Pure)

Abstract

Prediction in a small-sized sample with a large number of covariates, the “small n, large p” problem, is challenging. This setting is encountered in multiple applications, such as in precision medicine, where obtaining additional data can be extremely costly or even impossible, and extensive research effort has recently been dedicated to finding principled solutions for accurate prediction. However, a valuable source of additional information, domain experts, has not yet been efficiently exploited. We formulate knowledge elicitation generally as a probabilistic inference process, where expert knowledge is sequentially queried to improve predictions. In the specific case of sparse linear regression, where we assume the expert has knowledge about the relevance of the covariates, or of values of the regression coefficients, we propose an algorithm and computational approximation for fast and efficient interaction, which sequentially identifies the most informative features on which to query expert knowledge. Evaluations of the proposed method in experiments with simulated and real users show improved prediction accuracy already with a small effort from the expert.
Original languageEnglish
Pages (from-to)1599-1620
Number of pages22
JournalMachine Learning
Volume106
Issue number9
DOIs
Publication statusPublished - 12 Jul 2017
MoE publication typeA1 Journal article-refereed

Keywords

  • Bayesian methods
  • Experimental design
  • Human-to-machine transfer learning
  • Interactive machine learning
  • Statistics in high dimensions

Fingerprint

Dive into the research topics of 'Knowledge elicitation via sequential probabilistic inference for high-dimensional prediction'. Together they form a unique fingerprint.
  • Data-Driven Decision Support for Digital Health

    Kaski, S. (Principal investigator), Vuollekoski, H. (Project Member), Strahl, J. (Project Member), Niinimäki, T. (Project Member), Sundin, I. (Project Member), Blomstedt, P. (Project Member), Hegde, P. (Project Member), Daee, P. (Project Member) & Eranti, P. (Project Member)

    01/01/201630/06/2018

    Project: Academy of Finland: Other research funding

  • Interactive machine learning from multiple biodata sources

    Kaski, S. (Principal investigator) & Filstroff, L. (Project Member)

    01/01/201631/08/2021

    Project: Academy of Finland: Other research funding

  • Interactive machine learning from multiple biodata sources

    Kaski, S. (Principal investigator), Reinvall, J. (Project Member), Chen, Y. (Project Member), Daee, P. (Project Member), Qin, X. (Project Member), Jälkö, J. (Project Member), Pesonen, H. (Project Member), Blomstedt, P. (Project Member), Eranti, P. (Project Member), Hegde, P. (Project Member), Siren, J. (Project Member), Peltola, T. (Project Member), Celikok, M. M. (Project Member), Sundin, I. (Project Member), Kangas, J.-K. (Project Member), Afrabandpey, H. (Project Member), Honkamaa, J. (Project Member), Shen, Z. (Project Member) & Aushev, A. (Project Member)

    01/01/201631/12/2018

    Project: Academy of Finland: Other research funding

Cite this