Knowledge elicitation via sequential probabilistic inference for high-dimensional prediction

Tutkimustuotos: Lehtiartikkelivertaisarvioitu

Standard

Knowledge elicitation via sequential probabilistic inference for high-dimensional prediction. / Daee, Pedram; Peltola, Tomi; Soare, Marta; Kaski, Samuel.

julkaisussa: Machine Learning, Vuosikerta 106, Nro 9, 12.07.2017, s. 1599-1620.

Tutkimustuotos: Lehtiartikkelivertaisarvioitu

Harvard

APA

Vancouver

Author

Bibtex - Lataa

@article{928fb288edf44b37a5508578891eeb88,
title = "Knowledge elicitation via sequential probabilistic inference for high-dimensional prediction",
abstract = "Prediction in a small-sized sample with a large number of covariates, the “small n, large p” problem, is challenging. This setting is encountered in multiple applications, such as in precision medicine, where obtaining additional data can be extremely costly or even impossible, and extensive research effort has recently been dedicated to finding principled solutions for accurate prediction. However, a valuable source of additional information, domain experts, has not yet been efficiently exploited. We formulate knowledge elicitation generally as a probabilistic inference process, where expert knowledge is sequentially queried to improve predictions. In the specific case of sparse linear regression, where we assume the expert has knowledge about the relevance of the covariates, or of values of the regression coefficients, we propose an algorithm and computational approximation for fast and efficient interaction, which sequentially identifies the most informative features on which to query expert knowledge. Evaluations of the proposed method in experiments with simulated and real users show improved prediction accuracy already with a small effort from the expert.",
keywords = "Bayesian methods, Experimental design, Human-to-machine transfer learning, Interactive machine learning , Statistics in high dimensions",
author = "Pedram Daee and Tomi Peltola and Marta Soare and Samuel Kaski",
year = "2017",
month = "7",
day = "12",
doi = "10.1007/s10994-017-5651-7",
language = "English",
volume = "106",
pages = "1599--1620",
journal = "Machine Learning",
issn = "0885-6125",
publisher = "Springer Netherlands",
number = "9",

}

RIS - Lataa

TY - JOUR

T1 - Knowledge elicitation via sequential probabilistic inference for high-dimensional prediction

AU - Daee, Pedram

AU - Peltola, Tomi

AU - Soare, Marta

AU - Kaski, Samuel

PY - 2017/7/12

Y1 - 2017/7/12

N2 - Prediction in a small-sized sample with a large number of covariates, the “small n, large p” problem, is challenging. This setting is encountered in multiple applications, such as in precision medicine, where obtaining additional data can be extremely costly or even impossible, and extensive research effort has recently been dedicated to finding principled solutions for accurate prediction. However, a valuable source of additional information, domain experts, has not yet been efficiently exploited. We formulate knowledge elicitation generally as a probabilistic inference process, where expert knowledge is sequentially queried to improve predictions. In the specific case of sparse linear regression, where we assume the expert has knowledge about the relevance of the covariates, or of values of the regression coefficients, we propose an algorithm and computational approximation for fast and efficient interaction, which sequentially identifies the most informative features on which to query expert knowledge. Evaluations of the proposed method in experiments with simulated and real users show improved prediction accuracy already with a small effort from the expert.

AB - Prediction in a small-sized sample with a large number of covariates, the “small n, large p” problem, is challenging. This setting is encountered in multiple applications, such as in precision medicine, where obtaining additional data can be extremely costly or even impossible, and extensive research effort has recently been dedicated to finding principled solutions for accurate prediction. However, a valuable source of additional information, domain experts, has not yet been efficiently exploited. We formulate knowledge elicitation generally as a probabilistic inference process, where expert knowledge is sequentially queried to improve predictions. In the specific case of sparse linear regression, where we assume the expert has knowledge about the relevance of the covariates, or of values of the regression coefficients, we propose an algorithm and computational approximation for fast and efficient interaction, which sequentially identifies the most informative features on which to query expert knowledge. Evaluations of the proposed method in experiments with simulated and real users show improved prediction accuracy already with a small effort from the expert.

KW - Bayesian methods

KW - Experimental design

KW - Human-to-machine transfer learning

KW - Interactive machine learning

KW - Statistics in high dimensions

UR - https://github.com/HIIT/knowledge-elicitation-for-linear-regression

U2 - 10.1007/s10994-017-5651-7

DO - 10.1007/s10994-017-5651-7

M3 - Article

VL - 106

SP - 1599

EP - 1620

JO - Machine Learning

JF - Machine Learning

SN - 0885-6125

IS - 9

ER -

ID: 14181192