Projective inference in high-dimensional problems: Prediction and feature selection

Juho Piironen, Markus Paasiniemi, Aki Vehtari

Research output: Contribution to journalArticleScientificpeer-review

68 Citations (Scopus)
169 Downloads (Pure)

Abstract

This paper reviews predictive inference and feature selection for generalized linear models with scarce but high-dimensional data. We demonstrate that in many cases one can benefit from a decision theoretically justified two-stage approach: first, construct a possibly non-sparse model that predicts well, and then find a minimal subset of features that characterize the predictions. The model built in the first step is referred to as the reference model and the operation during the latter step as predictive projection. The key characteristic of this approach is that it finds an excellent tradeoff between sparsity and predictive accuracy, and the gain comes from utilizing all available information including prior and that coming from the left out features. We review several methods that follow this principle and provide novel methodological contributions. We present a new projection technique that unifies two existing techniques and is both accurate and fast to compute. We also propose a way of evaluating the feature selection process using fast leave-one-out cross-validation that allows for easy and intuitive model size selection. Furthermore, we prove a theorem that helps to understand the conditions under which the projective approach could be beneficial. The key ideas are illustrated via several experiments using simulated and real world data.

Original languageEnglish
Pages (from-to)2155-2197
Number of pages43
JournalElectronic Journal of Statistics
Volume14
Issue number1
DOIs
Publication statusPublished - 1 Jan 2020
MoE publication typeA1 Journal article-refereed

Funding

We thank anonymous reviewers for useful comments, Michael Riis Andersen for helpful discussions and Academy of Finland (grants 298742 and 313122) for partial funding. We also acknowledge the computational resources provided by the Aalto Science-IT project and support by the Academy of Finland Flagship programme: Finnish Center for Artificial Intelligence, FCAI.

Keywords

  • Feature selection
  • Post-selection inference
  • Prediction
  • Projection
  • Sparsity

Fingerprint

Dive into the research topics of 'Projective inference in high-dimensional problems: Prediction and feature selection'. Together they form a unique fingerprint.
  • -: Finnish Center for Artificial Intelligence

    Kaski, S. (Principal investigator)

    01/01/201931/12/2022

    Project: Academy of Finland: Other research funding

  • Reliable Automated Bayesian Machine Learning

    Vehtari, A. (Principal investigator), Pavone, F. (Project Member), Koistinen, O.-P. (Project Member), Magnusson, M. (Project Member), Ghosh, K. (Project Member) & Dhaka, A. (Project Member)

    01/01/201831/12/2019

    Project: Academy of Finland: Other research funding

  • Computational methods for survival analysis

    Vehtari, A. (Principal investigator), Dhaka, A. (Project Member), Siivola, E. (Project Member), Paananen, T. (Project Member), Andersen, M. (Project Member), Säilynoja, T. (Project Member), Magnusson, M. (Project Member) & Sivula, T. (Project Member)

    01/09/201631/08/2020

    Project: Academy of Finland: Other research funding

  • Science-IT

    Hakala, M. (Manager)

    School of Science

    Facility/equipment: Facility

Cite this