Projects per year
Abstract
Bayesian cross-validation (CV) is a popular method for predictive model assessment that is simple to implement and broadly applicable. A wide range of CV schemes is available for time series applications, including generic leave-one-out (LOO) and K-fold methods, as well as specialized approaches intended to deal with serial dependence such as leave-future-out (LFO), h-block, and hv-block.
Existing large-sample results show that both specialized and generic methods are applicable to models of serially-dependent data. However, large sample consistency results overlook the impact of sampling variability on accuracy in finite samples. Moreover, the accuracy of a CV scheme depends on many aspects of the procedure. We show that poor design choices can lead to elevated rates of adverse selection.
In this paper, we consider the problem of identifying the regression component of an important class of models of data with serial dependence, autoregressions of order p with q exogenous regressors (ARX(p,q)
), under the logarithmic scoring rule. We show that when serial dependence is present, scores computed using the joint (multivariate) density have lower variance and better model selection accuracy than the popular pointwise estimator. In addition, we present a detailed case study of the special case of ARX models with fixed autoregressive structure and variance. For this class, we derive the finite-sample distribution of the CV estimators and the model selection statistic. We conclude with recommendations for practitioners.
Existing large-sample results show that both specialized and generic methods are applicable to models of serially-dependent data. However, large sample consistency results overlook the impact of sampling variability on accuracy in finite samples. Moreover, the accuracy of a CV scheme depends on many aspects of the procedure. We show that poor design choices can lead to elevated rates of adverse selection.
In this paper, we consider the problem of identifying the regression component of an important class of models of data with serial dependence, autoregressions of order p with q exogenous regressors (ARX(p,q)
), under the logarithmic scoring rule. We show that when serial dependence is present, scores computed using the joint (multivariate) density have lower variance and better model selection accuracy than the popular pointwise estimator. In addition, we present a detailed case study of the special case of ARX models with fixed autoregressive structure and variance. For this class, we derive the finite-sample distribution of the CV estimators and the model selection statistic. We conclude with recommendations for practitioners.
Original language | English |
---|---|
Pages (from-to) | 1-25 |
Journal | Bayesian Analysis |
DOIs | |
Publication status | E-pub ahead of print - 2024 |
MoE publication type | A1 Journal article-refereed |
Fingerprint
Dive into the research topics of 'Cross-Validatory Model Selection for Bayesian Autoregressions with Exogenous Regressors'. Together they form a unique fingerprint.-
Iterative Bayes Vehtari: Safe iterative Bayesian model building
Vehtari, A. (Principal investigator), Lindgren, L. (Project Member), Johnson, A. (Project Member), Säilynoja, T. (Project Member) & Siccha, N. (Project Member)
01/09/2021 → 31/08/2025
Project: Academy of Finland: Other research funding
-
-: Finnish Center for Artificial Intelligence
Kaski, S. (Principal investigator)
01/01/2019 → 31/12/2022
Project: Academy of Finland: Other research funding