Projects per year
Abstract
In this paper we address the problem of performing statistical inference for large-scale data sets where volume and dimensionality of the data may be so high that it cannot be processed or stored in a single unit. In particular, we focus on bootstrapping based methods that can provide quantitative information on the accuracy of the inference such as confidence intervals without explicit assumptions on the probability models. We propose a scalable distributed boot- strap method that uses iterative estimation equations favoring sparse solution. Scalability is achieved by applying bootstrapping to multiple smaller distinct subsets generated by resampling the full data without replacement similarly to BLB method [1]. An iteratively reweighted ell-{1}-norm minimizing estimation equations are applied to each bootstrap sample. Such estimators allow for parameter estimation and inference even for moderately underdetermined systems as well as performing variable selection by promoting a sparse parameter vector. Estimation problems may become underdetermined for the distinct subsets of data even if the full large scale problem would be overdetermined. The performance of the presented approach is studied in extensive simulations. It is demonstrated that the method gives smaller Root MSE and significantly lower bias than bootstrap employing widely used sparse estimator BPDN. Moreover, better performance is obtained in variable selection in terms of classification error rate (CER) and recovery rate (RER) in identifying sparse parameters. Estimated confidence intervals are also highly concentrated about the true parameter values.
Original language | English |
---|---|
Title of host publication | Conference Record of the 52nd Asilomar Conference on Signals, Systems and Computers, ACSSC 2018 |
Editors | Michael B. Matthews |
Publisher | IEEE |
Pages | 769-773 |
Number of pages | 5 |
Volume | 2018-October |
ISBN (Electronic) | 9781538692189 |
DOIs | |
Publication status | Published - 2018 |
MoE publication type | A4 Article in a conference publication |
Event | Asilomar Conference on Signals, Systems & Computers - Pacific Grove, United States Duration: 28 Oct 2018 → 31 Oct 2018 Conference number: 52 |
Publication series
Name | Conference Record of the Asilomar Conference on Signals Systems and Computers |
---|---|
ISSN (Print) | 1058-6393 |
Conference
Conference | Asilomar Conference on Signals, Systems & Computers |
---|---|
Abbreviated title | ACSSC |
Country/Territory | United States |
City | Pacific Grove |
Period | 28/10/2018 → 31/10/2018 |
Keywords
- bootstrap
- parameter estimation
- scalable inference
- sparse methods
- underdetermined systems
Fingerprint
Dive into the research topics of 'Scalable Statistical Inference Using Distributed Bootstrapping and Iterative ℓ 1 -Norm Minimization'. Together they form a unique fingerprint.Projects
- 1 Finished
-
Statistical Signal Processing Theory and Computational Methods for Large Scale Data Analysis
Rajamäki, R., Koivunen, V., Mozafari Majd, M., Basiri, S., Oksanen, J., Pölönen, K., Chis, A. & Halme, T.
01/09/2015 → 31/08/2019
Project: Academy of Finland: Other research funding