Projekteja vuodessa
Abstrakti
In this paper we address the problem of performing statistical inference for large-scale data sets where volume and dimensionality of the data may be so high that it cannot be processed or stored in a single unit. In particular, we focus on bootstrapping based methods that can provide quantitative information on the accuracy of the inference such as confidence intervals without explicit assumptions on the probability models. We propose a scalable distributed boot- strap method that uses iterative estimation equations favoring sparse solution. Scalability is achieved by applying bootstrapping to multiple smaller distinct subsets generated by resampling the full data without replacement similarly to BLB method [1]. An iteratively reweighted ell-{1}-norm minimizing estimation equations are applied to each bootstrap sample. Such estimators allow for parameter estimation and inference even for moderately underdetermined systems as well as performing variable selection by promoting a sparse parameter vector. Estimation problems may become underdetermined for the distinct subsets of data even if the full large scale problem would be overdetermined. The performance of the presented approach is studied in extensive simulations. It is demonstrated that the method gives smaller Root MSE and significantly lower bias than bootstrap employing widely used sparse estimator BPDN. Moreover, better performance is obtained in variable selection in terms of classification error rate (CER) and recovery rate (RER) in identifying sparse parameters. Estimated confidence intervals are also highly concentrated about the true parameter values.
Alkuperäiskieli | Englanti |
---|---|
Otsikko | Conference Record of the 52nd Asilomar Conference on Signals, Systems and Computers, ACSSC 2018 |
Toimittajat | Michael B. Matthews |
Kustantaja | IEEE |
Sivut | 769-773 |
Sivumäärä | 5 |
Vuosikerta | 2018-October |
ISBN (elektroninen) | 9781538692189 |
DOI - pysyväislinkit | |
Tila | Julkaistu - 2018 |
OKM-julkaisutyyppi | A4 Artikkeli konferenssijulkaisussa |
Tapahtuma | Asilomar Conference on Signals, Systems & Computers - Pacific Grove, Yhdysvallat Kesto: 28 lokak. 2018 → 31 lokak. 2018 Konferenssinumero: 52 |
Julkaisusarja
Nimi | Conference Record of the Asilomar Conference on Signals Systems and Computers |
---|---|
ISSN (painettu) | 1058-6393 |
Conference
Conference | Asilomar Conference on Signals, Systems & Computers |
---|---|
Lyhennettä | ACSSC |
Maa/Alue | Yhdysvallat |
Kaupunki | Pacific Grove |
Ajanjakso | 28/10/2018 → 31/10/2018 |
Sormenjälki
Sukella tutkimusaiheisiin 'Scalable Statistical Inference Using Distributed Bootstrapping and Iterative ℓ 1 -Norm Minimization'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.Projektit
- 1 Päättynyt
-
Tilastollisen signaalinkäsittelyn teoriaa ja laskennallisia menetelmiä laajojen datajoukkojen analysointiin
Koivunen, V. (Vastuullinen tutkija), Basiri, S. (Projektin jäsen), Mozafari Majd, M. (Projektin jäsen), Rajamäki, R. (Projektin jäsen), Chis, A. (Projektin jäsen), Oksanen, J. (Projektin jäsen), Pölönen, K. (Projektin jäsen) & Halme, T. (Projektin jäsen)
01/09/2015 → 31/08/2019
Projekti: Academy of Finland: Other research funding