Abstract
Training SVMs on high dimensional feature vectors in one shot incurs high computational cost. A low dimensional representation reduces computational overhead and improves the classification speed. Low dimensionality also reduces the risk of over-fitting and tends to improve the generalisation ability of classification algorithms. For many important applications, the dimensionality may remain prohibitively high despite feature selection. In this paper, we address these issues primarily in the context of handwritten digit data. In particular, we make the following contributions: 1 we introduce the α-minimum feature over (α-MFC) problem and prove it to be NP-hard 2 investigate the efficacy of a divide-and-conquer ensemble method for SVMs based on segmentation of the feature space (FS-SVMs) 3 propose a greedy algorithm for finding an approximate α-MFC using FS-SVMs.
Original language | English |
---|---|
Pages (from-to) | 411-436 |
Number of pages | 26 |
Journal | INTERNATIONAL JOURNAL OF DATA MINING, MODELLING AND MANAGEMENT |
Volume | 1 |
Issue number | 4 |
DOIs | |
Publication status | Published - 2009 |
MoE publication type | A1 Journal article-refereed |
Keywords
- Approximation algorithms
- Classification
- Dimensionality reduction
- Feature selection
- Greedy algorithms
- Support vector machines
- SVMs