Abstrakti
For large-scale multi-class classification problems, consisting of tens of thousand target categories, recent works have emphasized the need to store billions of parameters. For instance, the classical l2-norm regularization employed by a state-of-the-art method results in the model size of 17GB for a training set whose size is only 129MB. To the contrary, by using a mixed-norm regularization approach, we show that around 99.5% of the stored parameters is dispensable noise. Using this strategy, we can extract the information relevant for classification, which is constituted in remaining 0.5% of the parameters, and hence demonstrate drastic reduction in model sizes. Furthermore, the proposed method leads to improvement in generalization performance compared to state-of-the-art methods, especially for under-represented categories. Lastly, our method enjoys easy parallelization, and scales well to tens of thousand target categories.
| Alkuperäiskieli | Englanti |
|---|---|
| Otsikko | Proceedings of the 2016 SIAM International Conference on Data Mining (SDM) |
| Kustantaja | Society for Industrial and Applied Mathematics |
| Sivut | 234-242 |
| Sivumäärä | 9 |
| ISBN (elektroninen) | 978-1-61197-434-8 |
| DOI - pysyväislinkit | |
| Tila | Julkaistu - 2016 |
| OKM-julkaisutyyppi | A4 Artikkeli konferenssijulkaisussa |
| Tapahtuma | SIAM INTERNATIONAL CONFERENCE ON DATA MINING - Miami, Yhdysvallat Kesto: 5 toukok. 2016 → 7 toukok. 2016 https://archive.siam.org/meetings/sdm16/ |
Conference
| Conference | SIAM INTERNATIONAL CONFERENCE ON DATA MINING |
|---|---|
| Lyhennettä | SDM |
| Maa/Alue | Yhdysvallat |
| Kaupunki | Miami |
| Ajanjakso | 05/05/2016 → 07/05/2016 |
| www-osoite |
Sormenjälki
Sukella tutkimusaiheisiin 'TerseSVM : A scalable approach for learning compact models in Large-scale classification'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.Siteeraa tätä
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver