Towards Memory-Efficient Training for Extremely Large Output Spaces : Learning with 670k Labels on a Single Commodity GPU

Erik Schultheis*, Rohit Babbar

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

Abstract

In classification problems with large output spaces (up to millions of labels), the last layer can require an enormous amount of memory. Using sparse connectivity would drastically reduce the memory requirements, but as we show below, applied naïvely it can result in much diminished predictive performance. Fortunately, we found that this can be mitigated by introducing an intermediate layer of intermediate size. We further demonstrate that one can constrain the connectivity of the sparse layer to be of constant fan-in, in the sense that each output neuron will have the exact same number of incoming connections, which allows for more efficient implementations, especially on GPU hardware. The CUDA implementation of our approach is provided at https://github.com/xmc-aalto/ecml23-sparse.

Original languageEnglish
Title of host publicationMachine Learning and Knowledge Discovery in Databases
Subtitle of host publicationResearch Track - European Conference, ECML PKDD 2023, Proceedings
EditorsDanai Koutra, Claudia Plant, Manuel Gomez Rodriguez, Elena Baralis, Francesco Bonchi
PublisherSpringer
Pages689-704
Number of pages16
ISBN (Print)978-3-031-43417-4
DOIs
Publication statusPublished - 2023
MoE publication typeA4 Conference publication
EventEuropean Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases - Turin, Italy
Duration: 18 Sept 202322 Sept 2023

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
PublisherSpringer
Volume14171 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceEuropean Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases
Abbreviated titleECML PKDD
Country/TerritoryItaly
CityTurin
Period18/09/202322/09/2023

Fingerprint

Dive into the research topics of 'Towards Memory-Efficient Training for Extremely Large Output Spaces : Learning with 670k Labels on a Single Commodity GPU'. Together they form a unique fingerprint.

Cite this