Meta-classifier free negative sampling for extreme multilabel classification

Mohammadreza Mohammadnia Qaraei*, Rohit Babbar

*Corresponding author for this work

Research output: Contribution to journalArticleScientificpeer-review

19 Downloads (Pure)

Abstract

Negative sampling is a common approach for making the training of deep models in classification problems with very large output spaces, known as extreme multilabel classification (XMC) problems, tractable. Negative sampling methods aim to find per instance negative labels with higher scores, known as hard negatives, and limit the computations of the negative part of the loss to these labels. Two well-known methods for negative sampling in XMC models are meta-classifier-based and Maximum Inner product Search (MIPS)-based adaptive methods. Owing to their good prediction performance, methods which employ a meta classifier are more common in contemporary XMC research. On the flip side, they need to train and store the meta classifier (apart from the extreme classifier), which can involve millions of additional parameters. In this paper, we focus on the MIPS-based methods for negative sampling. We highlight two issues which may prevent deep models trained by these methods to undergo stable training. First, we argue that using hard negatives excessively from the beginning of training leads to unstable gradient. Second, we show that when all the negative labels in a MIPS-based method are restricted to only those determined by MIPS, training is sensitive to the length of intervals for pre-processing the weights in the MIPS method. To mitigate the aforementioned issues, we propose to limit the labels selected by MIPS to only a few and sample the rest of the needed labels from a uniform distribution. We show that our proposed MIPS-based negative sampling can reach the performance of LightXML, a transformer-based model trained by a meta classifier, while there is no need to train and store any additional classifier. The code for our experiments is available at https://github.com/xmc-aalto/mips-negative-sampling.
Original languageEnglish
Pages (from-to)675-697
Number of pages23
JournalMachine Learning
Volume113
Issue number2
Early online date20 Nov 2023
DOIs
Publication statusPublished - Feb 2024
MoE publication typeA1 Journal article-refereed

Fingerprint

Dive into the research topics of 'Meta-classifier free negative sampling for extreme multilabel classification'. Together they form a unique fingerprint.

Cite this