Generalized test utilities for long-tail performance in extreme multi-label classification

Erik Schultheis, Marek Wydmuch, Wojciech Kotlowski, Rohit Babbar, Krzysztof Dembczynski

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review


Extreme multi-label classification (XMLC) is a task of selecting a small subset of relevant labels from a very large set of possible labels. As such, it is characterized by long-tail labels, i.e., most labels have very few positive instances. With standard performance measures such as precision@k, a classifier can ignore tail labels and still report good performance. However, it is often argued that correct predictions in the tail are more "interesting" or "rewarding," but the community has not yet settled on a metric capturing this intuitive concept. The existing propensity-scored metrics fall short on this goal by confounding the problems of long-tail and missing labels. In this paper, we analyze generalized metrics budgeted "at k" as an alternative solution. To tackle the challenging problem of optimizing these metrics, we formulate it in the \emph{expected test utility} (ETU) framework, which aims at optimizing the expected performance on a given test set. We derive optimal prediction rules and construct their computationally efficient approximations with provable regret guarantees and being robust against model misspecification. Our algorithm, based on block coordinate descent, scales effortlessly to XMLC problems and obtains promising results in terms of long-tail performance.
Original languageEnglish
Title of host publicationThirty-seventh Conference on Neural Information Processing Systems
Publication statusAccepted/In press - Dec 2023
MoE publication typeA4 Conference publication
EventConference on Neural Information Processing Systems - Ernest N. Morial Convention Center, New Orleans, United States
Duration: 10 Dec 202316 Dec 2023
Conference number: 37


ConferenceConference on Neural Information Processing Systems
Abbreviated titleNeurIPS
Country/TerritoryUnited States
CityNew Orleans
Internet address


Dive into the research topics of 'Generalized test utilities for long-tail performance in extreme multi-label classification'. Together they form a unique fingerprint.

Cite this