Generalized test utilities for long-tail performance in extreme multi-label classification

Erik Schultheis, Marek Wydmuch, Wojciech Kotlowski, Rohit Babbar, Krzysztof Dembczynski

Tutkimustuotos: Artikkeli kirjassa/konferenssijulkaisussaConference article in proceedingsScientificvertaisarvioitu

3 Sitaatiot (Scopus)

Abstrakti

Extreme multi-label classification (XMLC) is a task of selecting a small subset of relevant labels from a very large set of possible labels. As such, it is characterized by long-tail labels, i.e., most labels have very few positive instances. With standard performance measures such as precision@k, a classifier can ignore tail labels and still report good performance. However, it is often argued that correct predictions in the tail are more "interesting" or "rewarding," but the community has not yet settled on a metric capturing this intuitive concept. The existing propensity-scored metrics fall short on this goal by confounding the problems of long-tail and missing labels. In this paper, we analyze generalized metrics budgeted "at k" as an alternative solution. To tackle the challenging problem of optimizing these metrics, we formulate it in the \emph{expected test utility} (ETU) framework, which aims at optimizing the expected performance on a given test set. We derive optimal prediction rules and construct their computationally efficient approximations with provable regret guarantees and being robust against model misspecification. Our algorithm, based on block coordinate descent, scales effortlessly to XMLC problems and obtains promising results in terms of long-tail performance.
AlkuperäiskieliEnglanti
OtsikkoAdvances in Neural Information Processing Systems 36 - 37th Conference on Neural Information Processing Systems, NeurIPS 2023
KustantajaCurran Associates Inc.
Sivumäärä35
ISBN (elektroninen)978-1-7138-9992-1
TilaJulkaistu - 2024
OKM-julkaisutyyppiA4 Artikkeli konferenssijulkaisussa
TapahtumaConference on Neural Information Processing Systems - Ernest N. Morial Convention Center, New Orleans, Yhdysvallat
Kesto: 10 jouluk. 202316 jouluk. 2023
Konferenssinumero: 37
https://nips.cc/

Julkaisusarja

NimiAdvances in Neural Information Processing Systems
KustantajaMorgan Kaufmann Publishers
Vuosikerta36
ISSN (painettu)1049-5258

Conference

ConferenceConference on Neural Information Processing Systems
LyhennettäNeurIPS
Maa/AlueYhdysvallat
KaupunkiNew Orleans
Ajanjakso10/12/202316/12/2023
www-osoite

Sormenjälki

Sukella tutkimusaiheisiin 'Generalized test utilities for long-tail performance in extreme multi-label classification'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.

Siteeraa tätä