On Missing Labels, Long-tails and Propensities in Extreme Multi-label Classification

Erik Schultheis, Rohit Babbar, Marek Wydmuch, Krzysztof Dembczynski

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

26 Citations (Scopus)
80 Downloads (Pure)

Abstract

The propensity model introduced by Jain et al has become a standard approach for dealing with missing and long-tail labels in extreme multi-label classification (XMLC). In this paper, we critically revise this approach showing that despite its theoretical soundness, its application in contemporary XMLC works is debatable. We exhaustively discuss the flaws of the propensity-based approach, and present several recipes, some of them related to solutions used in search engines and recommender systems, that we believe constitute promising alternatives to be followed in XMLC.

Original languageEnglish
Title of host publicationProceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
PublisherACM
Pages1547–1557
Number of pages11
ISBN (Electronic)978-1-4503-9385-0
DOIs
Publication statusPublished - Aug 2022
MoE publication typeA4 Conference publication
EventACM SIGKDD International Conference on Knowledge Discovery and Data Mining - Washington, United States
Duration: 14 Aug 202218 Aug 2022
Conference number: 28
https://kdd.org/kdd2022/

Conference

ConferenceACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Abbreviated titleKDD
Country/TerritoryUnited States
CityWashington
Period14/08/202218/08/2022
Internet address

Fingerprint

Dive into the research topics of 'On Missing Labels, Long-tails and Propensities in Extreme Multi-label Classification'. Together they form a unique fingerprint.

Cite this