Diversity-Aware k-median: Clustering with Fair Center Representation

Suhas Thejaswi*, Bruno Ordozgoiti, Aristides Gionis

*Tämän työn vastaava kirjoittaja

Tutkimustuotos: Artikkeli kirjassa/konferenssijulkaisussaConference article in proceedingsScientificvertaisarvioitu

8 Sitaatiot (Scopus)

Abstrakti

We introduce a novel problem for diversity-aware clustering. We assume that the potential cluster centers belong to a set of groups defined by protected attributes, such as ethnicity, gender, etc. We then ask to find a minimum-cost clustering of the data into k clusters so that a specified minimum number of cluster centers are chosen from each group. We thus require that all groups are represented in the clustering solution as cluster centers, according to specified requirements. More precisely, we are given a set of clients C, a set of facilities, a collection F= { F1, ⋯, Ft} of facility groups, a budget k, and a set of lower-bound thresholds R= { r1, ⋯, rt}, one for each group in F. The diversity-aware k-median problem asks to find a set S of k facilities in such that | S∩ Fi| ≥ ri, that is, at least ri centers in S are from group Fi, and the k-median cost ∑ cCmin sSd(c, s) is minimized. We show that in the general case where the facility groups may overlap, the diversity-aware k-median problem is NP -hard, fixed-parameter intractable with respect to parameter k, and inapproximable to any multiplicative factor. On the other hand, when the facility groups are disjoint, approximation algorithms can be obtained by reduction to the matroid median and red-blue median problems. Experimentally, we evaluate our approximation methods for the tractable cases, and present a relaxation-based heuristic for the theoretically intractable case, which can provide high-quality and efficient solutions for real-world datasets.

AlkuperäiskieliEnglanti
OtsikkoMachine Learning and Knowledge Discovery in Databases. Research Track - European Conference, ECML PKDD 2021, Proceedings
ToimittajatNuria Oliver, Fernando Pérez-Cruz, Stefan Kramer, Jesse Read, Jose A. Lozano
KustantajaSpringer
Sivut765-780
Sivumäärä16
ISBN (painettu)9783030865191
DOI - pysyväislinkit
TilaJulkaistu - 2021
OKM-julkaisutyyppiA4 Artikkeli konferenssijulkaisussa
TapahtumaEuropean Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases - Virtual, Online
Kesto: 13 syysk. 202117 syysk. 2021

Julkaisusarja

NimiLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
KustantajaSpringer
Vuosikerta12976 LNAI
ISSN (painettu)0302-9743
ISSN (elektroninen)1611-3349

Conference

ConferenceEuropean Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases
LyhennettäECML PKDD
KaupunkiVirtual, Online
Ajanjakso13/09/202117/09/2021

Sormenjälki

Sukella tutkimusaiheisiin 'Diversity-Aware k-median: Clustering with Fair Center Representation'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.

Siteeraa tätä