Diversity-Aware k-median: Clustering with Fair Center Representation

Suhas Thejaswi*, Bruno Ordozgoiti, Aristides Gionis

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

7 Citations (Scopus)

Abstract

We introduce a novel problem for diversity-aware clustering. We assume that the potential cluster centers belong to a set of groups defined by protected attributes, such as ethnicity, gender, etc. We then ask to find a minimum-cost clustering of the data into k clusters so that a specified minimum number of cluster centers are chosen from each group. We thus require that all groups are represented in the clustering solution as cluster centers, according to specified requirements. More precisely, we are given a set of clients C, a set of facilities, a collection F= { F1, ⋯, Ft} of facility groups, a budget k, and a set of lower-bound thresholds R= { r1, ⋯, rt}, one for each group in F. The diversity-aware k-median problem asks to find a set S of k facilities in such that | S∩ Fi| ≥ ri, that is, at least ri centers in S are from group Fi, and the k-median cost ∑ cCmin sSd(c, s) is minimized. We show that in the general case where the facility groups may overlap, the diversity-aware k-median problem is NP -hard, fixed-parameter intractable with respect to parameter k, and inapproximable to any multiplicative factor. On the other hand, when the facility groups are disjoint, approximation algorithms can be obtained by reduction to the matroid median and red-blue median problems. Experimentally, we evaluate our approximation methods for the tractable cases, and present a relaxation-based heuristic for the theoretically intractable case, which can provide high-quality and efficient solutions for real-world datasets.

Original languageEnglish
Title of host publicationMachine Learning and Knowledge Discovery in Databases. Research Track - European Conference, ECML PKDD 2021, Proceedings
EditorsNuria Oliver, Fernando Pérez-Cruz, Stefan Kramer, Jesse Read, Jose A. Lozano
PublisherSpringer
Pages765-780
Number of pages16
ISBN (Print)9783030865191
DOIs
Publication statusPublished - 2021
MoE publication typeA4 Conference publication
EventEuropean Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases - Virtual, Online
Duration: 13 Sept 202117 Sept 2021

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
PublisherSpringer
Volume12976 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceEuropean Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases
Abbreviated titleECML PKDD
CityVirtual, Online
Period13/09/202117/09/2021

Keywords

  • Algorithmic bias
  • Algorithmic fairness
  • Diversity-aware clustering
  • Fair clustering

Fingerprint

Dive into the research topics of 'Diversity-Aware k-median: Clustering with Fair Center Representation'. Together they form a unique fingerprint.

Cite this