The Price of Explainability for Clustering

Anupam Gupta, Madhusudhan Reddy Pittu, Ola Svensson, Rachel Yuan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Given a set of points in d-dimensional space, an explainable clustering is one where the clusters are specified by a tree of axis-aligned threshold cuts. Dasgupta et al. (ICML 2020) posed the question of the price of explainability: the worst-case ratio between the cost of the best explainable clusterings to that of the best clusterings.We show that the price of explainability for k medians is at most 1+H_k-1; in fact, we show that the popular Random Thresholds algorithm has exactly this price of explainability, matching the known lower bound constructions. We complement our tight analysis of this particular algorithm by constructing instances where the price of explainability (using any algorithm) is at least (1-o(1)) ln k, showing that our result is best possible, up to lower-order terms. We also improve the price of explainability for the k-means problem to O(k ln k) from the previous O(k ln k), considerably closing the gap to the lower bounds of Ω(k). Finally, we study the algorithmic question of finding the best explainable clustering: We show that explainable k medians and k-means cannot be approximated better than O(ln k), under standard complexity-theoretic conjectures. This essentially settles the approximability of explainable k-medians and leaves open the intriguing possibility to get significantly better approximation algorithms for k-means than its price of explainability.

Original languageEnglish (US)
Title of host publicationProceedings - 2023 IEEE 64th Annual Symposium on Foundations of Computer Science, FOCS 2023
PublisherIEEE Computer Society
Pages1131-1148
Number of pages18
ISBN (Electronic)9798350318944
DOIs
StatePublished - 2023
Event64th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2023 - Santa Cruz, United States
Duration: Nov 6 2023Nov 9 2023

Publication series

NameProceedings - Annual IEEE Symposium on Foundations of Computer Science, FOCS
ISSN (Print)0272-5428

Conference

Conference64th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2023
Country/TerritoryUnited States
CitySanta Cruz
Period11/6/2311/9/23

Keywords

  • approximation algorithms
  • explainable clustering
  • k-means
  • k-medians
  • randomized algorithms

ASJC Scopus subject areas

  • General Computer Science

Fingerprint

Dive into the research topics of 'The Price of Explainability for Clustering'. Together they form a unique fingerprint.

Cite this