Approximate kernel clustering

Subhash Khot, Assaf Naor

Research output: Contribution to journalArticlepeer-review

Abstract

In the kernel clustering problem we are given a large n × n positive-semidefinite matrix A = (aij) with Σ i, j=1n aij = 0 and a small k × k positive-semidefinite matrix B = (bij). The goal is to find a partition S1, ⋯, Sk of {1, ⋯ n} which maximizes the quantity Σi, j=1k(p, q)εSi×Sj apq)b ij. We study the computational complexity of this generic clustering problem which originates in the theory of machine learning. We design a constant factor polynomial time approximation algorithm for this problem, answering a question posed by Song et al. In some cases we manage to compute the sharp approximation threshold for this problem assuming the unique games conjecture (UGC). In particular, when B is the 3 × 3 identity matrix the UGC hardness threshold of this problem is exactly 16π/27. We present and study a geometric conjecture of independent interest which we show would imply that the UGC threshold when B is the k × k identity matrix is (8π/9)(1 - 1/k) for every k ≥ 3.

Original languageEnglish (US)
Pages (from-to)129-165
Number of pages37
JournalMathematika
Volume55
Issue number1-2
DOIs
StatePublished - Jan 2009

ASJC Scopus subject areas

  • Mathematics(all)

Fingerprint Dive into the research topics of 'Approximate kernel clustering'. Together they form a unique fingerprint.

Cite this