TY - GEN

T1 - Finding an approximate mode of a Kernel density estimate

AU - Lee, Jasper C.H.

AU - Li, Jerry

AU - Musco, Christopher

AU - Phillips, Jeff M.

AU - Tai, Wai Ming

N1 - Publisher Copyright:
© Jasper C.H. Lee, Jerry Li, Christopher Musco, Jeff M. Phillips, and Wai Ming Tai; licensed under Creative Commons License CC-BY 4.0

PY - 2021/9/1

Y1 - 2021/9/1

N2 - Given points P = {p1,..., pn} subset of Rd, how do we find a point x which approximately maximizes the function n1 Ppi∈P e−∥pi−x∥2 ? In other words, how do we find an approximate mode of a Gaussian kernel density estimate (KDE) of P? Given the power of KDEs in representing probability distributions and other continuous functions, the basic mode finding problem is widely applicable. However, it is poorly understood algorithmically. We provide fast and provably accurate approximation algorithms for mode finding in both the low and high dimensional settings. For low (constant) dimension, our main contribution is a reduction to solving systems of polynomial inequalities. For high dimension, we prove the first dimensionality reduction result for KDE mode finding. The latter result leverages Johnson-Lindenstrauss projection, Kirszbraun's classic extension theorem, and perhaps surprisingly, the mean-shift heuristic for mode finding. For constant approximation factor these algorithms run in O(n(log n)O(d)) and O(nd + (log n)O(log3 n)), respectively; these are proven more precisely as a (1 + ϵ)-approximation guarantee. Furthermore, for the special case of d = 2, we give a combinatorial algorithm running in O(n log2 n) time. We empirically demonstrate that the random projection approach and the 2-dimensional algorithm improves over the state-of-the-art mode-finding heuristics.

AB - Given points P = {p1,..., pn} subset of Rd, how do we find a point x which approximately maximizes the function n1 Ppi∈P e−∥pi−x∥2 ? In other words, how do we find an approximate mode of a Gaussian kernel density estimate (KDE) of P? Given the power of KDEs in representing probability distributions and other continuous functions, the basic mode finding problem is widely applicable. However, it is poorly understood algorithmically. We provide fast and provably accurate approximation algorithms for mode finding in both the low and high dimensional settings. For low (constant) dimension, our main contribution is a reduction to solving systems of polynomial inequalities. For high dimension, we prove the first dimensionality reduction result for KDE mode finding. The latter result leverages Johnson-Lindenstrauss projection, Kirszbraun's classic extension theorem, and perhaps surprisingly, the mean-shift heuristic for mode finding. For constant approximation factor these algorithms run in O(n(log n)O(d)) and O(nd + (log n)O(log3 n)), respectively; these are proven more precisely as a (1 + ϵ)-approximation guarantee. Furthermore, for the special case of d = 2, we give a combinatorial algorithm running in O(n log2 n) time. We empirically demonstrate that the random projection approach and the 2-dimensional algorithm improves over the state-of-the-art mode-finding heuristics.

KW - Coresets

KW - Dimensionality reduction

KW - Kernel density estimation

KW - Means-shift

UR - http://www.scopus.com/inward/record.url?scp=85115112271&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85115112271&partnerID=8YFLogxK

U2 - 10.4230/LIPIcs.ESA.2021.61

DO - 10.4230/LIPIcs.ESA.2021.61

M3 - Conference contribution

AN - SCOPUS:85115112271

T3 - Leibniz International Proceedings in Informatics, LIPIcs

BT - 29th Annual European Symposium on Algorithms, ESA 2021

A2 - Mutzel, Petra

A2 - Pagh, Rasmus

A2 - Herman, Grzegorz

PB - Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing

T2 - 29th Annual European Symposium on Algorithms, ESA 2021

Y2 - 6 September 2021 through 8 September 2021

ER -