TY - GEN
T1 - Finding an approximate mode of a Kernel density estimate
AU - Lee, Jasper C.H.
AU - Li, Jerry
AU - Musco, Christopher
AU - Phillips, Jeff M.
AU - Tai, Wai Ming
N1 - Publisher Copyright:
© Jasper C.H. Lee, Jerry Li, Christopher Musco, Jeff M. Phillips, and Wai Ming Tai; licensed under Creative Commons License CC-BY 4.0
PY - 2021/9/1
Y1 - 2021/9/1
N2 - Given points P = {p1,..., pn} subset of Rd, how do we find a point x which approximately maximizes the function n1 Ppi∈P e−∥pi−x∥2 ? In other words, how do we find an approximate mode of a Gaussian kernel density estimate (KDE) of P? Given the power of KDEs in representing probability distributions and other continuous functions, the basic mode finding problem is widely applicable. However, it is poorly understood algorithmically. We provide fast and provably accurate approximation algorithms for mode finding in both the low and high dimensional settings. For low (constant) dimension, our main contribution is a reduction to solving systems of polynomial inequalities. For high dimension, we prove the first dimensionality reduction result for KDE mode finding. The latter result leverages Johnson-Lindenstrauss projection, Kirszbraun's classic extension theorem, and perhaps surprisingly, the mean-shift heuristic for mode finding. For constant approximation factor these algorithms run in O(n(log n)O(d)) and O(nd + (log n)O(log3 n)), respectively; these are proven more precisely as a (1 + ϵ)-approximation guarantee. Furthermore, for the special case of d = 2, we give a combinatorial algorithm running in O(n log2 n) time. We empirically demonstrate that the random projection approach and the 2-dimensional algorithm improves over the state-of-the-art mode-finding heuristics.
AB - Given points P = {p1,..., pn} subset of Rd, how do we find a point x which approximately maximizes the function n1 Ppi∈P e−∥pi−x∥2 ? In other words, how do we find an approximate mode of a Gaussian kernel density estimate (KDE) of P? Given the power of KDEs in representing probability distributions and other continuous functions, the basic mode finding problem is widely applicable. However, it is poorly understood algorithmically. We provide fast and provably accurate approximation algorithms for mode finding in both the low and high dimensional settings. For low (constant) dimension, our main contribution is a reduction to solving systems of polynomial inequalities. For high dimension, we prove the first dimensionality reduction result for KDE mode finding. The latter result leverages Johnson-Lindenstrauss projection, Kirszbraun's classic extension theorem, and perhaps surprisingly, the mean-shift heuristic for mode finding. For constant approximation factor these algorithms run in O(n(log n)O(d)) and O(nd + (log n)O(log3 n)), respectively; these are proven more precisely as a (1 + ϵ)-approximation guarantee. Furthermore, for the special case of d = 2, we give a combinatorial algorithm running in O(n log2 n) time. We empirically demonstrate that the random projection approach and the 2-dimensional algorithm improves over the state-of-the-art mode-finding heuristics.
KW - Coresets
KW - Dimensionality reduction
KW - Kernel density estimation
KW - Means-shift
UR - http://www.scopus.com/inward/record.url?scp=85115112271&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85115112271&partnerID=8YFLogxK
U2 - 10.4230/LIPIcs.ESA.2021.61
DO - 10.4230/LIPIcs.ESA.2021.61
M3 - Conference contribution
AN - SCOPUS:85115112271
T3 - Leibniz International Proceedings in Informatics, LIPIcs
BT - 29th Annual European Symposium on Algorithms, ESA 2021
A2 - Mutzel, Petra
A2 - Pagh, Rasmus
A2 - Herman, Grzegorz
PB - Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing
T2 - 29th Annual European Symposium on Algorithms, ESA 2021
Y2 - 6 September 2021 through 8 September 2021
ER -