TY - GEN
T1 - Density estimation with adaptive sparse grids for large data sets
AU - Peherstorfer, Benjamin
AU - Pflüger, Dirk
AU - Bungartz, Hans Joachim
N1 - Publisher Copyright:
© SIAM.
Copyright:
Copyright 2017 Elsevier B.V., All rights reserved.
PY - 2014
Y1 - 2014
N2 - Nonparametric density estimation is a fundamental problem of statistics and data mining. Even though kernel density estimation is the most widely used method, its performance highly depends on the choice of the kernel bandwidth, and it can become computationally expensive for large data sets. WTe present an adaptive sparse-grid-based density estimation method which discretizes the estimated density function on basis functions centered at grid points rather than on kernels centered at the data points. Thus, the costs of evaluating the estimated density function are independent from the number of data points. We give details on how to estimate density functions on sparse grids and develop a cross validation technique for the parameter selection. We show numerical results to confirm that our sparse-grid-based method is well-suited for large data sets, and, finally, employ our method for the classification of astronomical objects to demonstrate that it is competitive to current kernel-based density estimation approaches with respect to classification accuracy and runtime. Copyright
AB - Nonparametric density estimation is a fundamental problem of statistics and data mining. Even though kernel density estimation is the most widely used method, its performance highly depends on the choice of the kernel bandwidth, and it can become computationally expensive for large data sets. WTe present an adaptive sparse-grid-based density estimation method which discretizes the estimated density function on basis functions centered at grid points rather than on kernels centered at the data points. Thus, the costs of evaluating the estimated density function are independent from the number of data points. We give details on how to estimate density functions on sparse grids and develop a cross validation technique for the parameter selection. We show numerical results to confirm that our sparse-grid-based method is well-suited for large data sets, and, finally, employ our method for the classification of astronomical objects to demonstrate that it is competitive to current kernel-based density estimation approaches with respect to classification accuracy and runtime. Copyright
UR - http://www.scopus.com/inward/record.url?scp=84921664859&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84921664859&partnerID=8YFLogxK
U2 - 10.1137/1.9781611973440.51
DO - 10.1137/1.9781611973440.51
M3 - Conference contribution
AN - SCOPUS:84921664859
T3 - SIAM International Conference on Data Mining 2014, SDM 2014
SP - 443
EP - 451
BT - SIAM International Conference on Data Mining 2014, SDM 2014
A2 - Zaki, Mohammed J.
A2 - Banerjee, Arindam
A2 - Parthasarathy, Srinivasan
A2 - Ning-Tan, Pang
A2 - Obradovic, Zoran
A2 - Kamath, Chandrika
PB - Society for Industrial and Applied Mathematics Publications
T2 - 14th SIAM International Conference on Data Mining, SDM 2014
Y2 - 24 April 2014 through 26 April 2014
ER -