Abstract
Fast estimates for aggregate queries are useful in database query optimization, approximate query answering and online query processing. Hence, there has been a lot of focus on "selectivity estimation", that is, computing summary statistics on the underlying data and using that to answer aggregate queries fast and to a reasonable approximation. We present two sets of results for range aggregate queries, which are amongst the most common queries. First, we focus on a histogram as summary statistics and present algorithms for constructing histograms that are provably optimal (or provably approximate) for range queries; these algorithms take (pseudo-) polynomial time. These are the first known optimality or approximation results for arbitrary range queries; previously known results were optimal only for restricted range queries (such as equality queries, hierarchical or prefix range queries). Second, we focus on wavelet-based representations as summary statistics and present fast algorithms for pi cking wavelet statistics that are provably optimal for range queries. No previously-known wavelet-based methods have this property. We perform an experimental study of the various summary representations show the benefits of our algorithms over the known methods.
Original language | English (US) |
---|---|
Pages | 227-236 |
Number of pages | 10 |
DOIs | |
State | Published - 2001 |
Event | 20th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems - Santa Barbara, CA, United States Duration: May 21 2001 → May 23 2001 |
Conference
Conference | 20th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems |
---|---|
Country/Territory | United States |
City | Santa Barbara, CA |
Period | 5/21/01 → 5/23/01 |
ASJC Scopus subject areas
- Software
- Information Systems
- Hardware and Architecture