Optimal and approximate computation of summary statistics for range aggregates

A. C. Gilbert, Y. Kotidis, S. Muthukrishnan, M. J. Strauss

Research output: Contribution to conferencePaper

Abstract

Fast estimates for aggregate queries are useful in database query optimization, approximate query answering and online query processing. Hence, there has been a lot of focus on "selectivity estimation", that is, computing summary statistics on the underlying data and using that to answer aggregate queries fast and to a reasonable approximation. We present two sets of results for range aggregate queries, which are amongst the most common queries. First, we focus on a histogram as summary statistics and present algorithms for constructing histograms that are provably optimal (or provably approximate) for range queries; these algorithms take (pseudo-) polynomial time. These are the first known optimality or approximation results for arbitrary range queries; previously known results were optimal only for restricted range queries (such as equality queries, hierarchical or prefix range queries). Second, we focus on wavelet-based representations as summary statistics and present fast algorithms for pi cking wavelet statistics that are provably optimal for range queries. No previously-known wavelet-based methods have this property. We perform an experimental study of the various summary representations show the benefits of our algorithms over the known methods.

Original languageEnglish (US)
Pages227-236
Number of pages10
DOIs
StatePublished - 2001
Externally publishedYes
Event20th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems - Santa Barbara, CA, United States
Duration: May 21 2001May 23 2001

Conference

Conference20th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems
CountryUnited States
CitySanta Barbara, CA
Period5/21/015/23/01

ASJC Scopus subject areas

  • Software
  • Information Systems
  • Hardware and Architecture

Fingerprint Dive into the research topics of 'Optimal and approximate computation of summary statistics for range aggregates'. Together they form a unique fingerprint.

  • Cite this

    Gilbert, A. C., Kotidis, Y., Muthukrishnan, S., & Strauss, M. J. (2001). Optimal and approximate computation of summary statistics for range aggregates. 227-236. Paper presented at 20th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, Santa Barbara, CA, United States. https://doi.org/10.1145/375551.375598