Optimal and approximate computation of summary statistics for range aggregates

A. C. Gilbert, Y. Kotidis, S. Muthukrishnan, M. J. Strauss

    Research output: Contribution to conferencePaperpeer-review

    Abstract

    Fast estimates for aggregate queries are useful in database query optimization, approximate query answering and online query processing. Hence, there has been a lot of focus on "selectivity estimation", that is, computing summary statistics on the underlying data and using that to answer aggregate queries fast and to a reasonable approximation. We present two sets of results for range aggregate queries, which are amongst the most common queries. First, we focus on a histogram as summary statistics and present algorithms for constructing histograms that are provably optimal (or provably approximate) for range queries; these algorithms take (pseudo-) polynomial time. These are the first known optimality or approximation results for arbitrary range queries; previously known results were optimal only for restricted range queries (such as equality queries, hierarchical or prefix range queries). Second, we focus on wavelet-based representations as summary statistics and present fast algorithms for pi cking wavelet statistics that are provably optimal for range queries. No previously-known wavelet-based methods have this property. We perform an experimental study of the various summary representations show the benefits of our algorithms over the known methods.

    Original languageEnglish (US)
    Pages227-236
    Number of pages10
    DOIs
    StatePublished - 2001
    Event20th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems - Santa Barbara, CA, United States
    Duration: May 21 2001May 23 2001

    Conference

    Conference20th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems
    CountryUnited States
    CitySanta Barbara, CA
    Period5/21/015/23/01

    ASJC Scopus subject areas

    • Software
    • Information Systems
    • Hardware and Architecture

    Fingerprint Dive into the research topics of 'Optimal and approximate computation of summary statistics for range aggregates'. Together they form a unique fingerprint.

    Cite this