Domain-driven data synopses for dynamic quantiles

Anna C. Gilbert, Yannis Kotidis, S. Muthukrishnan, Martin J. Strauss

    Research output: Contribution to journalArticlepeer-review

    Abstract

    In this paper, we present new algorithms for dynamically computing quantiles of a relation subject to insert as well as delete operations. At the core of our algorithms lies a small-space multiresolution representation of the underlying data distribution based on random subset sums or RSSs. These RSSs are updated with every insert and delete operation. When quantiles are demanded, we use these RSSs to estimate quickly, without having to access the data, all the quantiles, each guaranteed to be accurate to within user-specified precision. While quantiles have found many uses in databases, in this paper, our focus is primarily on network management applications that monitor the distribution of active sessions in the network. Our examples are drawn both from the telephony and the IP network, where the goal is to monitor the distribution of the length of active calls and IP flows, respectively, over time. For such applications, we propose a new type of histogram that uses RSSs for summarizing the dynamic parts of the distributions while other parts with small volume of sessions are approximated using simple counters.

    Original languageEnglish (US)
    Pages (from-to)927-937
    Number of pages11
    JournalIEEE Transactions on Knowledge and Data Engineering
    Volume17
    Issue number7
    DOIs
    StatePublished - Jul 2005

    Keywords

    • Data streams
    • Database statistics
    • Quantiles

    ASJC Scopus subject areas

    • Information Systems
    • Computer Science Applications
    • Computational Theory and Mathematics

    Fingerprint

    Dive into the research topics of 'Domain-driven data synopses for dynamic quantiles'. Together they form a unique fingerprint.

    Cite this