TY - GEN

T1 - Space- and time-efficient deterministic algorithms for biased quantiles over data streams

AU - Cormode, Graham

AU - Korn, Flip

AU - Muthukrishnan, S.

AU - Srivastava, Divesh

PY - 2006

Y1 - 2006

N2 - Skew is prevalent in data streams, and should be taken into account by algorithms that analyze the data. The problem of finding "biased quantiles"-that is, approximate quantiles which must be more accurate for more extreme values-is a framework for summarizing such skewed data on data streams. We present the first deterministic algorithms for answering biased quantiles queries accurately with small-sublinear in the input size-space and time bounds in one pass. The space bound is near-optimal, and the amortized update cost is close to constant, making it practical for handling high speed network data streams. We not only demonstrate theoretical properties of the algorithm, but also show it uses less space than existing methods in many practical settings, and is fast to maintain.

AB - Skew is prevalent in data streams, and should be taken into account by algorithms that analyze the data. The problem of finding "biased quantiles"-that is, approximate quantiles which must be more accurate for more extreme values-is a framework for summarizing such skewed data on data streams. We present the first deterministic algorithms for answering biased quantiles queries accurately with small-sublinear in the input size-space and time bounds in one pass. The space bound is near-optimal, and the amortized update cost is close to constant, making it practical for handling high speed network data streams. We not only demonstrate theoretical properties of the algorithm, but also show it uses less space than existing methods in many practical settings, and is fast to maintain.

KW - Biased quantiles

KW - Data stream algorithms

UR - http://www.scopus.com/inward/record.url?scp=33846219851&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33846219851&partnerID=8YFLogxK

U2 - 10.1145/1142351.1142389

DO - 10.1145/1142351.1142389

M3 - Conference contribution

AN - SCOPUS:33846219851

SN - 1595933182

SN - 9781595933188

T3 - Proceedings of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems

SP - 263

EP - 272

BT - Proceedings of the Twenty-Fifth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 2006

T2 - 25th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 2006

Y2 - 26 June 2006 through 28 June 2006

ER -