TY - GEN
T1 - On the generation of 2-dimensional index workloads
AU - Hellerstein, Joseph M.
AU - Hellerstein, Lisa
AU - Kollios, George
N1 - Publisher Copyright:
© Springer-Verlag Berlin Heidelberg 1998.
PY - 1998
Y1 - 1998
N2 - A large number of database index structures have been proposed over the last two decades, and little consensus has emerged regarding their relative e_ectiveness. In order to empirically evaluate these indexes, it is helpful to have methodologies for generating random queries for performance testing. In this paper we propose a domain-independent approach to the generation of random queries: choose randomly among all logically distinct queries. We investigate this idea in the context of range queries over 2-dimensional points. We present an algorithm that chooses randomly among logically distinct 2-d range queries. It has constant-time expected performance over uniformly distributed data, and exhibited good performance in experiments over a variety of real and synthetic data sets. We observe nonuniformities in the way randomly chosen logical 2-d range queries are distributed over a variety of spatial properties. This raises questions about the quality of the workloads generated from such queries. We contrast our approach with previous work that generates workloads of random spatial ranges, and we sketch directions for future work on the robust generation of workloads for studying index performance.
AB - A large number of database index structures have been proposed over the last two decades, and little consensus has emerged regarding their relative e_ectiveness. In order to empirically evaluate these indexes, it is helpful to have methodologies for generating random queries for performance testing. In this paper we propose a domain-independent approach to the generation of random queries: choose randomly among all logically distinct queries. We investigate this idea in the context of range queries over 2-dimensional points. We present an algorithm that chooses randomly among logically distinct 2-d range queries. It has constant-time expected performance over uniformly distributed data, and exhibited good performance in experiments over a variety of real and synthetic data sets. We observe nonuniformities in the way randomly chosen logical 2-d range queries are distributed over a variety of spatial properties. This raises questions about the quality of the workloads generated from such queries. We contrast our approach with previous work that generates workloads of random spatial ranges, and we sketch directions for future work on the robust generation of workloads for studying index performance.
UR - http://www.scopus.com/inward/record.url?scp=84947205723&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84947205723&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84947205723
SN - 3540654526
SN - 9783540654520
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 113
EP - 130
BT - Database Theory - ICDT 1999 - 7th International Conference, Proceedings
A2 - Buneman, Peter
A2 - Beeri, Catriel
PB - Springer Verlag
T2 - 7th International Conference on Database Theory, ICDT 1999
Y2 - 10 January 1999 through 12 January 1999
ER -