TY - JOUR
T1 - Self-tuning management of update-intensive multidimensional data in clusters of workstations
AU - Kriakov, Vassil
AU - Kollios, George
AU - Delis, Alex
N1 - Copyright:
Copyright 2012 Elsevier B.V., All rights reserved.
PY - 2009/6
Y1 - 2009/6
N2 - Contemporary applications continuously modify large volumes of multidimensional data that must be accessed efficiently and, more importantly, must be updated in a timely manner. Single-server storage approaches are insufficient when managing such volumes of data, while the high frequency of data modification render classical indexing methods inefficient. To address these two problems we introduce a distributed storage manager for multidimensional data based on a Cluster-of-Workstations. The manager addresses the above challenges through a set of mechanisms that, through selective on-line data reorganization, collectively maintain a balanced load across a cluster of workstations. With the help of both a highly efficient and speedy self-tuning mechanism, based on a new data structure called stat-index, as well as a query aggregation and clustering algorithm, our storage manager attains short query response times even in the presence of massive modifications and highly skewed access patterns. Furthermore, we provide a data migration cost model used to determine the best data redistribution strategy. Through extensive experimentation with our prototype, we establish that our storage manager can sustain significant update rates with minimal overhead.
AB - Contemporary applications continuously modify large volumes of multidimensional data that must be accessed efficiently and, more importantly, must be updated in a timely manner. Single-server storage approaches are insufficient when managing such volumes of data, while the high frequency of data modification render classical indexing methods inefficient. To address these two problems we introduce a distributed storage manager for multidimensional data based on a Cluster-of-Workstations. The manager addresses the above challenges through a set of mechanisms that, through selective on-line data reorganization, collectively maintain a balanced load across a cluster of workstations. With the help of both a highly efficient and speedy self-tuning mechanism, based on a new data structure called stat-index, as well as a query aggregation and clustering algorithm, our storage manager attains short query response times even in the presence of massive modifications and highly skewed access patterns. Furthermore, we provide a data migration cost model used to determine the best data redistribution strategy. Through extensive experimentation with our prototype, we establish that our storage manager can sustain significant update rates with minimal overhead.
KW - Cluster of workstations
KW - Multi-dimensional data
KW - Self-tuning storage
UR - http://www.scopus.com/inward/record.url?scp=67649529351&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=67649529351&partnerID=8YFLogxK
U2 - 10.1007/s00778-008-0121-2
DO - 10.1007/s00778-008-0121-2
M3 - Article
AN - SCOPUS:67649529351
SN - 1066-8888
VL - 18
SP - 739
EP - 764
JO - VLDB Journal
JF - VLDB Journal
IS - 3
ER -