Self-tuning management of update-intensive multidimensional data in clusters of workstations

Vassil Kriakov, George Kollios, Alex Delis

Research output: Contribution to journalArticlepeer-review

Abstract

Contemporary applications continuously modify large volumes of multidimensional data that must be accessed efficiently and, more importantly, must be updated in a timely manner. Single-server storage approaches are insufficient when managing such volumes of data, while the high frequency of data modification render classical indexing methods inefficient. To address these two problems we introduce a distributed storage manager for multidimensional data based on a Cluster-of-Workstations. The manager addresses the above challenges through a set of mechanisms that, through selective on-line data reorganization, collectively maintain a balanced load across a cluster of workstations. With the help of both a highly efficient and speedy self-tuning mechanism, based on a new data structure called stat-index, as well as a query aggregation and clustering algorithm, our storage manager attains short query response times even in the presence of massive modifications and highly skewed access patterns. Furthermore, we provide a data migration cost model used to determine the best data redistribution strategy. Through extensive experimentation with our prototype, we establish that our storage manager can sustain significant update rates with minimal overhead.

Original languageEnglish (US)
Pages (from-to)739-764
Number of pages26
JournalVLDB Journal
Volume18
Issue number3
DOIs
StatePublished - Jun 2009

Keywords

  • Cluster of workstations
  • Multi-dimensional data
  • Self-tuning storage

ASJC Scopus subject areas

  • Information Systems
  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'Self-tuning management of update-intensive multidimensional data in clusters of workstations'. Together they form a unique fingerprint.

Cite this