From XML Schema to relations: A cost-based approach to XML storage

Philip Bohannon, Juliana Freire, Prasan Roy, Jérôme Siméon

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

As Web applications manipulate an increasing amount of XML, there is a growing interest in storing XML data in relational databases. Due to the mismatch between the complexity of XML's tree structure and the simplicity of flat relational tables, there are many ways to store the same document in an RDBMS, and a number of heuristic techniques have been proposed. These techniques typically define fixed mappings and do not take application characteristics into account. However, a fixed mapping is unlikely to work well for all possible applications. In contrast, LegoDB is a cost-based XML storage mapping engine that explores a space of possible XML-to-relational mappings and selects the best mapping for a given application. LegoDB leverages current XML and relational technologies: 1) it models the target application with an XML Schema, XML data statistics, and an XQuery workload; 2) the space of configurations is generated through XML-Schema rewritings; and 3) the best among the derived configurations is selected using cost estimates obtained through a standard relational optimizer. In this paper, we describe the LegoDB storage engine and provide experimental results that demonstrate the effectiveness of this approach.

Original languageEnglish (US)
Title of host publicationProceedings - International Conference on Data Engineering
EditorsR Agrawal, K Dittrich, A Ngu
Pages64-75
Number of pages12
StatePublished - 2002
Event18th International Conference on Data Engineering - San Jose, CA, United States
Duration: Feb 26 2002Mar 1 2002

Other

Other18th International Conference on Data Engineering
Country/TerritoryUnited States
CitySan Jose, CA
Period2/26/023/1/02

ASJC Scopus subject areas

  • Software
  • Engineering(all)
  • Engineering (miscellaneous)

Fingerprint

Dive into the research topics of 'From XML Schema to relations: A cost-based approach to XML storage'. Together they form a unique fingerprint.

Cite this