Cache-oblivious string B-trees

Michael A. Bender, Martin Farach-Colton, Bradley C. Kuszmaul

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    B-trees are the data structure of choice for maintaining searchable data on disk. However, B-trees perform suboptimally when keys are long or of variable length,when keys are compressed, even when using front compression, the standard B-tree compression scheme,for range queries, andwith respect to memory effects such as disk prefetching.This paper presents a cache-oblivious string B-tree (COSB-tree) data structure that is efficient in all these ways: The COSB-tree searches asymptotically optimally and inserts and deletes nearly optimally.It maintains an index whose size is proportional to the front-compressed size of the dictionary. Furthermore, unlike standard front-compressed strings, keys can be decompressed in a memory-efficient manner.It performs range queries with no extra disk seeks; in contrast, B-trees incur disk seeks when skipping from leaf block to leaf block.It utilizes all levels of a memory hierarchy efficiently and makes good use of disk locality by using cache-oblivious layout strategies.

    Original languageEnglish (US)
    Title of host publicationProceedings of the Twenty-Fifth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 2006
    Pages233-242
    Number of pages10
    DOIs
    StatePublished - 2006
    Event25th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 2006 - Chicago, IL, United States
    Duration: Jun 26 2006Jun 28 2006

    Publication series

    NameProceedings of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems

    Other

    Other25th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 2006
    Country/TerritoryUnited States
    CityChicago, IL
    Period6/26/066/28/06

    Keywords

    • Cache oblivious string B-tree
    • Locality preserving front compression
    • Packed-memory array
    • Range query
    • Rebalance

    ASJC Scopus subject areas

    • Software
    • Information Systems
    • Hardware and Architecture

    Fingerprint

    Dive into the research topics of 'Cache-oblivious string B-trees'. Together they form a unique fingerprint.

    Cite this