B-trees with inserts and deletes: Why free-at-empty is better than merge-at-half

Theodore Johnson, Dennis Shasha

Research output: Contribution to journalArticlepeer-review

Abstract

The space utilization of B-tree nodes determines the number of levels in the B-tree and hence its performance. Until now, the only analytical aid to the determination of a B-tree′s utilization has been the analysis by Yao and related work. Yao showed that the utilization of B-tree nodes under pure inserts is 69%. We derive analytically and verify by simulation the utilization of B-tree nodes constructed from a mixture of insert and delete operations. Assuming that nodes only merge (i.e., are freed) when they are empty we show that the utilization is 39% when the number of inserts is the same as the number of deletes. However, it there are just 5% more inserts than deletes, then the utilization is over 62%. We also calculate the probability of splitting and merging. We derive a simple rule-of-thumb that accurately calculates the probability of splitting. We also model B-trees that merge half-empty nodes. The utilization of merge-at-half B-trees is slightly larger than the utilization of free-at-empty B-trees, but the restructuring rate is much higher. For most purposes, this implies that free-at-empty B-trees are a better implementation choice than merge-at-half B-trees. We present two models for computing B-tree utilization, the more accurate of which remembers items inserted and then deleted in a node.

Original languageEnglish (US)
Pages (from-to)45-76
Number of pages32
JournalJournal of Computer and System Sciences
Volume47
Issue number1
DOIs
StatePublished - 1993

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Networks and Communications
  • Computational Theory and Mathematics
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'B-trees with inserts and deletes: Why free-at-empty is better than merge-at-half'. Together they form a unique fingerprint.

Cite this