TY - GEN
T1 - Cache-oblivious streaming B-trees
AU - Bender, Michael A.
AU - Farach-Colton, Martin
AU - Fineman, Jeremy T.
AU - Fogel, Yonatan R.
AU - Kuszmaul, Bradley C.
AU - Nelson, Jelani
PY - 2007
Y1 - 2007
N2 - A streaming B-tree is a dictionary that efficiently implements insertions and range queries. We present two cache-oblivious streaming B-trees, the shuttle tree, and the cache-oblivious lookahead array (COLA). For block-transfer size B and on N elements, the shuttle tree implements searches in optimal O(log B+1N) transfers, range queries of L successive elements in optimal O(log B+1N +L/B) transfers, and insertions in O((log B+1N)/B(1/(log log B)2)+(log2N)/B) transfers, which is an asymptotic speedup over traditional B-trees if B (log N) 1+c log log log2N for any constant c >1. A COLA implements searches in O(log N) transfers, range queries in O(log N + L/B) transfers, and insertions in amortized O((log N)/B) transfers, matching the bounds for a (cache-aware) buffered repository tree. A partially deamortized COLA matches these bounds but reduces the worst-case insertion cost to O(log N) if memory size M = (log N). We also present a cache-aware version of the COLA, the lookahead array, which achieves the same bounds as Brodal and Fagerberg's (cache-aware) Bε-tree. We compare our COLA implementation to a traditional B-tree. Our COLA implementation runs 790 times faster for random inser-tions, 3.1 times slower for insertions of sorted data, and 3.5 times slower for searches.
AB - A streaming B-tree is a dictionary that efficiently implements insertions and range queries. We present two cache-oblivious streaming B-trees, the shuttle tree, and the cache-oblivious lookahead array (COLA). For block-transfer size B and on N elements, the shuttle tree implements searches in optimal O(log B+1N) transfers, range queries of L successive elements in optimal O(log B+1N +L/B) transfers, and insertions in O((log B+1N)/B(1/(log log B)2)+(log2N)/B) transfers, which is an asymptotic speedup over traditional B-trees if B (log N) 1+c log log log2N for any constant c >1. A COLA implements searches in O(log N) transfers, range queries in O(log N + L/B) transfers, and insertions in amortized O((log N)/B) transfers, matching the bounds for a (cache-aware) buffered repository tree. A partially deamortized COLA matches these bounds but reduces the worst-case insertion cost to O(log N) if memory size M = (log N). We also present a cache-aware version of the COLA, the lookahead array, which achieves the same bounds as Brodal and Fagerberg's (cache-aware) Bε-tree. We compare our COLA implementation to a traditional B-tree. Our COLA implementation runs 790 times faster for random inser-tions, 3.1 times slower for insertions of sorted data, and 3.5 times slower for searches.
KW - Buffered repository tree
KW - Cache-oblivious B-tree
KW - Cascading array
KW - Deamortized
KW - Lookahead array
KW - Shuttle tree
UR - http://www.scopus.com/inward/record.url?scp=35248857461&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=35248857461&partnerID=8YFLogxK
U2 - 10.1145/1248377.1248393
DO - 10.1145/1248377.1248393
M3 - Conference contribution
AN - SCOPUS:35248857461
SN - 159593667X
SN - 9781595936677
T3 - Annual ACM Symposium on Parallelism in Algorithms and Architectures
SP - 81
EP - 92
BT - SPAA'07
T2 - SPAA'07: 19th Annual Symposium on Parallelism in Algorithms and Architectures
Y2 - 9 June 2007 through 11 June 2007
ER -