TY - GEN
T1 - Flushing without cascades
AU - Bender, Michael A.
AU - Das, Rathish
AU - Farach-Colton, Martín
AU - Johnson, Rob
AU - Kuszmaul, William
N1 - Publisher Copyright:
Copyright © 2020 by SIAM
PY - 2020
Y1 - 2020
N2 - Buffer-and-flush is a technique for transforming standard external-memory search trees into write-optimized search trees. In exchange for faster amortized insertions, buffer-and-flush can sometimes significantly increase the latency of operations by causing cascades of flushes. In this paper, we show that flushing cascades are not a fundamental consequence of the buffer-flushing technique, and can be removed entirely using randomization techniques. The underlying implementation of buffer flushing relies on a buffer-eviction strategy at each node in the tree. The ability for the user to select the buffer eviction strategy based on the workload has been shown to be important for performance, both in theory and in practice. In order to support arbitrary buffer-eviction strategies, we introduce the notion of a universal flush, which uses a universal eviction policy that can simulate any other eviction policy. This abstracts away the underlying eviction strategy, even allowing for workload-specific strategies that change dynamically. Our deamortization preserves the amortized throughput of the underlying flushing strategy on all workloads. In particular, with our deamortization and a node cache of size poly-logarithmic in the number of insertions performed on the tree, the amortized insertion cost matches the lower bound of Brodal and Fagerberg. For typical parameters, the lower bound is less than 1 I/O per insertion. For such parameters, our worst-case insertion cost is O(1) I/Os.
AB - Buffer-and-flush is a technique for transforming standard external-memory search trees into write-optimized search trees. In exchange for faster amortized insertions, buffer-and-flush can sometimes significantly increase the latency of operations by causing cascades of flushes. In this paper, we show that flushing cascades are not a fundamental consequence of the buffer-flushing technique, and can be removed entirely using randomization techniques. The underlying implementation of buffer flushing relies on a buffer-eviction strategy at each node in the tree. The ability for the user to select the buffer eviction strategy based on the workload has been shown to be important for performance, both in theory and in practice. In order to support arbitrary buffer-eviction strategies, we introduce the notion of a universal flush, which uses a universal eviction policy that can simulate any other eviction policy. This abstracts away the underlying eviction strategy, even allowing for workload-specific strategies that change dynamically. Our deamortization preserves the amortized throughput of the underlying flushing strategy on all workloads. In particular, with our deamortization and a node cache of size poly-logarithmic in the number of insertions performed on the tree, the amortized insertion cost matches the lower bound of Brodal and Fagerberg. For typical parameters, the lower bound is less than 1 I/O per insertion. For such parameters, our worst-case insertion cost is O(1) I/Os.
UR - http://www.scopus.com/inward/record.url?scp=85084077932&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85084077932&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85084077932
T3 - Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms
SP - 650
EP - 669
BT - 31st Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2020
A2 - Chawla, Shuchi
PB - Association for Computing Machinery
T2 - 31st Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2020
Y2 - 5 January 2020 through 8 January 2020
ER -