TY - GEN
T1 - On the optimal time/space tradeoff for hash tables
AU - Bender, Michael A.
AU - Farach-Colton, Martín
AU - Kuszmaul, John
AU - Kuszmaul, William
AU - Liu, Mingmou
N1 - Publisher Copyright:
© 2022 ACM.
PY - 2022/9/6
Y1 - 2022/9/6
N2 - For nearly six decades, the central open question in the study of hash tables has been to determine the optimal achievable tradeoff curve between time and space. State-of-the-art hash tables offer the following guarantee: If keys/values are (logn) bits each, then it is possible to achieve constant-time insertions/deletions/queries while wasting only O(loglogn) bits of space per key when compared to the information-theoretic optimum-this bound has been proven to be optimal for a number of closely related problems (e.g., stable hashing, dynamic retrieval, and dynamically-resized filters). This paper shows that O(loglogn) wasted bits per key is not the end of the line for hashing. In fact, for any k [log∗ n], it is possible to achieve O(k)-time insertions/deletions, O(1)-time queries, and the k-th iterated logarithm O(log(k) n) wasted bits per key (all with high probability in n), while also supporting dynamic resizing as the size of the table changes. We further show that this tradeoff curve is the best achievable by any of a large class of hash tables, including any hash table designed using the current framework for making constant-time hash tables succinct. Our result holds for arbitrarily large keys/values, and in the case where keys/values are very small, we can tighten our bounds to o(1) wasted bits per key. Building on this, we obtain a constant-time dynamic filter that uses n glog"-1 g‰+ n loge + o(n) bits of space for a wide choice of false-positive rates ", resolving a long-standing open problem for the design of dynamic filters.
AB - For nearly six decades, the central open question in the study of hash tables has been to determine the optimal achievable tradeoff curve between time and space. State-of-the-art hash tables offer the following guarantee: If keys/values are (logn) bits each, then it is possible to achieve constant-time insertions/deletions/queries while wasting only O(loglogn) bits of space per key when compared to the information-theoretic optimum-this bound has been proven to be optimal for a number of closely related problems (e.g., stable hashing, dynamic retrieval, and dynamically-resized filters). This paper shows that O(loglogn) wasted bits per key is not the end of the line for hashing. In fact, for any k [log∗ n], it is possible to achieve O(k)-time insertions/deletions, O(1)-time queries, and the k-th iterated logarithm O(log(k) n) wasted bits per key (all with high probability in n), while also supporting dynamic resizing as the size of the table changes. We further show that this tradeoff curve is the best achievable by any of a large class of hash tables, including any hash table designed using the current framework for making constant-time hash tables succinct. Our result holds for arbitrarily large keys/values, and in the case where keys/values are very small, we can tighten our bounds to o(1) wasted bits per key. Building on this, we obtain a constant-time dynamic filter that uses n glog"-1 g‰+ n loge + o(n) bits of space for a wide choice of false-positive rates ", resolving a long-standing open problem for the design of dynamic filters.
KW - balls and bins
KW - data structures
KW - hash tables
KW - succinct
UR - http://www.scopus.com/inward/record.url?scp=85130832160&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85130832160&partnerID=8YFLogxK
U2 - 10.1145/3519935.3519969
DO - 10.1145/3519935.3519969
M3 - Conference contribution
AN - SCOPUS:85130832160
T3 - Proceedings of the Annual ACM Symposium on Theory of Computing
SP - 1284
EP - 1297
BT - STOC 2022 - Proceedings of the 54th Annual ACM SIGACT Symposium on Theory of Computing
A2 - Leonardi, Stefano
A2 - Gupta, Anupam
PB - Association for Computing Machinery
T2 - 54th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2022
Y2 - 20 June 2022 through 24 June 2022
ER -