TY - GEN

T1 - On the optimal time/space tradeoff for hash tables

AU - Bender, Michael A.

AU - Farach-Colton, Martín

AU - Kuszmaul, John

AU - Kuszmaul, William

AU - Liu, Mingmou

N1 - Publisher Copyright:
© 2022 ACM.

PY - 2022/9/6

Y1 - 2022/9/6

N2 - For nearly six decades, the central open question in the study of hash tables has been to determine the optimal achievable tradeoff curve between time and space. State-of-the-art hash tables offer the following guarantee: If keys/values are (logn) bits each, then it is possible to achieve constant-time insertions/deletions/queries while wasting only O(loglogn) bits of space per key when compared to the information-theoretic optimum-this bound has been proven to be optimal for a number of closely related problems (e.g., stable hashing, dynamic retrieval, and dynamically-resized filters). This paper shows that O(loglogn) wasted bits per key is not the end of the line for hashing. In fact, for any k [log∗ n], it is possible to achieve O(k)-time insertions/deletions, O(1)-time queries, and the k-th iterated logarithm O(log(k) n) wasted bits per key (all with high probability in n), while also supporting dynamic resizing as the size of the table changes. We further show that this tradeoff curve is the best achievable by any of a large class of hash tables, including any hash table designed using the current framework for making constant-time hash tables succinct. Our result holds for arbitrarily large keys/values, and in the case where keys/values are very small, we can tighten our bounds to o(1) wasted bits per key. Building on this, we obtain a constant-time dynamic filter that uses n glog"-1 g‰+ n loge + o(n) bits of space for a wide choice of false-positive rates ", resolving a long-standing open problem for the design of dynamic filters.

AB - For nearly six decades, the central open question in the study of hash tables has been to determine the optimal achievable tradeoff curve between time and space. State-of-the-art hash tables offer the following guarantee: If keys/values are (logn) bits each, then it is possible to achieve constant-time insertions/deletions/queries while wasting only O(loglogn) bits of space per key when compared to the information-theoretic optimum-this bound has been proven to be optimal for a number of closely related problems (e.g., stable hashing, dynamic retrieval, and dynamically-resized filters). This paper shows that O(loglogn) wasted bits per key is not the end of the line for hashing. In fact, for any k [log∗ n], it is possible to achieve O(k)-time insertions/deletions, O(1)-time queries, and the k-th iterated logarithm O(log(k) n) wasted bits per key (all with high probability in n), while also supporting dynamic resizing as the size of the table changes. We further show that this tradeoff curve is the best achievable by any of a large class of hash tables, including any hash table designed using the current framework for making constant-time hash tables succinct. Our result holds for arbitrarily large keys/values, and in the case where keys/values are very small, we can tighten our bounds to o(1) wasted bits per key. Building on this, we obtain a constant-time dynamic filter that uses n glog"-1 g‰+ n loge + o(n) bits of space for a wide choice of false-positive rates ", resolving a long-standing open problem for the design of dynamic filters.

KW - balls and bins

KW - data structures

KW - hash tables

KW - succinct

UR - http://www.scopus.com/inward/record.url?scp=85130832160&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85130832160&partnerID=8YFLogxK

U2 - 10.1145/3519935.3519969

DO - 10.1145/3519935.3519969

M3 - Conference contribution

AN - SCOPUS:85130832160

T3 - Proceedings of the Annual ACM Symposium on Theory of Computing

SP - 1284

EP - 1297

BT - STOC 2022 - Proceedings of the 54th Annual ACM SIGACT Symposium on Theory of Computing

A2 - Leonardi, Stefano

A2 - Gupta, Anupam

PB - Association for Computing Machinery

T2 - 54th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2022

Y2 - 20 June 2022 through 24 June 2022

ER -