TY - JOUR
T1 - Don't thrash
T2 - How to cache your hash on flash
AU - Bender, Michael A.
AU - Farach-Colton, Martin
AU - Johnson, Rob
AU - Kraner, Russell
AU - Kuszmaul, Bradley C.
AU - Medjedovic, Dzejla
AU - Montes, Pablo
AU - Shetty, Pradeep
AU - Spillane, Richard P.
AU - Zadok, Erez
PY - 2012/7
Y1 - 2012/7
N2 - This paper presents new alternatives to the well-known Bloom filter data structure. The Bloom filter, a compact data structure supporting set insertion and membership queries, has found wide application in databases, storage systems, and networks. Because the Bloom filter performs frequent random reads and writes, it is used almost exclu-sively in RAM, limiting the size of the sets it can represent. This paper first describes the quotient filter, which sup-ports the basic operations of the Bloom filter, achieving roughly comparable performance in terms of space and time, but with better data locality. Operations on the quotient fil-ter require only a small number of contiguous accesses. The quotient filter has other advantages over the Bloom filter: it supports deletions, it can be dynamically resized, and two quotient filters can be efficiently merged. The paper then gives two data structures, the buffered quotient filter and the cascade filter, which exploit the quo-tient filter advantages and thus serve as SSD-optimized al-ternatives to the Bloom filter. The cascade filter has better asymptotic I/O performance than the buffered quotient fil-ter, but the buffered quotient filter outperforms the cascade filter on small to medium data sets. Both data structures significantly outperform recently-proposed SSD-optimized Bloom filter variants, such as the elevator Bloom filter, buffered Bloom filter, and forest-structured Bloom filter. In experiments, the cascade filter and buffered quotient fil-ter performed insertions 8.6-11 times faster than the fastest Bloom filter variant and performed lookups 0.94-2.56 times faster.
AB - This paper presents new alternatives to the well-known Bloom filter data structure. The Bloom filter, a compact data structure supporting set insertion and membership queries, has found wide application in databases, storage systems, and networks. Because the Bloom filter performs frequent random reads and writes, it is used almost exclu-sively in RAM, limiting the size of the sets it can represent. This paper first describes the quotient filter, which sup-ports the basic operations of the Bloom filter, achieving roughly comparable performance in terms of space and time, but with better data locality. Operations on the quotient fil-ter require only a small number of contiguous accesses. The quotient filter has other advantages over the Bloom filter: it supports deletions, it can be dynamically resized, and two quotient filters can be efficiently merged. The paper then gives two data structures, the buffered quotient filter and the cascade filter, which exploit the quo-tient filter advantages and thus serve as SSD-optimized al-ternatives to the Bloom filter. The cascade filter has better asymptotic I/O performance than the buffered quotient fil-ter, but the buffered quotient filter outperforms the cascade filter on small to medium data sets. Both data structures significantly outperform recently-proposed SSD-optimized Bloom filter variants, such as the elevator Bloom filter, buffered Bloom filter, and forest-structured Bloom filter. In experiments, the cascade filter and buffered quotient fil-ter performed insertions 8.6-11 times faster than the fastest Bloom filter variant and performed lookups 0.94-2.56 times faster.
UR - http://www.scopus.com/inward/record.url?scp=84873108209&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84873108209&partnerID=8YFLogxK
U2 - 10.14778/2350229.2350275
DO - 10.14778/2350229.2350275
M3 - Article
AN - SCOPUS:84873108209
SN - 2150-8097
VL - 5
SP - 1627
EP - 1637
JO - Proceedings of the VLDB Endowment
JF - Proceedings of the VLDB Endowment
IS - 11
ER -