Bloom filters, adaptivity, and the dictionary problem

Michael A. Bender, Martín Farach-Colton, Mayank Goswami, Rob Johnson, Samuel McCauley, Shikha Singh

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    An approximate membership query data structure (AMQ)-such as a Bloom, quotient, or cuckoo filter-maintains a compact, probabilistic representation of a set S of keys from a universe U. It supports lookups and inserts. Some AMQs also support deletes. A query for x ∈ S returns PRESENT. A query for x ∉ S returns PRESENT with a tunable false-positive probability ϵ, and otherwise returns ABSENT. AMQs are widely used to speed up dictionaries that are stored remotely (e.g., on disk or across a network). The AMQ is stored locally (e.g., in memory). The remote dictionary is only accessed when the AMQ returns PRESENT. Thus, the primary performance metric of an AMQ is how often it returns ABSENT for negative queries. Existing AMQs offer weak guarantees on the number of false positives in a sequence of queries. The false-positive probability ϵ holds only for a single query. It is easy for an adversary to drive an AMQ's false-positive rate towards 1 by simply repeating false positives. This paper shows what it takes to get strong guarantees on the number of false positives. We say that an AMQ is adaptive if it guarantees a false-positive probability of ϵ for every query, regardless of answers to previous queries. We establish upper and lower bounds for adaptive AMQs. Our lower bound shows that it is impossible to build a small adaptive AMQ, even when the AMQ is immediately told whenever a query is a false positive. On the other hand, we show that it is possible to maintain an AMQ that uses the same amount of local space as a non-adaptive AMQ (up to lower order terms), performs all queries and updates in constant time, and guarantees that each negative query to the dictionary accesses remote storage with probability ϵ, independent of the results of past queries. Thus, we show that adaptivity can be achieved effectively for free.

    Original languageEnglish (US)
    Title of host publicationProceedings - 59th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2018
    EditorsMikkel Thorup
    PublisherIEEE Computer Society
    Pages182-193
    Number of pages12
    ISBN (Electronic)9781538642306
    DOIs
    StatePublished - Nov 30 2018
    Event59th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2018 - Paris, France
    Duration: Oct 7 2018Oct 9 2018

    Publication series

    NameProceedings - Annual IEEE Symposium on Foundations of Computer Science, FOCS
    Volume2018-October
    ISSN (Print)0272-5428

    Other

    Other59th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2018
    Country/TerritoryFrance
    CityParis
    Period10/7/1810/9/18

    Keywords

    • Adaptive data structures
    • Approximate membership query data structures
    • Bloom filters
    • Dictionary data structures

    ASJC Scopus subject areas

    • General Computer Science

    Fingerprint

    Dive into the research topics of 'Bloom filters, adaptivity, and the dictionary problem'. Together they form a unique fingerprint.

    Cite this