Optimal hashing in external memory

Alex Conway, Martín Farach-Colton, Philip Shilane

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    Hash tables are a ubiquitous class of dictionary data structures. However, standard hash table implementations do not translate well into the external memory model, because they do not incorporate locality for insertions. Iacono and P tra u established an update/query tradeo curve for external-hash tables: a hash table that performs insertions in O(λ/B) amortized IOs requires (logλ N) expected IOs for queries, where N is the number of items that can be stored in the data structure, B is the size of a memory transfer, M is the size of memory, and λ is a tuning parameter. They provide a complicated hashing data structure, which we call the IP hash table, that meets this curve for λ that is (log log M + logM N). In this paper, we present a simpler external-memory hash table, the Bundle of Arrays Hash Table (BOA), that is optimal for a narrower range of λ. The simplicity of BOAs allows them to be readily modified to achieve the following results: A new external-memory data structure, the Bundle of Trees Hash Table (BOT), that matches the performance of the IP hash table, while retaining some of the simplicity of the BOAs. The Cache-Oblivious Bundle of Trees Hash Table (COBOT), the first cache-oblivious hash table. This data structure matches the optimality of BOTs and IP hash tables over the same range of λ.

    Original languageEnglish (US)
    Title of host publication45th International Colloquium on Automata, Languages, and Programming, ICALP 2018
    EditorsChristos Kaklamanis, Daniel Marx, Ioannis Chatzigiannakis, Donald Sannella
    PublisherSchloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing
    ISBN (Electronic)9783959770767
    DOIs
    StatePublished - Jul 1 2018
    Event45th International Colloquium on Automata, Languages, and Programming, ICALP 2018 - Prague, Czech Republic
    Duration: Jul 9 2018Jul 13 2018

    Publication series

    NameLeibniz International Proceedings in Informatics, LIPIcs
    Volume107
    ISSN (Print)1868-8969

    Other

    Other45th International Colloquium on Automata, Languages, and Programming, ICALP 2018
    Country/TerritoryCzech Republic
    CityPrague
    Period7/9/187/13/18

    Keywords

    • Asymmetric data structures
    • Cache-oblivious algorithms
    • External memory algorthims
    • Hash tables

    ASJC Scopus subject areas

    • Software

    Fingerprint

    Dive into the research topics of 'Optimal hashing in external memory'. Together they form a unique fingerprint.

    Cite this