Perfect hashing for strings: Formalization and algorithms

Martin Farach, S. Muthukrishnan

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    Numbers and strings are two objects manipulated by most programs. Hashing has been well-studied for numbers and it has been effective in practice. In contrast, basic hashing issues for strings remain largely unexplored. In this paper, we identify and formulate the core hashing problem for strings that we call substring hashing. Our main technical results are highly efficient sequential/parallel (CRCW PRAM) Las Vegas type algorithms that determine a perfect hash function for substring hashing. For example, given a binary string of length n, one of our algorithms finds a perfect hash function in 0(log n) time, O(n) work, and O(n) space; the hash value for any substring can then be computed in O(loglogn) time using a single processor. Our approach relies on a novel use of the suffix tree of a string. In implementing our approach, we design optimal parallel algorithms for the problem of determining weighted ancestors on a edge-weighted tree that may be of independent interest.

    Original languageEnglish (US)
    Title of host publicationCombinatorial Pattern Matching - 7th Annual Symposium, CPM 1996, Proceedings
    EditorsGene Myers, Dan Hirschberg
    PublisherSpringer Verlag
    Pages130-140
    Number of pages11
    ISBN (Print)3540612580, 9783540612582
    DOIs
    StatePublished - 1996
    Event7th Annual Symposium on Combinatorial Pattern Matching, CPM 1996 - Laguna Beach, United States
    Duration: Jun 10 1996Jun 12 1996

    Publication series

    NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    Volume1075
    ISSN (Print)0302-9743
    ISSN (Electronic)1611-3349

    Conference

    Conference7th Annual Symposium on Combinatorial Pattern Matching, CPM 1996
    Country/TerritoryUnited States
    CityLaguna Beach
    Period6/10/966/12/96

    ASJC Scopus subject areas

    • Theoretical Computer Science
    • General Computer Science

    Fingerprint

    Dive into the research topics of 'Perfect hashing for strings: Formalization and algorithms'. Together they form a unique fingerprint.

    Cite this