TY - GEN
T1 - Perfect hashing for strings
T2 - 7th Annual Symposium on Combinatorial Pattern Matching, CPM 1996
AU - Farach, Martin
AU - Muthukrishnan, S.
N1 - Publisher Copyright:
© Springer-Verlag Berlin Heidelberg 1996.
PY - 1996
Y1 - 1996
N2 - Numbers and strings are two objects manipulated by most programs. Hashing has been well-studied for numbers and it has been effective in practice. In contrast, basic hashing issues for strings remain largely unexplored. In this paper, we identify and formulate the core hashing problem for strings that we call substring hashing. Our main technical results are highly efficient sequential/parallel (CRCW PRAM) Las Vegas type algorithms that determine a perfect hash function for substring hashing. For example, given a binary string of length n, one of our algorithms finds a perfect hash function in 0(log n) time, O(n) work, and O(n) space; the hash value for any substring can then be computed in O(loglogn) time using a single processor. Our approach relies on a novel use of the suffix tree of a string. In implementing our approach, we design optimal parallel algorithms for the problem of determining weighted ancestors on a edge-weighted tree that may be of independent interest.
AB - Numbers and strings are two objects manipulated by most programs. Hashing has been well-studied for numbers and it has been effective in practice. In contrast, basic hashing issues for strings remain largely unexplored. In this paper, we identify and formulate the core hashing problem for strings that we call substring hashing. Our main technical results are highly efficient sequential/parallel (CRCW PRAM) Las Vegas type algorithms that determine a perfect hash function for substring hashing. For example, given a binary string of length n, one of our algorithms finds a perfect hash function in 0(log n) time, O(n) work, and O(n) space; the hash value for any substring can then be computed in O(loglogn) time using a single processor. Our approach relies on a novel use of the suffix tree of a string. In implementing our approach, we design optimal parallel algorithms for the problem of determining weighted ancestors on a edge-weighted tree that may be of independent interest.
UR - http://www.scopus.com/inward/record.url?scp=84957685983&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84957685983&partnerID=8YFLogxK
U2 - 10.1007/3-540-61258-0_11
DO - 10.1007/3-540-61258-0_11
M3 - Conference contribution
AN - SCOPUS:84957685983
SN - 3540612580
SN - 9783540612582
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 130
EP - 140
BT - Combinatorial Pattern Matching - 7th Annual Symposium, CPM 1996, Proceedings
A2 - Myers, Gene
A2 - Hirschberg, Dan
PB - Springer Verlag
Y2 - 10 June 1996 through 12 June 1996
ER -