TY - GEN
T1 - A multi-dimensional progressive perfect hashing for high-speed string matching
AU - Xu, Yang
AU - Ma, Lei
AU - Liu, Zhaobo
AU - Chao, H. Jonathan
PY - 2011
Y1 - 2011
N2 - Aho-Corasick (AC) automaton is widely used for multi-string matching in today's Network Intrusion Detection System (NIDS). With fast-growing rule sets, implementing AC automaton with a small memory without sacrificing its performance has remained challenging in NIDS design. In this paper, we propose a multi-dimensional progressive perfect hashing algorithm named P 2-Hashing, which allows transitions of an AC automaton to be placed in a compact hash table without any collision. P2-Hashing is based on the observation that a hash key of each transition consists of two dimensions, namely a source state ID and an input character. When placing a transition in a hash table and causing a collision, we can change the value of a dimension of the hash key to rehash the transition to a new location of the hash table. For a given AC automaton, P2-Hashing first divides all the transitions into many small sets based on the two-dimensional values of the hash keys, and then places the sets of transitions progressively into the hash table until all are placed. Hash collisions that occurred during the insertion of a transition will only affect the transitions in the same set. The proposed P 2-Hashing has many unique properties, including fast hash index generation and zero memory overhead, which are very suitable for the AC automaton operation. The feasibility and performance of P2-Hashing are investigated through simulations on the full Snort (6.4k rules) and Clam AV (54k rules) rule sets, each of which is first converted to a single AC automaton. Simulation results show that P2-Hashing can successfully construct the perfect hash table even when the load factor of the hash table is as high as 0.91.
AB - Aho-Corasick (AC) automaton is widely used for multi-string matching in today's Network Intrusion Detection System (NIDS). With fast-growing rule sets, implementing AC automaton with a small memory without sacrificing its performance has remained challenging in NIDS design. In this paper, we propose a multi-dimensional progressive perfect hashing algorithm named P 2-Hashing, which allows transitions of an AC automaton to be placed in a compact hash table without any collision. P2-Hashing is based on the observation that a hash key of each transition consists of two dimensions, namely a source state ID and an input character. When placing a transition in a hash table and causing a collision, we can change the value of a dimension of the hash key to rehash the transition to a new location of the hash table. For a given AC automaton, P2-Hashing first divides all the transitions into many small sets based on the two-dimensional values of the hash keys, and then places the sets of transitions progressively into the hash table until all are placed. Hash collisions that occurred during the insertion of a transition will only affect the transitions in the same set. The proposed P 2-Hashing has many unique properties, including fast hash index generation and zero memory overhead, which are very suitable for the AC automaton operation. The feasibility and performance of P2-Hashing are investigated through simulations on the full Snort (6.4k rules) and Clam AV (54k rules) rule sets, each of which is first converted to a single AC automaton. Simulation results show that P2-Hashing can successfully construct the perfect hash table even when the load factor of the hash table is as high as 0.91.
KW - Aho-Corasick Automaton
KW - Hash Collision
KW - Multi-string Matching
KW - Perfect Hash Table
UR - http://www.scopus.com/inward/record.url?scp=81255143348&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=81255143348&partnerID=8YFLogxK
U2 - 10.1109/ANCS.2011.33
DO - 10.1109/ANCS.2011.33
M3 - Conference contribution
AN - SCOPUS:81255143348
SN - 9780769545219
T3 - Proceedings - 2011 7th ACM/IEEE Symposium on Architectures for Networking and Communications Systems, ANCS 2011
SP - 167
EP - 177
BT - Proceedings - 2011 7th ACM/IEEE Symposium on Architectures for Networking and Communications Systems, ANCS 2011
T2 - 2011 7th ACM/IEEE Symposium on Architectures for Networking and Communications Systems, ANCS 2011
Y2 - 3 October 2011 through 4 October 2011
ER -