TY - GEN
T1 - Hardware implementation for scalable lookahead regular expressions detection
AU - Bando, Masanori
AU - Artan, N. Sertac
AU - Mehta, Nishit
AU - Guan, Yi
AU - Chao, H. Jonathan
PY - 2010
Y1 - 2010
N2 - Regular Expressions (RegExes) are widely used in various applications to identify strings of text. Their flexibility, however, increases the complexity of the detection system and often limits the detection speed as well as the total number of RegExes that can be detected using limited resources. The two classical detection methods, Deterministic Finite Automaton (DFA) and Non-Deterministic Finite Automaton (NFA), have the potential problems of prohibitively large memory requirements and a large number of concurrent operations, respectively. Although recent schemes addressing these problems to improve DFA and NFA are promising, they are inherently limited by their scalability, since they follow the state transition model in DFA and NFA, where the state transitions occur per each character of the input. We recently proposed a scalable RegEx detection system called Lookahead Finite Automata (LaFA) to solve these problems with three novel ideas: 1. Provide specialized and optimized detection modules to increase resource utilizations. 2. Systematically reordering the RegEx detection sequence to reduce number of concurrent operations. 3. Sharing states among automata for different RegExes to reduce resource requirements. In this paper, we propose an efficient hardware architecture and prototype design implementation based on LaFA. Our proof-of-concept prototype design is built on a fraction of a single commodity Field Programmable Gate Array (FPGA) chip and can accommodate up to twenty-five thousand (25k) RegExes. Using only 7% of the logic area and 25% of the memory on a Xilinx Virtex-4 FX100, the prototype design can achieve 2-Gbps (gigabits-per-second) detection throughput with only one detection engine. We estimate that 34-Gbps detection throughput can be achieved if the entire resources of a state-of-the-art FPGA chip are used to implement multiple detection engines.
AB - Regular Expressions (RegExes) are widely used in various applications to identify strings of text. Their flexibility, however, increases the complexity of the detection system and often limits the detection speed as well as the total number of RegExes that can be detected using limited resources. The two classical detection methods, Deterministic Finite Automaton (DFA) and Non-Deterministic Finite Automaton (NFA), have the potential problems of prohibitively large memory requirements and a large number of concurrent operations, respectively. Although recent schemes addressing these problems to improve DFA and NFA are promising, they are inherently limited by their scalability, since they follow the state transition model in DFA and NFA, where the state transitions occur per each character of the input. We recently proposed a scalable RegEx detection system called Lookahead Finite Automata (LaFA) to solve these problems with three novel ideas: 1. Provide specialized and optimized detection modules to increase resource utilizations. 2. Systematically reordering the RegEx detection sequence to reduce number of concurrent operations. 3. Sharing states among automata for different RegExes to reduce resource requirements. In this paper, we propose an efficient hardware architecture and prototype design implementation based on LaFA. Our proof-of-concept prototype design is built on a fraction of a single commodity Field Programmable Gate Array (FPGA) chip and can accommodate up to twenty-five thousand (25k) RegExes. Using only 7% of the logic area and 25% of the memory on a Xilinx Virtex-4 FX100, the prototype design can achieve 2-Gbps (gigabits-per-second) detection throughput with only one detection engine. We estimate that 34-Gbps detection throughput can be achieved if the entire resources of a state-of-the-art FPGA chip are used to implement multiple detection engines.
UR - http://www.scopus.com/inward/record.url?scp=77954054248&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77954054248&partnerID=8YFLogxK
U2 - 10.1109/IPDPSW.2010.5470750
DO - 10.1109/IPDPSW.2010.5470750
M3 - Conference contribution
AN - SCOPUS:77954054248
SN - 9781424465347
T3 - Proceedings of the 2010 IEEE International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum, IPDPSW 2010
BT - Proceedings of the 2010 IEEE International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum, IPDPSW 2010
T2 - 2010 IEEE International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum, IPDPSW 2010
Y2 - 19 April 2010 through 23 April 2010
ER -