TY - GEN
T1 - Better extensibility through modular syntax
AU - Grimm, Robert
PY - 2006
Y1 - 2006
N2 - We explore how to make the benefits of modularity available for syntactic specifications and present Rats!, a parser generator for Java that supports easily extensible syntax. Our parser generator builds on recent research on parsing expression grammars (PEGs), which, by being closed under composition, prioritizing choices, supporting unlimited lookahead, and integrating lexing and parsing, offer an attractive alternative to context-free grammars. PEGs are implemented by so-called packrat parsers, which are recursive descent parsers that memoize all intermediate results (hence their name). Memoization ensures linear-time performance in the presence of unlimited lookahead, but also results in an essentially lazy, functional parsing technique. In this paper, we explore how to leverage PEGs and packrat parsers as the foundation for extensible syntax. In particular, we show how make packrat parsing more widely applicable by implementing this lazy, functional technique in a strict, imperative language, while also generating better performing parsers through aggressive optimizations. Next, we develop a module system for organizing, modifying, and composing large-scale syntactic specifications. Finally, we describe a new technique for managing (global) parsing state in functional parsers. Our experimental evaluation demonstrates that the resulting parser generator succeeds at providing extensible syntax. In particular, Rats! enables other grammar writers to realize real-world language extensions in little time and code, and it generates parsers that consistently outperform parsers created by two GLR parser generators.
AB - We explore how to make the benefits of modularity available for syntactic specifications and present Rats!, a parser generator for Java that supports easily extensible syntax. Our parser generator builds on recent research on parsing expression grammars (PEGs), which, by being closed under composition, prioritizing choices, supporting unlimited lookahead, and integrating lexing and parsing, offer an attractive alternative to context-free grammars. PEGs are implemented by so-called packrat parsers, which are recursive descent parsers that memoize all intermediate results (hence their name). Memoization ensures linear-time performance in the presence of unlimited lookahead, but also results in an essentially lazy, functional parsing technique. In this paper, we explore how to leverage PEGs and packrat parsers as the foundation for extensible syntax. In particular, we show how make packrat parsing more widely applicable by implementing this lazy, functional technique in a strict, imperative language, while also generating better performing parsers through aggressive optimizations. Next, we develop a module system for organizing, modifying, and composing large-scale syntactic specifications. Finally, we describe a new technique for managing (global) parsing state in functional parsers. Our experimental evaluation demonstrates that the resulting parser generator succeeds at providing extensible syntax. In particular, Rats! enables other grammar writers to realize real-world language extensions in little time and code, and it generates parsers that consistently outperform parsers created by two GLR parser generators.
KW - Extensible syntax
KW - Module system
KW - Packrat parsing
KW - Parser generator
KW - Parsing expression grammar
KW - Rats!
UR - http://www.scopus.com/inward/record.url?scp=33746046773&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33746046773&partnerID=8YFLogxK
U2 - 10.1145/1133255.1133987
DO - 10.1145/1133255.1133987
M3 - Conference contribution
AN - SCOPUS:33746046773
SN - 1595933204
SN - 9781595933201
VL - 2006
T3 - Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI)
SP - 38
EP - 51
BT - ACM Sigplan Notices. Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI)
T2 - PLDI 2006 - 2006 ACM SIGPLAN Conference on Programming Language Design and Implementation
Y2 - 10 June 2006 through 16 June 2006
ER -