Abstract
C tools, such as source browsers, bug finders, and automated refactorings, need to process two languages: C itself and the preprocessor. The latter improves expressivity through file includes, macros, and static conditionals. But it operates only on tokens, making it hard to even parse both languages. This paper presents a complete, performant solution to this problem. First, a configurationpreserving preprocessor resolves includes and macros yet leaves static conditionals intact, thus preserving a program's variability. To ensure completeness, we analyze all interactions between preprocessor features and identify techniques for correctly handling them. Second, a configuration- preserving parser generates a wellformed AST with static choice nodes for conditionals. It forks new subparsers when encountering static conditionals and merges them again after the conditionals. To ensure performance, we present a simple algorithm for table-driven Fork-Merge LR parsing and four novel optimizations. We demonstrate the effectiveness of our approach on the x86 Linux kernel.
Original language | English (US) |
---|---|
Pages (from-to) | 323-334 |
Number of pages | 12 |
Journal | ACM SIGPLAN Notices |
Volume | 47 |
Issue number | 6 |
DOIs | |
State | Published - Aug 2012 |
Keywords
- C
- Fork-Merge LR parsing
- LR parsing
- Preprocessor
- SuperC
ASJC Scopus subject areas
- General Computer Science