Abstract
We present a method for induction of concise and accurate probabilistic context-free grammars for efficient use in early stages of a multi-stage parsing technique. The method is based on the use of statistical tests to determine if a non-terminal combination is unobserved due to sparse data or hard syntactic constraints. Experimental results show that, using this method, high accuracies can be achieved with a non-terminal set that is orders of magnitude smaller than in typically induced probabilistic context-free grammars, leading to substantial speed-ups in parsing. The approach is further used in combination with an existing reranker to provide competitive WSJ parsing results.
Original language | English (US) |
---|---|
Pages | 312-319 |
Number of pages | 8 |
DOIs | |
State | Published - 2006 |
Event | 2006 Human Language Technology Conference - North American Chapter of the Association for Computational Linguistics Annual Meeting, HLT-NAACL 2006 - New York, NY, United States Duration: Jun 4 2006 → Jun 9 2006 |
Other
Other | 2006 Human Language Technology Conference - North American Chapter of the Association for Computational Linguistics Annual Meeting, HLT-NAACL 2006 |
---|---|
Country/Territory | United States |
City | New York, NY |
Period | 6/4/06 → 6/9/06 |
ASJC Scopus subject areas
- Language and Linguistics
- Linguistics and Language