Abstract
This paper describes Lexogen, a system for generating natural-language sentences from Lexical Conceptual Structure, an interlingual representation. The system has been developed as part of a Chinese-English Machine Translation (MT) system; however, it is designed to be used for many other MT language pairs and natural language applications. The contributions of this work include: (1) development of a large-scale Hybrid Natural Language Generation system with language-independent components; (2) enhancements to an interlingual representation and associated algorithm for generation from ambiguous input; (3) development of an efficient reusable language-independent linearization module with a grammar description language that can be used with other systems; (4) improvements to an earlier algorithm for hierarchically mapping thematic roles to surface positions; and (5) development of a diagnostic tool for lexicon coverage and correctness and use of the tool for verification of English, Spanish, and Chinese lexicons. An evaluation of Chinese-English translation quality shows comparable performance with a commercial translation system. The generation system can also be extended to other languages and this is demonstrated and evaluated for Spanish.
Original language | English (US) |
---|---|
Pages (from-to) | 81-128 |
Number of pages | 48 |
Journal | Machine Translation |
Volume | 18 |
Issue number | 2 |
DOIs | |
State | Published - 2003 |
Keywords
- Hybrid Natural Language Generation
- Interlingua
- Lexical Conceptual Structure
- Multilingual Natural Language Generation
ASJC Scopus subject areas
- Software
- Language and Linguistics
- Linguistics and Language
- Artificial Intelligence