TY - CHAP
T1 - Interlingual annotation for MT development
AU - Reeder, Florence
AU - Dorr, Bonnie
AU - Farwell, David
AU - Habash, Nizar
AU - Helmreich, Stephen
AU - Hovy, Eduard
AU - Levin, Lori
AU - Mitamura, Teruko
AU - Miller, Keith
AU - Rambow, Owen
AU - Siddharthan, Advaith
PY - 2004
Y1 - 2004
N2 - MT systems that use only superficial representations, including the current generation of statistical MT systems, have been successful and useful. However, they will experience a plateau in quality, much like other "silver bullet" approaches to MT. We pursue work on the development of interlingual representations for use in symbolic or hybrid MT systems. In this paper, we describe the creation of an interlingua and the development of a corpus of semantically annotated text, to be validated in six languages and evaluated in several ways. We have established a distributed, well-functioning research methodology, designed a preliminary interlingua notation, created annotation manuals and tools, developed a test collection in six languages with associated English translations, annotated some 150 translations, and designed and applied various annotation metrics. We describe the data sets being annotated and the interlingual (IL) representation language which uses two ontologies and a systematic theta-role list. We present the annotation tools built and outline the annotation process. Following this, we describe our evaluation methodology and conclude with a summary of issues that have arisen.
AB - MT systems that use only superficial representations, including the current generation of statistical MT systems, have been successful and useful. However, they will experience a plateau in quality, much like other "silver bullet" approaches to MT. We pursue work on the development of interlingual representations for use in symbolic or hybrid MT systems. In this paper, we describe the creation of an interlingua and the development of a corpus of semantically annotated text, to be validated in six languages and evaluated in several ways. We have established a distributed, well-functioning research methodology, designed a preliminary interlingua notation, created annotation manuals and tools, developed a test collection in six languages with associated English translations, annotated some 150 translations, and designed and applied various annotation metrics. We describe the data sets being annotated and the interlingual (IL) representation language which uses two ontologies and a systematic theta-role list. We present the annotation tools built and outline the annotation process. Following this, we describe our evaluation methodology and conclude with a summary of issues that have arisen.
UR - http://www.scopus.com/inward/record.url?scp=35048825346&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=35048825346&partnerID=8YFLogxK
U2 - 10.1007/978-3-540-30194-3_26
DO - 10.1007/978-3-540-30194-3_26
M3 - Chapter
AN - SCOPUS:35048825346
SN - 3540233008
SN - 9783540233008
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 236
EP - 245
BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
A2 - Frederking, Robert E.
A2 - Taylor, Kathryn B.
PB - Springer Verlag
ER -