TY - GEN
T1 - Processing spontaneous orthography
AU - Eskander, Ramy
AU - Habash, Nizar
AU - Rambow, Owen
AU - Tomeh, Nadi
N1 - Publisher Copyright:
© 2013 Association for Computational Linguistics.
PY - 2013
Y1 - 2013
N2 - In cases in which there is no standard orthography for a language or language variant, written texts will display a variety of orthographic choices. This is problematic for natural language processing (NLP) because it creates spurious data sparseness. We study the transformation of spontaneously spelled Egyptian Arabic into a conventionalized orthography which we have previously proposed for NLP purposes. We show that a two-stage process can reduce divergences from this standard by 69%, making subsequent processing of Egyptian Arabic easier.
AB - In cases in which there is no standard orthography for a language or language variant, written texts will display a variety of orthographic choices. This is problematic for natural language processing (NLP) because it creates spurious data sparseness. We study the transformation of spontaneously spelled Egyptian Arabic into a conventionalized orthography which we have previously proposed for NLP purposes. We show that a two-stage process can reduce divergences from this standard by 69%, making subsequent processing of Egyptian Arabic easier.
UR - http://www.scopus.com/inward/record.url?scp=84926175165&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84926175165&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84926175165
T3 - NAACL HLT 2013 - 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Main Conference
SP - 585
EP - 595
BT - Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics
PB - Association for Computational Linguistics (ACL)
T2 - 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2013
Y2 - 9 June 2013 through 14 June 2013
ER -