TY - GEN
T1 - Foreign Words and the Automatic Processing of Arabic Social Media Text Written in Roman Script
AU - Eskander, Ramy
AU - Al-Badrashiny, Mohamed
AU - Habash, Nizar
AU - Rambow, Owen
N1 - Publisher Copyright:
© 2014 Association for Computational Linguistics
PY - 2014
Y1 - 2014
N2 - Arabic on social media has all the properties of any language on social media that make it tough for natural language processing, plus some specific problems. These include diglossia, the use of an alternative alphabet (Roman), and code switching with foreign languages. In this paper, we present a system which can process Arabic written in Roman alphabet (“Arabizi”). It identifies whether each word is a foreign word or one of another four categories (Arabic, name, punctuation, sound), and transliterates Arabic words and names into the Arabic alphabet. We obtain an overall system performance of 83.8% on an unseen test set.
AB - Arabic on social media has all the properties of any language on social media that make it tough for natural language processing, plus some specific problems. These include diglossia, the use of an alternative alphabet (Roman), and code switching with foreign languages. In this paper, we present a system which can process Arabic written in Roman alphabet (“Arabizi”). It identifies whether each word is a foreign word or one of another four categories (Arabic, name, punctuation, sound), and transliterates Arabic words and names into the Arabic alphabet. We obtain an overall system performance of 83.8% on an unseen test set.
UR - http://www.scopus.com/inward/record.url?scp=85082806742&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85082806742&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85082806742
T3 - 1st Workshop on Computational Approaches to Code Switching, Switching 2014 at the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014 - Proceedings
SP - 1
EP - 12
BT - 1st Workshop on Computational Approaches to Code Switching, Switching 2014 at the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014 - Proceedings
A2 - Diab, Mona
A2 - Hirschberg, Julia
A2 - Fung, Pascale
A2 - Solorio, Thamar
PB - Association for Computational Linguistics (ACL)
T2 - 1st Workshop on Computational Approaches to Code Switching, Switching 2014 at the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014
Y2 - 25 October 2014
ER -