TY - GEN
T1 - From Language to Family and Back
T2 - 2013 Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2013
AU - Stolerman, Ariel
AU - Islam, Aylin Caliskan
AU - Greenstadt, Rachel
N1 - Publisher Copyright:
© 2013 Association for Computational Linguistics.
PY - 2013
Y1 - 2013
N2 - Revealing an anonymous author’s traits from text is a well-researched area. In this paper we aim to identify the native language and language family of a non-native English author, given his/her English writings. We extract features from the text based on prior work, and extend or modify it to construct different feature sets, and use support vector machines for classification. We show that native language identification accuracy can be improved by up to 6.43% for a 9-class task, depending on the feature set, by introducing a novel method to incorporate language family information. In addition we show that introducing grammar-based features improves accuracy of both native language and language family identification.
AB - Revealing an anonymous author’s traits from text is a well-researched area. In this paper we aim to identify the native language and language family of a non-native English author, given his/her English writings. We extract features from the text based on prior work, and extend or modify it to construct different feature sets, and use support vector machines for classification. We show that native language identification accuracy can be improved by up to 6.43% for a 9-class task, depending on the feature set, by introducing a novel method to incorporate language family information. In addition we show that introducing grammar-based features improves accuracy of both native language and language family identification.
UR - http://www.scopus.com/inward/record.url?scp=84958036970&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84958036970&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84958036970
T3 - Proceedings of the 2013 Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2013 - Student Research Workshop
SP - 32
EP - 39
BT - Proceedings of the 2013 Annual Conference of the North American Chapter of the Association for Computational Linguistics
A2 - Louis, Annie
A2 - Socher, Richard
A2 - Hockenmaier, Julia
A2 - Ringger, Eric K.
PB - Association for Computational Linguistics (ACL)
Y2 - 9 June 2013 through 14 June 2013
ER -