TY - GEN
T1 - Robust Dictionary Lookup in Multiple Noisy Orthographies
AU - Zhang, Lingliang
AU - Habash, Nizar
AU - Toussaint, Godfried
N1 - Publisher Copyright:
©2017 Association for Computational Linguistics
PY - 2017
Y1 - 2017
N2 - We present the MultiScript Phonetic Search algorithm to address the problem of language learners looking up unfamiliar words that they heard. We apply it to Arabic dictionary lookup with noisy queries done using both the Arabic and Roman scripts. Our algorithm is based on a computational phonetic distance metric that can be optionally machine learned. To benchmark our performance, we created the ArabScribe dataset, containing 10,000 noisy transcriptions of random Arabic dictionary words. Our algorithm outperforms Google Translate’s “did you mean" feature, as well as the Yamli smart Arabic keyboard.
AB - We present the MultiScript Phonetic Search algorithm to address the problem of language learners looking up unfamiliar words that they heard. We apply it to Arabic dictionary lookup with noisy queries done using both the Arabic and Roman scripts. Our algorithm is based on a computational phonetic distance metric that can be optionally machine learned. To benchmark our performance, we created the ArabScribe dataset, containing 10,000 noisy transcriptions of random Arabic dictionary words. Our algorithm outperforms Google Translate’s “did you mean" feature, as well as the Yamli smart Arabic keyboard.
UR - http://www.scopus.com/inward/record.url?scp=85122950314&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85122950314&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85122950314
T3 - WANLP 2017, co-located with EACL 2017 - 3rd Arabic Natural Language Processing Workshop, Proceedings of the Workshop
SP - 119
EP - 129
BT - WANLP 2017, co-located with EACL 2017 - 3rd Arabic Natural Language Processing Workshop, Proceedings of the Workshop
PB - Association for Computational Linguistics (ACL)
T2 - 3rd Arabic Natural Language Processing Workshop, WANLP 2017 held at EACL 2017
Y2 - 3 April 2017
ER -