TY - GEN
T1 - Automatic topic identification and classification of text messages in the SMSAll system
AU - Pervaiz, Fahad
AU - Subramanian, Lakshmi
AU - Saif, Umar
PY - 2012
Y1 - 2012
N2 - This paper presents a way to identify topics and classify text messages in the SMSAll system, which is the Twitter of Pakistan (except over SMS). Among many challenges, one is to develop an unsupervised algorithm for text messages containing Urdu-English words written in roman letters. Still in 1-gram we are able to have 72%, 53% and 58% true positives for popular, medium and rare topics respectively and 48% and 40% true positives in 2 and 3-grams respectively.
AB - This paper presents a way to identify topics and classify text messages in the SMSAll system, which is the Twitter of Pakistan (except over SMS). Among many challenges, one is to develop an unsupervised algorithm for text messages containing Urdu-English words written in roman letters. Still in 1-gram we are able to have 72%, 53% and 58% true positives for popular, medium and rare topics respectively and 48% and 40% true positives in 2 and 3-grams respectively.
KW - SMS messages
KW - SMSAll
KW - classification
KW - topic identification
UR - http://www.scopus.com/inward/record.url?scp=84889717123&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84889717123&partnerID=8YFLogxK
U2 - 10.1145/2160601.2160626
DO - 10.1145/2160601.2160626
M3 - Conference contribution
AN - SCOPUS:84889717123
SN - 9781450312622
T3 - Proceedings of the 2nd ACM Symposium on Computing for Development, DEV 2012
BT - Proceedings of the 2nd ACM Symposium on Computing for Development, DEV 2012
T2 - 2nd ACM Symposium on Computing for Development, DEV 2012
Y2 - 11 March 2012 through 12 March 2012
ER -