Automatic topic identification and classification of text messages in the SMSAll system

Fahad Pervaiz, Lakshmi Subramanian, Umar Saif

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper presents a way to identify topics and classify text messages in the SMSAll system, which is the Twitter of Pakistan (except over SMS). Among many challenges, one is to develop an unsupervised algorithm for text messages containing Urdu-English words written in roman letters. Still in 1-gram we are able to have 72%, 53% and 58% true positives for popular, medium and rare topics respectively and 48% and 40% true positives in 2 and 3-grams respectively.

Original languageEnglish (US)
Title of host publicationProceedings of the 2nd ACM Symposium on Computing for Development, DEV 2012
DOIs
StatePublished - 2012
Event2nd ACM Symposium on Computing for Development, DEV 2012 - Atlanta, GA, United States
Duration: Mar 11 2012Mar 12 2012

Publication series

NameProceedings of the 2nd ACM Symposium on Computing for Development, DEV 2012

Other

Other2nd ACM Symposium on Computing for Development, DEV 2012
Country/TerritoryUnited States
CityAtlanta, GA
Period3/11/123/12/12

Keywords

  • SMS messages
  • SMSAll
  • classification
  • topic identification

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'Automatic topic identification and classification of text messages in the SMSAll system'. Together they form a unique fingerprint.

Cite this