TY - GEN
T1 - Sentiment analysis of mixed language employing Hindi-English code switching
AU - Sitaram, Dinkar
AU - Murthy, Savitha
AU - Ray, Debraj
AU - Sharma, Devansh
AU - Dhar, Kashyap
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2015/11/30
Y1 - 2015/11/30
N2 - Sentiment analysis has emerged as one of the prominent research branches because of its endless usages and applications. Monitoring social media, forums, blogs and other online resources for customer reviews, product competition and survey responses to understand customer insight is of significant importance in business analytics. With the proliferation of informal user generated data online, the use of mixed language has become a common phenomenon. Mixed language arises through the use of linguistic code switching (LCS) or the practice of using more than one language in a single sentence. Such mixed language has rarely been a subject of sentiment analysis before. The lack of a clear grammatical structure renders the previous approaches to sentiment analysis ineffective for such text. In this paper, we propose a strategy to determine the sentiment of sentences written in a mixed language comprising of Hindi and English lexicons. Our technique can be used to analyze the sentiment of data belonging to any one of the source languages as well as the mixed language data. Grammatical transitions which are very common in mixed language have been taken into account during the sentiment analysis. We demonstrate the effectiveness of the proposed approach via case studies on social media data sets.
AB - Sentiment analysis has emerged as one of the prominent research branches because of its endless usages and applications. Monitoring social media, forums, blogs and other online resources for customer reviews, product competition and survey responses to understand customer insight is of significant importance in business analytics. With the proliferation of informal user generated data online, the use of mixed language has become a common phenomenon. Mixed language arises through the use of linguistic code switching (LCS) or the practice of using more than one language in a single sentence. Such mixed language has rarely been a subject of sentiment analysis before. The lack of a clear grammatical structure renders the previous approaches to sentiment analysis ineffective for such text. In this paper, we propose a strategy to determine the sentiment of sentences written in a mixed language comprising of Hindi and English lexicons. Our technique can be used to analyze the sentiment of data belonging to any one of the source languages as well as the mixed language data. Grammatical transitions which are very common in mixed language have been taken into account during the sentiment analysis. We demonstrate the effectiveness of the proposed approach via case studies on social media data sets.
KW - Grammar
KW - Linguistic code switching
KW - Mixed language
KW - Sentiment analysis
KW - Social media
UR - http://www.scopus.com/inward/record.url?scp=85014812341&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85014812341&partnerID=8YFLogxK
U2 - 10.1109/ICMLC.2015.7340934
DO - 10.1109/ICMLC.2015.7340934
M3 - Conference contribution
AN - SCOPUS:85014812341
T3 - Proceedings - International Conference on Machine Learning and Cybernetics
SP - 271
EP - 276
BT - Proceedings of 2015 International Conference on Machine Learning and Cybernetics, ICMLC 2015
PB - IEEE Computer Society
T2 - 14th International Conference on Machine Learning and Cybernetics, ICMLC 2015
Y2 - 12 July 2015 through 15 July 2015
ER -