Sentiment analysis of mixed language employing Hindi-English code switching

Dinkar Sitaram, Savitha Murthy, Debraj Ray, Devansh Sharma, Kashyap Dhar

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    Sentiment analysis has emerged as one of the prominent research branches because of its endless usages and applications. Monitoring social media, forums, blogs and other online resources for customer reviews, product competition and survey responses to understand customer insight is of significant importance in business analytics. With the proliferation of informal user generated data online, the use of mixed language has become a common phenomenon. Mixed language arises through the use of linguistic code switching (LCS) or the practice of using more than one language in a single sentence. Such mixed language has rarely been a subject of sentiment analysis before. The lack of a clear grammatical structure renders the previous approaches to sentiment analysis ineffective for such text. In this paper, we propose a strategy to determine the sentiment of sentences written in a mixed language comprising of Hindi and English lexicons. Our technique can be used to analyze the sentiment of data belonging to any one of the source languages as well as the mixed language data. Grammatical transitions which are very common in mixed language have been taken into account during the sentiment analysis. We demonstrate the effectiveness of the proposed approach via case studies on social media data sets.

    Original languageEnglish (US)
    Title of host publicationProceedings of 2015 International Conference on Machine Learning and Cybernetics, ICMLC 2015
    PublisherIEEE Computer Society
    Pages271-276
    Number of pages6
    ISBN (Electronic)9781467372213
    DOIs
    StatePublished - Nov 30 2015
    Event14th International Conference on Machine Learning and Cybernetics, ICMLC 2015 - Guangzhou, China
    Duration: Jul 12 2015Jul 15 2015

    Publication series

    NameProceedings - International Conference on Machine Learning and Cybernetics
    Volume1
    ISSN (Print)2160-133X
    ISSN (Electronic)2160-1348

    Conference

    Conference14th International Conference on Machine Learning and Cybernetics, ICMLC 2015
    Country/TerritoryChina
    CityGuangzhou
    Period7/12/157/15/15

    Keywords

    • Grammar
    • Linguistic code switching
    • Mixed language
    • Sentiment analysis
    • Social media

    ASJC Scopus subject areas

    • Artificial Intelligence
    • Computational Theory and Mathematics
    • Computer Networks and Communications
    • Human-Computer Interaction

    Fingerprint

    Dive into the research topics of 'Sentiment analysis of mixed language employing Hindi-English code switching'. Together they form a unique fingerprint.

    Cite this