A Sentiment Analysis Service Platform for Streamed Multilingual Tweets

Ioanna Karageorgou, Panagiotis Liakos, Alex Delis

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Micro-blogging and social-media platforms are now prominent forums for disseminating information, opinions and commentaries. Among these, Twitter enjoys an in-excess of 330M base of users who continually produce and consume information snippets. Users collectively create a voluminous and multi-lingual corpus in a very broad range of topics on a daily basis. The discourse generated in the blogosphere is often of prime interest and importance to individuals, organizations, and companies. These actors would certainly like to periodically receive an overall assessment of demonstrated sentiments on specific issues by automatically classifying tweets expressed in different languages in conjunction with big-data analytics. In this paper, we propose a scalable service platform that employs multilingual sentiment analysis to classify streamed-tweets and yields analytics for selected topics in real-time. We discuss the main component of our Spark-enabled platform as we seek to offer an effective big-data service that can: 1) dynamically handle voluminous as well as high-rate tweettraffic through a multi-component application exploiting the latest software developments, 2) accurately identify messages originated by non-genuine user-accounts, and 3) utilize the Spark machine-learning library (MLib) to successfully classify streamed multi-lingual messages in real-time, using multiple potentially distributed executors. To empower our service platform, we have adopted training sets and developed sentiment analysis (SA) models for English, French, and Greek that help classify streamed tweetswith high accuracy. While experimenting with our distributed analytical platform, we establish both accurate and real-time classification for tweetsexpressed in the above European languages.

Original languageEnglish (US)
Title of host publicationProceedings - 2020 IEEE International Conference on Big Data, Big Data 2020
EditorsXintao Wu, Chris Jermaine, Li Xiong, Xiaohua Tony Hu, Olivera Kotevska, Siyuan Lu, Weijia Xu, Srinivas Aluru, Chengxiang Zhai, Eyhab Al-Masri, Zhiyuan Chen, Jeff Saltz
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages3262-3271
Number of pages10
ISBN (Electronic)9781728162515
DOIs
StatePublished - Dec 10 2020
Event8th IEEE International Conference on Big Data, Big Data 2020 - Virtual, Atlanta, United States
Duration: Dec 10 2020Dec 13 2020

Publication series

NameProceedings - 2020 IEEE International Conference on Big Data, Big Data 2020

Conference

Conference8th IEEE International Conference on Big Data, Big Data 2020
CountryUnited States
CityVirtual, Atlanta
Period12/10/2012/13/20

Keywords

  • Classification of Multilingual tweets
  • Real-time Big-data Analysis for Streamed tweets
  • Service Platform for Language Analytics
  • Spark-enabled Big-data Architecture

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems
  • Information Systems and Management
  • Safety, Risk, Reliability and Quality

Fingerprint Dive into the research topics of 'A Sentiment Analysis Service Platform for Streamed Multilingual Tweets'. Together they form a unique fingerprint.

Cite this