Extracting signals from news streams for disease outbreak prediction

Sunandan Chakraborty, Lakshminarayanan Subramanian

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Emergence of digital news provides new opportunities in information extraction. Proper characterization of unstructured news can help identify signals that may drive variations in many observable phenomena, such as disease outbreaks. In this paper, we propose a method to extract such signals from a large corpus of news events and identify a subset of signals that are closely related to the observed phenomenon. We show how words appearing in a large news corpus can be represented and latent features can be extracted to build predictive models. We build and evaluate such a system specifically for characterizing and predicting diseases outbreaks in India. We focused on 5 different diseases prevalent in India and experiments showed that our model can predict disease outbreaks 2 to 4 weeks prior, with an average precision of around 0.80 and recall of around 0.65. We also compared our model with an LDA-based baseline model, where our model demonstrated around 5-14% improvement across different diseases.

Original languageEnglish (US)
Title of host publication2016 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2016 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1300-1304
Number of pages5
ISBN (Electronic)9781509045457
DOIs
StatePublished - Apr 19 2017
Event2016 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2016 - Washington, United States
Duration: Dec 7 2016Dec 9 2016

Publication series

Name2016 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2016 - Proceedings

Other

Other2016 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2016
CountryUnited States
CityWashington
Period12/7/1612/9/16

ASJC Scopus subject areas

  • Signal Processing
  • Computer Networks and Communications

Fingerprint Dive into the research topics of 'Extracting signals from news streams for disease outbreak prediction'. Together they form a unique fingerprint.

Cite this