A generalized fast Subset Sums framework for Bayesian event detection

Kan Shao, Yandong Liu, Daniel B. Neill

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We present Generalized Fast Subset Sums (GFSS), a new Bayesian framework for scalable and accurate detection of irregularly shaped spatial clusters using multiple data streams. GFSS extends the previously proposed Multivariate Bayesian Scan Statistic (MBSS) and Fast Subset Sums (FSS) approaches for detection of emerging events. The detection power of MBSS is primarily limited by computational considerations, which limit it to searching over circular spatial regions. GFSS enables more accurate and timely detection by defining a hierarchical prior over all subsets of the N locations, first selecting a local neighborhood consisting of a center location and its neighbors, and introducing a sparsity parameter P to describe how likely each location in the neighborhood is to be affected. This approach allows us to consider all possible subsets of locations (including irregularlyshaped regions) but also puts higher weight on more compact regions. We demonstrate that MBSS and FSS are both special cases of this general framework (assuming P = 1 and P = 0.5 respectively), but substantially higher detection power can be achieved by choosing an appropriate value of P. Thus we show that the distribution of the sparsity parameter P can be accurately learned from a small number of labeled events. Our evaluation results (on synthetic disease outbreaks injected into real-world hospital data) show that the GFSS method with learned sparsity parameter has higher detection power and spatial accuracy than MBSS and FSS, particularly when the affected region is irregular or elongated. We also show that the learned models can be used for event characterization, accurately distinguishing between two otherwise identical event types based on the sparsity of the affected spatial region.

Original languageEnglish (US)
Title of host publicationProceedings - 11th IEEE International Conference on Data Mining, ICDM 2011
Pages617-625
Number of pages9
DOIs
StatePublished - 2011
Event11th IEEE International Conference on Data Mining, ICDM 2011 - Vancouver, BC, Canada
Duration: Dec 11 2011Dec 14 2011

Publication series

NameProceedings - IEEE International Conference on Data Mining, ICDM
ISSN (Print)1550-4786

Other

Other11th IEEE International Conference on Data Mining, ICDM 2011
Country/TerritoryCanada
CityVancouver, BC
Period12/11/1112/14/11

Keywords

  • Biosurveillance
  • Event detection
  • Scan statistics

ASJC Scopus subject areas

  • General Engineering

Fingerprint

Dive into the research topics of 'A generalized fast Subset Sums framework for Bayesian event detection'. Together they form a unique fingerprint.

Cite this