Abstract
We present new subset scan methods for multivariate event detection in massive space-time datasets. We extend the recently proposed 'fast subset scan' framework from univariate to multivariate data, enabling computationally efficient detection of irregular space-time clusters even when the numbers of spatial locations and data streams are large. For two variants of the multivariate subset scan, we demonstrate that the scan statistic can be efficiently optimized over proximity-constrained subsets of locations and over all subsets of the monitored data streams, enabling timely detection of emerging events and accurate characterization of the affected locations and streams. Using our new fast search algorithms, we perform an empirical comparison of the Subset Aggregation and Kulldorff multivariate subset scans on synthetic data and real-world disease surveillance tasks, demonstrating tradeoffs between the detection and characterization performance of the two methods.
Original language | English (US) |
---|---|
Pages (from-to) | 2185-2208 |
Number of pages | 24 |
Journal | Statistics in Medicine |
Volume | 32 |
Issue number | 13 |
DOIs | |
State | Published - Jun 15 2013 |
Keywords
- Algorithms
- Disease surveillance
- Event detection
- Scan statistics
- Spatial scan
ASJC Scopus subject areas
- Epidemiology
- Statistics and Probability