Abstract
We explore scalable and accurate dynamic pattern detection methods in graph-based data sets. We apply our proposed Dynamic Subset Scan method to the task of detecting, tracking, and source-tracing contaminant plumes spreading through a water distribution system equipped with noisy, binary sensors. While static patterns affect the same subset of data over a period of time, dynamic patterns may affect different subsets of the data at each time step. These dynamic patterns require a new approach to define and optimize penalized likelihood ratio statistics in the subset scan framework, as well as new computational techniques that scale to large, real-world networks. To address the first concern, we develop new subset scan methods that allow the detected subset of nodes to change over time, while incorporating temporal consistency constraints to reward patterns that do not dramatically change between adjacent time steps. Second, our Additive Graph Scan algorithm allows our novel scan statistic to process small graphs (500 nodes) in 4.1 seconds on average while maintaining an approximation ratio over 99% compared to an exact optimization method, and to scale to large graphs with over 12,000 nodes in 30 minutes on average. Evaluation results across multiple detection, tracking, and source-tracing tasks demonstrate substantial performance gains achieved by the Dynamic Subset Scan approach.
Original language | English (US) |
---|---|
Article number | 6729554 |
Pages (from-to) | 697-706 |
Number of pages | 10 |
Journal | Proceedings - IEEE International Conference on Data Mining, ICDM |
DOIs | |
State | Published - 2013 |
Event | 13th IEEE International Conference on Data Mining, ICDM 2013 - Dallas, TX, United States Duration: Dec 7 2013 → Dec 10 2013 |
Keywords
- likelihood ratio statistics
- sensor fusion
- spatial and subset scan statistics
- water distribution systems
ASJC Scopus subject areas
- General Engineering