Rapid detection of significant spatial clusters

Daniel B. Neill, Andrew W. Moore

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Given an N × N grid of squares, where each square has a count C ij and an underlying population pij, our goal is to find the rectangular region with the highest density, and to calculate its significance by randomization. An arbitrary density function D, dependent on a region's total count C and total population P, can be used. For example, if each count represents the number of disease cases occurring in that square, we can use Kulldorff's spatial scan statistic DK to find the most significant spatial disease cluster. A naive approach to finding the maximum density region requires O(N4) time, and is generally computationally infeasible. We present a multiresolution algorithm which partitions the grid into overlapping regions using a novel overlap-kd tree data structure, bounds the maximum score of subrogions contained in each region, and prunes regions which cannot contain the maximum density region. For sufficiently dense regions, this method finds the maximum density region in O((N log N)2) time, in practice resulting in significant (20-2000x) speedups on both real and simulated datasets.

Original languageEnglish (US)
Title of host publicationKDD-2004 - Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
PublisherAssociation for Computing Machinery
Pages256-265
Number of pages10
ISBN (Print)1581138881, 9781581138887
DOIs
StatePublished - 2004
EventKDD-2004 - Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - Seattle, WA, United States
Duration: Aug 22 2004Aug 25 2004

Publication series

NameKDD-2004 - Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Other

OtherKDD-2004 - Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
CountryUnited States
CitySeattle, WA
Period8/22/048/25/04

Keywords

  • Biosurveillance
  • Cluster detection
  • Spatial scan statistics

ASJC Scopus subject areas

  • Engineering(all)

Fingerprint Dive into the research topics of 'Rapid detection of significant spatial clusters'. Together they form a unique fingerprint.

  • Cite this

    Neill, D. B., & Moore, A. W. (2004). Rapid detection of significant spatial clusters. In KDD-2004 - Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 256-265). (KDD-2004 - Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining). Association for Computing Machinery. https://doi.org/10.1145/1014052.1014082