TY - GEN

T1 - Point pattern search in big data

AU - Porto, Fabio

AU - Krone-Martins, Alberto

AU - Rittmeyer, João N.

AU - Valduriez, Patrick

AU - Ogasawara, Eduardo

AU - Shasha, Dennis

N1 - Funding Information:
This research is partially funded by EU H2020 Program and MCTI/RNP-Brazil(HPC4e Project - grant agreement number 689772),CNPq (PQ 310109/2015-9),CAPES, INRIA(SciDISC associated team) and the Computational Biology Institute (www.ibc-montpellier.fr), INRIA international chair, U.S. National Science Foundation MCB-1158273, IOS-1139362 and MCB-1412232. This support is greatly appreciated.

PY - 2018/7/9

Y1 - 2018/7/9

N2 - Consider a set of points P in space with at least some of the pairwise distances specified. Given this set P, consider the following three kinds of queries against a database D of points: (i) pure constellation query: find all sets S in D of size |P| that exactly match the pairwise distances within P up to an additive error ϵ; (ii) isotropic constellation queries: find all sets S in D of size |P| such that there exists some scale factor f for which the distances between pairs in S exactly match f times the distances between corresponding pairs of P up to an additive ϵ; (iii) non-isotropic constellation queries: find all sets S in D of size |P| such that there exists some scale factor f and for at least some pairs of points, a maximum stretch factor mi , j > 1 such that (f × mi , j ×dist(pi, pj))+ϵ > dist(si,sj) >(f × dist(pi, pj)) - ϵ. Finding matches to such queries has applications to spatial data in astronomical, seismic, and any domain in which (approximate, scale-independent) geometrical matching is required. Answering the isotropic and non-isotropic queries is challenging because scale factors and stretch factors may take any of an infinite number of values. This paper proposes practically efficient sequential and distributed algorithms for pure, isotropic, and non-isotropic constellation queries. As far as we know, this is the first work to address isotropic and non-isotropic queries.

AB - Consider a set of points P in space with at least some of the pairwise distances specified. Given this set P, consider the following three kinds of queries against a database D of points: (i) pure constellation query: find all sets S in D of size |P| that exactly match the pairwise distances within P up to an additive error ϵ; (ii) isotropic constellation queries: find all sets S in D of size |P| such that there exists some scale factor f for which the distances between pairs in S exactly match f times the distances between corresponding pairs of P up to an additive ϵ; (iii) non-isotropic constellation queries: find all sets S in D of size |P| such that there exists some scale factor f and for at least some pairs of points, a maximum stretch factor mi , j > 1 such that (f × mi , j ×dist(pi, pj))+ϵ > dist(si,sj) >(f × dist(pi, pj)) - ϵ. Finding matches to such queries has applications to spatial data in astronomical, seismic, and any domain in which (approximate, scale-independent) geometrical matching is required. Answering the isotropic and non-isotropic queries is challenging because scale factors and stretch factors may take any of an infinite number of values. This paper proposes practically efficient sequential and distributed algorithms for pure, isotropic, and non-isotropic constellation queries. As far as we know, this is the first work to address isotropic and non-isotropic queries.

KW - Big Data

KW - Distance Matching

KW - Geometrical Patterns

KW - Isotropic

KW - Pattern Search

KW - Point set Registration

KW - Spatial Patterns

UR - http://www.scopus.com/inward/record.url?scp=85054936142&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85054936142&partnerID=8YFLogxK

U2 - 10.1145/3221269.3221294

DO - 10.1145/3221269.3221294

M3 - Conference contribution

AN - SCOPUS:85054936142

T3 - ACM International Conference Proceeding Series

BT - Scientific and Statistical Database Management - 30th International Conference, SSDBM 2018, Proceedings

A2 - Bohlen, Michael

A2 - Gamper, Johann

A2 - Kroger, Peer

A2 - Sacharidis, Dimitris

PB - Association for Computing Machinery

T2 - 30th International Conference on Scientific and Statistical Database Management, SSDBM 2018

Y2 - 9 July 2018 through 11 July 2018

ER -