TY - JOUR
T1 - SING
T2 - Subgraph search In Non-homogeneous Graphs
AU - Di Natale, Raffaele
AU - Ferro, Alfredo
AU - Giugno, Rosalba
AU - Mongiovì, Misael
AU - Pulvirenti, Alfredo
AU - Shasha, Dennis
N1 - Funding Information:
We would like to thank the authors of gIndex, CTree and GCoding for having kindly provided their tools for comparison purpose. Authors were in part supported by PROGETTO FIRB ITALY-ISRAEL grant n. RBIN04BYZ7: “Algorithms for Patterns Discovery and Retrieval in discrete structures with applications to Bioinformatics” and by the Sicily Region grants PROGETTO POR 3.14: “Ricerca e Sviluppo suite di programmi per l’analisi biologica, denominata: BIOWARE”. D. Shasha’s work was partly supported by the US National Science Foundation grants GM 32877-21/22, IIS-0414763, DBI-0445666, IOB-0519985, DBI-0519984, DBI-0421604, and N2010-0115586.
PY - 2010/2/19
Y1 - 2010/2/19
N2 - Background: Finding the subgraphs of a graph database that are isomorphic to a given query graph has practical applications in several fields, from cheminformatics to image understanding. Since subgraph isomorphism is a computationally hard problem, indexing techniques have been intensively exploited to speed up the process. Such systems filter out those graphs which cannot contain the query, and apply a subgraph isomorphism algorithm to each residual candidate graph. The applicability of such systems is limited to databases of small graphs, because their filtering power degrades on large graphs.Results: In this paper, SING (Subgraph search In Non-homogeneous Graphs), a novel indexing system able to cope with large graphs, is presented. The method uses the notion of feature, which can be a small subgraph, subtree or path. Each graph in the database is annotated with the set of all its features. The key point is to make use of feature locality information. This idea is used to both improve the filtering performance and speed up the subgraph isomorphism task.Conclusions: Extensive tests on chemical compounds, biological networks and synthetic graphs show that the proposed system outperforms the most popular systems in query time over databases of medium and large graphs. Other specific tests show that the proposed system is effective for single large graphs.
AB - Background: Finding the subgraphs of a graph database that are isomorphic to a given query graph has practical applications in several fields, from cheminformatics to image understanding. Since subgraph isomorphism is a computationally hard problem, indexing techniques have been intensively exploited to speed up the process. Such systems filter out those graphs which cannot contain the query, and apply a subgraph isomorphism algorithm to each residual candidate graph. The applicability of such systems is limited to databases of small graphs, because their filtering power degrades on large graphs.Results: In this paper, SING (Subgraph search In Non-homogeneous Graphs), a novel indexing system able to cope with large graphs, is presented. The method uses the notion of feature, which can be a small subgraph, subtree or path. Each graph in the database is annotated with the set of all its features. The key point is to make use of feature locality information. This idea is used to both improve the filtering performance and speed up the subgraph isomorphism task.Conclusions: Extensive tests on chemical compounds, biological networks and synthetic graphs show that the proposed system outperforms the most popular systems in query time over databases of medium and large graphs. Other specific tests show that the proposed system is effective for single large graphs.
UR - http://www.scopus.com/inward/record.url?scp=77952310982&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77952310982&partnerID=8YFLogxK
U2 - 10.1186/1471-2105-11-96
DO - 10.1186/1471-2105-11-96
M3 - Article
C2 - 20170516
AN - SCOPUS:77952310982
SN - 1471-2105
VL - 11
JO - BMC Bioinformatics
JF - BMC Bioinformatics
M1 - 96
ER -