Fast analytical methods for finding significant labeled graph motifs

Giovanni Micale, Rosalba Giugno, Alfredo Ferro, Misael Mongiovì, Dennis Shasha, Alfredo Pulvirenti

Research output: Contribution to journalArticlepeer-review

Abstract

Network motif discovery is the problem of finding subgraphs of a network that occur more frequently than expected, according to some reasonable null hypothesis. Such subgraphs may indicate small scale interaction features in genomic interaction networks or intriguing relationships involving actors or a relationship among airlines. When nodes are labeled, they can carry information such as the genomic entity under study or the dominant genre of an actor. For that reason, labeled subgraphs convey information beyond structure and could therefore enjoy more applications. To identify statistically significant motifs in a given network, we propose an analytical method (i.e. simulation-free) that extends the works of Picard et al. (J Comput Biol 15(1):1–20, 2008) and Schbath et al. (J Bioinform Syst Biol 2009(1):616234, 2009) to label-dependent scale-free graph models. We provide an analytical expression of the mean and variance of the count under the Expected Degree Distribution random graph model. Our model deals with both induced and non-induced motifs. We have tested our methodology on a wide set of graphs ranging from protein–protein interaction networks to movie networks. The analytical model is a fast (usually faster by orders of magnitude) alternative to simulation. This advantage increases as graphs grow in size.

Original languageEnglish (US)
Pages (from-to)504-531
Number of pages28
JournalData Mining and Knowledge Discovery
Volume32
Issue number2
DOIs
StatePublished - Mar 1 2018

Keywords

  • Graph algorithms
  • Labeled graph motifs
  • Network mining
  • Random network models

ASJC Scopus subject areas

  • Information Systems
  • Computer Science Applications
  • Computer Networks and Communications

Fingerprint Dive into the research topics of 'Fast analytical methods for finding significant labeled graph motifs'. Together they form a unique fingerprint.

Cite this