Abstract
We examine the problem of uncovering communities in complex real-world networks whose elements and their respective associations manifest as streams of data. Community detection is applied in emerging computational environments and concerns critical applications in diverse areas including social computing, web analysis, IoT and biology. Despite the already expended related research efforts, the task of revealing the community structure of massive and rapidly-evolving networks remains very challenging. More specifically, there is an emerging need for online approaches that ingest graph data as a stream. In this paper, we propose a streaming-graph community-detection algorithm that expands seed-sets of nodes to communities. We consider an online setting and process a stream of edges while aiming to form communities on-the-fly using partial knowledge of the graph structure. We use space-efficient structures to maintain very limited information regarding the nodes of the graph and the sought communities, so as to effectively process large scale networks. In addition to our novel streaming approach, we develop a technique that increases the accuracy of our algorithm considerably and additionally propose a new clustering algorithm that allows for automatically deriving the size of the communities we seek to detect. Using ground-truth communities for a wide range of large real-word and synthetic networks, our experimental evaluation shows that our approach does achieve accuracy comparable, and oftentimes better, to the state-of-the-art non-streaming community detection algorithms. More importantly, we attain significant improvements in both execution time and memory requirements.
Original language | English (US) |
---|---|
Pages (from-to) | 2375-2386 |
Number of pages | 12 |
Journal | IEEE Transactions on Knowledge and Data Engineering |
Volume | 34 |
Issue number | 5 |
DOIs | |
State | Published - May 1 2022 |
Keywords
- Local community detection
- graph streams
- seed-set expansion
- social networks
ASJC Scopus subject areas
- Information Systems
- Computer Science Applications
- Computational Theory and Mathematics