Abstract
Stochastic blockmodels and variants thereof are among the most widely used approaches to community detection for social networks and relational data. A stochastic blockmodel partitions the nodes of a network into disjoint sets, called communities. The approach is inherently related to clustering with mixture models; and raises a similar model selection problem for the number of communities. The Bayesian information criterion (BIC) is a popular solution, however, for stochastic blockmodels, the conditional independence assumption given the communities of the endpoints among different edges is usually violated in practice. In this regard, we propose composite likelihood BIC (CL-BIC) to select the number of communities, and we show it is robust against possible misspecifications in the underlying stochastic blockmodel assumptions. We derive the requisite methodology and illustrate the approach using both simulated and real data. Supplementary materials containing the relevant computer code are available online.
Original language | English (US) |
---|---|
Pages (from-to) | 171-181 |
Number of pages | 11 |
Journal | Journal of Computational and Graphical Statistics |
Volume | 26 |
Issue number | 1 |
DOIs | |
State | Published - Jan 2 2017 |
Keywords
- Community detection
- Composite likelihood
- Degree-corrected stochastic blockmodel
- Model selection
- Spectral clustering
- Stochastic blockmodel
ASJC Scopus subject areas
- Statistics and Probability
- Discrete Mathematics and Combinatorics
- Statistics, Probability and Uncertainty