One of the most fundamental problems in network study is community detection. The stochastic block model (SBM) is a widely used model, and various estimation methods have been developed with their community detection consistency results unveiled. However, the SBM is restricted by the strong assumption that all nodes in the same community are stochastically equivalent, which may not be suitable for practical applications. We introduce a pairwise covariates-adjusted stochastic block model (PCABM), a generalization of SBM that incorporates pairwise covariate information. We study the maximum likelihood estimators of the coefficients for the covariates as well as the community assignments, and show they are consistent under suitable sparsity conditions. Spectral clustering with adjustment (SCWA) is introduced to efficiently solve PCABM. Under certain conditions, we derive the error bound of community detection for SCWA and show that it is community detection consistent. In addition, we investigate model selection in terms of the number of communities and feature selection for the pairwise covariates, and propose two corresponding algorithms. PCABM compares favorably with the SBM or degree-corrected stochastic block model (DCBM) under a wide range of simulated and real networks when covariate information is accessible. Supplementary materials for this article are available online.
- Community detection
- Model selection
- Spectral clustering with adjustment
ASJC Scopus subject areas
- Statistics and Probability
- Statistics, Probability and Uncertainty