Abstract
We consider the problem of clustering a graph G into two communities by observing a subset of the vertex correlations. Specifically, we consider the inverse problem with observed variables Y=B-G x \oplus Z , where B-G is the incidence matrix of a graph G , x is the vector of unknown vertex variables (with a uniform prior), and Z is a noise vector with Bernoulli(\varepsilon ) i.i.d. entries. All variables and operations are Boolean. This model is motivated by coding, synchronization, and community detection problems. In particular, it corresponds to a stochastic block model or a correlation clustering problem with two communities and censored edges. Without noise, exact recovery (up to global flip) of x is possible if and only the graph G is connected, with a sharp threshold at the edge probability \log (n)/n for Erds-Rényi random graphs. The first goal of this paper is to determine how the edge probability p needs to scale to allow exact recovery in the presence of noise. Defining the degree rate of the graph by \alpha =np/\log (n) , it is shown that exact recovery is possible if and only if \alpha >2/(1-2\varepsilon )2+ o(1/(1-2\varepsilon )2). In other words, 2/(1-2\varepsilon )2 is the information theoretic threshold for exact recovery at low-SNR. In addition, an efficient recovery algorithm based on semidefinite programming is proposed and shown to succeed in the threshold regime up to twice the optimal rate. For a deterministic graph G , defining the degree rate as \alpha =d/\log (n) , where d is the minimum degree of the graph, it is shown that the proposed method achieves the rate \alpha > 4((1+\lambda )/(1-\lambda )2)/(1-2\varepsilon )2+ o(1/(1-2\varepsilon )2) , where 1-\lambda is the spectral gap of the graph G.
Original language | English (US) |
---|---|
Article number | 6949658 |
Pages (from-to) | 10-22 |
Number of pages | 13 |
Journal | IEEE Transactions on Network Science and Engineering |
Volume | 1 |
Issue number | 1 |
DOIs | |
State | Published - Jan 1 2014 |
Keywords
- Erdos-Renyi graphs
- Information theoretic bounds
- Semidefinite relaxations
- Synchronization problem
- graphbased codes
ASJC Scopus subject areas
- Control and Systems Engineering
- Computer Science Applications
- Computer Networks and Communications