# Algorithm independent bounds on community detection problems and associated transitions in stochastic block model graphs

Research output: Contribution to journal › Article › Scientific › peer-review

### Standard

**Algorithm independent bounds on community detection problems and associated transitions in stochastic block model graphs.** / Darst, Richard K.; Reichman, David R.; Ronhovde, Peter; Nussinov, Zohar.

Research output: Contribution to journal › Article › Scientific › peer-review

### Harvard

*Journal of Complex Networks*, vol. 3, no. 3, pp. 333-360. https://doi.org/10.1093/comnet/cnu042

### APA

*Journal of Complex Networks*,

*3*(3), 333-360. https://doi.org/10.1093/comnet/cnu042

### Vancouver

### Author

### Bibtex - Download

}

### RIS - Download

TY - JOUR

T1 - Algorithm independent bounds on community detection problems and associated transitions in stochastic block model graphs

AU - Darst, Richard K.

AU - Reichman, David R.

AU - Ronhovde, Peter

AU - Nussinov, Zohar

PY - 2015/1/1

Y1 - 2015/1/1

N2 - We derive rigorous bounds for well-defined community structure in complex networks for a stochastic block model (SBM) benchmark. In particular, we analyse the effect of inter-community 'noise' (intercommunity edges) on any 'community detection' algorithm's ability to correctly group nodes assigned to a planted partition, a problem which has been proved to be NP complete in a standard rendition. Our result does not rely on the use of any one particular algorithm nor on the analysis of the limitations of inference. Rather, we turn the problem on its head and work backwards to examine when, in the first place, well-defined structure may exist in SBMs. The method that we introduce here could potentially be applied to other computational problems. The objective of community detection algorithms is to partition a given network into optimally disjoint subgraphs (or communities). Similar to k-SAT and other combinatorial optimization problems, 'community detection' exhibits different phases. Networks that lie in the 'unsolvable phase' lack well-defined structure and thus have no partition that is meaningful. Solvable systems splinter into two disparate phases: those in the 'hard' phase and those in the 'easy' phase. As befits its name, within the easy phase, a partition is easy to achieve by known algorithms. When a network lies in the hard phase, it still has an underlying structure yet, finding a meaningful partition which can be checked in polynomial time requires an exhaustive computational effort that rapidly increases with the size of the graph. When taken together, (i) the rigorous results that we report here on when graphs have an underlying structure and (ii) recent results concerning the limits of rather general algorithms suggest bounds on the hard phase.

AB - We derive rigorous bounds for well-defined community structure in complex networks for a stochastic block model (SBM) benchmark. In particular, we analyse the effect of inter-community 'noise' (intercommunity edges) on any 'community detection' algorithm's ability to correctly group nodes assigned to a planted partition, a problem which has been proved to be NP complete in a standard rendition. Our result does not rely on the use of any one particular algorithm nor on the analysis of the limitations of inference. Rather, we turn the problem on its head and work backwards to examine when, in the first place, well-defined structure may exist in SBMs. The method that we introduce here could potentially be applied to other computational problems. The objective of community detection algorithms is to partition a given network into optimally disjoint subgraphs (or communities). Similar to k-SAT and other combinatorial optimization problems, 'community detection' exhibits different phases. Networks that lie in the 'unsolvable phase' lack well-defined structure and thus have no partition that is meaningful. Solvable systems splinter into two disparate phases: those in the 'hard' phase and those in the 'easy' phase. As befits its name, within the easy phase, a partition is easy to achieve by known algorithms. When a network lies in the hard phase, it still has an underlying structure yet, finding a meaningful partition which can be checked in polynomial time requires an exhaustive computational effort that rapidly increases with the size of the graph. When taken together, (i) the rigorous results that we report here on when graphs have an underlying structure and (ii) recent results concerning the limits of rather general algorithms suggest bounds on the hard phase.

KW - Clustering

KW - Community detection

KW - Edge density

KW - NP hard

KW - Solvability

KW - Stochastic block model

UR - http://www.scopus.com/inward/record.url?scp=84954202909&partnerID=8YFLogxK

U2 - 10.1093/comnet/cnu042

DO - 10.1093/comnet/cnu042

M3 - Article

VL - 3

SP - 333

EP - 360

JO - Journal of Complex Networks

JF - Journal of Complex Networks

SN - 2051-1310

IS - 3

ER -

ID: 28010400