Abstract
The analysis of complex networks is a rapidly growing topic with many applications in different domains. The analysis of large graphs is often made via unsupervised classification of vertices of the graph. Community detection is the main way to divide a large graph into smaller ones that can be studied separately. However another definition of a cluster is possible, which is based on the structural distance between vertices. This definition includes the case of community clusters but is more general in the sense that two vertices may be in the same group even if they are not connected. Methods for detecting communities in undirected graphs have been recently reviewed by Fortunato. In this paper we expand Fortunato’s work and make a review of methods and algorithms for detecting essentially structurally homogeneous subsets of vertices in binary or weighted and directed and undirected graphs.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Airoldi, E., Blei, D., Fienberg, S., Xing, E.: Mixed membership stochastic blockmodels. J. Mach. Learn. Res. 9, 1981–2014 (2008)
Ambroise, C., Matias, C.: New consistent and asymptotically normal parameter estimates for random-graph mixture models. J. R. Stat. Soc., Ser. B, Stat. Methodol. 74, 3–35 (2011)
Arabie, P., Boorman, S., Levitt, P.: Constructing blockmodels: how and why. J. Math. Psychol. 17, 21–63 (1978). doi:10.1073/pnas.0907096106
Benzecri, J.: L Analyse des Donnees. Volume II. L Analyse des Correspondances. Dunod, Paris (1973)
Bickel, P., Chen, A.: A nonparametric view of network models and Newman-Girvan and other modularities. Proc. Natl. Acad. Sci. USA, 1–6 (2010)
Brohee, S., Van Helden, J.: Evaluation of clustering algorithms for protein-protein interaction networks. BMC Bioinform. 7(1), 488 (2006)
Burt, R.: Cohesion versus structural equivalence as a basis for network subgroups. Sociol. Methods Res. 7(2), 189–212 (1978)
Celisse, A., Daudin, J., Pierre, L.: Consistency of maximum likelihood and variational estimators in mixture models for random graphs. Electron. J. Stat. 6, 1847–1899 (2012)
Choi, D., Wolfe, P., Airoldi, E.: Stochastic blockmodels with growing number of classes. Biometrika 99(2), 273–284 (2012)
Clauset, A., Newman, M., Moore, C.: Finding community structure in very large networks. Phys. Rev. E 70(6), 066111 (2004)
Csardi, G., Nepusz, T.: The igraph software package for complex network research. InterJournal, Complex Syst. 1695, 38 (2006). http://igraph.sf.net
Daudin, J.: A review of statistical models for clustering networks with an application to a PPI network. J. Soc. Fr. Stat. 152(2), 111–125 (2011)
Daudin, J., Picard, F., Robin, S.: A mixture model for random graphs. Stat. Comput. 18(2), 173–183 (2008)
Daudin, J.J., Pierre, L., Vacher, C.: Model for heterogeneous random networks using continuous latent variables and an application to a tree-fungus network. Biometrics 66(4), 1043–1051 (2010)
Decelle, A., Krzakala, F., Moore, C., Zdeborová, L.: Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications. Phys. Rev. E 84(6), 066106 (2011)
Donath, W.E., Hoffman, A.J.: Lower bounds for the partitioning of graphs. IBM J. Res. Dev. 17(5), 420–425 (1973)
Erosheva, E.: Comparing latent structures of the grade of membership, Rasch and latent class model. Psychometrika 70(4), 619–628 (2005)
Fortunato, S.: Community detection in graphs. Phys. Rep. 486(3–5), 75–174 (2010). http://www.sciencedirect.com/science/article/pii/S0370157309002841. doi:10.1016/j.physrep.2009.11.002
Girvan, M., Newman, M.: Community structure in social and biological networks. Proc. Natl. Acad. Sci. USA 99(12), 7821 (2002)
Guimera, R., Stouffer, D., Sales-Pardo, M., Leicht, E., Newman, M., Nunes Amaral, L.: Origin of compartmentalization in food webs. Ecology (2010). http://www.esajournals.org/doi/abs/10.1890/09-1175.1. doi:10.1890/09-1175.1
Handcock, M.S., Raftery, A.E., Tantrum, J.: Model-based clustering for social networks. J. R. Stat. Soc. A 170(2), 301–354 (2007)
Harshman, R.: Models for analysis of asymmetrical relationships among N objects or stimuli. In: First Joint Meeting of the Psychometric Society and the Society for Mathematical Psychology. McMaster University, Hamilton, Ontario, August (1978)
Hartigan, J.: Clustering Algorithms. Wiley, New York (1975)
Hirschfeld, H.: A connection between correlation and contingency. Proc. Camb. Philos. Soc. 31, 520–524 (1935)
Hofman, J.M., Wiggins, C.H.: Bayesian approach to network modularity. Phys. Rev. Lett. 100, 258701 (2008). http://link.aps.org/doi/10.1103/PhysRevLett.100.258701. doi:10.1103/PhysRevLett.100.258701
Holland, P., Laskey, K., Leinhardt, K.: Stochastic blockmodels: some first steps. Soc. Netw. 5, 109–137 (1983)
Kiers, H., ten Berge, J., Takane, Y., de Leeuw, J.: A generalization of Takane’s algorithm for DEDICOM. Psychometrika 55(1), 151–158 (1990)
Latouche, P., Birmelé, E., Ambroise, C.: Overlapping stochastic block models with application to the French political blogosphere. Ann. Appl. Stat. 5(1), 309–336 (2011)
Lorrain, F., White, H.: Structural equivalence of individuals in social networks. J. Math. Sociol. 1, 49–80 (1971)
Manton, K., Woodbury, M., Tolley, H.: In: Statistical Applications Using Fuzzy Sets (1994)
Marchette, D., Priebe, C.: Predicting unobserved links in incompletely observed networks. Comput. Stat. Data Anal. 52(3), 1373–1386 (2008)
Mariadassou, M., Robin, S., Vacher, C.: Uncovering latent structure in valued graphs: a variational approach. Ann. Appl. Stat. 4, 715–742 (2010)
Newman, M., Girvan, M.: Finding and evaluating community structure in networks. Phys. Rev. E 69(2), 026113 (2004)
Picard, F., Miele, V., Daudin, J.J., Cottret, L., Robin, S.: Deciphering the connectivity structure of biological networks using MixNet. BMC Bioinform. 10, S7 (2009)
Pons, P., Latapy, M.: Computing communities in large networks using random walks. J. Graph Algorithms Appl. 10(2), 191–218 (2006)
Raj, A., Wiggins, C.H.: An information-theoretic derivation of min-cut based clustering. IEEE Trans. Pattern Anal. Mach. Intell. 32, 988–995 (2010). doi:10.1109/TPAMI.2009.124
Rohe, K., Chatterjee, S., Yu, B.: Spectral clustering and the high-dimensional stochastic block model. Ann. Stat. 39(4), 1878–1915 (2011)
Sinkkonen, J., Aukia, J., Kaski, S.: Component models for large networks (2008a). arXiv:0803.1628
Sinkkonen, J., Aukia, J., Kaski, S.: Inferring vertex properties from topology in large networks (2008b). arXiv:0803.1628v1 [stat.ML]
Snijders, T., Nowicki, K.: Estimation and prediction for stochastic blockmodels for graphs with latent block structure. J. Classif. 14(1), 75–100 (1997)
Trendafilov, N.: GIPSCAL revisited. A projected gradient approach. Stat. Comput. 12(2), 135–145 (2002)
Van Dongen, S.: Graph clustering by flow simulation. University of Utrecht 275 (2000)
Von Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17(4), 395–416 (2007)
White, H.C., Boorman, S.A., Breiger, R.L.: Social structure from multiple networks. Am. J. Sociol. 81, 730–780 (1976)
Winship, C., Mandel, M.: Roles and positions: a critique and extension of the blockmodeling approach. In: Sociological Methodology (1983)
Zachary, W.: An information flow model for conflict and fission in small groups. J. Anthropol. Res. 33(4), 452–473 (1977)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Leger, JB., Vacher, C. & Daudin, JJ. Detection of structurally homogeneous subsets in graphs. Stat Comput 24, 675–692 (2014). https://doi.org/10.1007/s11222-013-9395-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11222-013-9395-3