Abstract
Determining if a graph displays a clustered structure prior to subjecting it to any cluster detection technique has recently gained attention in the literature. Attempts to group graph vertices into clusters when a graph does not have a clustered structure is not only a waste of time; it will also lead to misleading conclusions. To address this problem, we introduce a novel statistical test, the \(\delta \)-test, which is based on comparisons of local and global densities. Our goal is to assess whether a given graph meets the necessary conditions to be meaningfully summarized by clusters of vertices. We empirically explore our test’s behavior under a number of graph structures. We also compare it to other recently published tests. From a theoretical standpoint, our test is more general, versatile and transparent than recently published competing techniques. It is based on the examination of intuitive quantities, applies equally to weighted and unweighted graphs and allows comparisons across graphs. More importantly, it does not rely on any distributional assumptions, other than the universally accepted definition of a clustered graph. Empirically, our test is shown to be more responsive to graph structure than other competing tests.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Also, note that in this article we assume undirected graphs with no self-loops.
References
Albert, R., BarabĂ¡si, A.L.: Statistical mechanics of complex networks. Rev. Mod. Phys. 74, 47–97 (2002)
Aleskerov, F., Goldengorin, B., Pardalos, P.: Clusters, Orders, and Trees: Methods and Applications. Springer, Heidelberg (2014). https://doi.org/10.1007/978-1-4939-0742-7. Incorporated
Alon, N., Shapira, A.: A characterization of the (natural) graph properties testable with one-sided error. SIAM J. Comput. 37(6), 1703–1727 (2008)
Arias-Castro, E., Verzelen, N.: Community detection in dense random networks. Ann. Statist. 42(3), 940–969 (2014). https://doi.org/10.1214/14-AOS1208
BarabĂ¡si, A.L., Albert, R.: Emergence of scaling in random networks. Science 286, 509–512 (1999)
Barrat, A., Barthélemy, M., Pastor-Satorras, R., Vespignani, A.: The architecture of complex weighted networks. Proc. Natl. Acad. Sci. 101, 3747–3752 (2004)
Bickel, P.J., Sarkar, P.: Hypothesis testing for automated community detection in networks. J. R. Stat. Soc.: Ser. B (Stat. Methodol.) 78(1), 253–273 (2016). https://doi.org/10.1111/rssb.12117
Butenko, S., Chaovalitwongse, W.A., Pardalos, P.M.: Clustering Challenges in Biological Networks. World Scientific, Singapore (2009). https://doi.org/10.1142/6602
Chiplunkar, A., Kapralov, M., Khanna, S., Mousavifar, A., Peres, Y.: Testing graph clusterability: algorithms and lower bounds. ArXiv e-prints, August 2018
Czumaj, A., Peng, P., Sohler, C.: Testing cluster structure of graphs. ArXiv e-prints, April 2015
Eden, T., Ron, D., Seshadhri, C.: On Approximating the number of \(k\)-cliques in sublinear time. ArXiv e-prints, March 2018
Elenberg, E.R., Shanmugam, K., Borokhovich, M., Dimakis, A.G.: Beyond triangles: a distributed framework for estimating 3-profiles of large graphs. ArXiv e-prints, June 2015
Erdös, P., Rényi, A.: On random graphs I. Publ. Math. Debr. 6, 290 (1959)
Fortunato, S.: Community detection in graphs. Phys. Rep. 486, 75–174 (2010)
Fortunato, S., Hric, D.: Community detection in networks: a user guide. ArXiv e-prints, November 2016
Fronczak, A., HoÅ‚yst, J.A., Jedynak, M., Sienkiewicz, J.: Higher order clustering coefficients in BarabĂ¡si-Albert networks. Phys. Stat. Mech. Its Appl. 316, 688–694 (2002)
Gao, C., Lafferty, J.: Testing for global network structure using small subgraph statistics. ArXiv e-prints (Oct 2017)
Gao, C., Lafferty, J.: Testing network structure using relations between small subgraph probabilities. ArXiv e-prints, April 2017
Gishboliner, L., Shapira, A.: Deterministic vs non-deterministic graph property testing. ArXiv e-prints, April 2013
Goldreich, O., Ron, D.: Algorithmic aspects of property testing in the dense graphs model. SIAM J. Comput. 40(2), 376–445 (2011)
Hagberg, A., Schult, D., Swart, P.: Exploring network structure, dynamics, and function using network. In: Varoquaux, G., Vaught, T., Millman, J. (eds.) Proceedings of the 7th Python in Science Conference, Pasadena, CA USA, pp. 11–15 (2008)
He, Z., Liang, H., Chen, Z., Zhao, C.: Detecting statistically significant communities. CoRR abs/1806.05602 (2018). http://arxiv.org/abs/1806.05602
Jin, J., Ke, Z.T., Luo, S.: Network global testing by counting graphlets. ArXiv e-prints, July 2018
Leskovec, J., Kleinberg, J., Faloutsos, C.: Graph evolution: Densification and shrinking diameters. ACM Trans. Knowl. Discov. Data 1(1), 2 (2007)
LovĂ¡sz, L., Vesztergombi, K.: Nondeterministic graph property testing. Comb. Probab. Comput. 22, 749–762 (2013)
Miasnikof, P., Shestopaloff, A.Y., Bonner, A.J., Lawryshyn, Y.: A statistical performance analysis of graph clustering algorithms. In: Bonato, A., Prałat, P., Raigorodskii, A. (eds.) WAW 2018. LNCS, vol. 10836, pp. 170–184. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-92871-5_11
Ostroumova Prokhorenkova, L., Prałat, P., Raigorodskii, A.: Modularity of complex networks models. In: Bonato, A., Graham, F.C., Prałat, P. (eds.) WAW 2016. LNCS, vol. 10088, pp. 115–126. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-49787-7_10
Prokhorenkova, L.O., Prałat, P., Raigorodskii, A.: Modularity in several random graph models. Electron. Notes Discret. Math. 61, 947–953 (2017). http://www.sciencedirect.com/science/article/pii/S1571065317302238. The European Conference on Combinatorics, Graph Theory and Applications (EUROCOMB 2017)
Prokhorenkova, L., Tikhonov, A.: Community detection through likelihood optimization: in search of a sound model. In: Proceedings of the 2019 World Wide Web Conference (WWW 2019) (2019)
Schaeffer, S.E.: Survey: graph clustering. Comput. Sci. Rev. 1(1), 27–64 (2007). https://doi.org/10.1016/j.cosrev.2007.05.001
Ugander, J., Backstrom, L., Kleinberg, J.: Subgraph frequencies: mapping the empirical and extremal geography of large graph collections. ArXiv e-prints, April 2013
Verzelen, N., Arias-Castro, E.: Community detection in sparse random networks. Ann. Appl. Probab. 25(6), 3465–3510 (2015). https://doi.org/10.1214/14-AAP1080
Yang, J., Leskovec, J.: Defining and evaluating network communities based on Ground-truth. CoRR abs/1205.6233 (2012). http://arxiv.org/abs/1205.6233
Yin, H., Benson, A.R., Leskovec, J.: Higher-order clustering in networks. ArXiv e-prints (2018)
Yin, H., Benson, A., Leskovec, J.: Local higher-order graph clustering. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2017 (2017)
Acknowledgments
Pierre Miasnikof was supported by a Mitacs-Accelerate PhD award IT05806. He also wishes to thank Lasse Leskelä of Aalto University, for the introduction to the work of Gao and Lafferty. Liudmila Prokhorenkova and Andrei Raigorodskii were supported by The Russian Science Foundation (grant number 16-11-10014).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Miasnikof, P., Prokhorenkova, L., Shestopaloff, A.Y., Raigorodskii, A. (2020). A Statistical Test of Heterogeneous Subgraph Densities to Assess Clusterability. In: Matsatsinis, N., Marinakis, Y., Pardalos, P. (eds) Learning and Intelligent Optimization. LION 2019. Lecture Notes in Computer Science(), vol 11968. Springer, Cham. https://doi.org/10.1007/978-3-030-38629-0_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-38629-0_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-38628-3
Online ISBN: 978-3-030-38629-0
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)