Abstract
Relations among data entities in most big data sets can be modeled by a big graph. Implementation and execution of algorithms related to the structure of big graphs is very important in different fields. Because of the inherently high volume of big graphs, their calculations should be performed in a distributed manner. Some distributed systems based on vertex-centric model have been introduced for big graph calculations in recent years. The performance of these systems in terms of run time depends on the partitioning and distribution of the graph. Therefore, the graph partitioning is a major concern in this field. This paper concentrates on big graph partitioning approaches for distribution of graphs in vertex-centric systems. This briefly discusses vertex-centric systems and formulates different models of graph partitioning problem. Then, a review of recent methods of big graph partitioning for these systems is shown. Most recent methods of big graph partitioning for vertex centric systems can be categorized into three classes: (i) stream-based methods that see vertices or edges of the graph in a stream and partition them, (ii) distributed methods that partition vertices or edges in a distributed manner, and (iii) dynamic methods that change partitions during the execution of algorithms to obtain better performance. This study compares the properties of different approaches in each class and briefly reviews methods that are not in these categories. This comparison indicates that The streaming methods are good choices for initial load of the graph in Vertex-centric systems. The distributed and dynamic methods are appropriate for long-running applications.
Similar content being viewed by others
Notes
\(n=|V|\) & \(m=|E|\) & \(r= \textit{number of iterations}\).
References
Abbas, Z., Kalavri, V., Carbone, P., Vlassov, V.: Streaming graph partitioning: an experimental study. Proc. VLDB Endow. 11(11), 1590–1603 (2018). https://doi.org/10.14778/3236187.3236208
Abou-Rjeili, A., Karypis, G.: Multilevel algorithms for partitioning power-law graphs. In: Proceedings 20th IEEE International Parallel Distributed Processing Symposium (2006). https://doi.org/10.1109/IPDPS.2006.1639360
Akhremtsev, Y., Sanders, P., Schulz, C.: High-quality shared-memory graph partitioning (2017). arXiv:1710.08231
Aslam, S.: Twitter by the numbers: stats, demographics and fun facts (2018). https://www.omnicoreagency.com/twitter-statistics/
Aydin, K., Bateni, M., Mirrokni, V.: Distributed balanced partitioning via linear embedding. In: Proceedings of the Ninth ACM International Conference on Web Search and Data Mining, pp. 387–396. ACM, New York (2016)
Balakrishnan, H., Kaashoek, M.F., Karger, D., Morris, R., Stoica, I.: Looking up data in p2p systems. Commun. ACM 46(2), 43–48 (2003). https://doi.org/10.1145/606272.606299
Bao, N.T., Suzumura, T.: Towards highly scalable pregel-based graph processing platform with x10. In: Proceedings of the 22Nd International Conference on World Wide Web (WWW ’13) Companion, pp. 501–508. ACM, New York (2013). https://doi.org/10.1145/2487788.2487984
Bollobs, B.: Random Graphs. Springer, New York (1998)
Borthakur, D.: HDFS Architecture Guide. Apache Hadoop Project (2008)
Bourse, F., Lelarge, M., Vojnovic, M.: Balanced graph edge partition. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’14), pp. 1456–1465. ACM, New York (2014). https://doi.org/10.1145/2623330.2623660
Bu, Y., Borkar, V., Jia, J., Carey, M.J., Condie, T.: Pregelix: big(ger) graph analytics on a dataflow engine. Proc. VLDB Endow. 8(2), 161–172 (2014). https://doi.org/10.14778/2735471.2735477
Bui, T.N., Jones, C.: Finding good approximate vertex and edge partitions is np-hard. Inf. Process. Lett. 42(3), 153–159 (1992). https://doi.org/10.1016/0020-0190(92)90140-Q
Buluç, A., Meyerhenke, H., Safro, I., Sanders, P., Schulz, C.: Recent advances in graph partitioning. In: Kliemann, L., Sanders, P. (eds.) Algorithm Engineering. LNCS, pp. 117–158. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-49487-6-4
Cao, Y., Rao, R.: A streaming graph partitioning approach on imbalance cluster. In: 2016 18th International Conference on Advanced Communication Technology (ICACT), pp. 360–364 (2016). https://doi.org/10.1109/ICACT.2016.7423392
Chen, R., Shi, J., Chen, Y., Chen, H.: Powerlyra: differentiated graph computation and partitioning on skewed graphs. In: Proceedings of the Tenth European Conference on Computer Systems (EuroSys ’15), pp. 1:1–1:15. ACM, New York (2015). https://doi.org/10.1145/2741948.2741970
Chen, T., Li, B.: A distributed graph partitioning algorithm for processing large graphs. In: 2016 IEEE Symposium on Service-Oriented System Engineering (SOSE), pp. 53–59 (2016)
Ching, A.: Giraph: production-grade graph processing infrastructure for trillion edge graphs. In: ATPESC, vol. 14 (2014)
Ching, A., Edunov, S., Kabiljo, M., Logothetis, D., Muthukrishnan, S.: One trillion edges: graph processing at Facebook-scale. Proc. VLDB Endow. 8(12), 1804–1815 (2015). https://doi.org/10.14778/2824032.2824077
Condon, A., Karp, R.M.: Algorithms for graph partitioning on the planted partition model. Random Struct. Algorithms 18(2), 116–140 (2001)
Cormen, T.H.: Introduction to algorithms. MIT, Cambridge (2009)
Cui, H., Cipar, J., Ho, Q., Kim, J.K., Lee, S., Kumar, A., Wei, J., Dai, W., Ganger, G.R., Gibbons, P.B., et al.: Exploiting bounded staleness to speed up big data analytics. In: USENIX Annual Technical Conference, pp. 37–48 (2014)
Devine, K., Boman, E., Heaphy, R., Hendrickson, B., Vaughan, C.: Zoltan data management services for parallel dynamic applications. Comput. Sci. Eng. 4(2), 90–97 (2002)
Dindokar, R., Simmhan, Y.: Elastic partition placement for non-stationary graph algorithms. In: 2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), pp. 90–93 (2016). https://doi.org/10.1109/CCGrid.2016.97
Dong, F., Zhang, J., Luo, J., Shen, D., Jin, J.: Enabling application-aware flexible graph partition mechanism for parallel graph processing systems. Concurr. Comput. Pract. Exp. 29(6), e3849 (2017). https://doi.org/10.1002/cpe.3849, e3849 cpe.3849
Donnelly, G.: 75 super-useful Facebook statistics for 2018 (2018). https://www.wordstream.com/blog/ws/2017/11/07/facebook-statistics
Echbarthi, G., Kheddouci, H.: Fractional greedy and partial restreaming partitioning: New methods for massive graph partitioning. In: 2014 IEEE International Conference on Big Data (Big Data), pp. 25–32 (2014). https://doi.org/10.1109/BigData.2014.7004368
Echbarthi, G., Kheddouci, H.: Streaming metis partitioning. In: 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pp. 17–24 (2016). https://doi.org/10.1109/ASONAM.2016.7752208
Elsner, U.: Graph partitioning—a survey. Technical Report SFB393/97-27 (1997)
Fjllstrm, P.O.: Algorithms for Graph Partitioning: A Survey, vol. 3. Linkping University Electronic Press, Linkping (1998)
Gonzalez, J.E., Low, Y., Gu, H., Bickson, D., Guestrin, C.: Powergraph: distributed graph-parallel computation on natural graphs. In: Proceedings of 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI), vol. 12, pp. 17–30 (2012)
Gregory, S.: Finding overlapping communities in networks by label propagation. New J. Phys. 12(10), 103,018 (2010)
Guerrieri, A., Montresor, A.: Distributed edge partitioning for graph processing (2014). arXiv:1403.6270
Guerrieri, A., Montresor, A.: DFEP: Distributed Funding-Based Edge Partitioning, Springer, Berlin, pp. 346–358 (2015)
Guo, Y., Hong, S., Chafi, H., Iosup, A., Epema, D.: Modeling, analysis, and experimental comparison of streaming graph-partitioning policies. J. Parallel Distrib. Comput. 108, 106–121. https://doi.org/10.1016/j.jpdc.2016.02.003. Special Issue on Scalable Computing Systems for Big Data Applications (2017)
Han, M., Daudjee, K.: Giraph unchained: barrierless asynchronous parallel execution in pregel-like graph processing systems. Proc. VLDB Endow. 8(9), 950–961 (2015). https://doi.org/10.14778/2777598.2777604
Han, W., Miao, Y., Li, K., Wu, M., Yang, F., Zhou, L., Prabhakaran, V., Chen, W., Chen, E.: Chronos: A graph engine for temporal graph analysis. In: Proceedings of the Ninth European Conference on Computer Systems (EuroSys ’14), pp. 1:1–1:14. ACM, New York (2014). https://doi.org/10.1145/2592798.2592799
Hendawi, A.M., Bao, J., Mokbel, M.F.: iRoad: a framework for scalable predictive query processing on road networks. Proc. VLDB Endow. 6(12), 1262–1265 (2013). https://doi.org/10.14778/2536274.2536291
Hoque, I., Gupta, I.: Lfgraph: Simple and fast distributed graph analytics. In: Proceedings of the First ACM SIGOPS Conference on Timely Results in Operating Systems (TRIOS ’13), pp. 9:1–9:17. ACM, New York (2013). https://doi.org/10.1145/2524211.2524218
Hu, K., Zeng, H.J.W.W.G.: Partitioning big graph with respect to arbitrary proportions in a streaming manner. Future Gen. Comput. Syst. 80, 1–11 (2018). https://doi.org/10.1016/j.future.2017.06.027
Hwang, C.R.: Simulated annealing: theory and applications. Acta Appl. Math. 12(1), 108–111 (1988). https://doi.org/10.1007/BF00047572
Jain, N., Liao, G., Willke, T.L.: Graphbuilder: Scalable graph etl framework. In: First International Workshop on Graph Data Management Experiences and Systems (GRADES ’13), pp. 4:1–4:6. ACM, New York (2013). https://doi.org/10.1145/2484425.2484429
Junghanns, M., Kiessling, M., Teichmann, N., Gomez, K., Petermann, A., Rahm, E.: Declarative and distributed graph analytics with gradoop. Proc. VLDB Endow. 11(12), 2006–2009 (2018). https://doi.org/10.14778/3229863.3236246
Khayyat, Z., Awara, K., Alonazi, A., Jamjoom, H., Williams, D., Kalnis, P.: Mizan: A system for dynamic load balancing in large-scale graph processing. In: Proceedings of the 8th ACM European Conference on Computer Systems (EuroSys ’13), pp. 169–182. ACM, New York (2013). https://doi.org/10.1145/2465351.2465369
Kim, G.H., Trimi, S., Chung, J.H.: Big-data applications in the government sector. Commun. ACM 57(3), 78–85 (2014). https://doi.org/10.1145/2500873
Lang, K.: Finding good nearly balanced cuts in power law graphs. Technical Report YRL-2004-036, Yahoo! Research Labs (2004)
Lee, K.H., Lee, Y.J., Choi, H., Chung, Y.D., Moon, B.: Parallel data processing with MapReduce: a survey. SIGMOD Rec. 40(4), 11–20 (2012). https://doi.org/10.1145/2094114.2094118
Leskovec, J., Kleinberg, J., Faloutsos, C.: Graph evolution: densification and shrinking diameters. ACM Trans. Knowl. Discov. Data 1(1), 2 (2007). https://doi.org/10.1145/1217299.1217301
Leskovec, J., Lang, K.J., Dasgupta, A., Mahoney, M.W.: Community structure in large networks: natural cluster sizes and the absence of large well-defined clusters. Internet Math. 6(1), 29–123 (2009)
Lim, Y., Lee, W.J., Choi, H.J., Kang, U.: MTP: discovering high quality partitions in real world graphs. World Wide Web, pp. 1–24 (2016)
Liu, X., Murata, T.: Advanced modularity-specialized label propagation algorithm for detecting communities in networks. Physica A 389(7), 1493–1500 (2010). https://doi.org/10.1016/j.physa.2009.12.019
Low, Y., Bickson, D., Gonzalez, J., Guestrin, C., Kyrola, A., Hellerstein, J.M.: Distributed GraphLab: a framework for machine learning and data mining in the cloud. Proc. VLDB Endow. 5(8), 716–727 (2012). https://doi.org/10.14778/2212351.2212354
Malewicz, G., Austern, M.H., Bik, A.J., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: A system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data (SIGMOD ’10), pp. 135–146. ACM, New York (2010). https://doi.org/10.1145/1807167.1807184
Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., Byers, A.H.: Big data: the next frontier for innovation, competition, and productivity 2011, vol. 5(33), p. 222 (2015). http://www.mckinsey.com/Insights/MGI/Research/Technology_and_Innovation/Big_data_The_next_frontier_for_innovation
Margo, D., Seltzer, M.: A scalable distributed graph partitioner. Proc. VLDB Endow. 8(12), 1478–1489 (2015). https://doi.org/10.14778/2824032.2824046
Martella, C., Logothetis, D., Loukas, A., Siganos, G.: Spinner: Scalable graph partitioning in the cloud (2014). arXiv:1404.3861
Mayer, C., Tariq, M.A., Li, C., Rothermel, K.: Graph: heterogeneity-aware graph computation with adaptive partitioning. In: 2016 IEEE 36th International Conference on Distributed Computing Systems (ICDCS), pp. 118–128 (2016). https://doi.org/10.1109/ICDCS.2016.92
Mayer, C., Mayer, R., Tariq, M.A., Geppert, H., Laich, L., Rieger, L., Rothermel, K.: Adwise: adaptive window-based streaming edge partitioning for high-speed graph processing (2017). arXiv preprint. arXiv:1712.08367
McAfee, A., Brynjolfsson, E., Davenport, T.H., Patil, D., Barton, D.: Big data’s: the management revolution. Harv. Bus. Rev. 90(10), 61–67 (2012)
McSherry, F.: Spectral partitioning of random graphs. In: Proceedings of 42nd IEEE Symposium on Foundations of Computer Science, pp. 529–537 (2001)
Meyerhenke, H., Sanders, P., Schulz, C.: Parallel graph partitioning for complex networks. IEEE Trans. Parallel Distrib. Syst. PP(99), 1–1 (2017)
Mofrad, M.H., Melhem, R., Hammoud, M.: Revolver: vertex-centric graph partitioning using reinforcement learning. In: 2018 IEEE 11th International Conference on Cloud Computing (CLOUD), pp. 818–821 (2018). https://doi.org/10.1109/CLOUD.2018.00111
Moreira, O., Popp, M., Schulz, C.: Graph partitioning with acyclicity constraints (2017). arXiv:1704.00705
Nirmala, G., Dinakaran, K.: Analysis of protein database for semantic similarity using map reduce; a survey. In: Proceedings of IEEE International Conference on Computer Communication and Systems (ICCCS14), pp. 046–050 (2014). https://doi.org/10.1109/ICCCS.2014.7068166
Nishimura, J., Ugander, J.: Restreaming graph partitioning: simple versatile algorithms for advanced balancing. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’13), pp. 1106–1114. ACM, New York (2013). https://doi.org/10.1145/2487575.2487696
Onizuka, M., Fujimori, T., Shiokawa, H.: Graph partitioning for distributed graph processing. Data Sci. Eng. 2(1), 94–105 (2017). https://doi.org/10.1007/s41019-017-0034-4
Petroni, F., Querzoni, L., Daudjee, K., Kamali, S., Iacoboni, G.: HDRF: Stream-based partitioning for power-law graphs. In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management (CIKM ’15), pp. 243–252. ACM, New York (2015). https://doi.org/10.1145/2806416.2806424
Rahimian, F., Payberah, A.H., Girdzijauskas, S., Jelasity, M., Haridi, S.: JA-BE-JA: A distributed algorithm for balanced graph partitioning. In: 2013 IEEE 7th International Conference on Self-Adaptive and Self-Organizing Systems, pp. 51–60 (2013)
Rahimian, F., Payberah, A.H., Girdzijauskas, S., Haridi S.: Distributed vertex-cut partitioning. In: Distributed Applications and Interoperable Systems. Springer, Heidelberg, pp. 186–200 (2014)
Roy, A., Bindschaedler, L., Malicevic, J., Zwaenepoel, W.: Chaos: Scale-out graph processing from secondary storage. In: Proceedings of the 25th Symposium on Operating Systems Principles (SOSP ’15), pp. 410–424. ACM, New York (2015). https://doi.org/10.1145/2815400.2815408
Sajjad, H.P., Payberah, A.H., Rahimian, F., Vlassov, V., Haridi, S.: Boosting vertex-cut partitioning for streaming graphs. In: 2016 IEEE International Congress on Big Data (BigData Congress), pp. 1–8 (2016). https://doi.org/10.1109/BigDataCongress.2016.10
Sala, A., Cao, L., Wilson, C., Zablit, R., Zheng, H., Zhao, B.Y.: Measurement-calibrated graph models for social network experiments. In: Proceedings of the 19th International Conference on World Wide Web (WWW ’10), pp. 861–870. ACM, New York (2010). https://doi.org/10.1145/1772690.1772778
Sala, A., Zhao, X., Wilson, C., Zheng, H., Zhao, B.Y.: Sharing graphs using differentially private graph models. In: Proceedings of the 2011 ACM SIGCOMM Conference on Internet Measurement Conference (IMC ’11), pp. 81–98. ACM, New York (2011).https://doi.org/10.1145/2068816.2068825
Salihoglu, S., Widom, J.: GPS: a graph processing system. In: Proceedings of the 25th International Conference on Scientific and Statistical Database Management (SSDBM), pp. 22:1–22:12. ACM, New York (2013). https://doi.org/10.1145/2484838.2484843
Salihoglu, S., Widom, J.: Optimizing graph algorithms on pregel-like systems. Proc. VLDB Endow. 7(7), 577–588 (2014). https://doi.org/10.14778/2732286.2732294
Sanders, P., Schulz, C.: Engineering multilevel graph partitioning algorithms. In: Algorithms—ESA 2011. Springer, Berlin, pp. 469–480 (2011)
Shang, Z., Yu, J.X.: Catch the wind: Graph workload balancing on cloud. In: 2013 IEEE 29th International Conference on Data Engineering (ICDE), pp. 553–564 (2013). https://doi.org/10.1109/ICDE.2013.6544855
Shi, Z., Li, J., Guo, P., Li, S., Feng, D., Su, Y.: Partitioning dynamic graph asynchronously with distributed fennel. Future Gen. Comput. Syst. 71, 32–42 (2017). https://doi.org/10.1016/j.future.2017.01.014
Slota, G.M., Madduri, K., Rajamanickam, S.: PuLP: scalable multi-objective multi-constraint partitioning for small-world networks. In: 2014 IEEE International Conference on Big Data (Big Data), pp. 481–490 (2014). https://doi.org/10.1109/BigData.2014.7004265
Stanton, I.: Streaming balanced graph partitioning algorithms for random graphs. In: Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA ’14), pp. 1287–1301 (2014). https://doi.org/10.1137/1.9781611973402.95
Stanton, I., Kliot, G.: Streaming graph partitioning for large distributed graphs. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’12), pp. 1222–1230. ACM, New York (2012). https://doi.org/10.1145/2339530.2339722
Sundaram, N., Satish, N., Patwary, M.M.A., Dulloor, S.R., Anderson, M.J., Vadlamudi, S.G., Das, D., Dubey, P.: GraphMat: high performance graph analytics made productive. Proc. VLDB Endow. 8(11), 1214–1225 (2015). https://doi.org/10.14778/2809974.2809983
Suri, S., Vassilvitskii, S.: Counting triangles and the curse of the last reducer. In: Proceedings of the 20th International Conference on World Wide Web (WWW ’11), pp. 607–614. ACM, New York (2011). https://doi.org/10.1145/1963405.1963491
Tatarowicz, A.L., Curino, C., Jones, E.P.C., Madden, S.: Lookup tables: fine-grained partitioning for distributed databases. In: 2012 IEEE 28th International Conference on Data Engineering, pp. 102–113. (2012). https://doi.org/10.1109/ICDE.2012.26
Tsourakakis, C.: Streaming graph partitioning in the planted partition model. In: Proceedings of the 2015 ACM on Conference on Online Social Networks (COSN ’15), pp. 27–35. ACM, New York (2015). https://doi.org/10.1145/2817946.2817950
Tsourakakis, C., Gkantsidis, C., Radunovic, B., Vojnovic, M.: Fennel: streaming graph partitioning for massive scale graphs. In: Proceedings of the 7th ACM International Conference on Web Search and Data Mining (WSDM ’14), pp. 333–342. ACM, New York (2014). https://doi.org/10.1145/2556195.2556213
Ugander, J., Backstrom, L.: Balanced label propagation for partitioning massive graphs. In: Proceedings of the Sixth ACM International Conference on Web Search and Data Mining (WSDM ’13), pp. 507–516. ACM, New York (2013). https://doi.org/10.1145/2433396.2433461
Valiant, L.G.: A bridging model for parallel computation. Commun. ACM 33(8), 103–111 (1990). https://doi.org/10.1145/79173.79181
Vaquero, L.M., Cuadrado, F., Logothetis, D., Martella, C.: xDGP: a dynamic graph processing system with adaptive partitioning (2013). arXiv:1309.1049
Verma, S., Leslie, L.M., Shin, Y., Gupta, I.: An experimental comparison of partitioning strategies in distributed graph processing. Proc. VLDB Endow. 10(5), 493–504 (2017). https://doi.org/10.14778/3055540.3055543
Wang, L., Xiao, Y., Shao, B., Wang, H.: How to partition a billion-node graph. In: 2014 IEEE 30th International Conference on Data Engineering, pp. 568–579 (2014). https://doi.org/10.1109/ICDE.2014.6816682
Watts, D.J., Strogatz, S.H.: Collective dynamics of ‘small-world’ networks. Nature 393(6684), 440–442 (1998). https://doi.org/10.1038/30918
White, T.: Hadoop: The Definitive Guide. O’Reilly Media, Boston (2012)
Wilson, C., Boe, B., Sala, A., Puttaswamy, K.P., Zhao, B.Y.: User interactions in social networks and their implications. In: Proceedings of the 4th ACM European Conference on Computer Systems (EuroSys ’09), pp. 205–218. ACM, New York (2009). https://doi.org/10.1145/1519065.1519089
Wu, M., Yang, F., Xue, J., Xiao, W., Miao, Y., Wei, L., Lin, H., Dai, Y., Zhou, L.: GraM: Scaling graph computation to the trillions. In: Proceedings of the Sixth ACM Symposium on Cloud Computing (SoCC ’15), pp. 408–421. ACM, New York (2015). https://doi.org/10.1145/2806777.2806849
Xiao, W., Xue, J., Miao, Y., Li, Z., Chen, C., Wu, M., Li, W., Zhou, L.: Tux2: Distributed graph computation for machine learning. In: Proceedings of the 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17), pp. 669–682 (2017)
Xiao-Shu, W., Yao, X., Huan, L.: Cloud computing oriented retrieval technology based on big data. In: 2015 IEEE Seventh International Conference on Measuring Technology and Mechatronics Automation, pp. 275–278 (2015). https://doi.org/10.1109/ICMTMA.2015.73
Xie, C., Li, W.J., Zhang, Z.: S-PowerGraph: streaming graph partitioning for natural graphs by vertex-cut (2015). arXiv:1511.02586
Xin, R.S., Gonzalez, J.E., Franklin, M.J., Stoica, I.: GraphX: a resilient distributed graph system on spark. In: First International Workshop on Graph Data Management Experiences and Systems (GRADES ’13), pp. 2:1–2:6. ACM, New York (2013). https://doi.org/10.1145/2484425.2484427
Xu, N., Chen, L., Cui, B.: LogGP: a log-based dynamic graph partitioning method. Proc. VLDB Endow. 7(14), 1917–1928 (2014)
Xu, N., Cui, B., Chen, L., Huang, Z., Shao, Y.: Heterogeneous environment aware streaming graph partitioning. IEEE Trans. Knowl. Data Eng. 27(6), 1560–1572 (2015)
Yan, D., Cheng, J., Lu, Y., Ng, W.: Effective techniques for message reduction and load balancing in distributed graph computation. In: Proceedings of the 24th International Conference on World Wide Web (WWW ’15), pp. 1307–1317. International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland (2015). https://doi.org/10.1145/2736277.2741096
Yang, S., Yan, X., Zong, B., Khan, A.: Towards effective partition management for large graphs. In: Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data (SIGMOD ’12), pp. 517–528. ACM, New York (2012). https://doi.org/10.1145/2213836.2213895
Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., Franklin, M.J., Shenker, S., Stoica, I.: Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation (NSDI’12), pp. 2–2. USENIX Association, Berkeley (2012)
Zeng, Z., Wu, B., Wang, H.: A parallel graph partitioning algorithm to speed up the large-scale distributed graph mining. In: Proceedings of the 1st International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications (BigMine ’12), pp. 61–68. ACM, New York (2012). https://doi.org/10.1145/2351316.2351325
Zhao, Y., Yoshigoe, K., Xie, M., Zhou, S., Seker, R., Bian, J.: LightGraph: lighten communication in distributed graph-parallel processing. In: 2014 IEEE International Congress on Big Data, pp. 717–724 (2014). https://doi.org/10.1109/BigData.Congress.2014.106
Zheng, A., Labrinidis, A., Chrysanthis, P.K.: Architecture-aware graph repartitioning for data-intensive scientific computing. In: 2014 IEEE International Conference on Big Data (Big Data), pp. 78–85 (2014). https://doi.org/10.1109/BigData.2014.7004375
Zheng, A., Labrinidis, A., Chrysanthis, P.K.: Planar: Parallel lightweight architecture-aware adaptive graph repartitioning. In: 2016 IEEE 32nd International Conference on Data Engineering (ICDE), pp. 121–132 (2016a). https://doi.org/10.1109/ICDE.2016.7498234
Zheng, A., Labrinidis, A., Pisciuneri, P.H., Chrysanthis, P.K., Givi, P.: Paragon: parallel architecture-aware graph partition refinement algorithm. In: EDBT, pp. 365–376 (2016b)
Zheng, A., Labrinidis, A., Faloutsos, C.: Skew-resistant graph partitioning. In: 2017 IEEE 33rd International Conference on Data Engineering (ICDE), pp. 151–154 (2017). https://doi.org/10.1109/ICDE.2017.62
Zhu, X., Ghahramani, Z.: Learning from labeled and unlabeled data with label propagation. Tech. Rep., Citeseer (2002)
Zhu, X., Chen, W., Zheng, W., Ma, X.: Gemini: A computation-centric distributed graph processing system. In: Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), pp. 301–316 (2016)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Mazaheri Soudani, N., Fatemi, A. & Nematbakhsh, M. An investigation of big graph partitioning methods for distribution of graphs in vertex-centric systems. Distrib Parallel Databases 38, 1–29 (2020). https://doi.org/10.1007/s10619-019-07256-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10619-019-07256-z