Exploring the distributed learning on federated learning and cluster computing via convolutional neural networks

Jia-Wei Chang¹,
Jason C. Hung¹ &
Ting-Hong Chu¹

489 Accesses
1 Citation
Explore all metrics

Abstract

Distributed learning has led to the development of federated learning and cluster computing; however, the two methods are very different. Therefore, this study uses a deep learning approach to investigate the distinction between federated learning and cluster computing. Specifically, the LeNet convolutional neural network model is used. Three frameworks were tested, including Spark on Hadoop with four nodes, PySyft with four nodes, and native PyTorch with a single node. The results show that Spark on Hadoop can accelerate performance and facilitate applications that have large memory requirements. In addition, PySyft can protect data privacy but is slower than Spark on Hadoop and native PyTorch. The three frameworks performed comparable accuracy for IID distributions, while PySyft had the worst for non-IID data. Therefore, if excluding sensitive data does not significantly affect training results, the results suggest that cluster computing, Spark on Hadoop, is recommended. However, federated learning, PySyft, is recommended in cases where sensitive data is required for training or positively affects training results, and time constraints are not an issue.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Instance segmentation on distributed deep learning big data cluster

Article Open access 02 January 2024

Training Effective Neural Networks on Structured Data with Federated Learning

From distributed machine learning to federated learning: a survey

Article 22 March 2022

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data availability

The data used in this paper are publicly accessible after agreement, and details are available in the references section.

References

Gupta O, Raskar R (2018) Distributed learning of deep neural network over multiple agents. J Netw Comput Appl 116:1–8
Article Google Scholar
Liu B, Ding Z (2022) A distributed deep reinforcement learning method for traffic light control. Neurocomputing 490:390–399
Article Google Scholar
Duan Y, Wang N, Wu J (2021) Minimizing training time of distributed machine learning by reducing data communication. IEEE Trans Netw Sci Eng 8(2):1802–1814
Article MathSciNet Google Scholar
Gosselin R, Vieu L, Loukil F, Benoit A (2022) Privacy and security in federated learning: a survey. Appl Sci 12(19):9901
Article Google Scholar
Antunes RS, André da Costa C, Küderle A, Yari IA, Eskofier B (2022) Federated learning for healthcare: systematic review and architecture proposal. ACM Trans Intell Syst Technol (TIST) 13(4):1–23
Article Google Scholar
Imteaj A, Amini MH (2022) Leveraging asynchronous federated learning to predict customers financial distress. Intell Syst Appl 14:200064
Google Scholar
Xiong K (2009) Multiple priority customer service guarantees in cluster computing. In: 2009 IEEE International symposium on parallel and distributed processing (pp. 1–12). IEEE
Zaharia M, Xin RS, Wendell P, Das T, Armbrust M, Dave A, Meng X, Rosen J, Venkataraman S, Franklin MJ, Ghodsi A, Gonzalez J, Shenker S, Stoica I (2016) Apache spark: a unified engine for big data processing. Commun ACM 59(11):56–65
Article Google Scholar
Ohno Y, Morishima S, Matsutani H (2016) Accelerating spark RDD operations with local and remote GPU devices. In: 2016 IEEE 22nd international conference on parallel and distributed systems (ICPADS) (pp. 791–799). IEEE
Mothukuri V, Parizi RM, Pouriyeh S, Huang Y, Dehghantanha A, Srivastava G (2021) A survey on security and privacy of federated learning. Futur Gener Comput Syst 115:619–640
Article Google Scholar
Sattler F, Wiedemann S, Müller KR, Samek W (2019) Robust and communication-efficient federated learning from non-iid data. IEEE Trans Neural Netw Learn Syst 31(9):3400–3413
Article Google Scholar
Xu G, Shen C, Liu M, Zhang F, Shen W (2017) A user behavior prediction model based on parallel neural network and k-nearest neighbor algorithms. Clust Comput 20:1703–1715
Article Google Scholar
Ziller A, Trask A, Lopardo A, Szymkow B, Wagner B, Bluemke E, Nounahon J-M, Passerat-Palmbach J, Prakash K, Rose N, Ryffel T, Reza ZN, Kaissis G (2021) Pysyft: a library for easy federated learning. Federated learning systems. Towards next-generation AI. Springer, Cham, pp 111–139
Chapter Google Scholar
LeCun Y, Cortes C, Burges CJ (1998) The MNIST database of handwritten digits. http://yann.lecun.com/exdb/mnist/
Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Article Google Scholar
Jia Z, Zaharia M, Aiken A (2019) Beyond data and model parallelism for deep neural networks. Proc Mach Learn Syst 1:1–13
Google Scholar
Zhang H, Li Y, Deng Z, Liang X, Carin L, Xing E (2020) Autosync: learning to synchronize for data-parallel distributed deep learning. Adv Neural Inf Process Syst 33:906–917
Google Scholar
Lin X, Wang P, Wu B (2013) Log analysis in cloud computing environment with Hadoop and Spark. In: 2013 5th IEEE International conference on broadband network & multimedia technology (pp. 273–276). IEEE
Salloum S, Dautov R, Chen X, Peng PX, Huang JZ (2016) Big data analytics on apache spark. Int J Data Sci Analyt 1:145–164
Article Google Scholar
Lal DK, Suman U (2019) Towards comparison of real time stream processing engines. In: 2019 IEEE conference on information and communication technology (pp. 1–5). IEEE
Shvachko K, Kuang H, Radia S, Chansler R (2010) The hadoop distributed file system. In 2010 IEEE 26th symposium on mass storage systems and technologies (MSST) (pp. 1–10). IEEE
Zaharia M, Chowdhury M, Das T, Dave A, Ma J, McCauly M, Franklin MJ, Shenker S, Stoica I (2012) Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: Presented as part of the 9th {USENIX} symposium on networked systems design and implementation ({NSDI} 12) (pp. 15–28)
Karau H, Konwinski A, Wendell P, Zaharia M (2015) Learning spark: lightning-fast big data analysis. O’Reilly Media Inc, Sebastopol
Google Scholar
Zaharia M, Chowdhury M, Franklin MJ, Shenker S, Stoica I (2010) Spark: cluster computing with working sets. HotCloud 10(10–10):95
Google Scholar
Rathore MM, Son H, Ahmad A, Paul A, Jeon G (2018) Real-time big data stream processing using GPU with spark over hadoop ecosystem. Int J Parallel Prog 46:630–646
Article Google Scholar
Yang Q, Liu Y, Chen T, Tong Y (2019) Federated machine learning: concept and applications. ACM Trans Intell Syst Technol (TIST) 10(2):1–19
Article Google Scholar
Lindell Y (2020) Secure multiparty computation. Commun ACM 64(1):86–96
Article Google Scholar
Yi X, Paulet R, Bertino E, Yi X, Paulet R, Bertino E (2014) Homomorphic encryption. Springer International Publishing, Cham, pp 27–46
Book Google Scholar
Abadi M, Chu A, Goodfellow I, McMahan HB, Mironov I, Talwar K, Zhang L (2016) Deep learning with differential privacy. In: Proceedings of the 2016 ACM SIGSAC conference on computer and communications security (pp. 308–318)
Google (2019) Tensorflow federated. [Online]. Available: https://www.tensorflow.org/federated
W. B. F. Tech Fate. (2019) [Online]. Available: https://github.com/WeBankFinTech/FATE
Ramaswamy S, Mathews R, Rao K, Beaufays F (2019) Federated learning for emoji prediction in a mobile keyboard. arXiv preprint arXiv:1906.04329
Webank (2019) FedAI ecosystem. https://cn.fedai.org/cases/.2019, Accessed 2019
Zhu X, Wang J, Hong Z, Xia T, Xiao J (2019) Federated learning of unsegmented chinese text recognition model. In: 2019 IEEE 31st International conference on tools with artificial intelligence (ICTAI) (pp. 1341–1345). IEEE
Konečný J, McMahan HB, Yu FX, Richtárik P, Suresh AT, Bacon D (2016) Federated learning: strategies for improving communication efficiency. arXiv preprint arXiv:1610.05492
McMahan B, Moore E, Ramage D, Hampson S, y Arcas BA (2017) Communication-efficient learning of deep networks from decentralized data. Artif Intell Stat 54:1273–1282
Google Scholar
Li T, Sahu AK, Zaheer M, Sanjabi M, Talwalkar A, Smith V (2020) Federated optimization in heterogeneous networks. Proc Mach Learn Syst 2:429–450
Google Scholar
Mahmud MS, Huang JZ, Salloum S, Emara TZ, Sadatdiynov K (2020) A survey of data partitioning and sampling methods to support big data analysis. Big Data Min Analyt 3(2):85–101
Article Google Scholar
Guo X, Pimentel AD, Stefanov T (2023) automated exploration and implementation of distributed CNN inference at the edge. IEEE Internet Things J 10(7):5843–5858
Article Google Scholar
Azab M, Samir M, Samir E (2022) “MystifY”: a proactive moving-target defense for a resilient SDN controller in software defined CPS. Comput Commun 189:205–220
Article Google Scholar

Download references

Acknowledgements

This work was supported by the National Science and Technology Council, Taiwan, R.O.C. [grant number NSTC 111-2221-E-025-008].

Author information

Authors and Affiliations

National Taichung University of Science and Technology, Taichung City, Taiwan
Jia-Wei Chang, Jason C. Hung & Ting-Hong Chu

Authors

Jia-Wei Chang
View author publications
You can also search for this author in PubMed Google Scholar
Jason C. Hung
View author publications
You can also search for this author in PubMed Google Scholar
Ting-Hong Chu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jason C. Hung.

Ethics declarations

Conflict of interest

The authors declared that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Chang, JW., Hung, J.C. & Chu, TH. Exploring the distributed learning on federated learning and cluster computing via convolutional neural networks. Neural Comput & Applic 36, 2141–2153 (2024). https://doi.org/10.1007/s00521-023-09160-1

Download citation

Received: 22 March 2023
Accepted: 20 October 2023
Published: 13 November 2023
Issue Date: February 2024
DOI: https://doi.org/10.1007/s00521-023-09160-1

Exploring the distributed learning on federated learning and cluster computing via convolutional neural networks

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Instance segmentation on distributed deep learning big data cluster

Training Effective Neural Networks on Structured Data with Federated Learning

From distributed machine learning to federated learning: a survey

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Exploring the distributed learning on federated learning and cluster computing via convolutional neural networks

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Instance segmentation on distributed deep learning big data cluster

Training Effective Neural Networks on Structured Data with Federated Learning

From distributed machine learning to federated learning: a survey

Explore related subjects

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now