Abstract
Messages exchanged between the aggregator and the parties in a federated learning system can be corrupted due to machine glitches or malicious intents. This is known as a Byzantine failure or Byzantine attack. As such, in many federated learning settings, replies sent by participants may not be trusted fully. A set of competitors may work collaboratively to detect fraud via federated learning where each party provides local gradients that an aggregator uses to update a global model. This global model can be corrupted when one or more parties send malicious gradients. This necessitates the use of robust methods for aggregating gradients that mitigate the adverse effects of Byzantine replies. In this chapter, we focus on mitigating the Byzantine effect when training neural networks in a federated learning setting with a focus on the effect of having parties with highly disparate training datasets. Disparate training datasets or non-IID datasets may take the form of parties with imbalanced proportions of the training labels or different ranges of feature values. We introduce several state-of-the-art robust gradient aggregation algorithms and examine their performances as defenses against various attack settings. We empirically show the limitations of some existing robust aggregation algorithms, especially under certain Byzantine attacks and when parties admit non-IID data distributions. Moreover, we show that LayerwisE Gradient AggregaTiOn (LEGATO) is more computationally efficient than many existing robust aggregation algorithms and more generally robust across a variety of attack settings.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Blanchard P, El Mhamdi EM, Guerraoui R, Stainer J (2017) Machine learning with adversaries: byzantine tolerant gradient descent. Advances in Neural Information Processing Systems, p 30
Charikar M, Steinhardt J, Valiant G (2017, June) Learning from untrusted data. In: Proceedings of the 49th annual ACM SIGACT symposium on theory of computing, pp 47–60
Chen, Y., Su, L., & Xu, J. (2017). Distributed statistical machine learning in adversarial settings: Byzantine gradient descent. Proceedings of the ACM on measurement and analysis of computing systems, 1(2), 1–25
El-Mhamdi, E. M., Guerraoui, R., & Rouault, S. (2020). Distributed momentum for byzantine-resilient learning. arXiv preprint arXiv:2003.00010
Fung, C., Yoon, C. J., & Beschastnikh, I. (2018). Mitigating sybils in federated learning poisoning. arXiv preprint arXiv:1808.04866
Gao D, Liu Y, Huang A, Ju C, Yu H, Yang Q (2019) Privacy-preserving heterogeneous federated transfer learning. In: 2019 IEEE international conference on big data (Big Data). IEEE, pp 2552–2559
Krishnaswamy R, Li S, Sandeep S (2018, June) Constant approximation for k-median and k-means with outliers via iterative rounding. In: Proceedings of the 50th annual ACM SIGACT symposium on theory of computing, pp 646–659
Lamport L, Shostak R, Pease M (1982) The byzantine generals problem. ACM Trans Program Lang Syst 4(3):382–401. https://doi.org/10.1145/357172.357176
Lecun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86:2278–2324
Li T, Sahu AK, Zaheer M, Sanjabi M, Talwalkar A, Smith V (2018) Federated optimization in heterogeneous networks. Preprint. arXiv:1812.06127
Ludwig H, Baracaldo N, Thomas G, Zhou Y, Anwar A, Rajamoni S, Ong Y, Radhakrishnan J, Verma A, Sinn M et al (2020) IBM federated learning: an enterprise framework white paper v0. 1. Preprint. arXiv:2007.10987
McMahan B, Moore E, Ramage D, Hampson S, y Arcas, B. A. (2017, April) Communication-efficient learning of deep networks from decentralized data. In: Artificial intelligence and statistics, PMLR, pp 1273–1282
Guerraoui R, Rouault S (2018, July) The hidden vulnerability of distributed learning in byzantium. In: International conference on machine learning. PMLR, pp 3521–3530
Muñoz-González, L., Co, K. T., & Lupu, E. C. (2019). Byzantine-robust federated machine learning through adaptive model averaging. arXiv preprint arXiv:1909.05125
Pillutla, K., Kakade, S. M., & Harchaoui, Z. (2019). Robust aggregation for federated learning. arXiv preprint arXiv:1912.13445
Rajput S, Wang H, Charles ZB, Papailiopoulos DS (2019) DETOX: A redundancy-based framework for faster and more robust gradient aggregation. CoRR abs/1907.12205. http://arxiv.org/abs/1907.12205
Varma K, Zhou Y, Baracaldo N, Anwar A (2021) Legato: A layerwise gradient aggregation algorithm for mitigating byzantine attacks in federated learning
Wang H, Yurochkin M, Sun Y, Papailiopoulos D, Khazaeni Y (2020) Federated learning with matched averaging
Xia Q, Tao Z, Hao Z, Li Q (2019) Faba: An algorithm for fast aggregation against byzantine attacks in distributed neural networks. In: Proceedings of the twenty-eighth international joint conference on artificial intelligence, IJCAI-19. International joint conferences on artificial intelligence organization, pp 4824–4830. https://doi.org/10.24963/ijcai.2019/670
Xie, C., Koyejo, O., & Gupta, I. (2018). Generalized byzantine-tolerant sgd. arXiv preprint arXiv:1802.10116
Xie C, Koyejo S, Gupta I (2019, May) Zeno: distributed stochastic gradient descent with suspicion-based fault-tolerance. In: International conference on machine learning. PMLR, pp 6893–6901
Xie C, Koyejo O, Gupta I (2020, August) Fall of empires: breaking byzantine-tolerant sgd by inner product manipulation. In: Uncertainty in artificial intelligence. PMLR, pp 261–270
Yin D, Chen Y, Kannan R, Bartlett P (2018, July) Byzantine-robust distributed learning: towards optimal statistical rates. In: International conference on machine learning. PMLR, pp 5650–5659
Zhang, C., Bengio, S., & Singer, Y. (2019). Are all layers created equal?. arXiv preprint arXiv:1902.01996
Zhao, Y., Li, M., Lai, L., Suda, N., Civin, D., & Chandra, V. (2018). Federated learning with non-iid data. arXiv preprint arXiv:1806.00582
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Zhou, Y., Baracaldo, N., Anwar, A., Varma, K. (2022). Dealing with Byzantine Threats to Neural Networks. In: Ludwig, H., Baracaldo, N. (eds) Federated Learning. Springer, Cham. https://doi.org/10.1007/978-3-030-96896-0_17
Download citation
DOI: https://doi.org/10.1007/978-3-030-96896-0_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-96895-3
Online ISBN: 978-3-030-96896-0
eBook Packages: Computer ScienceComputer Science (R0)