Abstract
Federated learning (FL) aims to learn a model with privacy protection through a distributed scheme over many clients. In FL, an important problem is to reduce the transmission quantity between clients and parameter server during gradient uploading. Because FL environment is not stable and requires enough client responses to be collected within a certain period of time, traditional model compression practices are not entirely suitable for FL setting. For instance, both design of the low-rank filter and the algorithm used to pursue sparse neural network generally need to perform more training rounds locally to ensure that the accuracy of model is not excessively lost. To breakthrough transmission bottleneck, we propose low rank communication Fedlr to compress whole neural network in clients reporting phase. Our innovation is to propose the concept of optimal compression rate. In addition, two measures are introduced to make up accuracy loss caused by truncation: training low rank parameter matrix and using iterative averaging. The algorithm is verified by experimental evaluation on public datasets. In particular, CNN model parameters training on the MNIST dataset can be compressed 32 times and lose only 2% of accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
McMahan, H.B., Moore, E., Ramage, D., Hampson, S.: Communication-efficient learning of deep networks from decentralized data. arXiv preprint arXiv:1602.05629 (2016)
Liu, Y., et al.: A communication efficient vertical federated learning framework. arXiv preprint arXiv:1912.11187 (2019)
Reisizadeh, A., Mokhtari, A., Hassani, H., Jadbabaie, A., Pedarsani, R.: FedPAQ: a communication-efficient federated learning method with periodic averaging and quantization. arXiv preprint arXiv:1909.13014 (2019)
Sattler, F., Wiedemann, S., Müller, K.R., et al.: Robust and communication-efficient federated learning from non-IID data. arXiv preprint arXiv:1903.02891 (2019)
Srinivas, S., Subramanya, A., Venkatesh Babu, R.: Training sparse neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 138–145 (2017)
Liu, Z., Li, J., Shen, Z., Huang, G., Yan, S., Zhang, C.: Learning efficient convolutional networks through network slimming. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2736–2744 (2017)
Zhu, M., Gupta, S.: To prune, or not to prune: exploring the efficacy of pruning for model compression. arXiv preprint arXiv:1710.01878 (2017)
Frankle, J., Carbin, M.: The lottery ticket hypothesis: finding sparse, trainable neural networks. arXiv preprint arXiv:1803.03635 (2018)
Luo, J.H., Wu, J., Lin, W.: Thinet: a filter level pruning method for deep neural network compression. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5058–5066 (2017)
Ullrich, K., Meeds, E., Welling, M.: Soft weight-sharing for neural network compression. arXiv preprint arXiv:1702.04008 (2017)
Denton, E.L., Zaremba, W., Bruna, J., LeCun, Y., Fergus, R.: Exploiting linear structure within convolutional networks for efficient evaluation. In: Advances in Neural Information Processing Systems, pp. 1269–1277 (2014)
Jaderberg, M., Vedaldi, A., Zisserman, A.: Speeding up convolutional neural networks with low rank expansions. arXiv preprint arXiv:1405.3866 (2014)
Guo, Y.: A survey on methods and theories of quantized neural networks. arXiv preprint arXiv:1808.04752 (2018)
Shayer, O., Levi, D., Fetaya, E.: Learning discrete weights using the local reparameterization trick. arXiv preprint arXiv:1710.07739 (2017)
Han, S., Mao, H., Dally, W.J.: Deep compression: compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149 (2015)
Konečný, J., Richtárik, P.: Randomized distributed mean estimation: accuracy vs. communication. Front. Appl. Math. Stat. 4, 62 (2018)
Alistarh, D., Grubic, D., Li, J., Tomioka, R., Vojnovic, M.: QSGD: communication-efficient SGD via gradient quantization and encoding. In: Advances in Neural Information Processing Systems, pp. 1709–1720 (2017)
Horvath, S., Ho, C.Y., Horvath, L., Sahu, A.N., Canini, M., Richtarik, P.: Natural compression for distributed deep learning. arXiv preprint arXiv:1905.10988 (2019)
Wu, J., Huang, W., Huang, J., Zhang, T.: Error compensated quantized SGD and its applications to large-scale distributed optimization. arXiv preprint arXiv:1806.08054 (2018)
Suresh, A.T., Yu, F.X., Kumar, S., McMahan, H.B.: Distributed mean estimation with limited communication. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 3329–3337. JMLR. org (2017)
Caldas, S., Konečny, J., McMahan, H.B., Talwalkar, A.: Expanding the reach of federated learning by reducing client resource requirements. arXiv preprint arXiv:1812.07210 (2018)
Prabhavalkar, R., Alsharif, O., Bruguier, A., McGraw, L.: On the compression of recurrent neural networks with an application to LVCSR acoustic modeling for embedded speech recognition. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5970–5974. IEEE (2016)
Langeberg, P., Balda, E.R., Behboodi, A., Mathar, R.: On the effect of low-rank weights on adversarial robustness of neural networks. arXiv preprint arXiv:1901.10371 (2019)
Kalman, D.: A singularly valuable decomposition: the SVD of a matrix. Coll. Math. J. 27(1), 2–23 (1996)
Koneçný, J., McMahan, H.B., Yu, F.X., Richtárik, P., Suresh, A.T., Bacon, D.: Federated learning: strategies for improving communication efficiency. arXiv preprint arXiv:1610.05492 (2016)
Ciliberto, C., Stamos, D., Pontil, M.: Reexamining low rank matrix factorization for trace norm regularization. arXiv preprint arXiv:1706.08934 (2017)
Bottou, L., Curtis, F.E., Nocedal, J.: Optimization methods for large-scale machine learning. SIAM Rev. 60(2), 223–311 (2018)
Jain, P., Kakade, S.M., Kidambi, R., Netrapalli, P., Sidford, A.: Parallelizing stochastic approximation through mini-batching and tail-averaging. STAT 1050, 12 (2016)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images (2009)
Springenberg, J.T., Dosovitskiy, A., Brox, T., Riedmiller, M.: Striving for simplicity: the all convolutional net. arXiv preprint arXiv:1412.6806 (2014)
Acknowledgement
This work was supported by National Key R&D Program of China (No. 2017YFC0803700), NSFC grants (No. 61532021 and 61972155), Shanghai Knowledge Service Platform Project (No. ZF1213) and Zhejiang Lab (No. 2019KB0AB04).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhou, H., Cheng, J., Wang, X., Jin, B. (2020). Low Rank Communication for Federated Learning. In: Nah, Y., Kim, C., Kim, SY., Moon, YS., Whang, S.E. (eds) Database Systems for Advanced Applications. DASFAA 2020 International Workshops. DASFAA 2020. Lecture Notes in Computer Science(), vol 12115. Springer, Cham. https://doi.org/10.1007/978-3-030-59413-8_1
Download citation
DOI: https://doi.org/10.1007/978-3-030-59413-8_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59412-1
Online ISBN: 978-3-030-59413-8
eBook Packages: Computer ScienceComputer Science (R0)