Peer-to-Peer Multi-class Boosting

István Hegedűs¹⁹,
Róbert Busa-Fekete^19,20,
Róbert Ormándi¹⁹,
Márk Jelasity¹⁹ &
…
Balázs Kégl^20,21

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7484))

Included in the following conference series:

European Conference on Parallel Processing

3098 Accesses

Abstract

We focus on the problem of data mining over large-scale fully distributed databases, where each node stores only one data record. We assume that a data record is never allowed to leave the node it is stored at. Possible motivations for this assumption include privacy or a lack of a centralized infrastructure. To tackle this problem, earlier we proposed the generic gossip learning framework (GoLF), but so far we have studied only basic linear algorithms. In this paper we implement the well-known boosting technique in GoLF. Boosting techniques have attracted growing attention in machine learning due to their outstanding performance in many practical applications. Here, we present an implementation of a boosting algorithm that is based on FilterBoost. Our main algorithmic contribution is a derivation of a pure online multi-class version of FilterBoost, so that it can be employed in GoLF. We also propose improvements to GoLF, with the aim of maximizing the diversity of the evolving models gossiped in the network, a feature that we show to be important. We evaluate the robustness and the convergence speed of the algorithm empirically over three benchmark databases.We compare the algorithm with the sequential AdaBoost algorithm and we test its performance in a failure scenario involving message drop and delay, and node churn.

M. Jelasity was supported by the Bolyai Scholarship of the Hungarian Academy of Sciences. This work was partially supported by the FET programme FP7-COSI-ICT of the European Commission through project QLectives (grant no.: 231200) and by the ANR-2010-COSI-002 grant of the French National Research Agency.

Download to read the full chapter text

Chapter PDF

TOMBoost: a topic modeling based boosting approach for learning with class imbalance

Article 12 October 2022

Gossip Learning as a Decentralized Alternative to Federated Learning

Nearest Neighbors Graph Construction: Peer Sampling to the Rescue

Keywords

References

Filelist (2005), http://www.filelist.org
Asuncion, A.U., Smyth, P., Welling, M.: Asynchronous distributed estimation of topic models for document analysis. Statistical Methodology 8(1), 3–17 (2011)
Article MathSciNet Google Scholar
Babenko, B., Yang, M., Belongie, S.: A family of online boosting algorithms. In: Computer Vision Workshops (ICCV Workshops), pp. 1346–1353 (2009)
Google Scholar
Bottou, L.: Large-scale machine learning with stochastic gradient descent. In: Intl. Conf. on Computational Statistics, vol. 19, pp. 177–187 (2010)
Google Scholar
Bradley, J., Schapire, R.: FilterBoost: Regression and classification on large datasets. In: Advances in Neural Information Processing Systems, vol. 20. The MIT Press (2008)
Google Scholar
Collins, M., Schapire, R., Singer, Y.: Logistic regression, AdaBoost and Bregman distances. Machine Learning 48, 253–285 (2002)
Article MATH Google Scholar
Datta, S., Bhaduri, K., Giannella, C., Wolff, R., Kargupta, H.: Distributed data mining in peer-to-peer networks. IEEE Internet Comp. 10(4), 18–26 (2006)
Article Google Scholar
Fan, W., Stolfo, S.J., Zhang, J.: The application of AdaBoost for distributed, scalable and on-line learning. In: Proc. 5th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, pp. 362–366 (1999)
Google Scholar
Frank, A., Asuncion, A.: UCI machine learning repository (2010)
Google Scholar
Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. In: Machine Learning: Proc. Thirteenth Intl. Conf., pp. 148–156 (1996)
Google Scholar
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. of Comp. and Syst. Sci. 55, 119–139 (1997)
Article MathSciNet MATH Google Scholar
Friedman, J.: Stochastic gradient boosting. Computational Statistics and Data Analysis 38(4), 367–378 (2002)
Article MathSciNet MATH Google Scholar
Jelasity, M., Canright, G., Engø-Monsen, K.: Asynchronous Distributed Power Iteration with Gossip-Based Normalization. In: Kermarrec, A.-M., Bougé, L., Priol, T. (eds.) Euro-Par 2007. LNCS, vol. 4641, pp. 514–525. Springer, Heidelberg (2007)
Chapter Google Scholar
Jelasity, M., Montresor, A., Babaoglu, O.: Gossip-based aggregation in large dynamic networks. ACM Trans. on Computer Systems 23(3), 219–252 (2005)
Article Google Scholar
Kégl, B., Busa-Fekete, R.: Boosting products of base classifiers. In: Intl. Conf. on Machine Learning, Montreal, Canada, vol. 26, pp. 497–504 (2009)
Google Scholar
Kempe, D., Dobra, A., Gehrke, J.: Gossip-based computation of aggregate information. In: Proc. 44th Annual IEEE Symposium on Foundations of Computer Science (FOCS 2003), pp. 482–491. IEEE Computer Society (2003)
Google Scholar
Kowalczyk, W., Vlassis, N.: Newscast EM. In: 17th Advances in Neural Information Processing Systems (NIPS), pp. 713–720. MIT Press, Cambridge (2005)
Google Scholar
Luo, P., Xiong, H., Lü, K., Shi, Z.: Distributed classification in peer-to-peer networks. In: Proc. 13th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining (KDD 2007, pp. 968–976. ACM, New York (2007)
Chapter Google Scholar
Ormándi, R., Hegedűs, I., Jelasity, M.: Asynchronous Peer-to-Peer Data Mining with Stochastic Gradient Descent. In: Jeannot, E., Namyst, R., Roman, J. (eds.) Euro-Par 2011, Part I. LNCS, vol. 6852, pp. 528–540. Springer, Heidelberg (2011)
Chapter Google Scholar
Ormándi, R., Hegedüs, I., Jelasity, M.: Efficient p2p ensemble learning with linear models on fully distributed data. CoRR abs/1109.1396 (2011)
Google Scholar
Oza, N., Russell, S.: Online bagging and boosting. In: Proc. Eighth Intl. Workshop on Artificial Intelligence and Statistics (2001)
Google Scholar
Park, B.H., Kargupta, H.: Distributed data mining: Algorithms, systems, and applications. In: Ye, N. (ed.) The Handbook of Data Mining. CRC Press (2003)
Google Scholar
PeerSim, http://peersim.sourceforge.net/
Schapire, R.E., Singer, Y.: Improved boosting algorithms using confidence-rated predictions. Machine Learning 37(3), 297–336 (1999)
Article MATH Google Scholar
Siersdorfer, S., Sizov, S.: Automatic Document Organization in a P2P Environment. In: Lalmas, M., MacFarlane, A., Rüger, S.M., Tombros, A., Tsikrika, T., Yavlinsky, A. (eds.) ECIR 2006. LNCS, vol. 3936, pp. 265–276. Springer, Heidelberg (2006)
Chapter Google Scholar
Tölgyesi, N., Jelasity, M.: Adaptive Peer Sampling with Newscast. In: Sips, H., Epema, D., Lin, H.-X. (eds.) Euro-Par 2009. LNCS, vol. 5704, pp. 523–534. Springer, Heidelberg (2009)
Chapter Google Scholar
Widrow, B., Hoff, M.E.: Adaptive Switching Circuits. In: 1960 IRE WESCON Convention Record, vol. 4, pp. 96–104 (1960)
Google Scholar

Download references

Author information

Authors and Affiliations

Research Group on AI, Hungarian Acad. Sci. and Univ. of Szeged, Hungary
István Hegedűs, Róbert Busa-Fekete, Róbert Ormándi & Márk Jelasity
Linear Accelerator Laboratory (LAL), University of Paris-Sud, CNRS, Orsay, 91898, France
Róbert Busa-Fekete & Balázs Kégl
Computer Science Laboratory (LRI), University of Paris-Sud, CNRS and INRIA-Saclay, 91405, Orsay, France
Balázs Kégl

Authors

István Hegedűs
View author publications
You can also search for this author in PubMed Google Scholar
Róbert Busa-Fekete
View author publications
You can also search for this author in PubMed Google Scholar
Róbert Ormándi
View author publications
You can also search for this author in PubMed Google Scholar
Márk Jelasity
View author publications
You can also search for this author in PubMed Google Scholar
Balázs Kégl
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Patras, Computer Technology Institute and Press “Diophantus”,, N. Kazantzaki, 26504, Rio, Greece
Christos Kaklamanis
University of Patras, University Building B, 26504, Rio, Greece
Theodore Papatheodorou
Computer Technology Institute and Press “Diophantus”, University of Patras, N. Kazantzaki, 26504, Rio, Greece
Paul G. Spirakis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hegedűs, I., Busa-Fekete, R., Ormándi, R., Jelasity, M., Kégl, B. (2012). Peer-to-Peer Multi-class Boosting. In: Kaklamanis, C., Papatheodorou, T., Spirakis, P.G. (eds) Euro-Par 2012 Parallel Processing. Euro-Par 2012. Lecture Notes in Computer Science, vol 7484. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32820-6_39

Download citation

DOI: https://doi.org/10.1007/978-3-642-32820-6_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32819-0
Online ISBN: 978-3-642-32820-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Peer-to-Peer Multi-class Boosting

Abstract

Chapter PDF

Similar content being viewed by others

TOMBoost: a topic modeling based boosting approach for learning with class imbalance

Gossip Learning as a Decentralized Alternative to Federated Learning

Nearest Neighbors Graph Construction: Peer Sampling to the Rescue

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Peer-to-Peer Multi-class Boosting

Abstract

Chapter PDF

Similar content being viewed by others

TOMBoost: a topic modeling based boosting approach for learning with class imbalance

Gossip Learning as a Decentralized Alternative to Federated Learning

Nearest Neighbors Graph Construction: Peer Sampling to the Rescue

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation