Abstract
BlueGene/L is currently in the pole position on the Top500 list[4]. In its full configuration the system will leverage 65,536 compute nodes. Application scalability is a crucial issue for a system of such size. On BlueGene/L scalability is made possible through the efficient exploitation of special communication. The BlueGene/L system software provides its own optimized version for collective communication routines in addition to the general purpose MPICH2 implementation. The collective network is a natural platform for reduction operations due to its built-in arithmetic units. Unfortunately ALUs of the collective network can handle only fixed point operands. Therefore efficient exploitation of that network for the purpose of floating point reductions is a challenging task. In this paper we present our experiences with implementing an efficient collective network algorithm for Allreduce sums of floating point numbers.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Adiga, N.R., et al.: An overview of the BlueGene/L supercomputer. In: SC 2002 – High Performance Networking and Computing, Baltimore, MD (November 2002)
Almási, G., Bellofatto, R., Brunheroto, J., Caşcaval, C., nos, J.G.C., Ceze, L., Crumley, P., Erway, C., Gagliano, J., Lieber, D., Martorell, X., Moreira, J.E., Sanomiya, A., Strauss, K.: An overview of the BlueGene/L system software organization. In: Proceedings of Euro-Par 2003 Conference, Klagenfurt, Austria, August 2003. LNCS. Springer, Heidelberg (2003)
Almasi, G., et al.: Cellular supercomputing with system-on-a-chip. In: IEEE International Solid-state Circuits Conference ISSCC (2001)
Dongarra, J., Meuer, H.-W., Strohmaier, E.: TOP500 Supercomputer Sites. Available in Web page at, http://www.netlib.org/benchmark/top500.html
Shuler, L., Riesen, R., Jong, C., van Dresser, D., Maccabe, A.B., Fisk, L.A., Stallcup, T.M.: The PUMA operating system for massively parallel computers. In: Proceedings of the Intel Supercomputer Users’ Group. 1995 Annual North America Users’ Conference (June 1995)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Almási, G., Dózsa, G., Erway, C.C., Steinmacher-Burow, B. (2005). Efficient Implementation of Allreduce on BlueGene/L Collective Network. In: Di Martino, B., Kranzlmüller, D., Dongarra, J. (eds) Recent Advances in Parallel Virtual Machine and Message Passing Interface. EuroPVM/MPI 2005. Lecture Notes in Computer Science, vol 3666. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11557265_12
Download citation
DOI: https://doi.org/10.1007/11557265_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29009-4
Online ISBN: 978-3-540-31943-6
eBook Packages: Computer ScienceComputer Science (R0)