User Co-scheduling for MPI+OpenMP Applications Using OpenMP Semantics

Antoine Capra¹⁸,
Patrick Carribault²⁰,
Jean-Baptiste Besnard¹⁸,
Allen D. Malony¹⁹,
Marc Pérache²⁰ &
…
Julien Jaeger²⁰

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 10468))

Included in the following conference series:

International Workshop on OpenMP

1016 Accesses

Abstract

The evolution of parallel architectures towards machines with many-core processors and high node-level concurrency is putting an end to the pure-MPI programming model. Simulations codes must expose multiple levels of parallelisms inside and between nodes, combining different programming models (e.g., MPI+X), to productively use current and future supercomputers. MPI+OpenMP is a common hybridization approach. However, recent evolutions in the OpenMP standard presents options for how OpenMP tasking constructs might be used when mixing fine-grained computation and communications. Various approaches are discussed and compared in this context. Advantages and limitations of the approaches are detailed, including potential improvements to OpenMP in order ease both the integration and progress of MPI calls. These methods are applied to a representative stencil code and demonstrate improvements on the overall execution time as a result of more efficient mixing of MPI and OpenMP.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

References

Allen, E., Chase, D., Hallett, J., Luchangco, V., Maessen, J.W., Ryu, S., Steele, G.L., Tobin-Hochstadt, S.: The Fortress language specification. Tech. report, Sun Microsystems, Inc., version 1.0, March 2008
Google Scholar
Augonnet, C., Thibault, S., Namyst, R., Wacrenier, P.-A.: StarPU: a unified platform for task scheduling on heterogeneous multicore architectures. In: Sips, H., Epema, D., Lin, H.-X. (eds.) Euro-Par 2009. LNCS, vol. 5704, pp. 863–874. Springer, Heidelberg (2009). doi:10.1007/978-3-642-03869-3_80
Chapter Google Scholar
Ayguade, E., et al.: A proposal to extend the OpenMP tasking model for heterogeneous architectures. In: Müller, M.S., Supinski, B.R., Chapman, B.M. (eds.) IWOMP 2009. LNCS, vol. 5568, pp. 154–167. Springer, Heidelberg (2009). doi:10.1007/978-3-642-02303-3_13
Chapter Google Scholar
Bertolli, C., Antao, S.F., Eichenberger, A.E., O’Brien, K., Sura, Z., Jacob, A.C., Chen, T., Sallenave, O.: Coordinating GPU threads for OpenMP 4.0 in LLVM. In: Proceedings of the 2014 LLVM Compiler Infrastructure in HPC, LLVM-HPC 2014, pp. 12–21. IEEE Press, Piscataway (2014). http://dx.doi.org/10.1109/LLVM-HPC.2014.10
Besnard, J.B., Malony, A., Shende, S., Pérache, M., Carribault, P., Jaeger, J.: An MPI halo-cell implementation for zero-copy abstraction. In: Proceedings of the 22nd European MPI Users’ Group Meeting, EuroMPI 2015, NY, USA, pp. 3:1–3:9 (2015). http://doi.acm.org/10.1145/2802658.2802669
Brunst, H., Mohr, B.: Performance analysis of large-scale OpenMP and hybrid MPI/OpenMP applications with Vampir NG. In: Mueller, M.S., Chapman, B.M., Supinski, B.R., Malony, A.D., Voss, M. (eds.) IWOMP 2005. LNCS, vol. 4315, pp. 5–14. Springer, Heidelberg (2008). doi:10.1007/978-3-540-68555-5_1
Chapter Google Scholar
Chamberlain, B., Callahan, D., Zima, H.: Parallel programmability and the Chapel language. Int. J. High Perform. Comput. Appl. 21(3), 291–312 (2007). http://dx.doi.org/10.1177/1094342007078442
Article Google Scholar
Charles, P., Grothoff, C., Saraswat, V., Donawa, C., Kielstra, A., Ebcioglu, K., von Praun, C., Sarkar, V.: X10: an object-oriented approach to non-uniform cluster computing. SIGPLAN Not. 40(10), 519–538 (2015). http://doi.acm.org/10.1145/1103845.1094852
Article Google Scholar
Duran, A., Klemm, M.: The intel many integrated core architecture. In: 2012 International Conference on High Performance Computing Simulation (HPCS), pp. 365–366, July 2012
Google Scholar
Fowler, M.: Domain-Specific Languages. Pearson Education, Boston (2010)
Google Scholar
Gautier, T., Lima, J.V.F., Maillard, N., Raffin, B.: XKaapi: a runtime system for data-flow task programming on heterogeneous architectures. In: 2013 IEEE 27th International Symposium on Parallel and Distributed Processing, pp. 1299–1308, May 2013
Google Scholar
Hamidouche, K., Falcou, J., Etiemble, D.: Hybrid bulk synchronous parallelism library for clustered SMP architectures. In: Proceedings of the Fourth International Workshop on High-level Parallel Programming and Applications, HLPP 2010, NY, USA, pp. 55–62 (2010). http://doi.acm.org/10.1145/1863482.1863494
Kale, L.V., Krishnan, S.: Charm++: a portable concurrent object oriented system based on c++. SIGPLAN Not. 28(10), 91–108 (1993). http://doi.acm.org/10.1145/167962.165874
Article Google Scholar
Karlin, I., Bhatele, A., Keasler, J., Chamberlain, B.L., Cohen, J., Devito, Z., Haque, R., Laney, D., Luke, E., Wang, F., Richards, D., Schulz, M., Still, C.H.: Exploring traditional and emerging parallel programming models using a proxy application. In: 2013 IEEE 27th International Symposium on Parallel and Distributed Processing, pp. 919–932, May 2013
Google Scholar
Loveman, D.B.: High performance Fortran. IEEE Parallel Distrib. Technol. Syst. Appl. 1(1), 25–42 (1993)
Article Google Scholar
Marjanović, V., Labarta, J., Ayguadé, E., Valero, M.: Overlapping communication and computation by using a hybrid MPI/SMPSS approach. In: Proceedings of the 24th ACM International Conference on Supercomputing, ICS 2010, NY, USA, pp. 5–16 (2010). http://doi.acm.org/10.1145/1810085.1810091
Numrich, R.W., Reid, J.: Co-array Fortran for parallel programming. SIGPLAN Fortran Forum 17(2), 1–31 (1998). http://doi.acm.org/10.1145/289918.289920
Article Google Scholar
Stone, J.E., Gohara, D., Shi, G.: OpenCL: a parallel programming standard for heterogeneous computing systems. Comput. Sci. Eng. 12(3), 66–73 (2010)
Article Google Scholar
Sujeeth, A.K., et al.: Composition and reuse with compiled domain-specific languages. In: Castagna, G. (ed.) ECOOP 2013. LNCS, vol. 7920, pp. 52–78. Springer, Heidelberg (2013). doi:10.1007/978-3-642-39038-8_3
Chapter Google Scholar
Wienke, S., Springer, P., Terboven, C., Mey, D.: OpenACC — first experiences with real-world applications. In: Kaklamanis, C., Papatheodorou, T., Spirakis, P.G. (eds.) Euro-Par 2012. LNCS, vol. 7484, pp. 859–870. Springer, Heidelberg (2012). doi:10.1007/978-3-642-32820-6_85
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

ParaTools SAS, Bruyeres-le-Chatel, France
Antoine Capra & Jean-Baptiste Besnard
ParaTools Inc., Eugene, USA
Allen D. Malony
CEA, DAM, DIF, 91297, Arpajon, France
Patrick Carribault, Marc Pérache & Julien Jaeger

Authors

Antoine Capra
View author publications
You can also search for this author in PubMed Google Scholar
Patrick Carribault
View author publications
You can also search for this author in PubMed Google Scholar
Jean-Baptiste Besnard
View author publications
You can also search for this author in PubMed Google Scholar
Allen D. Malony
View author publications
You can also search for this author in PubMed Google Scholar
Marc Pérache
View author publications
You can also search for this author in PubMed Google Scholar
Julien Jaeger
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Antoine Capra .

Editor information

Editors and Affiliations

Lawrence Livermore National Laboratory, Livermore, California, USA
Bronis R. de Supinski
Sandia National Laboratories, Albuquerque, New Mexico, USA
Stephen L. Olivier
RWTH Aachen University, Aachen, Germany
Christian Terboven
Stony Brook University, Stony Brook, New York, USA
Barbara M. Chapman
RWTH Aachen University, Aachen, Germany
Matthias S. Müller

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Capra, A., Carribault, P., Besnard, JB., Malony, A.D., Pérache, M., Jaeger, J. (2017). User Co-scheduling for MPI+OpenMP Applications Using OpenMP Semantics. In: de Supinski, B., Olivier, S., Terboven, C., Chapman, B., Müller, M. (eds) Scaling OpenMP for Exascale Performance and Portability. IWOMP 2017. Lecture Notes in Computer Science(), vol 10468. Springer, Cham. https://doi.org/10.1007/978-3-319-65578-9_14

Download citation

DOI: https://doi.org/10.1007/978-3-319-65578-9_14
Published: 17 August 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-65577-2
Online ISBN: 978-3-319-65578-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics