Dynamic Kernel/Device Mapping Strategies for GPU-Assisted HPC Systems

Jiadong Wu²⁰,
Weiming Shi²⁰ &
Bo Hong²⁰

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7698))

Included in the following conference series:

Workshop on Job Scheduling Strategies for Parallel Processing

988 Accesses

Abstract

With their high computation throughput and outstanding performance-per-watt figures, the graphics processing units (GPU) are becoming increasingly important for high-performance computing (HPC) systems. Existing GPU execution environment restricts the GPU usage to local host node. This is suitable for standalone computer nodes, but becomes inefficient for HPC systems that consist of a large number of GPU-assisted nodes. In this paper, a novel framework is proposed to support dynamic GPU kernel/device mapping strategies for HPC systems. Adaptive mapping policies are designed to mitigate the impact of network transfer overhead. The performance of the framework is studied through extensive simulations. The results show that compared with existing local-only static mapping method, the proposed framework is capable of improving the system-wide GPU utilization rate and computation throughput, especially when the concurrent workloads exhibit different GPU usage intensities.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 49.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Adaptive Simultaneous Multi-tenancy for GPUs

Multi-device Controllers: A Library to Simplify Parallel Heterogeneous Programming

Article 09 December 2017

Optimizing non-coalesced memory access for irregular applications with GPU computing

Article 17 September 2020

References

Barak, A., Ben-Nun, T., Levy, E., Shiloh, A.: A package for opencl based heterogeneous computing on clusters with many gpu devices. In: 2010 IEEE International Conference on Cluster Computing Workshops and Posters (Cluster Workshops), pp. 1–7 (September 2010)
Google Scholar
Che, S., Boyer, M., Meng, J., Tarjan, D., Sheaffer, J., Lee, S., Skadron, K.: Rodinia: A benchmark suite for heterogeneous computing. In: IEEE International Symposium on Workload Characterization, IISWC 2009, pp. 44–54. IEEE (2009)
Google Scholar
Danalis, A., Marin, G., McCurdy, C., Meredith, J., Roth, P., Spafford, K., Tipparaju, V., Vetter, J.: The scalable heterogeneous computing (shoc) benchmark suite. In: Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units, pp. 63–74. ACM (2010)
Google Scholar
Duato, J., Pena, A., Silla, F., Mayo, R., Quintana-Ortí, E.: rcuda: Reducing the number of gpu-based accelerators in high performance clusters. In: 2010 International Conference on High Performance Computing and Simulation (HPCS), pp. 224–231. IEEE (2010)
Google Scholar
Giunta, G., Montella, R., Agrillo, G., Coviello, G.: A GPGPU Transparent Virtualization Component for High Performance Computing Clouds. In: D’Ambra, P., Guarracino, M., Talia, D. (eds.) Euro-Par 2010, Part I. LNCS, vol. 6271, pp. 379–391. Springer, Heidelberg (2010)
Chapter Google Scholar
Gupta, V., Gavrilovska, A., Schwan, K., Kharche, H., Tolia, N., Talwar, V., Ranganathan, P.: Gvim: Gpu-accelerated virtual machines. In: Proceedings of the 3rd ACM Workshop on System-level Virtualization for High Performance Computing, pp. 17–24. ACM (2009)
Google Scholar
Khronos-Group. Opencl - the open standard for parallel programming of heterogeneous systems (2011)
Google Scholar
Kim, J., Kim, H., Lee, J., Lee, J.: Achieving a single compute device image in opencl for multiple gpus. In: Proceedings of the 16th ACM Symposium on Principles and Practice of Parallel Programming, pp. 277–288. ACM (2011)
Google Scholar
Merritt, A., Gupta, V., Verma, A., Gavrilovska, A., Schwan, K.: Shadowfax: scaling in heterogeneous cluster systems via gpgpu assemblies. In: Proceedings of the 5th International Workshop on Virtualization Technologies in Distributed Computing, pp. 3–10. ACM (2011)
Google Scholar
Nickolls, J., Dally, W.: The gpu computing era. IEEE Micro. 30(2), 56–69 (2010)
Article Google Scholar
Nvidia. Gpu computing sdk (2011)
Google Scholar
Owens, J., Houston, M., Luebke, D., Green, S., Stone, J., Phillips, J.: Gpu computing. Proceedings of the IEEE 96(5), 879–899 (2008)
Article Google Scholar
PBS-Works. Scheduling jobs onto nvidia tesla gpu computing processors using pbs professional (2011)
Google Scholar
Trofinoff, S.: Scheduling gpus with slurm (2011)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Electric and Computer Engineering, Georgia Institute of Technology, Atlanta, GA, 30332, USA
Jiadong Wu, Weiming Shi & Bo Hong

Authors

Jiadong Wu
View author publications
You can also search for this author in PubMed Google Scholar
Weiming Shi
View author publications
You can also search for this author in PubMed Google Scholar
Bo Hong
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Google, 1600 Amphitheater Parkway, 94043, Mountain View, CA, USA
Walfredo Cirne
Mathematics and Computer Science Division, Argonne National Laboratory, Bldg 240, 60439, Argonne, IL, USA
Narayan Desai
Facebook Inc., 1601 Willow Road, 94025, Menlo Park, CA, USA
Eitan Frachtenberg
Robotics Research Institute, TU Dortmund, Otto-Hahn-Str. 8, 44227, Dortmund, Germany
Uwe Schwiegelshohn

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wu, J., Shi, W., Hong, B. (2013). Dynamic Kernel/Device Mapping Strategies for GPU-Assisted HPC Systems. In: Cirne, W., Desai, N., Frachtenberg, E., Schwiegelshohn, U. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 2012. Lecture Notes in Computer Science, vol 7698. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35867-8_6

Download citation

DOI: https://doi.org/10.1007/978-3-642-35867-8_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35866-1
Online ISBN: 978-3-642-35867-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics