Abstract
In the recent years, computer servers and data center facilities that provide high performance computing (HPC) for scientific applications have largely increased in numbers and have become great consumers of electrical power. Supercomputers often run at their peak performance for an efficient execution of scientific applications, and therefore consume an enormous amount of power that results in increased operational cost. Furthermore, an increase in the power consumption results in an increase in the temperature of the physical HPC systems, which in turn translates into increased failure rates and decreased reliability. Slowing down these HPC systems by reducing the individual speed of the processors, results in a loss of execution performance of the scientific application, due to the variation in processing speed. Another cause of the degradation in the execution performance of scientific applications is the variation in the computational resource availability due to its utilization by other applications executing on the same computing node in a space shared manner. The variations in processor availability can lead to severe performance degradation in the execution environment due to load imbalance and a violation of the performance objectives, such as meeting a deadline, and therefore it may result in high penalty in terms of revenue loss to the service providers. In this chapter, a utility based power-aware approach has been presented that uses a model-based control theoretic framework for executing scientific applications. The approach and related simulations indicate that the performance and the power requirements of the system can dynamically be adjusted, while maintaining the predefined quality of service (QoS) goals in terms of deadline of execution and power consumption of the HPC system, even in the presence of computational resource related perturbations. This approach is autonomic, performance directed, dynamically controlled, and independent of (does not interfere with) the execution of the application.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Report to congress on server and data center energy efficiency public law 109-431. Technical report, U.S. Environmental Protection Agency ENERGY STAR Program, August 2 2007.
A simple way to estimate the cost of downtime. In Proceedings of the 16th USENIX conference on System administration (LISA '02), pages 185–188, Berkeley, CA, USA, 2002. USENIX Association.
Wu chun Feng, Xizhou Feng, and Rong Ge. Green supercomputing comes of age. IT Professional, 10(1):17–23, 2008.
W. Feng. Green destiny + mpiblast = bioinfomagic. In 10th International Conference on Parallel Computing (PARCO), pages 653–660, 2003.
Rong Ge, Xizhou Feng, Wu-chun Feng, and Kirk W. Cameron. Cpu miser: A performance-directed, run-time system for power-aware clusters. In Proceedings of the 2007 International Conference on Parallel Processing (ICPP '07), page 18, Washington, DC, USA, 2007. IEEE Computer Society.
R. Ge and K.W. Cameron. Power-aware speedup. In Proceedings of the IEEE International on Parallel and Distributed Processing Symposium (IPDPS)., pages 1–10, March 2007.
Chung-hsing Hsu and Wu-chun Feng. A power-aware run-time system for high-performance computing. In Proceedings of the ACM/IEEE conference on Supercomputing (SC '05), page 1, Washington, DC, USA, 2005. IEEE Computer Society.
Ioana Banicescu and Ricolindo L. Carino. Addressing the stochastic nature of scientific computations via dynamic loop scheduling. Electronic Transactions on Numerical Analysis 21:66-80, 2005.
Rajat Mehrotra, Ioana Banicescu, and Srishti Srivastava. A utility based power-aware autonomic approach for running scientific applications. In Proceedings of IEEE 26th International Parallel and Distributed Processing Symposium (IPDPS), pages 1457–1466, 2012.
David A. Patterson and John L. Hennessy. Computer Organization and Design, The Hardware/Software Interface, 4th Edition. Morgan Kaufmann, 2008.
Yongpeng Liu and Hong Zhu. A survey of the research on power management techniques for high-performance systems. Software: Practice and Experience, 40(11):943–964, October 2010.
M. Nakao, H. Hayama, and M. Nishioka. Which cooling air supply system is better for a high heat density room: underfloor or overhead? In Proceedings of Telecommunications Energy Conference, (INTELEC '91), pages 393–400, 1991.
H. Hayama and M. Nakao. Air flow systems for telecommunications equipment rooms. In Proceedings of Telecommunications Energy Conference (INTELEC '89), pages 8.3/1–8.3/7 vol.1, 1989.
Taliver Heath, Ana Paula Centeno, Pradeep George, Luiz Ramos, Yogesh Jaluria, and Ricardo Bianchini. Mercury and freon: temperature emulation and management for server systems. In Proceedings of the 12th international conference on Architectural support for programming languages and operating systems, ASPLOS XII, pages 106–116, New York, NY, USA, 2006. ACM.
Justin Moore, Jeff Chase, Parthasarathy Ranganathan, and Ratnesh Sharma. Making scheduling “cool”: temperature-aware workload placement in data centers. In Proceedings of the annual conference on USENIX Annual Technical Conference, ATEC '05, pages 5–5, Berkeley, CA, USA, 2005. USENIX Association.
Tridib Mukherjee, Ayan Banerjee, Georgios Varsamopoulos, Sandeep K. S. Gupta, and Sanjay Rungta. Spatio-temporal thermal-aware job scheduling to minimize energy consumption in virtualized heterogeneous data centers. Computer Networks, 53(17):2888–2904, December 2009.
Eun Kyung Lee, Indraneel Kulkarni, Dario Pompili, and Manish Parashar. Proactive thermal management in green datacenters. Journal of Supercomput., 60(2):165–195, May 2012.
Blue gene. http://www-03.ibm.com/ibm/history/ibm100/us/en/icons/bluegene/ [May 2013].
Severin Zimmermann, Ingmar Meijer, Manish K. Tiwari, Stephan Paredes, Bruno Michel, and Dimos Poulikakos. Aquasar: A hot water cooled data center with direct energy reuse. Energy, 43(1):237–245, 2012. 2nd International Meeting on Cleaner Combustion (CM0901-Detailed Chemical Models for Cleaner Combustion).
Chung-Hsing Hsu and Wu-Chun Feng. Effective dynamic voltage scaling through cpu-boundedness detection. In In Workshop on Power Aware Computing Systems, pages 135–149, 2004.
Vincent W. Freeh, David K. Lowenthal, Feng Pan, Nandini Kappiah, Rob Springer, Barry L. Rountree, and Mark E. Femal. Analyzing the energy-time trade-off in high-performance computing applications. IEEE Trans. Parallel Distrib. Syst., 18:835–848, June 2007.
Michael Knobloch. Chapter 1 - energy-aware high performance computing—a survey. In Ali Hurson, editor, Green and Sustainable Computing: Part II, volume 88 of Advances in Computers, pages 1–78. Elsevier, 2013.
B. J. Smith. Architecture and applications of the hep multiprocessor computer system. In SPIE - Real-Time Signal Processing IV, pages 241–248, 1981.
Clyde P. Kruskal and Alan Weiss. Allocating independent subtasks on parallel processors. IEEE Trans. Softw. Eng., 11(10):1001–1016, 1985.
T. H. Tzen and L. M. Ni. Trapezoid self-scheduling: A practical scheduling scheme for parallel compilers. IEEE Trans. Parallel Distrib. Syst., 4(1):87–98, 1993.
Susan Flynn Hummel, Edith Schonberg, and Lawrence E. Flynn. Factoring: a method for scheduling parallel loops. Communication of ACM, 35(8):90–101, 1992.
Ioana Banicescu and Susan Flynn Hummel. Balancing processor loads and exploiting data locality in n-body simulations. In Proceedings of the 1995 ACM/IEEE Conference on Supercomputing, Supercomputing '95 (on CDROM), pages 43–55, New York, NY, USA, 1995. ACM.
Susan Flynn Hummel, Jeanette Schmidt, R. N. Uma, and Joel Wein. Load-sharing in heterogeneous systems via weighted factoring. In Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures (SPAA '96), pages 318–328, New York, NY, USA, 1996. ACM.
Ioana Banicescu and Vijay Velusamy. Performance of scheduling scientific applications with adaptive weighted factoring. In Proceedings of the 15th International Parallel & Distributed Processing Symposium (IPDPS '01), page 84, Washington, DC, USA, 2001. IEEE Computer Society.
Ricolindo L. Carino Cariño and Ioana Banicescu. Dynamic load balancing with adaptive factoring methods in scientific applications. The Journal of Supercomputing, 44(1):41–63, 2008.
Ioana Banicescu, Vijay Velusamy, and Johnny Devaprasad. On the scalability of dynamic scheduling scientific applications with adaptive weighted factoring. Cluster Computing, 6(3):215–226, 2003.
Ioana Banicescu and Vijay Velusamy. Load balancing highly irregular computations with the adaptive factoring. In 16th International Parallel and Distributed Processing Symposium (IPDPS 2002), 15-19 April 2002, Fort Lauderdale, FL, USA, CD-ROM/Abstracts Proceedings. IEEE Computer Society, 2002.
Ricolindo Cari˜no, Ioana Banicescu, Thomas Rauber, and Gudula Rünger. Dynamic loop scheduling with processor groups. In Proceedings of the ISCA Parallel and distributed Computing Symposium (PDCS), pages 78–84, 2004.
Yong Dong, Juan Chen, Xuejun Yang, Lin Deng, and Xuemeng Zhang. Energy-oriented openmp parallel loop scheduling. In Proceedings of the 2008 IEEE International Symposium on Parallel and Distributed Processing with Applications, pages 162–169, Washington, DC, USA, 2008. IEEE Computer Society.
Anton Cervin, Johan Eker, Bo Bernhardsson, and Karl-Erik Arzen. Feedback–feedforward scheduling of control tasks. Real-Time Systems, 23(1/2):25–53, 2002.
T.F. Abdelzaher, K.G. Shin, and N. Bhatti. Performance guarantees for web server end-systems: a control-theoretical approach. IEEE Transactions on Parallel and Distributed Systems, 13(1):80–96, Jan 2002.
R. Mehrotra, A. Dubey, S. Abdelwahed, and W. Monceaux. Large scale monitoring and online analysis in a distributed virtualized environment. In 8th IEEE International Conference and Workshops on Engineering of Autonomic and Autonomous Systems (EASe), 2011, pages 1–9, 2011.
Chenyang Lu, Guillermo A. Alvarez, and John Wilkes. Aqueduct: Online data migration with performance guarantees. In FAST '02: Proceedings of the 1st USENIX Conference on File and Storage Technologies, page 21, Berkeley, CA, USA, 2002. USENIX Association.
R. Mehrotra, A. Dubey, S. Abdelwahed, and A. Tantawi. Integrated monitoring and control for performance management of distributed enterprise systems. In 2010 IEEE International Symposium on Modeling, Analysis Simulation of Computer and Telecommunication Systems (MASCOTS), pages 424–426, 2010.
Rajat Mehrotra, Abhishek Dubey, Sherif Abdelwahed, and Asser Tantawi. A Power-aware Modeling and Autonomic Management Framework for Distributed Computing Systems. CRC Press, 2011.
Dara Kusic, Nagarajan Kandasamy, and Guofei Jiang. Approximation modeling for the online performance management of distributed computing systems. In ICAC '07: Proceedings of the Fourth International Conference on Autonomic Computing, page 23, Washington, DC, USA, 2007. IEEE Computer Society.
Rajat Mehrotra, Abhishek Dubey, Sherif Abdelwahed, and Asser Tantawi. Model identification for performance management of distributed enterprise systems. (ISIS-10-104), 2010.
S. Abdelwahed, Nagarajan Kandasamy, and Sandeep Neema. Online control for self-management in computing systems. In Proceedings of Real-Time and Embedded Technology and Applications Symposium,(RTAS) 2004., pages 368–375, 2004.
Abhishek Dubey, Rajat Mehrotra, Sherif Abdelwahed, and Asser Tantawi. Performance modeling of distributed multi-tier enterprise systems. SIGMETRICS Performance Evaluation Review, 37(2):9–11, 2009.
S. Abdelwahed, Jia Bai, Rong Su, and Nagarajan Kandasamy. On the application of predictive control techniques for adaptive performance management of computing systems. IEEE Transactions on Network and Service Management, 6(4):212–225, 2009.
Acknowledgment
The authors would like to thank the National Science Foundation (NSF) for its support of this work through the grant NSF IIP-1034897.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer Science+Business Media New York
About this chapter
Cite this chapter
Mehrotra, R., Banicescu, I., Srivastava, S., Abdelwahed, S. (2015). A Power-Aware Autonomic Approach for Performance Management of Scientific Applications in a Data Center Environment. In: Khan, S., Zomaya, A. (eds) Handbook on Data Centers. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-2092-1_5
Download citation
DOI: https://doi.org/10.1007/978-1-4939-2092-1_5
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4939-2091-4
Online ISBN: 978-1-4939-2092-1
eBook Packages: Computer ScienceComputer Science (R0)