[go: up one dir, main page]

skip to main content
research-article

Thermal-aware Adaptive Platform Management for Heterogeneous Embedded Systems

Published: 22 September 2021 Publication History

Abstract

Recent trends in real-time applications have raised the demand for high-throughput embedded platforms with integrated CPU-GPU based Systems-On-Chip (SoCs). The enhanced performance of such SoCs, however, comes at the cost of increased power consumption, resulting in significant heat dissipation and high on-chip temperatures. The prolonged occurrences of high on-chip temperature can cause accelerated in-circuit ageing, which severely degrades the long-term performance and reliability of the chip. Violation of thermal constraints leads to on-board dynamic thermal management kicking-in, which may result in timing unpredictability for real-time tasks due to transient performance degradation. Recent work in adaptive software design have explored this issue from a control theoretic stand-point, striving for smooth thermal envelopes by tuning the core frequency.
Existing techniques do not handle thermal violations for periodic real-time task sets in the presence of dynamic events like change of task periodicity, more so in the context of heterogeneous SoCs with integrated CPU-GPUs. This work presents an OpenCL runtime extension for thermal-aware scheduling of periodic, real-time tasks on heterogeneous multi-core platforms. Our framework mitigates dynamic thermal violations by adaptively tuning task mapping parameters, with the eventual control objective of satisfying both platform-level thermal constraints and task-level deadline constraints. We consider multiple platform-level control actions like task migration, frequency tuning and idle slot insertion as the task mapping parameters. To the best of our knowledge, this is the first work that considers such a variety of task mapping control actions in the context of heterogeneous embedded platforms. We evaluate the proposed framework on an Odroid-XU4 board using OpenCL benchmarks and demonstrate its effectiveness in reducing thermal violations.

References

[1]
Tarek F. Abdelzaher, John A. Stankovic, Chenyang Lu, Ronghua Zhang, and Ying Lu. 2003. Feedback performance control in software services. IEEE Control Systems Magazine 23, 3 (2003), 74–90.
[2]
Hussam Amrouch, Seyed Borna Ehsani, Andreas Gerstlauer, and Jörg Henkel. 2019. On the efficiency of voltage overscaling under temperature and aging effects. IEEE Trans. Comput. 68, 11 (2019), 1647–1662.
[3]
Ramazan Bitirgen, Engin Ipek, and Jose F. Martinez. 2008. Coordinated management of multiple interacting resources in chip multiprocessors: A machine learning approach. In MICRO. IEEE, 318–329.
[4]
David Brooks and Margaret Martonosi. 2001. Dynamic thermal management for high-performance microprocessors. In HPCA. IEEE, 171–182.
[5]
Thidapat Chantem, X. Sharon Hu, and Robert P. Dick. 2011. Temperature-aware scheduling and assignment for hard real-time applications on MPSoCs. IEEE Transactions on Very Large Scale Integration Systems 19, 10 (2011), 1884–1897.
[6]
Ting-Hsuan Chien and Rong-Guey Chang. 2016. A thermal-aware scheduling for multicore architectures. Journal of Systems Architecture 62 (2016), 54–62.
[7]
Edward G. Coffman, Gabor Galambos, Silvano Martello, and Daniele Vigo. 1999. Bin packing approximation algorithms: Combinatorial analysis. In Handbook of combinatorial optimization. Springer, 151–207.
[8]
Christina Delimitrou and Christos Kozyrakis. 2013. Paragon: QoS-aware scheduling for heterogeneous datacenters. IEEE Transactions on Computer Systems 48, 4 (2013), 77–88.
[9]
Qingyuan Deng, David Meisner, Abhishek Bhattacharjee, Thomas F. Wenisch, and Ricardo Bianchini. 2012. Coscale: Coordinating cpu and memory system dvfs in server systems. In MICRO. IEEE, 143–154.
[10]
Kapil Dev and Sherief Reda. 2016. Scheduling challenges and opportunities in integrated CPU+GPU processors. In ESTIMedia. IEEE, 78–83.
[11]
Somdip Dey, Enrique Zaragoza Guajardo, Karunakar Reddy Basireddy, Xiaohang Wang, Amit Kumar Singh, and Klaus McDonald-Maier. 2019. EdgeCoolingMode: An Agent based Thermal Management Mechanism for DVFS enabled Heterogeneous MPSoCs. In VLSID. 19–24.
[12]
Ashutosh S. Dhodapkar and James E. Smith. 2002. Managing multi-configuration hardware via dynamic working set analysis. In ISCA. IEEE, 233–244.
[13]
Yang Ge, Parth Malani, and Qinru Qiu. 2010. Distributed task migration for thermal management in many-core systems. In DAC. IEEE, 579–584.
[14]
Anirban Ghose, Soumyajit Dey, Pabitra Mitra, and Mainak Chaudhuri. 2016. Divergence aware automated partitioning of OpenCL workloads. In ISEC. 131–135.
[15]
Anirban Ghose, Lokesh Dokara, Soumyajit Dey, and Pabitra Mitra. 2017. A framework for OpenCL task scheduling on heterogeneous multicores. Parallel Processing Letters 27, 03n04 (2017), 1750008.
[16]
Anirban Ghose, Srijeeta Maity, Arijit Kar, and Soumyajit Dey. 2021. Orchestration of perception systems for reliable performance in heterogeneous platforms. In DATE. IEEE.
[17]
Jairo Giraldo, David Urbina, Alvaro Cardenas, et al. 2018. A survey of physics-based attack detection in cyber-physical systems. Comput. Surveys 51, 4 (2018), 1–36.
[18]
Dip Goswami, Alejandro Masrur, Reinhard Schneider, Chun Jason Xue, and Samarjit Chakraborty. 2013. Multirate controller design for resource-and schedule-constrained automotive ECUs. In DATE. 1123–1126.
[19]
Scott Grauer-Gray, Lifan Xu, Robert Searles, Sudhee Ayalasomayajula, and John Cavazos. 2012. Auto-tuning a high-level language targeted to GPU codes. In InPar. IEEE, 1–10.
[20]
Dominik Grewe and Michael F. P. O’Boyle. 2011. A static task partitioning approach for heterogeneous systems using OpenCL. In CC. Springer, 286–305.
[21]
Heather Hanson, Stephen W. Keckler, Soraya Ghiasi, Karthick Rajamani, Freeman Rawson, and Juan Rubio. 2007. Thermal response to DVFS: Analysis with an Intel Pentium M. In ISLPED. IEEE, 219–224.
[22]
Joseph L. Hellerstein, Yixin Diao, Sujay Parekh, and Dawn M. Tilbury. 2004. Feedback control of computing systems. John Wiley & Sons.
[23]
Connor Imes, David H. K. Kim, Martina Maggio, and Henry Hoffmann. 2015. POET: A portable approach to minimizing energy under soft real-time constraints. In RTAS. IEEE, 75–86.
[24]
Engin Ipek, Onur Mutlu, José F. Martínez, and Rich Caruana. 2008. Self-optimizing memory controllers: A reinforcement learning approach. In ISCA. IEEE, 39–50.
[25]
Samuel Isuwa, Somdip Dey, Amit Kumar Singh, and Klaus McDonald-Maier. 2019. TEEM: Online thermal-and energy-efficiency management on CPU-GPU MPSoCs. In DATE. IEEE, 438–443.
[26]
Young Geun Kim, Minyong Kim, Jae Min Kim, and Sung Woo Chung. 2015. M-DTM: Migration-based dynamic thermal management for heterogeneous mobile multi-core processors. In DATE. IEEE, 1533–1538.
[27]
Klaus Kofler, Ivan Grasso, Biagio Cosenza, and Thomas Fahringer. 2013. An automatic input-sensitive approach for heterogeneous task partitioning. In ICS. 149–160.
[28]
Kai Lampka and Björn Forsberg. 2016. Keep it slow and in time: Online DVFS with hard real-time workloads. In DATE. IEEE, 385–390.
[29]
Janghaeng Lee, Mehrzad Samadi, Yongjun Park, and Scott Mahlke. 2013. Transparent CPU-GPU collaboration for data-parallel kernels on heterogeneous systems. In PACT. IEEE, 245–255.
[30]
Youngmoon Lee, Hoon Sung Chwa, Kang G. Shin, and Shige Wang. 2018. Thermal-aware resource management for embedded real-time systems. IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems 37, 11 (2018), 2857–2868.
[31]
Youngmoon Lee, Kang G. Shin, and Hoon Sung Chwa. 2019. Thermal-aware scheduling for integrated CPUs-GPU platforms. ACM Transactions on Embedded Computing Systems 18, 5s (2019), 78–83.
[32]
Chenyang Lu, John A. Stankovic, Sang H. Son, and Gang Tao. 2002. Feedback control real-time scheduling: Framework, modeling, and algorithms. Real-Time Systems 23, 1–2 (2002), 85–126.
[33]
Srijeeta Maity, Anirban Ghose, Soumyajit Dey, and Swarnendu Biswas. 2020. Thermal load-aware adaptive scheduling for heterogeneous platforms. In VLSID. IEEE, 125–130.
[34]
Abhinandan Majumdar, Leonardo Piga, Indrani Paul, Joseph L. Greathouse, Wei Huang, and David H. Albonesi. 2017. Dynamic GPGPU power management using adaptive model predictive control. In HPCA. IEEE, 613–624.
[35]
Nikita Mishra, Connor Imes, John D. Lafferty, and Henry Hoffmann. 2018. CALOREE: Learning control for predictable latency and low energy. In ASPLOS. ACM, 184–198.
[36]
Malcolm S. Mollison, Jeremy P. Erickson, James H. Anderson, Sanjoy K. Baruah, and John A. Scoredos. 2010. Mixed-criticality real-time scheduling for multicore systems. In CIT. IEEE, 1864–1871.
[37]
Aaftab Munshi. 2009. The opencl specification. In HCS. IEEE, 1–314.
[38]
Alok Prakash, Hussam Amrouch, Muhammad Shafique, Tulika Mitra, and Jörg Henkel. 2016. Improving mobile gaming performance through cooperative CPU-GPU thermal management. In DAC. IEEE, 1–6.
[39]
Amir M. Rahmani, Bryan Donyanavard, Tiago Mück, et al. 2018. SPECTR: Formal supervisory control and coordination for many-core systems resource management. In ASPLOS. ACM, 169–183.
[40]
Andrea Rudi, Andrea Bartolini, Andrea Lodi, and Luca Benini. 2014. Optimum: Thermal-aware task allocation for heterogeneous many-core devices. In HPCS. IEEE, 82–87.
[41]
Pierre Sermanet and Yann LeCun. 2011. Traffic sign recognition with multi-scale Convolutional Networks. In IJCNN. 2809–2813.
[42]
Gaurav Singla, Gurinderjit Kaur, Ali K. Unver, and Umit Y. Ogras. 2015. Predictive dynamic thermal and power management for heterogeneous mobile platforms. In DATE. IEEE, 960–965.
[43]
Gaurav Singla, Gurinderjit Kaur, Ali K. Unver, and Umit Y. Ogras. 2015. Predictive dynamic thermal and power management for heterogeneous mobile platforms. In DATE. IEEE, 960–965.
[44]
Kevin Skadron. 2004. Hybrid architectural dynamic thermal management. In DATE. IEEE, 10–15.
[45]
Khalid Tahboub, David Güera, Amy R. Reibman, and Edward J. Delp. 2017. Quality-adaptive deep learning for pedestrian detection. In ICIP. IEEE, 4187–4191.
[46]
Siqi Wang, Gayathri Ananthanarayanan, and Tulika Mitra. 2018. OPTiC: Optimizing collaborative CPU–GPU computing on mobile devices with thermal constraints. IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems 38, 3 (2018), 393–406.
[47]
Chi Xu, Xi Chen, Robert P. Dick, and Zhuoqing Morley Mao. 2010. Cache contention and application performance prediction for multi-core systems. In ISPASS. IEEE, 76–86.
[48]
Junjie Yan, Xucong Zhang, Zhen Lei, Shengcai Liao, and Stan Z. Li. 2013. Robust multi-resolution pedestrian detection in traffic scenes. In CVPR. IEEE, 3033–3040.
[49]
Huazhe Zhang and Henry Hoffmann. 2016. Maximizing performance under a power cap: A comparison of hardware, software, and hybrid techniques. In ASPLOS. ACM, 545–559.
[50]
Qi Zhu, Bo Wu, Xipeng Shen, Li Shen, and Zhiying Wang. 2017. Co-run scheduling with power cap on integrated CPU-GPU systems. In IPDPS. IEEE, 967–977.

Cited By

View all
  • (2024)Toward Energy-efficient STT-MRAM-based Near Memory Computing Architecture for Embedded SystemsACM Transactions on Embedded Computing Systems10.1145/365072923:3(1-24)Online publication date: 25-Apr-2024
  • (2024)An Evaluation Framework for Dynamic Thermal Management Strategies in 3D MultiProcessor System-on-Chip Co-DesignIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.345941435:11(2161-2176)Online publication date: Nov-2024
  • (2024)CPU-GPU Cooperative QoS Optimization of Personalized Digital Healthcare Using Machine Learning and Swarm IntelligenceIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2022.320750921:4(521-533)Online publication date: Jul-2024
  • Show More Cited By

Index Terms

  1. Thermal-aware Adaptive Platform Management for Heterogeneous Embedded Systems

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Transactions on Embedded Computing Systems
        ACM Transactions on Embedded Computing Systems  Volume 20, Issue 5s
        Special Issue ESWEEK 2021, CASES 2021, CODES+ISSS 2021 and EMSOFT 2021
        October 2021
        1367 pages
        ISSN:1539-9087
        EISSN:1558-3465
        DOI:10.1145/3481713
        • Editor:
        • Tulika Mitra
        Issue’s Table of Contents
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Journal Family

        Publication History

        Published: 22 September 2021
        Accepted: 01 July 2021
        Revised: 01 June 2021
        Received: 01 April 2021
        Published in TECS Volume 20, Issue 5s

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. Heterogeneous computing
        2. thermal violation
        3. adaptive thermal management

        Qualifiers

        • Research-article
        • Refereed

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)77
        • Downloads (Last 6 weeks)7
        Reflects downloads up to 14 Nov 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Toward Energy-efficient STT-MRAM-based Near Memory Computing Architecture for Embedded SystemsACM Transactions on Embedded Computing Systems10.1145/365072923:3(1-24)Online publication date: 25-Apr-2024
        • (2024)An Evaluation Framework for Dynamic Thermal Management Strategies in 3D MultiProcessor System-on-Chip Co-DesignIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.345941435:11(2161-2176)Online publication date: Nov-2024
        • (2024)CPU-GPU Cooperative QoS Optimization of Personalized Digital Healthcare Using Machine Learning and Swarm IntelligenceIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2022.320750921:4(521-533)Online publication date: Jul-2024
        • (2023)TMDS: Temperature-aware Makespan Minimizing DAG Scheduler for Heterogeneous Distributed SystemsACM Transactions on Design Automation of Electronic Systems10.1145/361686928:6(1-22)Online publication date: 16-Oct-2023
        • (2023)Hot Under the Hood: An Analysis of Ambient Temperature Impact on Heterogeneous Edge PlatformsProceedings of the 6th International Workshop on Edge Systems, Analytics and Networking10.1145/3578354.3592868(25-30)Online publication date: 8-May-2023
        • (2023)Inferencing on Edge Devices: A Time- and Space-aware Co-scheduling ApproachACM Transactions on Design Automation of Electronic Systems10.1145/357619728:3(1-33)Online publication date: 19-Mar-2023
        • (2023)Reducing Peak Temperature by Redistributing Idle-Time in Modern MPSoCs2023 IEEE 26th International Symposium on Real-Time Distributed Computing (ISORC)10.1109/ISORC58943.2023.00020(76-85)Online publication date: May-2023
        • (2022)Future aware Dynamic Thermal Management in CPU-GPU Embedded Platforms2022 IEEE Real-Time Systems Symposium (RTSS)10.1109/RTSS55097.2022.00041(396-408)Online publication date: Dec-2022
        • (2022)Edge AIBench 2.0: A scalable autonomous vehicle benchmark for IoT–Edge–Cloud systemsBenchCouncil Transactions on Benchmarks, Standards and Evaluations10.1016/j.tbench.2023.1000862:4(100086)Online publication date: Oct-2022
        • (2021)Work-in-Progress: Cooling by Core-Idling: Thermal-Aware Thread Scheduling for Mobile Multicore Processors2021 IEEE Real-Time Systems Symposium (RTSS)10.1109/RTSS52674.2021.00055(520-523)Online publication date: Dec-2021

        View Options

        Get Access

        Login options

        Full Access

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format.

        HTML Format

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media