[go: up one dir, main page]

CN106874158A - A kind of heterogeneous system Whole Process power consumption metering method - Google Patents

A kind of heterogeneous system Whole Process power consumption metering method Download PDF

Info

Publication number
CN106874158A
CN106874158A CN201710020074.5A CN201710020074A CN106874158A CN 106874158 A CN106874158 A CN 106874158A CN 201710020074 A CN201710020074 A CN 201710020074A CN 106874158 A CN106874158 A CN 106874158A
Authority
CN
China
Prior art keywords
power consumption
heterogeneous
program
segment
processor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710020074.5A
Other languages
Chinese (zh)
Inventor
王卓薇
程良伦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN201710020074.5A priority Critical patent/CN106874158A/en
Publication of CN106874158A publication Critical patent/CN106874158A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3006Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3058Monitoring arrangements for monitoring environmental properties or parameters of the computing system or of the computing system component, e.g. monitoring of power, currents, temperature, humidity, position, vibrations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3058Monitoring arrangements for monitoring environmental properties or parameters of the computing system or of the computing system component, e.g. monitoring of power, currents, temperature, humidity, position, vibrations
    • G06F11/3062Monitoring arrangements for monitoring environmental properties or parameters of the computing system or of the computing system component, e.g. monitoring of power, currents, temperature, humidity, position, vibrations where the monitored property is the power consumption
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3447Performance evaluation by modeling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3452Performance evaluation by statistical analysis
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Computing Systems (AREA)
  • Computer Hardware Design (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Mathematical Physics (AREA)
  • Power Sources (AREA)

Abstract

The invention discloses a kind of heterogeneous system Whole Process power consumption metering method, including step:Set up and calculate program execution time and dynamic power consumption relation that section is divided more multiprocessor;Data transfer is obtained to be influenced each other with quiescent dissipation with chip temperature management in real time under the communication power consumption formalized description method and analysis thermal model of multitask dynamically distributes.Compared with prior art, implementation procedure of the present invention analysis concurrent program on heterogeneous system, the power consumption modeling of the multiple parallel sections of concern, the communication overhead that primary processor brings with OverDrive Processor ODP task communication is considered simultaneously, and the influence of leakage current that chip temperature elevated band is come, from Whole Process angle, accurate count Heterogeneous parallel system power consumption calculation.

Description

一种异构系统全程序功耗计量方法A whole-program power consumption measurement method for heterogeneous systems

技术领域technical field

本发明涉及异构系统领域,尤其涉及一种异构系统全程序功耗计量方法。The invention relates to the field of heterogeneous systems, in particular to a method for measuring power consumption of a whole program of a heterogeneous system.

背景技术Background technique

功耗精确计量是面向特定体系结构进行功耗优化的基础。目前关于异构系统功耗计量方法的研究并不充分,大多是基于同构系统功耗计量方法修改得到。然而,异构系统由于集成有多种不同类型的处理器(主要分为主处理器与加速处理器),各处理器不仅具有不同体系结构;同时主处理器与加速处理器大都通过系统总线链接,在调度加速部件执行加速计算的过程中必然引入额外的通信操作;另外加速处理器密集处理单元使得芯片温度较一般处理器高,而温度对静态功耗会产生一定影响,导致静态功耗比例在逐渐增大,因此面向异构系统的功耗计量对象相比同构系统会更加复杂。Accurate metering of power consumption is the basis for power optimization for specific architectures. At present, the research on the measurement method of power consumption of heterogeneous system is not sufficient, and most of them are based on the modification of the measurement method of power consumption of homogeneous system. However, since heterogeneous systems integrate many different types of processors (mainly divided into main processors and accelerator processors), each processor not only has a different architecture; , in the process of scheduling acceleration components to perform accelerated calculations, additional communication operations must be introduced; in addition, the intensive processing unit of the accelerated processor makes the chip temperature higher than that of ordinary processors, and the temperature will have a certain impact on static power consumption, resulting in static power consumption ratio It is gradually increasing, so the power consumption measurement object for heterogeneous systems will be more complicated than that of homogeneous systems.

传统功耗计量的对象基本上都是单独针对处理器部件或者整个处理器进行建模,考虑的系统功耗与应用程序的执行过程无关,仅由处理器决定。然而在异构系统中,由于编程模型或体系结构上的限制,并行应用程序大都采用通用微处理器与加速部件依次执行不同计算段的方式来完成整个应用,并且随着异构并行处理技术及其支撑环境的不断完善,越来越多的并行程序将采用异构多处理器并行组合处理单个并行计算段的方式,以充分开发系统并行处理的优势。同时,由于异构系统中主处理器与加速部件间大都通过PCI接口传递数据,其单项峰值带宽仅为8GB/s,特别是以GPU为代表的加速处理器显存容量已经很难满足科学计算应用的需求,进一步增大了数据通信带宽的压力,对于大量数据密集型应用,处理器间的数据通信开销对异构系统高功耗造成了不小影响。随着集成电路进入纳米工艺,漏电流静态功耗已超过动态功耗,成为了芯片功耗的主要来源。The objects of traditional power consumption measurement are basically modeled solely for processor components or the entire processor, and the considered system power consumption has nothing to do with the execution process of the application program, and is only determined by the processor. However, in heterogeneous systems, due to the limitations of programming models or architectures, most parallel applications use general-purpose microprocessors and acceleration components to execute different calculation segments in sequence to complete the entire application. With the continuous improvement of its supporting environment, more and more parallel programs will adopt the method of parallel combination of heterogeneous multi-processors to process a single parallel computing segment, so as to fully exploit the advantages of system parallel processing. At the same time, since the main processor and the acceleration components in the heterogeneous system mostly transmit data through the PCI interface, its single peak bandwidth is only 8GB/s, especially the memory capacity of the acceleration processor represented by the GPU has been difficult to meet the requirements of scientific computing applications. The demand further increases the pressure on data communication bandwidth. For a large number of data-intensive applications, the data communication overhead between processors has a significant impact on the high power consumption of heterogeneous systems. As integrated circuits enter the nanometer process, the static power consumption of the leakage current has exceeded the dynamic power consumption and has become the main source of chip power consumption.

发明内容Contents of the invention

为克服现有技术的不足,从全程序角度建立异构系统功耗计量方法,有效降低系统能耗,更为高效开发异构系统效能优势,本发明提出一种异构系统全程序功耗计量方法。In order to overcome the deficiencies of the existing technology, establish a heterogeneous system power consumption measurement method from the perspective of the whole program, effectively reduce the system energy consumption, and more efficiently develop the performance advantages of the heterogeneous system, the present invention proposes a whole program power consumption measurement method for the heterogeneous system method.

本发明的技术方案是这样实现的:Technical scheme of the present invention is realized like this:

一种异构系统全程序功耗计量方法,包括步骤A method for measuring power consumption of a heterogeneous system, including the steps of

S1:针对异构多处理器并行处理单个并行计算段,根据同一类型处理器或多种不同类型处理器完成计算段的不同方式,分析同构计算段程序执行时间对该计算段动态功耗的影响,建立同构计算段功耗与执行时间关系,获得基于同构程序划分的动态功耗表示方法;S1: Aiming at the parallel processing of a single parallel computing segment by heterogeneous multiprocessors, according to the different ways that the same type of processor or multiple different types of processors complete the computing segment, analyze the impact of the program execution time of the homogeneous computing segment on the dynamic power consumption of the computing segment Influence, establish the relationship between power consumption and execution time of the isomorphic computing segment, and obtain a dynamic power consumption representation method based on isomorphic program division;

S2:分析时间约束条件下单个计算段达到功耗最优的条件,建立异构计算段功耗与执行时间关系,获得基于异构程序划分的动态功耗表示方法;S2: Analyze the condition that a single computing segment achieves optimal power consumption under time constraints, establish the relationship between power consumption and execution time of heterogeneous computing segments, and obtain a dynamic power consumption representation method based on heterogeneous program division;

S3:在同构计算段程序中,以并行数据规模为对象,分析主处理器与加速处理器之间数据传输对通信能耗的影响,获得同构计算段通信能耗表示方法;S3: In the program of the isomorphic computing segment, taking the parallel data scale as the object, analyzing the impact of data transmission between the main processor and the accelerator processor on the communication energy consumption, and obtaining the expression method of the communication energy consumption of the isomorphic computing segment;

S4:在异构计算段程序中,以并行执行任务为对象,利用异构处理器实际效能与任务特征的直接关系,分析单个计算段中具有数据依赖关系的多个并行任务划分对通信能耗的影响,获得异构计算段通信能耗表示方法;S4: In the heterogeneous computing segment program, taking parallel execution tasks as the object, using the direct relationship between the actual performance of the heterogeneous processor and the task characteristics, analyzing the communication energy consumption caused by the division of multiple parallel tasks with data dependencies in a single computing segment The impact of the communication energy consumption of heterogeneous computing segments is obtained;

S5:以多核处理器芯片为对象,利用处理器内核的热传导特性,采用等效RC电路方法建立实时系统热分析模型,求解芯片工作温度;S5: Taking the multi-core processor chip as the object, using the heat conduction characteristics of the processor core, using the equivalent RC circuit method to establish a real-time system thermal analysis model to solve the chip operating temperature;

S6:分析芯片漏电流与静态功耗的相互关系,进行曲线拟合,获得漏电流与芯片温度、电压的函数关系式;S6: Analyze the relationship between chip leakage current and static power consumption, perform curve fitting, and obtain the functional relationship between leakage current and chip temperature and voltage;

S7:引入两个工作参考温度,建立漏电流与温度的二次函数,获得静态功耗与芯片温度的函数关系式,建立基于实时温度管理的静态功耗计量表示方法。S7: Introduce two working reference temperatures, establish the quadratic function of leakage current and temperature, obtain the functional relationship between static power consumption and chip temperature, and establish a static power consumption measurement representation method based on real-time temperature management.

进一步地,步骤S6中所述进行曲线拟合是使用HISPICE软件完成的。Further, the curve fitting described in step S6 is completed by using HISPICE software.

本发明的有益效果在于,与现有技术相比,本发明分析并行程序在异构系统上的执行过程,关注多个并行段的功耗建模,同时考虑主处理器与加速处理器任务通信带来的通信开销,以及芯片温度升高带来的漏电流影响,从全程序角度,精确统计异构并行系统功耗计算。The beneficial effect of the present invention is that, compared with the prior art, the present invention analyzes the execution process of the parallel program on the heterogeneous system, pays attention to the power consumption modeling of multiple parallel segments, and considers the task communication between the main processor and the accelerated processor The communication overhead caused by it, and the impact of leakage current caused by the increase of chip temperature, from the perspective of the whole program, accurately calculate the power consumption calculation of heterogeneous parallel systems.

附图说明Description of drawings

图1是本发明一种异构系统全程序功耗计量方法流程图;Fig. 1 is a flow chart of a method for measuring power consumption of a heterogeneous system in a whole program according to the present invention;

图2是本发明一种异构系统全程序功耗计量方法整体框架示意图;Fig. 2 is a schematic diagram of the overall framework of a heterogeneous system full-program power consumption measurement method according to the present invention;

图3是本发明一种异构系统全程序功耗计量方法的异构并行程序分类图。FIG. 3 is a classification diagram of heterogeneous parallel programs in a method for measuring power consumption of a whole program in a heterogeneous system according to the present invention.

具体实施方式detailed description

下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

请参见图1和图2,本发明一种异构系统全程序功耗计量方法,包括三个部分:Please refer to Fig. 1 and Fig. 2, a heterogeneous system whole-program power consumption measurement method of the present invention includes three parts:

(1)建立多处理器多计算段划分的程序执行时间与动态功耗关系,包括步骤:(1) Establish the relationship between program execution time and dynamic power consumption divided by multi-processor multi-computing segments, including steps:

S1:针对异构多处理器并行处理单个并行计算段,根据同一类型处理器或多种不同类型处理器完成计算段的不同方式,分析同构计算段程序执行时间对该计算段动态功耗的影响,建立同构计算段功耗与执行时间关系,获得基于同构程序划分的动态功耗表示方法;S1: Aiming at the parallel processing of a single parallel computing segment by heterogeneous multiprocessors, according to the different ways that the same type of processor or multiple different types of processors complete the computing segment, analyze the impact of the program execution time of the homogeneous computing segment on the dynamic power consumption of the computing segment Influence, establish the relationship between power consumption and execution time of the isomorphic computing segment, and obtain a dynamic power consumption representation method based on isomorphic program division;

S2:分析时间约束条件下单个计算段达到功耗最优的条件,建立异构计算段功耗与执行时间关系,获得基于异构程序划分的动态功耗表示方法;S2: Analyze the condition that a single computing segment achieves optimal power consumption under time constraints, establish the relationship between power consumption and execution time of heterogeneous computing segments, and obtain a dynamic power consumption representation method based on heterogeneous program division;

(2)获得数据传输与多任务动态分配的通信功耗形式化描述方法,包括步骤:(2) Obtain a formal description method of communication power consumption for data transmission and multi-task dynamic allocation, including steps:

S3:在同构计算段程序中,以并行数据规模为对象,分析主处理器与加速处理器之间数据传输对通信能耗的影响,获得同构计算段通信能耗表示方法;S3: In the program of the isomorphic computing segment, taking the parallel data scale as the object, analyzing the impact of data transmission between the main processor and the accelerator processor on the communication energy consumption, and obtaining the expression method of the communication energy consumption of the isomorphic computing segment;

S4:在异构计算段程序中,以并行执行任务为对象,利用异构处理器实际效能与任务特征的直接关系,分析单个计算段中具有数据依赖关系的多个并行任务划分对通信能耗的影响,获得异构计算段通信能耗表示方法;S4: In the heterogeneous computing segment program, taking parallel execution tasks as the object, using the direct relationship between the actual performance of the heterogeneous processor and the task characteristics, analyzing the communication energy consumption caused by the division of multiple parallel tasks with data dependencies in a single computing segment The impact of the communication energy consumption of heterogeneous computing segments is obtained;

(3)分析热分析模型下实时芯片温度管理与静态功耗相互影响,包括步骤:(3) Analyze the interaction between real-time chip temperature management and static power consumption under the thermal analysis model, including steps:

S5:以多核处理器芯片为对象,利用处理器内核的热传导特性,采用等效RC电路方法建立实时系统热分析模型,求解芯片工作温度;S5: Taking the multi-core processor chip as the object, using the heat conduction characteristics of the processor core, using the equivalent RC circuit method to establish a real-time system thermal analysis model to solve the chip operating temperature;

S6:分析芯片漏电流与静态功耗的相互关系,进行曲线拟合,获得漏电流与芯片温度、电压的函数关系式;S6: Analyze the relationship between chip leakage current and static power consumption, perform curve fitting, and obtain the functional relationship between leakage current and chip temperature and voltage;

S7:引入两个工作参考温度,建立漏电流与温度的二次函数,获得静态功耗与芯片温度的函数关系式,建立基于实时温度管理的静态功耗计量表示方法。S7: Introduce two working reference temperatures, establish the quadratic function of leakage current and temperature, obtain the functional relationship between static power consumption and chip temperature, and establish a static power consumption measurement representation method based on real-time temperature management.

本发明首先对并行程序在异构系统上的执行过程进行抽象。其中S表示串行计算段,S={s0,…,sn-1},根据计算段的并行性将程序分为n段,si表示第i个计算段的任务量;C表示通信段;R={r0,…,rm-1},表示异构并行系统由m类处理器组成;Nj表示第j(0≤j≤m-1)类处理器rj的数量;vj表示最高频率下的速度(处理器单位时间内完成任务量);P表示并行计算段(第一个并行计算段由主处理器完成,第二个并行计算段由主处理器和加速部件并行完成,第三个并行计算段由加速部件独立完成)。将由主处理器/加速部件独立完成并行计算段称为同构计算段程序,主处理器和加速部件共同完成并行计算段称为异构计算段程序。接着将并行程序执行特征进行符号定义,如图3所示。The invention firstly abstracts the execution process of the parallel program on the heterogeneous system. Among them, S represents the serial computing segment, S={s 0 ,…,s n-1 }, the program is divided into n segments according to the parallelism of the computing segment, s i represents the task amount of the i-th computing segment; C represents the communication Segment; R={r 0 ,...,r m-1 }, means that the heterogeneous parallel system is composed of m type processors; N j means the number of jth (0≤j≤m-1) type processor r j ; v j represents the speed at the highest frequency (the amount of tasks completed by the processor per unit time); P represents the parallel computing segment (the first parallel computing segment is completed by the main processor, and the second parallel computing segment is completed by the main processor and the acceleration unit completed in parallel, and the third parallel computing segment is completed independently by the acceleration component). The parallel computing segment completed independently by the main processor/acceleration component is called a homogeneous computing segment program, and the parallel computing segment completed by the main processor and the acceleration component is called a heterogeneous computing segment program. Next, the parallel program execution characteristics are symbolically defined, as shown in Figure 3.

(1)异构系统动态功耗计量(1) Dynamic power consumption measurement of heterogeneous systems

在同构计算段程序中,如果si为串行段,则由ri类型的单个处理器完成;如果si为并行段,则由ri类型的所有处理器完成。动态电压与处理器频率的关系可以近似的描述为f=KVγ-1,其中K和γ是与工艺相关的参数。记因此动态功耗Pd可以看成与频率f的α次方成正比的关系,即Pd=Kfα。记第i个计算段的执行时间为ti,Ni表示第i个计算段ri类处理器个数,fi表示第i个计算段时ri处理器运行频率,同构程序段程序总功耗可以表示为In the isomorphic computing segment program, if s i is a serial segment, it is completed by a single processor of type ri; if s i is a parallel segment, it is completed by all processors of type ri . The relationship between the dynamic voltage and the processor frequency can be approximately described as f=KV γ-1 , where K and γ are parameters related to the process. remember Therefore, the dynamic power consumption P d can be regarded as a relationship proportional to the α power of the frequency f, that is, P d =Kf α . Note that the execution time of the i-th computing segment is t i , N i represents the number of r i processors in the i-th computing segment, f i represents the operating frequency of the r i processor in the i-th computing segment, and the isomorphic segment program The total power dissipation can be expressed as

针对由多个计算段组成的程序模型,求解在给定执行时间T的约束下使全程序总功耗达到最小,其中对任意计算段Si的时间约束ti的分析如下:如果第i个计算段Si为串行段,则该计算段仅由一个处理器完成,此时执行时间满足如果第i个计算段Si为并行段,则该计算段由ri类型的所有处理器并行完成,此时执行时间ti满足因此,基于同构程序的计算功耗计量可以表示为:For a program model composed of multiple computing segments, the solution is to minimize the total power consumption of the whole program under the constraint of a given execution time T, where the analysis of the time constraint t i of any computing segment S i is as follows: if the i-th The calculation segment S i is a serial segment, then the calculation segment is completed by only one processor, and the execution time satisfies If the i-th calculation segment S i is a parallel segment, then this calculation segment is completed in parallel by all processors of type ri, and the execution time t i satisfies Therefore, the computational power consumption measurement based on isomorphic programs can be expressed as:

在异构计算程序中,如果Si为串行段,则由ri类型的单个处理器完成;如果Si为并行段,则由系统内所有类型的处理器共同完成。本项目主要针对CPU-GPU异构并行系统进行研究,因此处理器类型只包含CPU和GPU两类(假设CPU的型号一致;GPU的型号一致)。In a heterogeneous computing program, if S i is a serial segment, it will be completed by a single processor of type ri; if S i is a parallel segment, it will be completed by all types of processors in the system. This project mainly focuses on the research of CPU-GPU heterogeneous parallel system, so the processor types only include CPU and GPU (assuming that the CPU models are the same; the GPU models are the same).

记第i个计算段的执行时间为ti,NC表示第i个计算段CPU处理器的个数,NG表示第i个计算段GPU处理器的个数,kC和kG分别表示CPU和GPU处理器相关常数。fC和fG分别表示第i个计算段CPU和GPU处理器的运行频率。表示第j类处理器在第i个计算段内单位时间内完成的任务量。因此异构程序段程序总功耗可以表示为Note that the execution time of the i-th computing segment is t i , N C represents the number of CPU processors in the i-th computing segment, N G represents the number of GPU processors in the i-th computing segment, k C and k G represent CPU and GPU processor-related constants. f C and f G represent the operating frequencies of the i-th computing segment CPU and GPU processors, respectively. Indicates the amount of tasks completed per unit time in the i-th computing segment by the j-type processor. Therefore, the total power consumption of heterogeneous program segments can be expressed as

异构计算段程序功耗最优问题原则上可分为两个子问题进行研究,即计算段内局部功耗最优和全程序整体功耗最优。第一个子问题的关键是建立计算段处理器最优功耗与执行时间的关系,第二个子问题是在计算段内功耗最优的基础上分配不同计算段的执行时间。因此异构计算段程序功耗优化问题可归纳为一般多元极值问题,基于异构程序的计算功耗可以表示为:In principle, the problem of optimal power consumption of heterogeneous computing segment programs can be divided into two sub-problems for research, that is, the optimal local power consumption within the computing segment and the optimal overall power consumption of the entire program. The key to the first sub-problem is to establish the relationship between the optimal power consumption of the computing segment processor and the execution time, and the second sub-problem is to allocate the execution time of different computing segments based on the optimal power consumption within the computing segment. Therefore, the power consumption optimization problem of heterogeneous computing segment programs can be summarized as a general multivariate extreme value problem, and the computing power consumption based on heterogeneous programs can be expressed as:

(2)异构系统通信功耗计量(2) Heterogeneous system communication power consumption measurement

在异构并行系统中,CPU与GPU通过PCI-E总线进行连接,PCI-E总线不支持动态电压/频率调节技术,即数据通信操作的执行速度与功耗开销一定。将PCI-E总线记为一类特殊的功能单元,其运行过程中的功耗开销为pm,0,空闲状态下的功耗开销为pm,1。同时假设通信操作不可中断,即多个数据通信操作需顺序执行,由于系统总线由单个通信操作独占使用,因此通信开销与数据规模成正比关系;而数据规模取决于具有数据依赖关系的两个并行任务的划分策略。In a heterogeneous parallel system, the CPU and GPU are connected through the PCI-E bus. The PCI-E bus does not support dynamic voltage/frequency adjustment technology, that is, the execution speed and power consumption of data communication operations are constant. The PCI-E bus is recorded as a special functional unit, and its power consumption overhead during operation is p m,0 , and its power consumption overhead in idle state is p m,1 . At the same time, it is assumed that the communication operation cannot be interrupted, that is, multiple data communication operations need to be executed sequentially. Since the system bus is exclusively used by a single communication operation, the communication overhead is proportional to the data size; and the data size depends on two parallel data with data dependencies. Task division strategy.

①同构程序段通信功耗计量①Isomorphic program segment communication power consumption measurement

在同构计算段程序中,通信功耗主要为输入数据由CPU传到GPU存储空间,输出数据由GPU回存到CPU存储空间所引入的通信开销。记通信操作的执行时间表示CPU与GPU之间数据通信开销,tm,0表示PCI-E总线空闲状态下的时间开销,则同构程序的通信功耗可以表示为,In the isomorphic computing segment program, the communication power consumption is mainly the communication overhead introduced by the input data being transferred from the CPU to the GPU storage space, and the output data being stored back from the GPU to the CPU storage space. Record the execution time of the communication operation Indicates the data communication overhead between the CPU and GPU, t m,0 indicates the time overhead in the idle state of the PCI-E bus, then the communication power consumption of the isomorphic program can be expressed as,

②异构程序段通信功耗计量② Communication power consumption measurement of heterogeneous program segments

在异构计算段程序中,通信功耗主要为在单个计算段中,具有数据依赖关系的多个并行任务划分所产生的通信开销。由于异构处理器的实际效能与任务特征直接相关,因此在多个任务间容易产生各不相同的划分策略,而由此引入了较大的通信开销。记表示任务v在划分方式z下与任务v'在划分方式z'下的通信开销,则异构程序的通信功耗可以表示为,In a heterogeneous computing segment program, communication power consumption is mainly the communication overhead generated by the division of multiple parallel tasks with data dependencies in a single computing segment. Since the actual performance of heterogeneous processors is directly related to task characteristics, it is easy to generate different division strategies among multiple tasks, which introduces a large communication overhead. remember Indicates the communication overhead of task v in the division mode z and task v' in the division mode z', then the communication power consumption of the heterogeneous program can be expressed as,

(3)异构系统静态功耗计量(3) Static power consumption measurement of heterogeneous systems

为了研究处理器内核的热传导特性,采用等效RC电路方法进行热分析建模,该模型采用如下公式进行工作温度的求解:In order to study the thermal conduction characteristics of the processor core, the equivalent RC circuit method is used for thermal analysis modeling. The model uses the following formula to solve the working temperature:

T和Tamb分别代表芯片的温度与环境温度,P代表时间t时芯片的功耗,Rth、Cth分别为等效热阻与等效热容。处理器的系统状态可以分为工作状态和休眠状态。只有在工作状态下处理器才执行任务;否则,处理器将进入休眠状态以减少功耗并降低自身温度。工作状态下的静态功耗可以表示为,T and T amb represent the temperature of the chip and the ambient temperature, respectively, P represents the power consumption of the chip at time t, and R th and C th are the equivalent thermal resistance and equivalent thermal capacity, respectively. The system state of the processor can be divided into a working state and a sleep state. The processor performs tasks only when it is active; otherwise, the processor goes to sleep to reduce power consumption and cool itself. The static power consumption in the working state can be expressed as,

Pstatic=NgateIleakageVdd (10)P static = N gate I leakage V dd (10)

通过HSPICE软件进行曲线拟合,与温度、电压相关的漏电流可以写为Curve fitting is carried out by HSPICE software, and the leakage current related to temperature and voltage can be written as

其中,A,B,α,β,γ,δ,μ,η是经验参数,由生产工艺所决定,当工作温度T在300k—380k的正常范围内变化,的波动变化很小。当给定了Vdd后,通过引入两个参考温度TH和TL进一步将漏电流简化为温度的二次函数。于是与漏电流相关的静态功耗可以形式化表示为Among them, A, B, α, β, γ, δ, μ, η are empirical parameters, which are determined by the production process. When the working temperature T changes within the normal range of 300k-380k, fluctuations are small. When V dd is given, the leakage current is further simplified as a quadratic function of temperature by introducing two reference temperatures TH and TL. The static power dissipation associated with the leakage current can then be formalized as

其中,in,

以上对本发明的具体实施例进行了描述。需要理解的是,本发明并不局限于上述特定实施方式,本领域技术人员可以在权利要求的范围内做出各种变形或修改,这并不影响本发明的实质内容。Specific embodiments of the present invention have been described above. It should be understood that the present invention is not limited to the specific embodiments described above, and those skilled in the art may make various changes or modifications within the scope of the claims, which do not affect the essence of the present invention.

以上所述是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也视为本发明的保护范围。The above description is a preferred embodiment of the present invention, it should be pointed out that for those skilled in the art, without departing from the principle of the present invention, some improvements and modifications can also be made, and these improvements and modifications are also considered Be the protection scope of the present invention.

Claims (2)

1. The method for metering the power consumption of the whole program of the heterogeneous system is characterized by comprising the following steps
S1: aiming at a single parallel computing section processed by a heterogeneous multiprocessor in parallel, analyzing the influence of the execution time of programs in the isomorphic computing section on the dynamic power consumption of the computing section according to different modes of the same type of processor or a plurality of different types of processors for completing the computing section, establishing the relation between the power consumption and the execution time of the isomorphic computing section, and obtaining a dynamic power consumption representation method based on isomorphic program division;
s2: analyzing the condition that the power consumption of a single computing section is optimal under the time constraint condition, establishing the relation between the power consumption and the execution time of the heterogeneous computing section, and obtaining a dynamic power consumption representation method based on the heterogeneous program division;
s3: in a isomorphic computation segment program, taking the parallel data scale as an object, analyzing the influence of data transmission between a main processor and an acceleration processor on communication energy consumption to obtain a communication energy consumption expression method of the isomorphic computation segment;
s4: in a heterogeneous computing segment program, parallel execution tasks are taken as objects, the direct relation between the actual efficiency and the task characteristics of a heterogeneous processor is utilized, the influence of the division of a plurality of parallel tasks with data dependency relation in a single computing segment on communication energy consumption is analyzed, and a heterogeneous computing segment communication energy consumption expression method is obtained;
s5: taking a multi-core processor chip as an object, establishing a real-time system thermal analysis model by using the heat conduction characteristic of a processor core and adopting an equivalent RC circuit method, and solving the working temperature of the chip;
s6: analyzing the correlation between the chip leakage current and the static power consumption, and performing curve fitting to obtain a functional relation between the leakage current and the chip temperature and voltage;
s7: two working reference temperatures are introduced, a quadratic function of leakage current and temperature is established, a functional relation between static power consumption and chip temperature is obtained, and a static power consumption metering expression method based on real-time temperature management is established.
2. The method for global program power consumption measurement of a heterogeneous system according to claim 1, wherein the curve fitting in step S6 is performed using hispid software.
CN201710020074.5A 2017-01-11 2017-01-11 A kind of heterogeneous system Whole Process power consumption metering method Pending CN106874158A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710020074.5A CN106874158A (en) 2017-01-11 2017-01-11 A kind of heterogeneous system Whole Process power consumption metering method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710020074.5A CN106874158A (en) 2017-01-11 2017-01-11 A kind of heterogeneous system Whole Process power consumption metering method

Publications (1)

Publication Number Publication Date
CN106874158A true CN106874158A (en) 2017-06-20

Family

ID=59159228

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710020074.5A Pending CN106874158A (en) 2017-01-11 2017-01-11 A kind of heterogeneous system Whole Process power consumption metering method

Country Status (1)

Country Link
CN (1) CN106874158A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107818040B (en) * 2017-09-28 2021-09-21 华南师范大学 Analysis method, system and device suitable for guiding parallelization of correlation algorithm
CN113467936A (en) * 2021-06-16 2021-10-01 上海行健职业学院 Processor scale selection method based on parallel computing time shortest estimation model
WO2021227418A1 (en) * 2020-05-11 2021-11-18 深圳先进技术研究院 Task deployment method and device based on multi-board fpga heterogeneous system
CN114546666A (en) * 2022-04-25 2022-05-27 沐曦科技(北京)有限公司 Power consumption distribution method based on multiple computing devices
CN117349029A (en) * 2023-12-04 2024-01-05 浪潮电子信息产业股份有限公司 Heterogeneous computing system, energy consumption determining method and device, electronic equipment and medium

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106293003A (en) * 2016-08-05 2017-01-04 广东工业大学 A kind of heterogeneous system dynamic power consumption optimization method based on AOV gateway key path query

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106293003A (en) * 2016-08-05 2017-01-04 广东工业大学 A kind of heterogeneous system dynamic power consumption optimization method based on AOV gateway key path query

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ZHUOWEI WANG: ""An architecture-level graphics processing unit energy model"", 《WILEY ONLINE LIBRARY》 *
王桂彬: ""大规模异构并行系统软件低功耗优化关键技术研究"", 《中国博士学位论文全文数据库 信息科技辑》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107818040B (en) * 2017-09-28 2021-09-21 华南师范大学 Analysis method, system and device suitable for guiding parallelization of correlation algorithm
WO2021227418A1 (en) * 2020-05-11 2021-11-18 深圳先进技术研究院 Task deployment method and device based on multi-board fpga heterogeneous system
CN113467936A (en) * 2021-06-16 2021-10-01 上海行健职业学院 Processor scale selection method based on parallel computing time shortest estimation model
CN114546666A (en) * 2022-04-25 2022-05-27 沐曦科技(北京)有限公司 Power consumption distribution method based on multiple computing devices
CN114546666B (en) * 2022-04-25 2022-07-19 沐曦科技(北京)有限公司 Power consumption distribution method based on multiple computing devices
CN117349029A (en) * 2023-12-04 2024-01-05 浪潮电子信息产业股份有限公司 Heterogeneous computing system, energy consumption determining method and device, electronic equipment and medium

Similar Documents

Publication Publication Date Title
Moreau et al. SNNAP: Approximate computing on programmable SoCs via neural acceleration
Zhu et al. High-performance and energy-efficient mobile web browsing on big/little systems
US7716006B2 (en) Workload scheduling in multi-core processors
Bertran et al. A systematic methodology to generate decomposable and responsive power models for CMPs
Jahanshahi et al. Gpu-nest: Characterizing energy efficiency of multi-gpu inference servers
Wang et al. OPTiC: Optimizing collaborative CPU–GPU computing on mobile devices with thermal constraints
Paul et al. Coordinated energy management in heterogeneous processors
CN106874158A (en) A kind of heterogeneous system Whole Process power consumption metering method
Tiwari et al. Predicting optimal power allocation for cpu and dram domains
Rossi et al. Modeling power consumption for DVFS policies
Stamoulis et al. Can we guarantee performance requirements under workload and process variations?
Liu et al. Source-level energy consumption estimation for cloud computing tasks
Wang et al. GPGPU power estimation with core and memory frequency scaling
Metz et al. Towards neural hardware search: Power estimation of cnns for gpgpus with dynamic frequency scaling
León-Vega et al. A Comprehensive Analysis of Process Energy Consumption on Multi-Socket Systems with GPUs
Xiong et al. A novel scalability metric about iso-area of performance for parallel computing
Maghsoud et al. PEPS: Predictive energy-efficient parallel scheduler for multi-core processors
Kornaros et al. Hardware-assisted dynamic power and thermal management in multi-core SoCs
Munir et al. A queueing theoretic approach for performance evaluation of low-power multi-core embedded systems
Li et al. Kernel scheduling approach for reducing GPU energy consumption
Peng et al. PROPHET: Predictive on-chip power Meter in hardware accelerator for DNN
Feng et al. Efficient task assignment and scheduling on MPSOC with STT-RAM based hybrid SPMs considering data allocation
Bambini et al. Modeling the thermal and power control subsystem in HPC processors
Wang et al. Whole procedure heterogeneous multiprocessors low-power optimization at algorithm-level
Lösch et al. reMinMin: A novel static energy-centric list scheduling approach based on real measurements

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20170620

WD01 Invention patent application deemed withdrawn after publication