GB2540804A - Hardware Power Management Apparatus and Methods - Google Patents
Hardware Power Management Apparatus and Methods Download PDFInfo
- Publication number
- GB2540804A GB2540804A GB1513348.1A GB201513348A GB2540804A GB 2540804 A GB2540804 A GB 2540804A GB 201513348 A GB201513348 A GB 201513348A GB 2540804 A GB2540804 A GB 2540804A
- Authority
- GB
- United Kingdom
- Prior art keywords
- component
- power management
- computer device
- algorithm
- load
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
- G06F1/3203—Power management, i.e. event-based initiation of a power-saving mode
- G06F1/3206—Monitoring of events, devices or parameters that trigger a change in power modality
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
- G06F1/3203—Power management, i.e. event-based initiation of a power-saving mode
- G06F1/3234—Power saving characterised by the action undertaken
- G06F1/3296—Power saving characterised by the action undertaken by lowering the supply or operating voltage
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/16—Constructional details or arrangements
- G06F1/20—Cooling means
- G06F1/206—Cooling means comprising thermal management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
- G06F1/3203—Power management, i.e. event-based initiation of a power-saving mode
- G06F1/3234—Power saving characterised by the action undertaken
- G06F1/324—Power saving characterised by the action undertaken by lowering clock frequency
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- Power Engineering (AREA)
- Computing Systems (AREA)
- Power Sources (AREA)
- Quality & Reliability (AREA)
Abstract
Power management apparatus 200, 204 and method comprising a processor obtaining data representing perceived performance of a computer device; a processor executing a power management algorithm that uses at least the data representing perceived performance and data representing current load of the computer device as inputs, and a communication unit configured to transmit a hardware configuration signal, configuring a hardware component of the computer device based on an output of the power management algorithm. The power management algorithm may comprise a dynamic voltage and frequency scaling algorithm. A further input may be used by the power management algorithm, which may be power usage, processor load, processor frequency, bus load, bus frequency, storage device activity, or temperature. The data represent perceived performance may the frames per second generated by the computer device.
Description
Hardware Power Management Apparatus and Methods
The present invention relates to power management of hardware.
Hardware power management is a known component, or group of components, that takes care of tasks including enabling and disabling parts of the hardware, changing the hardware operating properties (e.g. frequency/voltage) and swapping between different hardware modules in order to increase or decrease the performance as needed. Power management is usually implemented in software due to the need to tweak and change according to the different scenarios of hardware usage. Specific examples of power management include Advanced Power Management (APM) developed by Intel™ and Microsoft™ and supported and used by Linux™ Kernels.
Dynamic Voltage and Frequency Scaling (DVFS) is one of the key components of the power management. DVFS can increase or decrease the voltage and frequency provided to a component depending upon usage. Typically, the voltage and frequency is decreased in order to reduce power usage and is particularly useful in battery-powered devices, such as laptops, tablets or mobile telephones, to conserve power. DVFS can typically increase the voltage and frequency provided to the component in order to increase performance, such as when processing load has increased.
Figure 1A schematically illustrates a typical conventional DVFS arrangement 100. Hardware component 102, which may comprise a Central Processing Unit, Graphics Processing Unit, or any other component suitable for power management, can be configured by a DVFS unit 104. Typically, the DVFS unit will send signals to the hardware component in order to increase, decrease or maintain its processing frequency and voltage. The DVFS unit determines this configuration based on input received from a unit 106, which provides information regarding the load of the hardware component based on conventional measurements, such as the percentage of the component’s processing power that is being used or the number of tasks being run.
The load in the example is produced by one or more applications 110 that can be under the command of a user 112. The perceived processing power available to the user can be based on the results 114 produced, which may be expressed in terms of current Frames Per Second (FPS) for graphics operations, or the current speed of processing/installing/downloading in the case of processing operations. The performance of these activities are directly influenced by the hardware component, as configured by the DVFS unit 104. The skilled person will understand that the units described herein can comprise at least one hardware, firmware and/or software module.
Specific examples of known DVFS methods include ’’interactive” or “on demand” DVFS control, which are the default methods currently included in any Linux™ Kernel device and any Operating System (OS) mobile devices (for example, tablets, smartphones, and laptops). These DVFS methods are load based, and increase the frequency and voltage if the load increases and vice-versa. Other examples at a lower hardware and driver level include Intelligent Power Allocation (IPA) from ARM™. This can perform more complex hardware configurations, including reacting to operating temperature and power consumption, typically reducing frequency when device hits a thermal limit or is likely to hit it in the near future.
However, there are limitations to existing DVFS methods. For example, they are unable to identify and optimize trade-offs. Typically, an increase in frequency will create an increase in power consumption and an increase of perceived performance generated by the hardware unit. While performance and frequency/voltage increases are linear, an increase in power is a squared function. This leads to a problem that conventional DVFS techniques do not even try to solve, namely finding a balanced point between performance perceived and performance per Watt. Under some circumstances the perceived performance can be high enough to allow some degradation, in order to get closer to the point where the maximum performance per Watt power is achieved, without comprising the user experience. Further, conventional DVFS cannot change behaviour based on application domains, or on applications outputs. Linux kernels DVFS, IPA and other DVFS or HPM are also decoupled and do not work together. Under some circumstances they may even go for contradictory decisions, leading to fluctuations, overheat or low performance.
Embodiments of the present invention aim to address at least one of the above problems. Embodiments can automatically find trade-offs between power, performance and temperature, as well as controlling temperature, performance and power under a single framework. Embodiments can also provide application domain based power management decision making.
The present inventors have appreciated that the perceived performance or number of frames (FPS) generated by a GPU or other processor, particularly when used to generate display output, can advantageously be used to improve the determinations made by a power management/DVFS algorithm. Conventional power management/DVFS for GPUs base their determination merely on the processing load, and perceived performance factors, such as FPS, is not a part of conventional DVFS algorithms, i.e. they are not used in/by conventional DVFS algorithms (as an input/value/variable) to determine the desired hardware component configuration. The present inventors have appreciated that power usage is not equal to perceived performance for the purposes of effective power management and have created improved power management/DVFS techniques that can take into account power, performance and temperature values and which can also balance to find a better operation point. In conventional DVFS, power usage will increase as long as it provides better performance to the demanding application, leading to very high power usage in some cases, which is neither beneficial to the user nor optimal.
Figures 1B and 1C are graphs showing the results of experiments performed on a Samsung Note 4 Device (A53/57 Arch; 8 cores; Exynos 5433 (1.3/1.8Mhz); Mali T760 -600-700 Mhz) performing the Manhattan Benchmark by GFX Bench. The DVFS is operating at 7.76 Watt -> 10.02 FPS -> 0.77 Watt/FPS. A much better point of operation would be 600 MHz frequency: 6.11 Watt -> 9.55 FPS -> 0.63 Watt/FPS, that is a 22% power save with just a 4% performance degradation. However, conventional DVFS are unable to make that type of decision, because they do not know about the performance. In contrast, as discussed above, conventional DVFS simply process a value based on load, and the load in both cases is 100% (because the hardcoded maximum performance value is not achieved in any configuration) and so they will continue to push for higher frequencies. This situation is worsen after some time, since the temperature will hit the maximum limit and the frequency will be limited to 550 MHz, which leads to a drastic FPS drop (4.7 FPS) and a user evident feeling that the performance has worsen.
According to a first aspect of the present invention there is provided a power management method comprising: obtaining data representing perceived performance of a computer device; executing a power management algorithm that uses at least the data representing perceived performance and data representing current load of the computer device (or a component of the computer device) as inputs, and configuring a hardware component of the computer device based on an output of the power management algorithm.
The data representing perceived performance may comprise an estimate (or measurement) of performance at an end of a processing pipe of the computer device, e.g. an estimate of the performance actually provided to/experienced by a user running at least one application that is responsible for the load that is processed by the component.
The power management algorithm will typically determine a desired operating configuration (typically frequency and/or voltage) of the hardware component. The power management algorithm may comprise a Dynamic Voltage and Frequency Scaling (DVFS) algorithm and the output of the algorithm can be used to configure frequency and/or voltage of the hardware component.
The power management algorithm can include comparing one or more of the inputs with one or more predetermined values, and producing the output based on a result of the comparison(s). The power management algorithm may be a deterministic algorithm or may be implemented by means of a Machine Learning process.
The computing of the estimate of perceived performance for a display application executed by the computer device may comprise estimating an amount of frames generated/rendered by the computer device, typically by the component, e.g. GPU, operating in conjunction with a display driver or display controller. The amount of frames may be converted to FPS, e.g. by dividing the amount of frames rendered during a known time period by the time. The FPS may be used as the data representing perceived performance or may be scaled and used as an input to a neural network/Machine Learning implementation of the power management algorithm.
At least one further input used by the power management algorithm may be selected from a set including: power usage; processor load; processor frequency; bus load; bus frequency; storage device, e.g. disk, input/output activity; temperature (of the computer device or component). The at least one further input may be obtained by the computer device/kernel and/or a sensor, e.g. a temperature sensor.
In embodiments where a power metric is not given by the computer device/component, power usage of the computer device may be estimated using a fixed algorithm or formula that receives inputs based on hardware counters or parameters, or estimated using a neural network trained with data representing power outputs corresponding to a set of metrics of the computer device/component. The set of metrics in a case where the component comprises a GPU can include at least one of: GPU frequency, GPU load and GPU temperature. The set of metrics may include any of internal metrics of the computer device (for example, memory transfers, used clocks, used modules, type of load, rendering size, etc). The skilled person will understand that the metrics described in the specific embodiments below are for a GPU component, and that different hardware may contain different metrics.
The power management algorithm may use a further input representing an estimate of a type of activity being performed by the computer device/component in order to increase the quality of the decisions made by the power management algorithm. The type of activity for a GPU case may be selected from a set of activities, such as a set including a game (2D or 3D), a user interface (Ul) and a benchmark operation, among others. The type of activity may be estimated using an algorithm or formula, or using a neural network, which may receive as inputs estimated performance (e.g. FPS for a GPU) of the component and a value representing utility of the component. The value representing utility of the component may be computed using an equation: GPU operating frequency * GPU load percentage. At least one further hardware parameter may be used as an additional input for the estimation; e.g. similar to the parameters listed in relation to the power estimation above. The neural network may process a set of historical said neural network inputs.
The power estimation algorithm may include a Machine Learning process. The Machine Learning process may include: reading a current state of the computer device/component; choosing and performing an action based on the current state; checking a result of the action (typically by reading an updated current state of the computer device/component), and a feedback or reward parameter; and calculating and updating the state to reflect a new learned relationship between the action and the feedback or reward parameter.
The current state of the computer device/component read by the Machine Learning process can include a state from amongst a set of states comprising FPS, GPU frequency, GPU load, thermal, battery level and/or power. The actions chosen by the Machine Learning process can be selected from a set of actions that include: increasing frequency, decreasing frequency, maintaining operating frequency of the component or changing frequency of the component to a specified value. Goals that may be targeted by the Machine Learning process can include a specific thermal point, a highest possible performance and/or a high device load. The reward parameters calculated by the Machine Learning process may include a positive said reward parameter for approaching a said goal and a negative said reward parameter for departing from a said goal. The Machine Learning process may use a neural network, a Citable representing the set of states and the set of actions, or any other iterative Machine Learning algorithm that eventually learns the best actions from the feedback provided. The Machine Learning can be fine tuned by means of changing the reward functions or weights of each input, output, or target.
The power management algorithm may be configured to determine a specific thermal point, a highest possible performance and/or a high load for the component. The power management algorithm may be configured to determine an optimal point where maximum performance per Watt power of the component is achieved with some/predetermined constrains to minimum performance values of the component. The output of the power management algorithm may be used to configure an operating frequency of the component. The frequency may be selected from amongst a set or range of frequencies.
The component may comprise a Graphics Processing Unit, a Central Processing Unit or a Bus.
According to another aspect of the present invention there is provided apparatus configured to provide a power management function to a computing device, the apparatus including: a processor configured to obtain data representing perceived performance of a computer device; a processor configured to execute a power management algorithm that uses at least the data representing perceived performance and data representing current load of the computer device as inputs, and a communication unit configured to transmit a hardware configuration signal to a component of the computer device based on an output of the power management algorithm.
The apparatus may include, or be in communication with, at least one external device/sensor that, in use, provides at least one input to the power management algorithm.
According to another aspect of the present invention there is provided a computing device including, or in communication with, a power management apparatus substantially as described herein.
According to another aspect of the present invention there is provided computer readable medium (or circuitry) storing a computer program to operate a power management method substantially as described herein.
According to yet another aspect of the present invention there is provided a power management method comprising: executing a Machine Learning implementation of a power management algorithm, and configuring a hardware component of a computer device based on an output of the power management algorithm.
According to yet another aspect of the present invention there is provided a method comprising: executing a Machine Learning process to estimate power usage of a component of a computer device; and/or executing a Machine Learning process to estimate a type of activity performed by a component of a computer device, and performing power management of a hardware component of a computer device based on an output of the Machine Learning process(es).
According to the present invention, there is provided a method, an apparatus and a system as set forth in the appended claims. Other features of the invention will be apparent form the dependent claims, and the description which follows.
For a better understanding of the invention, and to show how embodiments of the same may be carried into effect, reference will now be made, by way of example, to the accompanying diagrammatic drawings in which:
Figure 1A is an overview of a computer device being configured by a known example DVFS;
Figures 1B and 1C are graphs showing the results of experiments relating to DVFS;
Figure 2 is an overview of a computer device being configured by an example DVFS according to a general embodiment of the present invention;
Figure 3 is a block diagram of a DVFS unit configured according to a first specific embodiment;
Figure 4 is a block diagram of a power estimator unit of the DVFS unit;
Figure 5 is a flowchart illustrating steps performed by an example DVFS algorithm of the DVFS unit;
Figure 6 is a block diagram of a DVFS unit configured according to second specific embodiment;
Figure 7 is a block diagram of a load type estimator unit of the DVFS unit;
Figure 8 is a block diagram of a DVFS unit configured according to a third specific embodiment;
Figure 9 is a block diagram of neural network-related units of the DVFS unit;
Figure 10 is a block diagram of Q-learning-related units of the DVFS unit, and
Figures 11A and 11B are examples of graphs useable by embodiments to select convergence points based on multiple goals.
Figure 2 schematically illustrates how a computer device can be configured by an example power management arrangement 200 according to a general embodiment of the present invention. A hardware component 202 of the computer device can be configured by a power management unit 204. The power management unit will typically be a component of the computer device, although it could be a separate/remote device. In the example, the component comprises a Graphics Processing Unit (GPU) of a computer device; however, the skilled person will appreciate that it could comprise any other component suitable for power management configuration, such as a CPU. For brevity, other common components of the computer device, e.g. communications interface, user input/out unit, etc, are not shown and need not be described herein. Typically, embodiments of the invention may be used in conjunction with battery-powered devices, such as laptops, tablets or mobile telephones, although embodiments can also be beneficial for at least partially mains-powered devices, such as desktop computers.
The power management unit 204 determines a configuration of the component 202 based on the output of a power management algorithm that it executes, examples of which are detailed herein. In specific embodiments, the output of the power management unit may comprise DVFS control signals (of any suitable format, transmitted by any suitable medium, e.g. wires/circuitry) that can increase, decrease or maintain the current operating frequency of the GPU. However, in alternative embodiments, the output of the power management unit differ, e.g. it can comprise an output representing a specific frequency value; outputs designed to configure frequencies and/or voltage of at least one CPU and/or a bus of the computer device; outputs for configuring active cores of the GPU/processor; outputs for configuring the hardware component by means of enabling disabling hardware parts, selecting between different hardware parts, and so on.
In general, the power management algorithm executed by the power management unit 204 will receive a set of inputs that it uses as values/variables in the algorithm/set of processing steps in order to determine the frequency value (or any other suitable power management value) to be provided to the component 202, thereby setting the operating characteristics, e.g. voltage and frequency, of the component. Typical processing steps performed by the power management algorithm can include comparing one or more of the inputs with one or more stored values and producing an output based on a result of the comparison(s). Such values can be computed/obtained by the skilled person based on experiments performed for particular hardware components. The processing steps may be based on a deterministic algorithm and/or may be implemented by means of a Machine Learning process.
In the example of Figure 2, one of the inputs comprises a value that is computed based on the output/operating results 214 of the GPU 202 and represents the perceived performance of the computer device. The perceived performance will normally be an estimate of performance at the end of the processing pipe of the computer device, i.e. an estimate of the perceived performance actually provided to the user 212, who may be running at least one application 210 that is responsible for the load 206 processed by the GPU. This is in contrast to a conventional DVFS unit, which uses a measurement (e.g. percentage) of the load of the hardware component as an input value for the DVFS algorithm (as illustrated in Figure 1A), rather than an estimate of perceived performance provided to/experienced by the user.
In a specific embodiment, the perceived performance input 214 to the power management unit 204 represents the FPS generated by the GPU 202, as measured by a display driver or display controller of the computer device. In some embodiments, the computer device comprises an Android™ based mobile device, such as the Note 4 Device produced by Samsung, and the FPS is measured at the SurfaceFlinger level, which is a service in the Android™ operating system that takes care of refreshing the display. However, the skilled person will appreciate that this type of data may be obtained in different ways, depending on the computer system/device. It will also be understood that embodiments of the invention can operate with other types of Operating Systems, including, but not limited to Tizen™, Unix™, etc. The data collected in the specific example corresponds to the amount of frames rendered by the GPU between power management decisions, and may be converted to FPS, e.g. based on the knowledge of time between DVFS decisions. (For example: 5 frames rendered and the time passed between decisions is 100ms -> FPS: 50 fps -> Scaled to [-1,1] -> 0.66666). The skilled person will also understand that other indicators of perceived performance could be used instead of (or in addition to) FPS, such as Ul response time (e.g. time between the press of a button and when a corresponding action is completed, either a sound, vibration, image, Ul change, or feedback to the user); application completion time (e.g. collected by the information the application provides to the Ul trough loading indicators in “%”,“bars” or loading animations) or the current speed of processing/installing/downloading in the case of processing operations.
The power management unit 204 will also receive an input representing information regarding the current load 206 of the GPU 202 (and/or the computer device/other components). The format of this information can vary, but will typically comprise an indication (e.g. percentage) of the maximum load of the component 202 currently being utilised.
The skilled person will understand that the inputs may be used directly (unchanged) as values/variables in the power management algorithm. Alternatively, the inputs may be processed further before being used as values/variables in the power management algorithm, e.g. converted from a value representing a number of frames to FPS as mentioned above.
Optionally, the power management unit 204 may receive at least one additional input from at least one external device, such as sensor(s) 208 that can also be used as value(s)/variable(s) by the power management algorithm. Examples of such sensors include temperature sensors that measure the temperature of the component 202 and/or other components of the computer device.
In some embodiments, the power management unit 204 may use at least one additional value as input(s) to its DVFS algorithm, which may be provided by at least one other components) of the computer device. A non-exhaustive list of examples includes: power usage (if available by a hardware-based measuring system); CPU loads; CPU frequency; bus frequency; bus load (if available), and/or disk Input Output activity.
Figure 3 is a block diagram of a first example embodiment of the power management unit 204. The power management unit comprises a power estimator unit 302 as well as a unit 304 configured to execute a (single) power management algorithm. Each unit can be hardcoded as a full/partial circuit in the computer device, but could be implemented in alternative ways. It will be appreciated that the units described herein are exemplary only and many variations are possible. For instance, any particular function/step may be performed by a single processor/circuit, or may be distributed over several. Also, the methods described herein can be implemented using any suitable programming language/means and/or data structures.
Figure 4 is a block diagram of an example embodiment of the power estimator unit 302. An aim of this unit is to provide an estimate of the power currently used by the GPU 202. Most conventional computer devices, such as mobile phones, do not directly provide such power information. In the example embodiment, the power estimator unit is implemented by means of neural networks; however, the skilled person will appreciate that in alternative embodiments, other techniques, which may or may not be based on machine learning (e.g. supervised, unsupervised or reinforcement learning), can be used. In the illustrated example, the neural networks are trained with captured data, where the input and outputs are known. The inputs and outputs may be measured using external equipment/sensors during a learning phase. The networks can be trained by automatically “tuning” the weights of the connections between neurons in order to achieve the desired output for the given input.
The power estimator unit 302 receives inputs 402 corresponding to hardware metrics, which may be provided by the computer device hardware, Kernel, driver, etc. A non-exhaustive exemplary list of inputs that can be used in the specific example comprises: GPU frequency; GPU load (expressed as a percentage); GPU temperature; the frequencies and load (expressed as a percentage) of set of associated CPUs 0-7, and the frequencies of set of associated buses 0-3. The power estimator unit 302 uses a scaler unit 404 to scale the inputs to [-1,1] range. The resulting values are inputted into a trained neural network 406 (one value to each input neuron), which processes them in a known manner and outputs a result value. This value is scaled back from [0,1] to the real value (e.g. expressed in Watt) using a rescaler unit 408. In one embodiment the scale factor is x15, but it can be any suitable value, depending on the training scenarios and device power usage ranges.
As shown in Figure 3, the output of the power estimator unit 302 is used as an input to the power management algorithm unit 304, which also receives as inputs a set of hardware metrics and inputs comprising an estimate of perceived performance, e.g. FPS as discussed above, and current load of the GPU 202. The hardware metric inputs may be the same as (at least some of) the inputs 402 of the power estimator unit discussed above, or may differ. The power management algorithm unit also receives inputs 306 representing settings of the computer device (including user settings or modes).
Figure 5 is a flowchart illustrating steps performed by an example of the power management algorithm unit 304. It will be appreciated that in alternative embodiments, at least some of the steps may be re-ordered or omitted. Also, additional steps may be performed. Further, although the steps are shown as being performed in sequence in the Figure, in alternative embodiments at least some of them may be performed concurrently. In the illustrated embodiment, the steps comprise a set of if-else conditions. Embodiments may allow the user to “tune” these hardcoded functions by changing the mode or the values of some parameters. However, the skilled person will understand that the power management algorithm steps shown are purely exemplary and many variations are possible. In general, the power management algorithm with receive input values and process those values in order to determine a desired/efficiency-improving operating configuration (typically frequency) of the hardware component 202. In a very simple embodiment, the perceived performance estimate and current load value may be processed by being compared to one or more threshold values, with the result of that determination(s) being used to select from amongst a range/set of frequencies (e.g. low or high).
Step 502 represents the start of a sequence of steps that are performed for each decision to be output by the power management algorithm unit 304. At step 504 the mode of the computer device is checked. In the illustrated embodiment, the check relates to a power saving mode as can be set by as user, but it will be appreciated that this is merely an example. If the mode is “normal” then control passes to step 506, where a question is asked as to whether an input representing the temperature is greater than a certain threshold, e.g. 70° C. If the answer to this question is affirmative then control passes to step 508. At step 508 the DVFS algorithm outputs a signal that indicates that the frequency of the GPU 202 should be reduced (e.g. set to a predetermined reduced level or decremented by a certain value) and then the algorithm ends at step 509.
If the answer to the question of step 506 is negative then control passes to step 510. At step 510 a question is asked as to whether the value of the input representing the FPS is greater than the sum of 50 and the F parameter (one of the user setting inputs 306 relating to FPS). If the answer to this question is affirmative then control passes to step 508. If the answer to the question of step 510 is negative then control passes to step 512. At step 512 a question is asked as to whether the value of the input representing the FPS is less than the sum of 40 plus the F parameter. If the answer to this question is affirmative then control passes to step 514. At step 514 the DVFS algorithm outputs a signal that indicates that the frequency of the GPU 202 should be increased (e.g. set to a predetermined increased level or incremented by a certain value) and then the algorithm then ends at step 509. If the answer to the question of step 512 is negative then control passes to step 516.
At step 516 a question is asked as to whether the value of the input representing power is greater than X plus parameter P, wherein X is a predefined (not user-modifiable) value, and the P parameter is one of the user setting inputs 306 that relates to power. If the answer to this question is affirmative then control passes to step 508; otherwise, control passes to step 509, meaning that the frequency of the GPU remains unchanged.
Referring back to step 504, if the answer to that question is that the mode of the computer device is “power saving” then control passes to step 518. At step 518 a question is asked as to whether the value of the input representing the temperature is greater than a certain threshold, e.g. 70° C. If the answer to this question is affirmative then control passes to step 508. If the answer to the question of step 518 is negative then control passes to step 520.
At step 520 a question is asked as to whether the value of the input representing power is greater than X plus parameter P. If the answer to this question is affirmative then control passes to step 508; otherwise, control passes to step 522. At step 522 a question is asked as to whether the value of the input representing the FPS is greater than the sum of 40 plus the F parameter. If the answer to this question is affirmative then control passes to step 508; otherwise control passes to step 524. At step 524 a question is asked as to whether the value of the input representing the FPS is less than the sum of 30 and the F parameter. If the answer to this question is affirmative then control passes to step 514; otherwise, control passes to step 509.
In an alternative embodiment, the power management unit 204 may execute a power management algorithm based on the below formula:
where P represents power, T represents Temperature, F represents FPS, G|0ad represents load, and w represents configurable “weight” parameters to give more or less importance to the parameters (these can be set by the user).
If the result value of the formula is greater than 1 then the power management unit 204 outputs a signal to increase the operating frequency of the component 202. If the result value is less than 1 then the frequency will decrease. All intermediate result values do not change the frequency, but the algorithm saves the result value(s) for influencing later decisions. In some embodiments the weight parameters can be tuned by a user, with the hardcoded values being just a guide.
Figure 6 is a block diagram of a second example embodiment of the power management unit 204. Compared to the embodiment of Figure 3, this second embodiment additionally comprises a load type detector unit 601. This unit receives inputs that may be the same as (at least some of) the inputs of a power estimator unit 602, or which may be different. The outputs of the load detector unit and the power estimator unit are used as inputs to a power management algorithm unit 604. In contrast to the embodiment of Figure 3, the power management algorithm unit 604 can be programmed with a set of different power management algorithms, typically one algorithm for each type of determined load. One of these power management algorithms will be selected for execution based on the output of the load type detector unit, thereby allowing the power management unit to implement domain-based decision making. For example, the power management decisions regarding FPS values can be completely different under a game or a benchmark scenario: a game can be rendered at 60 FPS if the frequencies are pushed upwards, while a benchmark would not normally reach 60 FPS. Therefore, for example, under a benchmark scenario the FPS threshold value used by the power management algorithm might be reduced. Another example can be to ignore the power and temperature conditions under Ul, since Ul operations do not tend to heat the computer device.
Figure 7 is a block diagram of an example load type detector unit 601. An aim of this unit is to determine/estimate the current type of task (e.g. onscreen application) being performed by the computer device. In the example embodiment, there is a set of four possible types of applications (3D game, 2D game, benchmark operation, or user interface), but it will be appreciated that these are exemplary only and in other embodiments the number and type of applications can differ.
In the example embodiment, the load type detector unit 601 is implemented by means of a neural network; however, the skilled person will appreciate that in alternative embodiments, other techniques, which may or may not be based on machine learning, can be used. The example neural network uses two inputs: FPS and GPU Util (where Util = Freq * Load%). These inputs may be processed by a scaler 701 to scale them to [-1,1] range. The unit 601 also requires some history of previous values and so it saves old values (e.g. the last 30 old values) using a delayer 702 or moving window, and then uses them as inputs to the trained neural network 704. A selector unit 706 generates an output representing the type of load that has been determined based on the outputs of the neural network.
Figure 8 is a block diagram of a third example embodiment of the power management unit 204. Compared to the second embodiment of Figure 6, this third embodiment uses a machine-learning based power management algorithm unit 804. This power management algorithm unit receives input from a reward calculation unit 805, as well as inputs from a load type detector 601 and other inputs corresponding to hardware/sensor metrics. The reward calculation unit can receive inputs from a power estimator unit 602, inputs corresponding to hardware metrics, input(s) representing an estimate of perceived performance of a computer device (e.g. FPS) and/or user/device settings 306.
The machine learning based power management algorithm unit 804 is based on the known Q-learning approach, which is a type of reinforcement learning. Reinforcement learning is typically based on a computation:
Given the heuristic examples of x, y the process should find f() with the use of an Environment (Env) and rewards (R). In specific embodiments, the variables used for Q-learning can include the current state of the computer device, e.g. FPS, GPU frequency, GPU load, thermal, battery level and/or power, etc. The actions determined by the process can include increasing frequency, decreasing frequency or remaining at the same frequency. Alternatively, the process could decide on one from a range/set of frequencies. Examples of convergence points/goals that may be targeted by the process include a specific thermal point, the highest possible FPS and/or a high GPU load. The rewards used by the process may include a positive reward for approaching the goals and a negative reward for departing from the goals.
An outline of an example of a model reinforcement learning heuristic to control GPU power management is given below:
Step 1: Read current state (St)
Step 2: Choose action based on current state
Step 3: Calculate reward of current state (St)
Step 4: Update (learn) the last action taken (Sn, An)
Step 5: Goto Step 1
Figure 9 is a block diagram of an exemplary implementation of the power management algorithm 804 based on machine learning with Q-learning neural networks. A neural network 904 receives inputs corresponding to hardware metrics and/or estimate of perceived performance (e.g. FPS), as well as from the reward calculation unit 805. At every step of the process, the neural network 904 is positively or negatively reconfigured 902 based on whether the n-1 decisions resulted in a positive or negative reward.
Figure 10 is a block diagram of an alternative implementation of the power management algorithm 804 that is based on Q-learning with a Q table. The format of an example suitable Q table is shown below:
States Actions ] FPS, Load, ... j Action 1, Action 2, Action 3... | ] FPS, Load, ... j Action 1, Action 2, Action 3... | ] FPS, Load, ... j Action 1, Action 2, Action 3... j
An example function for updating the table is: Update_Table (PrevS, Action, CurrS,
Reward), which may be expressed as Q(st,at) = Q(st,at) + a[r + β * argmax Q(st+1,at+1) — Q(st,at)] where a represents the learning factor or learning rate; R represents the reward value; β represents the predictive value (also called gamma factor) and obtains the maximum value of the next expected state in order to give an extra reward to those decisions that lead to better decisions in future.
The skilled person will appreciate that the above formula is orientative and that alternative Q Learning implementations can have multiple different variations of the formula. Provided that the Q value of the action gets updated by a suitable formula that uses the reward value obtained, it can still be a Q Learning implementation.
In Figure 10, the Q table 1004 receives inputs from the reward calculation unit 805 and a state generator 1006. At every step of the process, the previous state is updated 1008 with the reward value obtained and decisions are made based on the states. For equal Q states, a random one is selected. The action to be selected for a given state may be determined in various ways, e.g. using the known Algorithmic Temperature Function or Boltzmann Probability
When there are multiple goals, such as target power, temperature and/or FPS then convergence points may be chosen based on overlapping curves and rules, such as the examples given in Figures 11A and 11B. Examples of reward formulas that can be used by the reward calculation unit 805 are given below:
where t represents time; f represents frequency; T represents temperature; n represents the current reward value; n-1 refers to the last reward value; L represents the load value of the GPU.
Embodiments of the power management unit 204 may also be provided that aim to operate the component 202 at optimal FPS/Watt. The optimal value is known (e.g. from experimental data for a particular GPU), and is the lowest possible frequency (because the power scales to χΛ2). Such embodiments can aim to have the component operate at a “better” or closer-to-optimal FPS/Watt for a predefined “good enough” FPS value, instead of using too much power, which is the case with conventional DVFS. Such embodiments can execute a power management algorithm using inputs representing perceived performance and current load and process these. This behaviour can be implicitly modelled by the reward functions, where the GPU load at 100% is preferred and the highest FPS is also preferred. This will lead the device/component towards the optimal (typically close to 60 FPS). Other rewards can further tune this down due to temperature, power, etc. In a simple embodiment an approach to achieve this can reduce the frequency while the load is less than 100%, and when it is 100% it is not allowed to the FPS lower than 50 FPS. For example, that will make the component operate in a quasi-optimal FPS/Watt without degrading the perceived performance.
It is understood that according to an exemplary embodiment, a computer readable medium storing a computer program to operate a method according to the foregoing embodiments is provided.
Attention is directed to all papers and documents which are filed concurrently with or previous to this specification in connection with this application and which are open to public inspection with this specification, and the contents of all such papers and documents are incorporated herein by reference.
All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and/or all of the steps of any method or process so disclosed, may be combined in any combination, except combinations where at least some of such features and/or steps are mutually exclusive.
Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise. Thus, unless expressly stated otherwise, each feature disclosed is one example only of a generic series of equivalent or similar features.
The invention is not restricted to the details of the foregoing embodiment(s). The invention extends to any novel one, or any novel combination, of the features disclosed in this specification (including any accompanying claims, abstract and drawings), or to any novel one, or any novel combination, of the steps of any method or process so disclosed.
Claims (28)
1. A power management method comprising: obtaining data representing perceived performance of a computer device; executing a power management algorithm that uses at least the data representing perceived performance and data representing current load of the computer device as inputs, and configuring a hardware component of the computer device based on an output of the power management algorithm.
2. A method according to claim 1, wherein the data representing perceived performance represents performance at an end of a processing pipe of the computer device, e.g. an estimate of the performance provided to/experienced by a user running at least one application that is responsible for the load that is processed by the component.
3. A method according to claim 1 or 2, wherein the power management algorithm determines an operating frequency and/or voltage of the hardware component.
4. A method according to claim 3, wherein the power management algorithm comprises a Dynamic Voltage and Frequency Scaling, DVFS, algorithm and the output of the algorithm is used to configure the frequency and/or the voltage of the hardware component.
5. A method according to any preceding claim, wherein the power management algorithm includes comparing one or more of the inputs with one or more predetermined values, and produces the output based on a result of the comparison(s).
6. A method according to any preceding claim, including computing the data representing perceived performance by estimating an amount of frames generated/rendered by the computer device when executing a display application.
7. A method according to claim 6, wherein the amount of frames is converted to Frames Per Second, FPS, for use by the power management algorithm.
8. A method according to any preceding claim, wherein at least one further input used by the power management algorithm is selected from a set including: power usage of the computer device/component; processor load of the computer device/component; processor frequency of the computer device/component; bus load; bus frequency; storage device activity; temperature of the computer device/component.
9. A method according to any preceding claim, including estimating power usage of the computer device using a fixed algorithm or formula that receives inputs based on hardware counters or parameters of the computer device/component.
10. A method according to any of claims 1 to 8, including estimating power usage of the computer device using a neural network trained with data representing power outputs corresponding to a set of metrics of the computer device/component.
11. A method according to claim 10, wherein the component comprises a Graphics Processing Unit, GPU, and the set of metrics comprises at least one of: GPU frequency, GPU load and GPU temperature.
12. A method according any preceding claim, wherein the power management algorithm uses a further input representing an estimate of a type of activity being performed by the computer device/component.
13. A method according to claim 12, wherein the component comprises a GPU and the type of activity is selected from a set of activities including: a 2D or 3D game, a user interface and a benchmark operation.
14. A method according to claim 12 or 13, wherein the type of activity is estimated using an algorithm or formula, or using a neural network that receives as inputs estimated performance, e.g. FPS for a GPU, of the component and a value representing utility of the component.
15. A method according to claim 14, wherein the value representing utility of the component is computed using an equation: GPU operating frequency * GPU load percentage.
16. A method according to any preceding claim, wherein the power estimation algorithm includes a Machine Learning process comprising: reading a current state of the computer device/component; choosing and performing an action based on the current state; checking a result of the action and a feedback or reward parameter; and calculating and updating the state to reflect a new learned relationship between the action and the feedback or reward parameter.
17. A method according to claim 16, wherein the current state of the computer device/component includes a state from amongst a set of states comprising FPS, GPU frequency, GPU load, thermal, battery level and/or power usage.
18. A method according to claim 16 or 17, wherein the actions are selected from a set of actions that include: increasing operating frequency of the component, decreasing operating frequency of the component, maintaining operating frequency of the component or changing operating frequency of the component to a specified value.
19. A method according to any of claims 16 to 18, wherein goals targeted by the Machine Learning process include a specific thermal point of the component, a highest possible performance of the component and/or a high load of the component.
20. A method according to any of claims 16 to 19, wherein the reward parameters include a positive said reward parameter for approaching a said goal and a negative said reward parameter for departing from a said goal.
21. A method according to any of claims 16 to 20, wherein the Machine Learning process uses a neural network or a Q-table representing the set of states and the set of actions.
22. A method according to any preceding claim, wherein the power management algorithm is configured to determine a specific thermal point for the component, a highest possible performance for the component and/or a high load for the component.
23. A method according to any preceding claim, wherein the power management algorithm is configured to determine an optimal point where maximum performance per Watt power of the component is achieved.
24. Apparatus configured to provide a power management function to a computing device, the apparatus including: a processor (204) configured to obtain data representing perceived performance of a computer device; a processor (204) configured to execute a power management algorithm that uses at least the data representing perceived performance and data representing current load of the computer device as inputs, and a communication unit (204) configured to transmit a hardware configuration signal to a component of the computer device based on an output of the power management algorithm.
25. Apparatus according to claim 24, including, or in communication with, at least one external device/sensor (208) that, in use, provides at least one input to the power management algorithm.
26. A computer device including, or in communication with, power management apparatus according to claim 24 or 25.
27. A computer readable medium or circuitry storing a computer program to operate a method according to any one of claims 1 to 23.
28. A method or apparatus substantially as described herein and/or with reference to Figures 2 - 11B of the accompanying drawings.
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB1513348.1A GB2540804B (en) | 2015-07-29 | 2015-07-29 | Hardware power management apparatus and methods |
KR1020160025762A KR102651874B1 (en) | 2015-07-29 | 2016-03-03 | Method and apparatus for power management |
EP16181942.0A EP3125072B8 (en) | 2015-07-29 | 2016-07-29 | Method of managing power and electronic device |
US15/223,396 US10101800B2 (en) | 2015-07-29 | 2016-07-29 | Method of managing power and electronic device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB1513348.1A GB2540804B (en) | 2015-07-29 | 2015-07-29 | Hardware power management apparatus and methods |
Publications (3)
Publication Number | Publication Date |
---|---|
GB201513348D0 GB201513348D0 (en) | 2015-09-09 |
GB2540804A true GB2540804A (en) | 2017-02-01 |
GB2540804B GB2540804B (en) | 2018-03-07 |
Family
ID=54106787
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
GB1513348.1A Expired - Fee Related GB2540804B (en) | 2015-07-29 | 2015-07-29 | Hardware power management apparatus and methods |
Country Status (2)
Country | Link |
---|---|
KR (1) | KR102651874B1 (en) |
GB (1) | GB2540804B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111524058A (en) * | 2019-02-01 | 2020-08-11 | 纬创资通股份有限公司 | Hardware acceleration method and hardware acceleration system |
WO2021050275A1 (en) * | 2019-09-13 | 2021-03-18 | Nvidia Corporation | Device link management |
CN113366409A (en) * | 2019-01-08 | 2021-09-07 | 惠普发展公司,有限责任合伙企业 | Stable processing equipment performance |
EP3848776A4 (en) * | 2018-10-15 | 2021-11-10 | Huawei Technologies Co., Ltd. | RESOURCE PLANNING PROCEDURES AND COMPUTER DEVICE |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR102568686B1 (en) * | 2018-02-09 | 2023-08-23 | 삼성전자주식회사 | Mobile device including context hub and operation method thereof |
KR102684855B1 (en) | 2018-08-08 | 2024-07-15 | 삼성전자 주식회사 | Method for executing application using clock speed of processor selected by external temperature and electronic device performing the method |
WO2020231014A1 (en) * | 2019-05-16 | 2020-11-19 | Samsung Electronics Co., Ltd. | Electronic device for performing power management and method for operating the same |
KR20200132629A (en) | 2019-05-16 | 2020-11-25 | 삼성전자주식회사 | Electronic device for performing power management and method for operating thereof |
KR102766383B1 (en) | 2019-08-06 | 2025-02-12 | 삼성전자주식회사 | Multi-core system and controlling operation of the same |
EP3987770A4 (en) | 2019-08-20 | 2022-08-17 | Samsung Electronics Co., Ltd. | ELECTRONIC DEVICE FOR IMPROVING THE GRAPHIC PERFORMANCE OF AN APPLICATION PROGRAM AND OPERATING METHOD THEREOF |
CN118051267A (en) * | 2024-01-12 | 2024-05-17 | 飞腾信息技术有限公司 | ACPI-based IPA implementation method, device, equipment, and storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090309885A1 (en) * | 2008-06-11 | 2009-12-17 | Eric Samson | Performance allocation method and apparatus |
US20140164757A1 (en) * | 2012-12-11 | 2014-06-12 | Apple Inc. | Closed loop cpu performance control |
US20140184619A1 (en) * | 2013-01-03 | 2014-07-03 | Samsung Electronics Co., Ltd. | System-on-chip performing dynamic voltage and frequency scaling |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8872812B2 (en) * | 2009-11-12 | 2014-10-28 | Marvell World Trade Ltd. | Power saving in mobile devices by optimizing frame rate output |
KR101991682B1 (en) * | 2012-08-29 | 2019-06-21 | 삼성전자 주식회사 | A DVFS controlling method and A System-on Chip using thereof |
US9158358B2 (en) * | 2013-06-04 | 2015-10-13 | Qualcomm Incorporated | System and method for intelligent multimedia-based thermal power management in a portable computing device |
-
2015
- 2015-07-29 GB GB1513348.1A patent/GB2540804B/en not_active Expired - Fee Related
-
2016
- 2016-03-03 KR KR1020160025762A patent/KR102651874B1/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090309885A1 (en) * | 2008-06-11 | 2009-12-17 | Eric Samson | Performance allocation method and apparatus |
US20140164757A1 (en) * | 2012-12-11 | 2014-06-12 | Apple Inc. | Closed loop cpu performance control |
US20140184619A1 (en) * | 2013-01-03 | 2014-07-03 | Samsung Electronics Co., Ltd. | System-on-chip performing dynamic voltage and frequency scaling |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3848776A4 (en) * | 2018-10-15 | 2021-11-10 | Huawei Technologies Co., Ltd. | RESOURCE PLANNING PROCEDURES AND COMPUTER DEVICE |
CN113366409A (en) * | 2019-01-08 | 2021-09-07 | 惠普发展公司,有限责任合伙企业 | Stable processing equipment performance |
US12141624B2 (en) | 2019-01-08 | 2024-11-12 | Hewlett-Packard Development Company, L.P. | Stabilizing performance of processing devices |
CN111524058A (en) * | 2019-02-01 | 2020-08-11 | 纬创资通股份有限公司 | Hardware acceleration method and hardware acceleration system |
CN111524058B (en) * | 2019-02-01 | 2023-08-22 | 纬创资通股份有限公司 | Hardware acceleration method and hardware acceleration system |
WO2021050275A1 (en) * | 2019-09-13 | 2021-03-18 | Nvidia Corporation | Device link management |
GB2600870A (en) * | 2019-09-13 | 2022-05-11 | Nvidia Corp | Device link management |
US11822926B2 (en) | 2019-09-13 | 2023-11-21 | Nvidia Corporation | Device link management |
GB2600870B (en) * | 2019-09-13 | 2024-10-30 | Nvidia Corp | Device link management |
Also Published As
Publication number | Publication date |
---|---|
KR20170015097A (en) | 2017-02-08 |
KR102651874B1 (en) | 2024-03-27 |
GB2540804B (en) | 2018-03-07 |
GB201513348D0 (en) | 2015-09-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
GB2540804A (en) | Hardware Power Management Apparatus and Methods | |
US10101800B2 (en) | Method of managing power and electronic device | |
CN109960395B (en) | Resource scheduling method and computer equipment | |
US8826048B2 (en) | Regulating power within a shared budget | |
US9329663B2 (en) | Processor power and performance manager | |
JP6632770B1 (en) | Learning device, learning inference device, method, and program | |
Dey et al. | User interaction aware reinforcement learning for power and thermal efficiency of CPU-GPU mobile MPSoCs | |
US11042410B2 (en) | Resource management of resource-controlled system | |
US20110055597A1 (en) | Regulating power using a fuzzy logic control system | |
EP3036598B1 (en) | Power signal interface | |
KR20140010930A (en) | Method and apparatus for providing efficient context classification | |
WO2018193934A1 (en) | Evaluation apparatus, evaluation method, and program therefor | |
CN113419825B (en) | Resource performance prediction method, device and system and electronic equipment | |
CN110781969A (en) | Air conditioning air volume control method, device and medium based on deep reinforcement learning | |
CN106462456B (en) | Processor state control based on detection of producer/consumer workload serialization | |
US11669762B2 (en) | Apparatus and method for forecasted performance level adjustment and modification | |
CN113934590A (en) | Dynamic threshold processing method and device, medium and electronic device for monitoring indicators | |
Vinay et al. | Light weight rl based run time power management methodology for edge devices | |
US20230171340A1 (en) | Mobile device and method for providing personalized management system | |
CN111684485B (en) | Video playback energy consumption control | |
CN113946428A (en) | Processor dynamic control method, electronic equipment and storage medium | |
US12132667B2 (en) | Adapting software code to device resource availability | |
JP7515063B2 (en) | Presentation method and presentation system | |
JP2014146152A (en) | Information processing device and control method of information processing device | |
US20230088429A1 (en) | Processing device, processing method, and program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PCNP | Patent ceased through non-payment of renewal fee |
Effective date: 20220729 |