[go: up one dir, main page]

100% found this document useful (1 vote)
99 views14 pages

Detail Power Calculation

Uploaded by

古永上
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
99 views14 pages

Detail Power Calculation

Uploaded by

古永上
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Chapter 4

Power Analysis in ASICs

This chapter describes various aspects of power analysis in a digital CMOS design.
The power dissipation in an ASIC is comprised of power in the digital core logic,
memories, analog macros, and other IO interfaces. The power dissipation in the
digital logic and memory macros can be due to switching activity, called active
power, and the leakage power which is present even with zero switching activity in
the design. This chapter describes each of these contributions in detail—specifically
the factors affecting the power calculation from various contributions in the design.
Switching activity formats are also described in this chapter.

4.1 What Is Switching Activity?

As described in Chaps. 2 and 3, the power computation is generally obtained from


the power models included in the library descriptions of the standard cells, memory
macros and the IO libraries. This computation of power using library power models
relies upon the transition activity and state of each pin of standard cells, memory
macros and the IOs.
The key to power computation is the switching activity of each net. What is
switching activity? The switching activity is comprised of the following two
parameters:
(a) Static probability
(b) Transition rate

4.1.1 Static Probability

For a given net, the static probability refers to the expected state of the signal. For
example, a static probability value of 0.2 implies that the signal is at logic-1 for 20%

R. Chadha and J. Bhasker, An ASIC Low Power Primer: Analysis, 45


Techniques and Specification, DOI 10.1007/978-1-4614-4271-4_4,
© Springer Science+Business Media New York 2013
46 4 Power Analysis in ASICs

of the time (and logic-0 for 80% of time1). A 50% duty cycle for a clock signal
implies that the static probability of the clock signal is 0.5 (or the clock is logic-0
for 50% of time and logic-1 for 50% of time).

4.1.2 Transition Rate

The transition rate is the number of transitions per unit time. The transition rate is
also referred to as toggle rate. For periodic signals such as clocks where the fre-
quency of the signal is specified, the transition rate is twice the frequency of the
signal (since there are two transitions—rising and falling—within each cycle).
The power analysis utilizes the switching activity (static probability and transi-
tion rate) for each signal in the design.

4.1.3 Examples

In Fig. 4.1, the probability that pins CK and Q are at 1 is 50%. However, the toggle
rate for pin CK is 8 toggles in 40 ns, or 200 million transitions per second. The
toggle rate for pin Q is 4 toggles in 40 ns, or 100 million transitions per second.
A net that has a probability of 1 or 0 is a constant net. A net with a probability of
0.5 is at logic-1 50% of time. This effectively describes the duty cycle of the net.
A net with a probability of 0.25 is at logic-1 for 25% of time. An example of two
waveforms with same toggle rate but different static probability values (different
duty cycles) is shown in Fig. 4.2.
Consider the example shown in Fig. 4.3. Net CK has a probability of 0.5 and a
toggle rate of 100 million transitions per second, and net CKE has a probability of
0.5 and a toggle rate of 2,000 transitions per second. In this case, the toggle rate for
CKG is nearly 50 million transitions per second. This is because CKE has a proba-
bility of 0.5 (it is on for half the time) and the CK toggle rate is much larger than the
CKE toggle rate. Thus, CKE can be treated as an almost steady signal in compari-
son, and one-half of the CK transitions would propagate to the CKG net.

4.2 Power Computation for Basic Cells and Macros

This section illustrates the detailed power computation of sample standard cells and
memory macros using the library descriptions and the switching activity values.

1
For the purposes of simplicity, we have assumed the signal cannot be in unknown (or X) state
anytime.
4.2 Power Computation for Basic Cells and Macros 47

Fig. 4.1 Example waveforms with same static probability but different transition rates

Fig. 4.2 Example waveforms with same transition rates but different static probability

Fig. 4.3 Example of reduced toggle rate at the output of an and gate

Fig. 4.4 NAND power computation using switching activity at the pins

4.2.1 Power Computation for a 2-Input NAND Cell

This section describes the power computation for a 2-input nand cell with input pins
A1 and A2 and output pin ZN. Assume that the switching activity for the pins of the
cell are available and are as shown in Fig. 4.4.
The power computation uses the switching activity along with the library descrip-
tion. A fragment of the power specification within the library for this cell is shown
next.
48 4 Power Analysis in ASICs

leakage_power () {
value : 42.2;
}
leakage_power () {
value : 26.1;
when : “!A1 !A2”;
}
leakage_power () {
value : 33.0;
when : “!A1 A2”;
}
leakage_power () {
value : 27.0;
when : “A1 !A2”;
}
leakage_power () {
value : 82.7;
when : “A1 A2”;
}
pin(A1) {
direction : input;
internal_power () {
when : “!A2&ZN”; /* Transition at A1 does not */
/* cause an output transition. */
rise_power (scalar) {
values (“0.004”);}
fall_power (scalar) {
values (“0.006”);}
}
}
}
pin(A2) {
direction : input;
internal_power () {
when : “!A1&ZN”; /* Transition at A2 does not */
/* cause output transition. */
rise_power (scalar) {
values (“0.006”);
}
fall_power (scalar) {
values (“0.008”);
}
}
}
4.2 Power Computation for Basic Cells and Macros 49

pin(ZN) {
direction : output;
internal_power () { /* A1 causes output transition */
related_pin : “A1”;
rise_power (scalar) {
values (“0.043”);}
fall_power (scalar) {
values (“0.016”);}
}
internal_power () { /* A2 causes output transition */
related_pin : “A2”;
rise_power (scalar) {
values (“0.036”);}
fall_power (scalar) {
values (“0.021”);}
}
}
The switching activity for the pins of the nand cell (with the library description for
power described above) are shown in Fig. 4.4. These are normally obtained through
simulation and the information is extracted in SAIF format. The switching activity
values at the pins of the nand cell are:
Static probability (pin A1) = 0.6
Static probability (pin A2) = 0.55
Toggle rate (pin A1) = 5 million transitions/sec
Toggle rate (pin A2) = 6 million transitions/sec
Static probability (pin ZN) = 0.67
Toggle rate (pin ZN) = 7.7 million transitions/sec
Note that the static probability at the output ZN directly follows from the static
probability values at the inputs A1 and A2.2

4.2.1.1 Leakage Power Computation

Leakage power is computed by combining the leakage power values for various
conditions of A1 and A2 pins specified in the library. This computation is based
upon the static probability values at the A1 and A2 pins. The computation is illus-
trated below.
Leakage power:
= 26.1 * Prob(!A1 !A2) +
33.0 * Prob(!A1 A2) +
27.0 * Prob(A1 !A2) +

2
The static probability values at the output of combinational gates is shown in Fig. 4.8.
50 4 Power Analysis in ASICs

82.7 * Prob(A1 A2)


= 26.1 * (1 – 0.6) * (1 – 0.55)+
33.0 * (1 – 0.6) * 0.55 +
27.0 *0.6 * (1 – 0.55) +
82.7 * 0.6 * 0.55
= 46.539nW

4.2.1.2 Active Power Computation

The active power is computed based upon the 7.7 million transitions per second
toggle rate at ZN and the 5 and 6 million transitions per second toggle rates on A1
and A2 respectively.

Internal Power

For the internal power due to switching activity on ZN, the appropriate path-
dependent internal power table has to be used. In particular, the 7.7 million transi-
tions per second toggle rate at ZN is mapped to path-specific (A1->ZN) or (A2->ZN)
based upon the toggle rates of A1 and A2. The distribution of the ZN toggles into
path-specific toggles uses the same ratio as the toggle rates of inputs A1 and A2.
A1->ZN toggle rate:
= ZN toggle rate * A1 toggle rate /
(A1 toggle rate + A2 toggle rate)
= 7.7 * 5 / (5 + 6) million transitions/sec
= 3.5 million transitions/sec
A2->ZN toggle rate:
= ZN toggle rate * A2 toggle rate /
(A1 toggle rate + A2 toggle rate)
= 7.7 * 6 / (5 + 6) million transitions/sec
= 4.2 million transitions/sec
A1 toggle rate not causing output transition:
= A1 Toggle rate - A1->ZN toggle rate
= (5 – 3.5) million transitions/sec
= 1.5 million transitions/sec
A2 toggle rate not causing output transition:
= A2 toggle rate - A2->ZN toggle rate
= (6 – 4.2) million transitions/sec
= 1.8 million transitions/sec
4.2 Power Computation for Basic Cells and Macros 51

For A1->ZN toggle rate, we use A1->ZN internal power table and for A2->ZN toggle
rate, we use A2->ZN internal power table. As described in Chap. 2, the internal
power tables can be a nonlinear table defined in terms of input slew and output capac-
itance. However, for simplifying the explanation, the power values for this example
are depicted as scalar values independent of input slew or the output capacitance.
The library description for the cell specifies internal power from each pin for the
two scenarios:
1. When the input pin switching causes an output transition
2. When the input pin switching does not result in an output transition.
The latter corresponds to the condition “= !A2&ZN” for transition on input A1 and
condition “= !A1&ZN” for transition at input A2.
From the library description:
Internal power 3 for transitions at A1 which do not result
in output pin transition:
= 0.004pJ (for rise transitions) and
0.006pJ (for fall transitions).
Total internal power for transitions at A1 which do not
result in output pin transitions:
= (1.5 million / 2) * 0.004
+ (1.5 million / 2) * 0.006
= 7.5nW
In above, the transition rate is divided by 2 to obtain the rise transition rate and the
fall transition rate. Again from the library description:
Internal power for transitions at A1 resulting in output
transition:
= 0.043pJ (for output rise) and
0.016pJ (for output fall).
Total internal power for transitions at A1 resulting in
output pin transition:
= 3.5 million * (0.043 + 0.016) / 2
= 103.25nW
Total internal power due to transitions at A1:
= 7.5 + 103.25
= 110.75nW
Similar computation for input pin A2 follows.

3
As described in Chap. 2, the library power models actually represent the energy dissipated per
transition.
52 4 Power Analysis in ASICs

Internal power for transition at A2 which do not result


in output pin transition:
= 0.006pJ (for rise transitions) and
0.008pJ (for fall transitions).
Total internal power for transitions at A2 which do not
result in output pin transitions:
= 1.8 million * (0.006 + 0.008) / 2
= 12.6nW
Internal power for transitions at A2 resulting in
output transition:
= 0.036pJ (for output rise) and
0.021pJ (for output fall).
Total internal power for transitions at A2 resulting in
output pin transition:
= 4.2 million * (0.036 + 0.021) / 2
= 119.7nW
Total internal power due to transitions at A2:
= 12.6 + 119.7
= 132.3nW

Output Charging Power

Now we illustrate the computation of the output charging power. Assume that the
power supply Vdd is 1.0 V and the output capacitance driven by ZN is 20fF.
Toggle rate at pin ZN
= 7.7 million transitions/sec
Total output charging power:
= 0.5 * C * Vdd * Vdd * Toggle rate
= 0.5 * 20fF * 1 * 1 * 7.7 million
= 77nW

4.2.1.3 Total Power

The total power dissipation is the sum of the leakage power and active power. Using
the values computed above:
TOTAL POWER DISSIPATION in nand cell:
= Leakage power + Internal power +
Output charging power
= 46.539 + (110.75 + 132.3) + 77
= 366.589nW
4.2 Power Computation for Basic Cells and Macros 53

Fig. 4.5 Power computation


using clock transition times
and pin activity information

In each of the above cases, the rise and fall toggles are assumed to be equal. Thus in
each computation, 50% of the toggles correspond to rise power models and 50% of
the toggles correspond to the fall power models.

4.2.2 Power Computation for a Flip-Flop Cell

This section illustrates the power calculation of a D-type flip-flop cell using the
switching activity information for various pins of the macro. The example illustrates
the internal power calculation—the leakage and the output charging power compo-
nents can be added similar to the case of the nand cell example in the previous
subsection. The switching activity values at the pins of the flip-flop are shown in
Fig. 4.5.
The flip-flop is clocked by a 250 MHz input clock with input transition times of
0.25 ns for rise and 0.1 ns for fall. An example fragment of the power specification
for a D-type flip-flop cell is given below.
pin (CLK) {
internal_power () {
when : “(D&Q) | (!D&!Q)”; /* No transition on Q */
rise_power (template_2x1) {
index_1 (“0.1, 0.4”); /* Input transition */
values ( /* 0.1 0.4 */ \
“ 0.050, 0.090”);
}
fall_power (template_2x1) {
index_1 (“0.1, 0.4”);
values ( \
“0.070, 0.100”);
}
}
internal_power () {
when : “(D&!Q) | (!D&Q)”; /* Has transition on Q */
rise_power (scalar) {
values ( “0” );
}
54 4 Power Analysis in ASICs

fall_power (template_2x1) {
index_1 (“0.1, 0.4”);
values ( \
“0.070, 0.110”);
}
}
}
pin (D) {
direction: input;
internal_power () { /* Input pin power */
rise_power (scalar) {
values (“0.026”);}
fall_power (scalar) {
values (“0.011”);}
}
}
pin (Q) {
direction: output;
related_pin: CLK;
internal_power () { /* When output switching */
rise_power (scalar) {
values (“0.09”);}
fall_power (scalar) {
values (“0.11”);}
}
}
The switching activity information for the signals at the D, CLK, and Q pins of the
flip-flop is described as follows.
Static probability (pin D) = 0.6
Static probability (pin CLK) = 0.5
Toggle rate (pin D) = 25 million transitions/sec
Toggle rate (pin CLK) = 500 million transitions/sec
Static probability (pin Q) = 0.61
Toggle rate (pin Q) = 25 million transitions/sec
This corresponds to a flip-flop clocked with a 250 MHz clock with the input data
and flip-flop output having 10% activity (that is, the flip-flop toggles in 10% of the
clock cycles). The active power computation for the above scenario is described as
follows.
Internal power due to transitions at input pin D
= 25 million * (0.026 + 0.011) / 2
= 0.4625mW
4.2 Power Computation for Basic Cells and Macros 55

Fig. 4.6 Activity


information at the pins of
the single port memory
macro

The internal power dissipation due to CLK pin transitions requires the break-
down of the CLK pin transitions into those which cause a transition at the output pin
Q and the ones which do not create a transition at output pin Q. Based upon the
activity at CLK and at Q, we can determine that amongst the toggle rate of 500 mil-
lion transitions per second at the CLK pin, 25 million transitions per second (rise
transitions) create a transition at the output pin Q and the remainder 475 million
transitions per second (225 million rise transitions per second and 250 million fall
transitions per second) do not cause a transition at the output pin Q.
Internal power due to output pin Q transitions:
= 25 million * (0.09 + 0.11)/2
= 2.5mW
Internal power due to CLK pin rise transitions:
= 25 million * 0.0 + 225 million * 0.07
= 15.75mW
Internal power due to CLK pin fall transitions:
= 250 million * 0.07
= 17.5mW
Total internal power:
= (0.4625 + 2.5 + 15.75 + 17.5)μW
= 36.2125mW

4.2.3 Power Computation for a Memory Macro

This section describes the power computation for an SRAM macro. We use the
SRAM instance with the library as described in Sect. 3.1.1 to illustrate the power
computation. The SRAM macro and the activity information for the signals at the
pins of the SRAM macro are depicted in Fig. 4.6.
The activity values are:
CLK pin (100 MHz):
200 million transitions/sec (for rise and fall)
Address pins:
56 4 Power Analysis in ASICs

30 million transitions/sec (for rise and fall)


Static probability: 0.5
Data pins:
15 million transitions/sec (for rise and fall)
Static probability: 0.5
Memory enable (ME):
1 million transitions/sec (for rise and fall)
Static probability: 0.7
Write enable (WE):
6 million transitions/sec (for rise and fall)
Static probability: 0.4
Output bus Q pins:
24 million transitions/sec (for rise and fall)
Internal power due to activity at one address pin:
= 30 million * (0.124 + 0.124) / 2
= 3.72mW
Internal power due to activity at all 10 address
pins:
= 3.72 * 10
= 37.2mW
Internal power due to activity at one data input pin:
= 15 million * (0.153 + 0.153) / 2
= 2.295mW
Internal power due to activity at 32 data input pins:
= 32 * 2.295
= 73.44mW
Internal power due to activity at ME pin:
= 1 million * 0.048
= 0.048mW
Internal power due to activity at WE pin:
= 6 million * 17.08
= 102.48mW
For power due to clock pin, we need to compute the toggles:
WRITE (Rising CLK with both ME and WE high):
= CLK_rise_toggles * static_probability_of_ME *
static_probability_of_WE
= 100 million * 0.7 * 0.4
= 28 million transitions/sec
4.2 Power Computation for Basic Cells and Macros 57

READ (Rising CLK with ME high; WE low):


= 100 million * 0.7 * 0.6
= 42 million transitions/sec
INACTIVE (Rising CLK with ME is low):
= 100 million * 0.3
= 30 million transitions/sec
Based upon above, the clock power for each case is computed. Note that the falling
clock transitions have negligible power dissipation and thus the library description
for the memory macro in Sect. 3.1 shows zero power for falling transitions on the
CLK pin. Thus, the power for various cases is computed as:
WRITE case:
= 28 million * 42.4
= 1187.2mW
READ case:
= 42 million * 43.9
= 1843.8mW
INACTIVE:
= 30 million * 0.93
= 27.9mW
Total clock power:
= 1187.2 + 1843.8 + 27.9
= 3058.9mW
For internal power due to output switching:
Internal power for each output pin switching:
= 24 million * (0.022 + 0.022) / 2
= 0.528mW
Internal power for all 32 output pins:
= 32 * 0.528
= 16.896mW
The output charging power corresponds to switching the output load capacitance at
the output pins.
For each output pin, the output charging power:
= 0.5 * C * Vdd * Vdd * Toggle_rate
Assume each output drives a 20fF capacitance load and the power supply is 1.0 V.
Total output charging power for all 32 output pins:
= 32 * 0.5 * 20fF * 1 * 1 * 24 million
= 7.68mW
58 4 Power Analysis in ASICs

Total active power in the memory macro:


= 37.2 + 73.44 + 0.048 + 102.48 + 3058.9 + 16.896
+ 7.68
= 3296.644mW
This illustrates that, in a practical scenario, the dynamic or active power of a mem-
ory macro is largely governed by the read and write power.

4.3 Specifying Activity at the Block or Chip Level

This section specifies various alternatives for specifying the switching activity
information at the block or at the full-chip level.

4.3.1 Default Global Activity or Vectorless

This method is typically utilized for blocks in the initial design phase or where the
designer does not have any detailed information. In this method, the designer pro-
vides an estimate of the activity ratio for all nets. Based upon the clock frequency
and an estimated activity factor (for example 20% or 30%), the transition rate is
obtained for all signals. The static probability can be specified or, in most cases, is
set to a default value of 0.5. The static probability and the transition rate together
constitute the switching activity information used for power analysis.

4.3.2 Propagating Activity from Inputs

This is typically the default method and it can be pessimistic in some cases. In this
method, the clock nets toggle at the frequency specified in the SDC. When the clock
hits a sequential element, the output of the sequential element gets a toggle rate
depending upon the activity ratio specified for that clock. The flip-flops which get
multiple clocks get the toggle rate of the fastest clock.
Nets with no clock phase4 on them are set to zero activity, unless a default activ-
ity has been specified for all nets. Typically, a power analysis tool provides a way to
annotate a net with a probability and a toggle rate.
When propagating the activity through combinational logic cells, the functional-
ity of the cell is utilized to obtain switching activity at the cell output based upon the
switching activity at the inputs of the cell. This can be accomplished by using heu-
ristics such as cycle-based random simulation or it can be as simple as using the sum
of the input transition rates as the transition rate for the output.

4
Nets that have no transitions.

You might also like