[go: up one dir, main page]

0% found this document useful (0 votes)
14 views22 pages

05 Partition

Unit 5A covers circuit partitioning techniques including the Kernighan-Lin and Fiduccia-Mattheyses heuristics, along with simulated annealing algorithms. It defines key concepts such as cells, pins, nets, and the objectives of partitioning, emphasizing the importance of minimizing connections while adhering to size constraints. The document also discusses the complexities and limitations of these algorithms, particularly in handling unequal sizes and hypergraphs.

Uploaded by

Shawn Wang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views22 pages

05 Partition

Unit 5A covers circuit partitioning techniques including the Kernighan-Lin and Fiduccia-Mattheyses heuristics, along with simulated annealing algorithms. It defines key concepts such as cells, pins, nets, and the objectives of partitioning, emphasizing the importance of minimizing connections while adhering to size constraints. The document also discusses the complexities and limitations of these algorithms, particularly in handling unequal sizes and hypergraphs.

Uploaded by

Shawn Wang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Unit 5A: Circuit Partitioning

․Course contents:
 Kernighang-Lin partitioning heuristic
 Fiduccia-Mattheyses heuristic
 Simulated annealing based partitioning algorithm
․Readings
 Chapter 7.5

Unit 5A 1
Chang, Huang, Li, Lin, Liu

Basic Definitions
․Cell: a logic block used to build larger circuits.
․Pin: a wire (metal or polysilicon) to which another
external wire can be connected.
․Nets: a collection of pins which must be electronically
connected.
․Netlist: a list of all nets in a circuit.

Unit 5A 2
Chang, Huang, Li, Lin, Liu
Basic Definitions (cont’d)
․Manhattan distance: If two points (pins) are located at
coordinates (x1, y1) and (x2, y2), the Manhattan distance
between them is given by d12 = |x1-x2| + |y1-y2|.
․Rectilinear spanning tree: a spanning tree that
connects its pins using Manhattan paths.
․Steiner tree: a tree that connects its pins, and
additional points (Steiner points) are permitted to used
for the connections.

Unit 5A 3
Chang, Huang, Li, Lin, Liu

Partitioning

Unit 5A 4
Chang, Huang, Li, Lin, Liu
Levels of Partitioning

․The levels of partitioning: system, board, chip.


․Hierarchical partitioning: higher costs for higher levels.

Unit 5A 5
Chang, Huang, Li, Lin, Liu

Circuit Partitioning
․Objective: Partition a circuit into parts such that every
component is within a prescribed range and the # of
connections among the components is minimized.
 More constraints are possible for some applications.
․Cutset? Cut size? Size of a component?

Unit 5A 6
Chang, Huang, Li, Lin, Liu
Problem Definition: Partitioning
․k-way partitioning: Given a graph G(V, E), where each
vertex v  V has a size s(v) and each edge e  E has a
weight w(e), the problem is to divide the set V into k disjoint
subsets V1, V2, …, Vk, such that an objective function is
optimized, subject to certain constraints.
․Bounded size constraint: The size of the i-th subset is
bounded by Bi ( ).
 Is the partition balanced?
․Min-cut cost between two subsets:
Minimize  e( u ,v ) p ( u ) p ( v ) w(e) , where p(u) is the partition # of
node u.
․The 2-way, balanced partitioning problem is NP-complete,
even in its simple form with identical vertex sizes and unit
edge weights.

Unit 5A 7
Chang, Huang, Li, Lin, Liu

Kernighan-Lin Algorithm
․Kernighan and Lin, “An efficient heuristic procedure for
partitioning graphs,” The Bell System Technical Journal,
vol. 49, no. 2, Feb. 1970.
․An iterative, 2-way, balanced partitioning (bi-sectioning)
heuristic.
․Till the cut size keeps decreasing
 Vertex pairs which give the largest decrease or the
smallest increase in cut size are exchanged.
 These vertices are then locked (and thus are prohibited
from participating in any further exchanges).
 This process continues until all the vertices are locked.
 Find the set with the largest partial sum for swapping.
 Unlock all vertices.

Unit 5A 8
Chang, Huang, Li, Lin, Liu
Kernighan-Lin Algorithm: A Simple Example
․Each edge has a unit weight.

․Questions: How to compute cost reduction? What pairs


to be swapped?
 Consider the change of internal & external connections.

Unit 5A 9
Chang, Huang, Li, Lin, Liu

Properties
․ Two sets A and B such that |A| = n = |B| and A  B = .
․ External cost of a  A: Ea =  vB cav.
․ Internal cost of a  A: Ia =  cav. vA

․ D-value of a vertex a: Da = Ea - Ia (cost reduction for moving a).


․ Cost reduction (gain) for swapping a and b: gab = Da + Db - 2cab.
․ If a  A and b  B are interchanged, then the new D-values, D’,
are given by

Unit 5A 10
Chang, Huang, Li, Lin, Liu
Kernighan-Lin Algorithm: A Weighted Example

․Iteration 1:

Unit 5A 11
Chang, Huang, Li, Lin, Liu

Weighted Example (cont'd)


․Iteration 1:

․gxy = Dx + Dy - 2cxy.

․Swap b and f!
Unit 5A 12
Chang, Huang, Li, Lin, Liu
Weighted Example (cont'd)

․ D’x = Dx + 2 cxp - 2 cxq,  x  A – {p} (swap p and q, p  A, q  B)

․ gxy = D’x + D’y - 2cxy.

․ Swap c and e!
Unit 5A 13
Chang, Huang, Li, Lin, Liu

Weighted Example (cont'd)

․ D’’x = D’x + 2 cxp - 2 cxq,  x  A – {p}

․ gxy = D’’x + D’’y - 2cxy.

․ Note that this step is redundant


․ Summary:
․ Largest partial sum (k = 1)  Swap b and f.

Unit 5A 14
Chang, Huang, Li, Lin, Liu
Weighted Example (cont'd)

․Iteration 2: Repeat what we did at Iteration 1 (Initial cost


= 22-4 =18).
․Summary:
․Largest partial sum = (k=3)  Stop!

Unit 5A 15
Chang, Huang, Li, Lin, Liu

Kernighan-Lin Algorithm
Algorithm: Kernighan-Lin(G)
Input: G = (V, E), |V| = 2n.
Output: Balanced bi-partition A and B with “small” cut cost.
1 begin
2 Bipartition G into A and B such that |VA| = |VB|, VA  VB = ,
and VA  VB = V.
3 repeat
4 Compute Dv,  v  V.
5 for i =1 to n do
6 Find a pair of unlocked vertices vai  VA and vbi  VB whose
exchange makes the largest decrease or smallest increase in cut
cost; )
7 Mark vai and vbi as locked, store the gain g , and compute the new
i

Dv, for all unlocked v  V;)


 g
k

8 Find k, such that Gk = i 1


is maximized;
i

9 if Gk > 0 then
10 Move va1, …, vak from VA to VB and vb1, …, vbk from VB to VA;
11 Unlock v,  v  V.
12 until Gk  0;
13 end
Unit 5A 16
Chang, Huang, Li, Lin, Liu
Time Complexity
․Line 4: Initial computation of D: O(n2)
․Line 5: The for-loop: O(n)
․The body of the loop: O(n2).
 Lines 6--7: Step i takes (n-i+1)2 time.
․Lines 4--11: Each pass of the repeat loop: O(n3).
․Suppose the repeat loop terminates after r passes.
․The total running time: O(rn3).
 Polynomial-time algorithm?

Unit 5A 17
Chang, Huang, Li, Lin, Liu

Extensions of K-L Algorithm


1. Unequal sized subsets (assume n1 < n2)
1. Partition: |A| = n1 and |B| = n2.
2. Add n2-n1 dummy vertices to set A. Dummy vertices have no
connections to the original graph.
3. Apply the Kernighan-Lin algorithm.
4. Remove all dummy vertices.
․ Unequal sized “vertices”
1. Assume that the smallest “vertex” has unit size.
2. Replace each vertex of size s with s vertices which are fully
connected with edges of infinite weight.
3. Apply the Kernighan-Lin algorithm.
․ k-way partition
1. Partition the graph into k equal-sized sets.
2. Apply the Kernighan-Lin algorithm for each pair of subsets.
3. Time complexity? Can be reduced by recursive bi-partition.

Unit 5A 18
Chang, Huang, Li, Lin, Liu
Drawbacks of the Kernighan-Lin Heuristic
․The K-L heuristic handles only unit vertex weights.
 Vertex weights might represent block sizes, different from
blocks to blocks.
 Reducing a vertex with weight w(v) into a clique with w(v)
vertices and edges with a high cost increases the size of
the graph substantially.
․The K-L heuristic handles only exact bisections.
 Need dummy vertices to handle the unbalanced problem.
․The K-L heuristic cannot handle hypergraphs.
 Need to handle multi-terminal nets directly.
․The time complexity of a pass is high, O(n3).

Unit 5A 19
Chang, Huang, Li, Lin, Liu

Coping with Hypergraph


․ A hypergraph H=(N, L) consists of a set N of vertices and a set L of
hyperedges, where each hyperedge corresponds to a subset Ni of
distinct vertices with |Ni|  2.

․ Schweikert and Kernighan, “A proper model for the partitioning of


electrical circuits,” 9th Design Automation Workshop, 1972.
․ For multi-terminal nets, net cut is a more accurate measurement
for cut cost (i.e., deal with hyperedges).
 {A, B, E}, {C, D, F} is a good partition.

 Should not assign the same weight for all edges.

Unit 5A 20
Chang, Huang, Li, Lin, Liu
Net-Cut Model
․Let n(i) = # of cells associated with Net i.
․Edge weight wxy = n2(i ) for an edge connecting cells x
and y.

․Easy modification of the K-L heuristic.

Unit 5A 21
Chang, Huang, Li, Lin, Liu

Fiduccia-Mattheyses Heuristic
․Fiduccia and Mattheyses, “A linear time heuristic for
improving network partitions,” DAC-82.
․New features to the K-L heuristic:
 Aims at reducing net-cut costs; the concept of cutsize
is extended to hypergraphs.
 Only a single vertex is moved across the cut in a single
move.
 Vertices are weighted.
 Can handle “unbalanced” partitions; a balance factor is
introduced.
 A special data structure is used to select vertices to be
moved across the cut to improve running time.
 Time complexity O(P), where P is the total # of
terminals.

Unit 5A 22
Chang, Huang, Li, Lin, Liu
F-M Heuristic: Notation
․n(i): # of cells in Net i; e.g., n(1) = 4.
․s(i): size of Cell i.
․p(i): # of pin terminals in Cell i; e.g., p(6)=3.
․C: total # of cells; e.g., C=6.
․N: total # of nets; e.g., N=6.
․P: total # of pins; P = p(1) + … + p(C) = n(1) + … + n(N).

Unit 5A 23
Chang, Huang, Li, Lin, Liu

Cut

․Cutstate of a net:
 Net 1 and Net 3 are cut by the partition.
 Net 2, Net 4, Net 5, and Net 6 are uncut.
․Cutset = {Net 1, Net 3}.
․|A| = size of A = s(1)+s(5); |B| = s(2)+s(3)+s(4)+s(6).
․Balanced 2-way partition: Given a fraction r, 0 < r < 1,
partition a graph into two sets A and B such that
| A|
r .

| A| | B |
Unit 5A
 Size of the cutset is minimized. 24
Chang, Huang, Li, Lin, Liu
Input Data Structures

․Size of the network:


․Construction of the two arrays takes O(P) time.
Unit 5A 25
Chang, Huang, Li, Lin, Liu

Basic Ideas: Balance and Movement


․ Only move a cell at a time, preserving “balance.”

where W=|A|+|B|; Smax=maxis(i).


․ g(i): gain in moving cell i to the other set, i.e., size of old cutset -
size of new cutset.

․ Suppose g(b), g(e), g(d), g(a), g(f), g(c) and the largest partial
sum is g(b)+g(e)+g(d). Then we should move b, e, d  resulting
two sets: {a, c, e, d}, {b, f}.

Unit 5A 26
Chang, Huang, Li, Lin, Liu
Cell Gains and Data Structure Manipulation
․ -p(i)  g(i)  p(i)

․ Two “bucket list” structures, one for set A and one for set B (Pmax =
maxi p(i)).

․ O(1)-time operations: find a cell with Max Gain, remove Cell i from
the structure, insert Cell i into the structure, update g(i) to g(i)+ ,
update the Max Gain pointer.

Unit 5A 27
Chang, Huang, Li, Lin, Liu

Net Distribution and Critical Nets


․ Distribution of Net i: (A(i), B(i)) = (2, 3).
 (A(i), B(i)) for all i can be computed in O(P) time.

․ Critical Nets: A net is critical if it has a cell which if moved will


change its cutstate.
 4 cases: A(i) = 0 or 1, B(i) = 0 or 1.

․ Gain of a cell depends only on its critical nets.

Unit 5A 28
Chang, Huang, Li, Lin, Liu
Computing Cell Gains
․Initialization of all cell gains requires O(P) time:
g(i)  0;
F  the “from block” of Cell i;
T  the “to block” of Cell i;
for each net n on Cell i do
if F(n)=1 then g(i)  g(i)+1;
if T(n)=0 then g(i)  g(i)-1;

․Will show: Only need O(P) time to maintain all cell


gains in one pass.

Unit 5A 29
Chang, Huang, Li, Lin, Liu

Updating Cell Gains


․ To update the gains, we only need to look at those nets, connected
to the base cell, which are critical before or after the move.
․ Base cell: The cell selected for movement from one set to the
other.

․ Consider only the case where the base cell is in the left partition.
The other case is similar.

Unit 5A 30
Chang, Huang, Li, Lin, Liu
Updating Cell Gains (cont'd)

Unit 5A 31
Chang, Huang, Li, Lin, Liu

Updating Cell Gains (cont'd)

Unit 5A 32
Chang, Huang, Li, Lin, Liu
Algorithm for Updating Cell Gains
Algorithm: Update_Gain
1 begin /* move base cell and update neighbors' gains */
2 F  the Front Block of the base cell;
3 T  the To Block of the base cell;
4 Lock the base cell and complement its block;
5 for each net n on the base cell do
/* check critical nets before the move */
6 if T(n) = 0 then increment gains of all free cells on n
else if T(n)=1 then decrement gain of the only T cell on n,
if it is free
/* change F(n) and T(n) to reflect the move */
7 F(n)  F(n) - 1; T(n)  T(n)+1;
/* check for critical nets after the move */
8 if F(n)=0 then decrement gains of all free cells on n
else if F(n) = 1 then increment gain of the only F cell on n,
if it is free
9 end

Unit 5A 33
Chang, Huang, Li, Lin, Liu

Complexity of Updating Cell Gains


․Once a net has “locked’ cells at both sides, the net will
remain cut from now on.
․Suppose we move a1, a2, …, ak from left to right, and
then move b from right to left  At most only moving a1,
a2, …, ak and b need updating!

․To update the cell gains, it takes O(n(i)) work for Net i.
․Total time = n(1)+n(2)+…+n(N) = O(P).

Unit 5A 34
Chang, Huang, Li, Lin, Liu
F-M Heuristic: An Example

․ Computing cell gains: F(n) = 1  g(i) + 1; T(n)=0  g(i) – 1

․ Balanced criterion: r|V| - Smax  |A|  r|V| + Smax. Let r = 0.4  |A| = 9, |V|=
18, Smax = 5, r|V|=7.2  Balanced: 2.2  9  12.2!
․ maximum gain: c2 and balanced: 2.2  9-2  12.2  Move c2 from A to B
(use size criterion if there is a tie).

Unit 5A 35
Chang, Huang, Li, Lin, Liu

F-M Heuristic: An Example (cont'd)

․ Changes in net distribution:

․ Updating cell gains on critical nets (run Algorithm Update_Gain):

․ Maximum gain: c3 and balanced! (2.2  7-4  12.2)  Move c3 from


A to B (use size criterion if there is a tie).

Unit 5A 36
Chang, Huang, Li, Lin, Liu
Summary of the Example

․  Maximum
partial sum Gk = +2, k = 2 or 4.
․Since k=4 results in a better balanced  Move c1, c2,
c3, c6  A={6}, B={1, 2, 3, 4, 5}.
․Repeat the whole process until new Gk  0.

Unit 5A 37
Chang, Huang, Li, Lin, Liu

Simulated Annealing
․Kirkpatrick, Gelatt, and Vecchi, “Optimization by
simulated annealing,” Science, May 1983.
․Greene and Supowit, “Simulated annealing without
rejected moves,” ICCD-84.

Unit 5A 38
Chang, Huang, Li, Lin, Liu
Simulated Annealing Basics
․ Non-zero probability for “up-hill” moves.
․ Probability depends on
1. magnitude of the “up-hill” movement
2. total search time

․ C = cost(S') - Cost(S)
․ T: Control parameter (temperature)
․ Annealing schedule: T=T0, T1, T2, …, where Ti = ri T0, r
< 1.

Unit 5A 39
Chang, Huang, Li, Lin, Liu

Generic Simulated Annealing Algorithm


1 begin
2 Get an initial solution S;
3 Get an initial temperature T > 0;
4 while not yet “frozen” do
5 for 1  i  P do
6 Pick a random neighbor S' of S;
7   cost(S') - cost(S);
/* downhill move */
8 if   0 then S  S'
/* uphill move */
9 if  > 0 then S  S' with probability ;
10 T  rT; /* reduce temperature */
11 return S
12 end

Unit 5A 40
Chang, Huang, Li, Lin, Liu
Basic Ingredients for Simulated Annealing

․Analogy:

․Basic Ingredients for Simulated Annealing:


 Solution space
 Neighborhood structure
 Cost function
 Annealing schedule

Unit 5A 41
Chang, Huang, Li, Lin, Liu

Partition by Simulated Annealing


․ Kirkpatrick, Gelatt, and Vecchi, “Optimization by simulated
annealing,” Science, May 1983.
․ Solution space: set of all partitions

․ Neighborhood structure:

Unit 5A 42
Chang, Huang, Li, Lin, Liu
Partition by Simulated Annealing (cont'd)
․ Cost function: f = C +  B
 C: the partition cost as used before.
 B: a measure of how balance the partition is
 : a constant

․ Annealing schedule:
 Tn = r n T0, r = 0.9.
 At each temperature, either
1. there are 10 accepted moves/cell on the average, or
2. # of attempts  100  total # of cells.
 The system is “frozen” if very low acceptances at 3 consecutive
temperatures.

Unit 5A 43
Chang, Huang, Li, Lin, Liu

You might also like