Novel Wire Density Driven Full-Chip Routing for
CMP Variation Control
Huang-Yu Chen† , Szu-Jui Chou† , Sheng-Lung Wang§ , and Yao-Wen Chang†‡
†
Graduate Institute of Electronics Engineering, National Taiwan University, Taipei, Taiwan
‡
Department of Electrical Engineering, National Taiwan University, Taipei, Taiwan
§
Synopsys, Inc, Taipei, Taiwan
Abstract— As nanometer technology advances, the post-CMP into layouts to restrict the variations on each layer. Dummy features
dielectric thickness variation control becomes crucial for man- may either be connected to power/ground (tied fills) or left floating
ufacturing closure. To improve CMP quality, dummy feature (floating fills) [19]. The tied fill has predictable but higher capaci-
filling is typically performed by foundries after the routing tance, while the floating fill has lower but unpredictable one due to
stage. However, filling dummy features may greatly degrade the floating nature. Traditionally, electrical impacts of dummy fills
the interconnect performance and lead to explosion of mask
can be negligible, and dummy features are inserted during the post
data. It is thus desirable to consider wire-density uniformity
during routing to minimize the side effects from aggressive post- routing stage. Filling algorithms have been proposed to satisfy density
layout dummy filling. In this paper, we present a new full-chip bounds and reduce the density variation [16], [25]. However, as
grid-based routing system considering wire density for reticle reported in [26], these filled dummy features may incur troublesome
planarization enhancement. To fully consider wire distribution, problems at 65nm and successive technology nodes. The tied fill
the router applies a novel two-pass, top-down planarity-driven may induce crosstalk for its high coupling capacitances to nearby
routing framework, which employs a new density critical area interconnects and would place a heavy burden for P/G (power/ground)
analysis based on Voronoi diagrams and incorporates an inter- network. On the other hand, the floating capacitance of floating
mediate stage of density-driven layer/track assignment based on fills is usually uncertain, and thus the induced coupling capacitance
incremental Delaunay triangulation. Experimental results show might unpredictably harm the timing-optimized results in the previous
that our methods can achieve more balanced wire distribution
design stages. Moreover, dummy fills also sheerly increase the data
than state-of-the-art works.
volume of mask, lengthening the time of mask-making processes such
I. I NTRODUCTION as mask synthesis, writing, and inspection verification. Especially,
As IC process geometries shrink to 65nm and below, one important these filled features would significantly increase the input data in
yield loss of interconnects comes from the chemical-mechanical pol- the following time-consuming reticle enhancement techniques, such
ishing (CMP) step in the copper metallization (Damascene) process. as OPC (optical proximity correction) and PSM (phase shift mask).
Because of the difference in hardness between copper and dielectric Therefore, much research focuses on impact-limited dummy feature
materials, the CMP planarizing process might generate topography filling algorithms [7], [18].
irregularities. A non-uniform feature density distribution on each In the nanometer technology, routing has become a decisive factor
layer causes CMP to over polish or under polish, generating metal for determining chip manufacturability, since it presides over most
dishing and dielectric erosion [22]. These thickness variations have to of the layout geometries in the back-end design process. In order
be carefully controlled, since the variation in one interconnect level is to tackle these manufacturing challenges, routing techniques must
progressively transferred to subsequent levels during manufacturing, handle the increasing complexity. The routing approaches applying
and finally the compounding variation can be significant on an upper the bottom-up coarsening and top-down uncoarsening techniques
level, which is often called the multi-layer accumulative effect [23]. have demonstrated the superior capability of handling large-scale
Two key problems arise from the post-CMP thickness variation: routing problems, such as the Λ-shaped multilevel [3], [4], [12],
(1) the layout surface fluctuates inside or outside the depth of the V-shaped multilevel [5], and the two-pass bottom-up [6] routing
focus (DOF) of the photolithography system, such that the exposed frameworks.
patterns do not appear acceptably sharp and open/short defects may Recently, routing considering wire distribution has attracted much
even occur, and (2) these irregular variations greatly change the attention in the literature. The earlier studies for CMP processes have
electrical characteristics of interconnects, especially for resistance and indicated that the post-CMP dielectric thickness is highly correlated
capacitance, degrading the accuracy of timing analysis and worsening to the layout pattern density, because during the polishing step,
the electromigration. As a result, in order to improve chip thickness interlevel dielectric (ILD) removal rates are varied with the pattern
uniformity, TSMC recommends performing virtual CMP (VCMP) density [23]. Further, the layout pattern (consisting of wires and
analysis to identify the metal and dielectric thickness variation hotspot dummy features) density can be systematically determined by the
before chip fabrication for 65nm manufacturing processes (see TSMC wire density distribution, as reported in [9]. Therefore, managing
Reference Flows 7.0) [24]. wire density at the routing stage has great potential for alleviating
In order to improve the CMP quality, modern foundries often the aggressive dummy feature filling induced problems.
impose recommended layout density rules and fill dummy features Li et al. [20] presented the first routing system in the literature
addressing the CMP induced variation. By setting the desired density
—————————————————————————————– in the cost function of global routing, the routing results have
This work was supported in part by UMC and NSC of Taiwan under Grant No’s. NSC 96-
2752-E-002-008-PAE, NSC 96-2628-E-002-248-MY3, NSC 96-2628-E-002-249-MY3, more balanced interconnect distribution. Cho et al. [9] proposed a
and NSC 96-2221-E-002-245. pioneering work to consider CMP variation during global routing.
They empirically developed a predictive CMP density model and tal results show that TTR can achieve 43% reduction on the maximum
showed that the number of inserted dummy features can be predicted number of nets crossing in tiles and obtain at least 35% smaller
by the wire density. Therefore, they proposed a minimum-pin density standard deviations of wire distribution.
global routing algorithm to reduce the maximum wire density in The rest of this paper is organized as follows. Section II describes
each global tile. However, both approaches only consider the wire the routing model and the routing framework. Section III presents our
density inside a routing tile. Since the topographic variation is a long- density-driven routing algorithms. Experimental results are reported
range effect, focusing density value inside each routing tile may incur in Section IV, and conclusions are given in Section V.
larger inter-tile density difference and result in more irregular post-
CMP thickness. (See Fig. 1 (a).) Therefore, optimizing wire-density II. ROUTING M ODEL
uniformity inside a routing tile is obviously not a right metric and We first explain the routing model. As illustrated in Fig. 2, Gk
a common pitfall for CMP control. For better CMP control, it is corresponds to the routing graph of level k. Each level contains a
more desirable to minimize the global variation of wire density, i.e., number of global cells (GCs), and the GCs belonging to different
the density gradient. As the example shown in Fig. 1, if the density levels have different sizes. We denote GCk as the GC of level k.
lower and upper bounds are 20% and 80% respectively, then the three The first top-down routing pass is for global routing, which starts
adjacent routing tiles in Fig. 1 (b) all satisfy these rules. However, uncoarsening from the coarsest level to the finest level (level 0). At
Fig. 1 (c) is a better choice for CMP control because it has the each level k, our global router finds routing paths for the local nets
minimum wire-density gradient. (those nets that entirely sit inside GCk but not inside GCk−1 ). After
all the global routings of level k are performed, we divide one GCk
tile1 tile2
into four smaller GCk−1 and at the same time perform resource
tile1 tile2 tile3 tile1 tile2 tile3 estimation for use at level k-1. Uncoarsening continues until the size
of GCk at a level is below a threshold.
30% 30%
The second top-down routing pass is for detailed routing. As the
20% 50% 30% 40% 50% 40% first pass, it processes uncoarsening from the coarsest level to the
Post-CMP Thickness finest level. At each level, a detailed router is performed and rip-up/re-
(a) (b) (c) route procedures are applied for failed nets. The process continues
until we reach level 0 when the final routing solution is obtained.
Fig. 1. Density variation among neighboring subregions impacts topography.
(a) Different wire distribution in a subregion exists even under the same III. D ENSITY-D RIVEN ROUTING
density. Large density variation among neighboring subregions leads to post-
To deal with wire density optimization, we develop a Two-pass
CMP thickness irregularities. (b) Three adjacent routing tiles satisfy density
rules but result in unbalanced wire distribution. (c) A better result for Top-down full-chip grid-based Routing system, named TTR (see
minimizing the density gradient among tiles. Fig. 2). The rational for top-down routing lies in the fact that it tends
to route longer nets first level by level, which directly contributes
to better wire planning since longer nets have greater impacts on
In this paper, we present a new full-chip grid-based routing system, planarization than shorter ones. We detail the three distinguished
named TTR (Two-pass Top-down grid-based Router), considering stages of TTR in the following subsections.
wire-distribution uniformity for density variation minimization. To
fully consider wire distribution, the router is based on a novel two- A. Density Critical Area Analysis (CAA)
pass, top-down planarization-driven routing framework. (See Fig. 2 In order to guide the following routing for making better deci-
for an illustration.) Different from the aforementioned works, TTR sions, TTR features a density critical area analysis in the prerouting
has the following distinguished features: stage that identifies the potential over-dense hotspots. Recently,
• A new routing framework of performing density prediction in Cho et al. [9] performed minimum-pin density routing to prevent
the prerouting stage, followed by planarization-aware global global-routing paths from crossing through over-dense areas. The
routing at the first uncoarsening stage, an intermediate stage of reason is that a path with higher pin density tends to pass through
density-driven layer/track assignment, and then detailed routing more wire dense areas, since the existence of a pin means that
at the second uncoarsening stage. eventually there is at least one wire connecting to other pins. This
• An efficient density critical area analysis (CAA) algorithm approach can help reduce the wire density in each global tile.
based on Voronoi diagrams is performed off-line in the pre- However, there are some limitations. As the global routing instance
routing stage, which considers both topological information of shown in Fig. 3 (a), although the routing path n1 passes fewer pins,
pins and wire connection to complement the density analysis. it may exacerbate the over-dense areas in its adjacent regions. In
As shown in Section IV, the Voronoi-diagram based CAA contrast, the routing path n2 contains more pins but results in a better
algorithm leads to 3–5% faster overall routing process due to balanced wire distribution. Moreover, the pin density is not directly
easier density control for later detailed routing. Further, it can proportional to the wire density. As shown in Fig. 3 (b), the small
substantially improve the resulting wire-density uniformity. pin count in the global tile may still contribute to large wire density.
• A planarization-aware global router is employed to consider the Therefore, it is necessary to consider both topological information
density lower and upper bounds while minimizing the density and wire connections of each pin to complement the density analysis.
gradient among global tiles. To remedy the deficiencies, we develop a new enhanced analysis
• A layer assigner for panel-density minimization and a density- model based on Voronoi diagrams. The Voronoi diagram of a point
driven track assignment algorithm based on the incremental set P partitions the plane into regions, called Voronoi cells, each of
Delaunay triangulation are performed before detailed routing which is associated with a point of P . If a point in the plane is closer
to preserve more flexibility for wire density arrangement. to the point pt ∈ P than to any other point of P , then this point will
Compared with the density-driven routing system [20], experimen- be in the interior of the Voronoi cell associated with pt . The boundary
To-be-routed net Already-routed net
uncoarsening uncoarsening
G2 G2
uncoarsening uncoarsening
G1 G1
G0 G0
high
low
Critical Area Analysis
Layer/Track Assignment
Prerouting Stage First Pass Stage Intermediate Stage Second Pass Stage
Identify the potential density Apply prerouting-guided Perform density-driven Use segment-to-segment
hot spots based on the pin planarization-aware global layer/track assignment detailed maze routing to
distribution and wire pattern routing for local nets for long segments panel route short segments and
connection to guide the and iteratively refine the by panel. reroute failed nets level by
following global routing. solution. level.
Fig. 2. The new two-pass, top-down routing framework.
Target
n1
n2
Source
(a) (b)
Fig. 4. Voronoi diagram for points with (a) non-uniform distribution and
(a) (b) (b) uniform distribution.
Fig. 3. Limitations of minimum-pin density routing [9]. (a) Path n1 passes
fewer pins but tends to exacerbate the over-dense areas in its adjacent regions,
whereas path n2 passes more pins but leads to better balanced wire density.
(b) Pin count cannot reflect the wire density in the global tile well.
p
segments of a Voronoi cell are called the Voronoi edges. A Voronoi
diagram can efficiently compute the physical proximity and has
been well studied in computational geometry [13]. Papadopoulou and (a) (b)
Lee [21] used Voronoi diagrams of rectilinear polygons to compute
the critical areas for short defects in a circuit layout. Fig. 5. Voronoi-diagram-based pin density analysis. (a) Proximity relation
The motivation for the Voronoi diagram approach lies in the induced by the Voronoi diagram reflects the dense quantity well. (b) Density
following observation. cost is measured by the topological proximity and the number of wire
connections.
Observation 1: Given the Voronoi diagram of points, the standard
deviation for the size of Voronoi cells strongly depends on the
distribution of these points.
As illustrated in Fig. 4 (a), the Voronoi cells for points with non- the dense quantity of the region where this point lies.
uniform distribution have large variation in sizes; in contrast, as As shown in Fig. 5 (a), the point in the dense area has more Voronoi
shown in Fig. 4 (b), for points with uniform distribution, the sizes of cells around it within a given circle with its center at this point.
Voronoi cells are almost the same. Base on these observations, we specify a range r and associate
Another observation can quantify the proximity relation to indicate each pin p with a density cost dp , which is defined as
whether a point lies in the dense area.
dp = ανp + (1 − α)ωp , (1)
Observation 2: For a point, the number of adjacent Voronoi cells
which entirely sit within a specified distance from this point reflects where νp is the number of Voronoi cells around p (excluding the
Voronoi cell associated with p itself) which entirely sit inside the local density and minimizes the density difference among adjacent
circle with a center at p and radius r, ωp is the number of wire regions.
connecting to p, and α, 0 ≤ α ≤ 1, is a user-defined parameter. For For more balanced wire distribution, the cost function Φp of the
the example shown in Fig. 5 (b), there are three Voronoi cells around global routing path gp is defined as follows:
p which entirely sit inside the circle, and four wires are connected
to p. Therefore νp and ωp equal 3 and 4, respectively. Φp = avg{Φt | tile t is on the path gp }, (3)
In the current implementation, we set the radius r as the average in which the average manner can represent the consciousness of even
distance among pins of adjacent Voronoi cells. In this way, the wire distribution.
expected value for νp would be zero if p lies in a uniformly distributed
region; otherwise, νp would increase as a penalty to reflect the density C. Density-Driven Layer/Track Assignment
hotspot where p lies. Additionally, since two-pin nets practically
Recently, Cong et al. [11] proposed the first wire-planning scheme
dominate the netlist in most designs, the expected value of ωp would
between global and detailed routers to reduce congestion. Battery-
equal one. Therefore, the ranges of νp and ωp in Eq. (1) are similar
wala et al. [2] also suggested to add a track assignment stage
and can be reasonably combined together through the α parameter.
between global and detailed routing to improve the routing quality.
After all density costs of pins have been computed, we transform
Ho et al. [14] developed a layer/track assignment heuristic in the
these costs into the cost of global tiles. For each global tile t, we set its
intermediate stage for crosstalk optimization. Later in [15], Ho et al.
predicted density cost dt = max{dp | p is inside t} in the prerouting
further extended their track assigner for the wirelength reduction in
stage. Then TTR feeds the pre-estimated density information to the
X-architecture routing. However, wire density is not addressed in
following routing stages. The density critical area analysis can be
these works.
efficiently performed. We have the following theorem.
1) Density-Driven Layer Assignment: In this paper, we pro-
Theorem 1: The Voronoi-diagram based density CAA runs in
pose a new layer/track assignment algorithm for wire-density op-
O(|P | lg |P |) time, where |P | is the number of pins.
timization. To our best knowledge, this is the first work of wire
Note that the Voronoi-diagram based CAA algorithm is performed
planning that addresses the wire-density optimization in the literature.
only once, and its running time overhead is very small (about 3% of
We handle long horizontal (vertical) segments which span more
the total running time in our experiment). Further, it even leads to
than one complete global tile in a row (column) in the middle
3–5% faster overall routing process due to easier density control for
layer/track assignment stage and delegate short segments to the
later detailed routing, and it can substantially improve the resulting
detailed router. The full row (or column) of a global tile array is
wire-density uniformity.
called a row (column) panel. We will refer to a row panel as a panel
B. Planarization-Aware Global Routing throughout the paper for brevity, unless specified otherwise.
The global routing plans tile-to-tile routing paths for all nets and In a panel, the local density of a column is defined as the
thereby is an important step to decide the wire distribution and total number of segments and obstacles at that column, and the
maintain a uniform metal density across the chip. As mentioned in panel density is the maximum local density among all columns. For
the introduction, both previous works [9], [20] consider only the wire example, Fig. 6 (a) gives a row panel with 11 columns, c1 to c11 .
density inside each global tile, which might incur larger inter-tile There are six segments s1 to s6 in the panel and two obstacles o1 and
density gradient and thus more irregular post-CMP thickness. As a o2 in layers, and its panel density is equal to 4. We intend to evenly
result, for better CMP control, a global router has to consider the arrange these segments to two horizontal layers (say layers 1 and 3)
density variation (gradient) among global tiles in addition to wire while minimizing the panel density at each layer. The density-driven
density inside each tile. layer assignment problem is defined as follows.
In our TTR, the global routing performed in the first top-down • The Density-driven Layer Assignment (DLA) Problem:
uncoarsening pass is based on pattern routing [17]. Pattern routing Given a set L of layers, a set S of disjoint segments in a panel,
uses an L-shaped (1-bend) or Z-shaped (2-bend) route to make the and a set O of fixed obstacles in layers, assign each segment
connection, which gives the shortest path length between two points of S to a layer, such that for each layer the local density is
while reducing the routing bends. Therefore, the obtained routing balanced, and the panel density is minimized.
path is the shortest, and we thus can focus on the objectives that we To solve the DLA problem, we partition the segments and obstacles
most concern. in each panel into |L| layer groups such that the main objective of
We define the planarization-aware cost Φt for each global tile t as DLA is achieved.
follows: First, we build the horizontal constraint graph HCG(V, E) for S
and O in the panel. Each vertex v ∈ V corresponds to a segment
κp , if dt ≥ Bu
Φt = dt +
2
β(2dt − 1) + (1 − β)(dt − dt ) , if Bl ≤ dt < Bu or an obstacle, and two vertices vi and vj are connected by an edge
κn , if dt < Bl e ∈ E if their spans overlap. The cost of edge e(vi , vj ) is defined as
(2) the maximal local density among the overlapping columns between
where dt is the wire density of t, dt is the predicted hotspot cost vi and vj . With this weighting policy, if two vertices are connected
calculated in the prerouting stage, dt is the average wire density of by an edge with a high cost, they should be separated into different
tiles adjacent to t, Bl and Bu are density lower and upper bounds layers. Fig. 6 (b) shows the HCG of the panel in Fig. 6 (a). Here,
specified in foundry density rules respectively, and β, 0 ≤ 1, is the obstacle o2 and segment s3 overlap in columns c3 and c4 , and
a user-defined parameter. (Note that both the values of 2dt − 1 the maximal local density of c3 and c4 is 3. So the cost of the edge
2
and (dt − dt ) are between 0 and 1.) κp and κn are constants, (o2 , s3 ) equals 3.
where κp is a positive penalty that hinders the over denseness in the Consequently, we can formulate the DLA problem as a max-cut, k-
global tile, and κn is a negative reward that encourages paths to go coloring problem (MCP) [10] on the HCG graph, where k equals |L|.
through sparse tiles. The second equation simultaneously considers In this way, we can guarantee that the partitioning result can evenly
s1 s4 s6
s2 s5 correct layer since there is no segment there (otherwise, there must
s3 be an edge connected with vo ). The final assignment result after the
repair procedure for exchanging the layer of vertex o1 with that of
Segment vertex s6 is shown in Fig. 6 (d). As a result, the final assignment has
1 o2
2 a very balanced density distribution that the average local density of
3 o1 Layer 1 obstacle layer 1 is 1.18 and that of layer 3 is 1.27 while the panel densities in
4
both layers equal 2. See Figs. 6 (e) and (f) for the resulting segment
Column c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 Layer 3 obstacle assignments for layers 1 and 3, respectively.
Local density 1 2 3 2 2 2 3 2 4 4 2 Note that for practical concern, in addition to the objectives of
(a) DLA, a good/practical layer assigner shall also assign layers with
more segments of the same nets closer to each other to minimize
s1 s2 s1 s2 s1 s2
1 1 1 1
the stacked-via usage. We can model the connectivity among layers
3 as a connection graph C(V, E) whose nodes represent layers and
2
s6 o2 s6 1 3 o2 s6 3 3 o2
4 3 edges denote the corresponding connectivity. Then, the problem can
4 4 3
o1 4 s3 o1 3 1 s3 o1 1 1 s3 be solved by first computing the Maximum-Weighted Hamiltonian
4 4 3 Path (MWHP) on C(V, E) and then assigning layers with the largest
3 3 3 3
s5 3 s4 s5 s4 s5 s4 connectivity closer to each other. Since the MWHP problem is NP-
(b) (c) (d) hard, we apply a greedy algorithm similar to Kruskal’s minimum
spanning tree algorithm to handle the MWHP problem. We first sort
1 s1 edges by their weights, and then add edges in non-increasing weight
2 s3
order if they form a path.
3 o1 Segment 2) Density-Driven Track Assignment: After the layer assign-
4 s2
ment, we intend to uniformly spread the segments in each layer
Column c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 Layer 1 obstacle of panels and balance the segment distribution among neighboring
Local density 1 1 2 1 1 1 1 1 2 2 0 panels. For convenience, we hereafter refer to a layer of a panel
(e) as a panel since the layer assignment has already been performed.
Let T be the set of tracks inside a panel. Each track τ ∈ T can be
1 o2 s6
represented by the set of its constituent contiguous intervals. Denoting
2
3 s4 these intervals by xi . A segment s ∈ S is said to be assignable to
Segment
4 s5 τ ∈ T, τ ≡ xi , if either xi is a free interval or is an interval
occupied by a segment of the same net. The density-driven track
Column c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 Layer 3 obstacle assignment problem is defined as follows:
Local density 0 1 1 1 1 1 2 1 2 2 2
• The Density-driven Track Assignment (DTA) Problem:
(f)
Given a panel A and its two neighboring panels Au and Ab , a
Fig. 6. A density-driven layer assignment example. (a) A row panel A set of tracks T ∈ A, a set of segments S ∈ A, and a set of fixed
consists of six segments and two obstacles. We intend to evenly assign obstacles O ∈ A, for a given cost function Ψ : S × T → R
these segments to two horizontal layers (layers 1 and 3). (b) The horizontal which represents the density cost of assigning a segment to a
constraint graph. (c) The layer-partitioning result for two layer groups by
track, find a feasible assignment of S to T that minimizes Ψ.
applying the maximum spanning tree and k-coloring algorithms. (d) The final
layer assignment result by applying a minimum-impact repair procedure to To solve this problem, we propose an Incremental Delaunay-
exchange the layers of s6 and o1 . (e) and (f) The final local densities of layers triangulation-based Track Assignment (IDTA) algorithm. In Obser-
1 and 3, respectively. vation 1, we have discovered the relation between density uniformity
and the Voronoi diagram. Instead of using the Voronoi diagram, we
can leverage the good properties of its dual graph, called Delaunay
Triangulation (DT), to evaluate the segment distribution. The DT for
distribute the segments of the maximal local density to different layer a point set is a triangulation that minimizes the standard deviations
groups. However, the MCP is NP-complete [10]. Thus, we resort to of angles among all triangles, and the circumscribed circle of every
a simple, yet efficient heuristic by constructing a maximum spanning triangle will not contain any other point in its interior [13]. Similar to
tree on the HCG and applying a k-coloring algorithm on this tree. the Voronoi diagram, the standard deviation for the size of triangles in
Note that the k-coloring algorithm on a tree can be solved in linear DT can reflect the distribution of these points. Thus, we can represent
time. Fig. 6 (c) shows a layer-partitioning result of Fig. 6 (a), where each segment by three points, two end points and one center point,
s1 , s2 , s3 and s6 are partitioned as one layer group, and o2 , s4 , s5 and and analyze the corresponding DT of these points.
o1 are partitioned as another one. Note that the objects o1 , s3 , s5 , and Before performing the IDTA algorithm, we first model the distri-
s6 at columns c9 and c10 that induce the maximum local density are bution of segments and obstacles in each neighboring panel into an
separated into two different layer groups. artificial segment lying on the boundary of A. In order to reflect the
At the last step, since obstacles are already in fixed layers, we distribution of objects in a neighboring panel An of A, we set the
applied a minimum-impact repair procedure for obstacles. If an length of an artificial segment as the average occupied length per
obstacle is not placed in the right layer (e.g., o1 in Fig. 6 (c)), the track in An , and the center of this artificial segment is determined
layer of a vertex vo of an obstacle is exchanged with that of a vertex by the center of gravity of all segments and obstacles in An .
vs of a segment such that the edge cost (vo , vs ) is the maximum Fig. 7 shows the IDTA algorithm. Without loss of generality, we
among the edges connected with vo in the maximum spanning tree. discuss the track assignment at a row panel, and the case for a column
If there does not exist such a vertex vs , we can just assign vo to the panel is similar. For the track assignment problem, the x-coordinates
Algorithm: IDTA IDTA algorithm.
Input: A /* The panel */ Theorem 2: The IDTA algorithm runs in O(|S| lg |S|) time, where
S /* A set of segments */
O /* A set of fixed obstacles */ |S| is the number of segments in a panel.
su , sb /* The artificial segments */ Fig. 8 shows a track assignment example. Fig. 8 (a) is the initial
Output: T /* The assignment configuration */
1 for each segment si ∈ S DT including only obstacles and artificial segments, and Figs. 8 (b),
2 Compute the flexibility of si , ξ(si ); (c), (d) are the assignment results of s3 , s2 , and s1 , respectively.
3 T ← ∅; The flexibilities of unassigned segments are listed on the right side
4 Construct an initial point set P based on O ∪ {su , sb }; of the figures. Note that each time when a segment is assigned, the
5 Construct an initial DT of P ;
6 while S is not empty flexibilities of unassigned segments are incrementally updated.
7 Choose the segment sj with the smallest flexibility; After the track assignment, the actual track position of a segment
8 Determine track(sj ) such that the maximum area difference is known. Thus, we can perform classical segment-to-segment maze
among the introduced triangles is minimum;
9 T ← T ∪ {sj , track(sj )}; routing in the detailed routing stage to connect shorter nets which
10 Add the points introduced by sj into P ; span at most two routing tiles, and the whole routing process is
11 Update DT incrementally; finished.
12 S ← S − {sj };
13 for each sk ∈ S overlapping sj
14 Update ξ(sk ); s1
s2
15 Return T ; s3
su
Fig. 7. The Incremental Delaunay-triangulation-based Track Assignment
(IDTA) algorithm. 1 Ӷ(s1) = 4.5
2
Ӷ(s2) = 5
3 o1
4 Ӷ(s3) = 3.125
of segments are fixed (i.e., the segments in row panels can only move sb
in the vertical direction), so we can focus on the y direction. At the
(a)
beginning, we define the flexibility of a segment si as su
1 1
ξ(si ) = ti + , s3
i 2 Ӷ(s1) = 4.5
where ti is the number of assignable tracks of si , and i is the length 3 o1 Ӷ(s2) = 4
4 sb
of si . Since the x-coordinate of si is fixed, ti can easily be computed.
If the flexibility of si is smaller, which means that si might have
(b)
longer length or less space to insert, then si should be assigned first. su
After the flexibility computation, we construct an initial DT that
includes only the obstacles and two artificial segments. Each segment 1 s3
2
or obstacle is represented as three points, its left-end, center, and Ӷ(s1) = 4.5
3 o1
right-end points. Fig. 8 (a) shows the initial DT. The construction
4 sb
of DT takes O(|P | lg |P |) time, where |P | is the number of points. s2
Note that a DT can be updated incrementally; if a new point is added (c)
into an existing DT, we only need to update the triangles introduced su
by this new point. Therefore, the process can be performed very
1 Segment
efficiently. The update will be frequently used in the following steps. s1 s3
2
Lemma 1: Adding a new point into an existing Delaunay triangu-
3 o1
lation of |P | points takes O(lg |P |) time. Artificial segment
4 sb
Segments are assigned sequentially in the non-decreasing order s2
of their flexibilities. Suppose segment sj has the smallest flexibility (d) Layer 1 obstacle
among all unassigned segments, then we assign sj to a proper track.
In order to minimize the area difference among all triangles, the track Fig. 8. A density-driven track assignment example. (a) The initial Delaunay
which results in a DT with smaller area difference is preferred. triangulation. (b) Track assignment for segment s3 . (c) Track assignment for
After assigning sj to the track track(sj ), we need to update segment s2 . (d) Track assignment for segment s1 .
the DT and the flexibility of segments. Since we can incrementally
update the DT, only the new triangles introduced by sj need to be
re-generated. Only the segments that overlap sj and are originally
assignable to track(sj ) need to update their values of flexibility. For
IV. E XPERIMENTAL R ESULTS
those segments, the new flexibility would be the original flexibility The TTR routing system was implemented in the C++ program-
minus 1. The number of segments overlapping with sj is bounded ming language on a 1.2 GHz SUN Blade-2000 workstation with 8
by j × tj , which is bounded by the constant size of the panel; GB memory. We used the LEDA packages to compute the Voronoi
here, j is a value, and tj is bounded by the number of tracks in a diagrams and Delaunay triangulation. We conducted the experiments
panel, which is predetermined before the routing and is around 10– based on the 11 MCNC routing benchmarks [3] (these designs have
20 in our implementation. Therefore, the total time complexity of 3–4 routing layers and contain up to 28K connections) and 5 real
updating DT and the flexibilities of segments is O(lg |S|), and we industrial Faraday benchmarks introduced in [1]. (See Table I for
have the following theorem for the overall time complexity of the the statistics of the Faraday benchmarks.) In our implementation, the
parameter α in Eq. (1) was set to 0.5, and the parameters β, κp , κn , and the horizontal wire-crossing maps. The experimental results
Bl , and Bu in Eq. (2) for all benchmarks were given as 0.5, 2, -2, consistently show the superior effectiveness and efficiency of our
10%, and 40%, respectively. routing algorithm and framework in wire density control.
TABLE I Vertical Wire Crossing
30
T HE FARADAY BENCHMARK CIRCUITS . 25
Circuit Size (µm2 ) #Layers #Nets #Connections #Pins 20
DMA 408.4×408.4 6 13256 36162 73982
15
DSP1 706×706 6 28447 63495 144872
DSP2 642.8×642.8 6 28431 36686 144703 10
RISC1 1003.6×1003.6 6 34034 95106 196677
5
RISC2 959.6×959.6 6 34034 95099 196670
0
(a) (b)
We compared the proposed two-pass, top-down routing framework Vertical Wire Crossing
of TTR with the grid-based full-chip multilevel router considering 30
balanced routing density in [20] (named MROR). The MROR pro- 25
gram was provided by the authors of [20] and was run on the same 20
machine. For fair comparison, TTR used the same setting for the size 15
of routing tiles in all benchmarks as MROR. Note that as reported 10
in [20], MROR achieves better solutions than the previous work [3], 5
and thus we shall directly compare TTR with MROR. 0
(c) (d)
In addition, we also examined the effects of the Voronoi-diagram-
based density critical area analysis (CAA) in TTR by comparing Vertical Wire Crossing
30
with the minimum-pin density routing algorithm presented in [9].
25
Note that in [9], the authors applied their algorithm in an ILP-
based global router called BoxRouter [8]. Therefore, to focus on the 20
comparison of the two CAA algorithms, we integrated the minimum- 15
pin density routing algorithm into TTR. In other words, we removed 10
the prerouting of TTR and replaced the cost function of the global 5
router in Eq. (2) by the minimum-pin density routing algorithm. 0
(e) (f)
Tables II and III show the comparison results on the MCNC
and Faraday benchmarks, respectively. Note that since the MROR
Fig. 9. The routing result and the vertical wire-crossing map in tiles for
program can only handle the designs with all pins lying in layer 1 “S13207.” (The red, green, and blue lines represent metals 1, 2, and 3,
(as in the MCNC benchmarks), we did not conduct the experiments respectively) (a) and (b) The routing layout and its vertical wire crossing
on the Faraday benchmarks (where pins are distributed between of MROR [20]. The maximum vertical wire crossing is 27. (c) and (d) The
layers 1 and 3) for MROR. In the tables, we used the same routing layout and its vertical wire crossing obtained from the minimum-pin
density global routing [9] + TTR’s routing framework. The maximum vertical
metrics as those in [20] which can evaluate the uniformity of wire wire crossing is 13. (e) and (f) The routing layout and its vertical wire crossing
distribution in the routing stage, where “Rout.” stands for routability, of TTR (Ours). The maximum vertical wire crossing is only 11.
“#Netmax ” denotes the maximum number of nets crossing a level-0
tile, “#Netavg h ” represents the average number of nets horizontally V. C ONCLUSIONS
crossing a tile and “σh ” gives its standard deviation, and “#Netavg v ”
We have presented a new two-pass, top-down full-chip grid-based
gives the average number of nets vertically crossing a tile and “σv ”
router, named TTR, considering wire density for CMP variation
gives its standard deviation. For the TTR routing systems, “#LG”
control. TTR features a new Voronoi-diagram-based density critical
denotes the total number of layer groups for the layer assignment,
area analyzer, a planarization-aware global router, a layer assigner
and “#Seg” shows the total number of segments.
for panel-density minimization, and an effective track assigner based
As shown in the tables, all routers obtain 100% routing com-
on the incremental Delaunay triangulation. Experimental results have
pletion on the MCNC benchmarks, and both routers applying the
shown the effectiveness and efficiency of the proposed methods.
new framework of TTR outperform the multilevel router MROR
in wire uniformity. Compared with MROR, TTR incorporated with R EFERENCES
the minimum-pin density global routing algorithm reduces #Netmax , [1] S. N. Adya, S. Chaturvedi, J. A. Roy, D. Papa, and I. L. Markov, “Unification
#Netavg v , and #Netavg h by 32%, 28%, 26% respectively, and of Partitioning, Floorplanning and Placement,” Proc. ICCAD, pp. 550–557, Nov.
TTR with Voronoi-diagram-based CAA can achieve 43%, 34%, 2004.
[2] S. H. Batterywala, N. Shenoy, W. Nicholls, and H. Zhou, “Track Assignment:
36% reductions on #Netmax , #Netavg v , and #Netavg h respectively. A Desirable Intermediate Step Between Global Routing and Detailed Routing,”
Moreover, the routers using the TTR framework also result in at Proc. ICCAD, pp. 59–66, Nov. 2002.
least 35% smaller standard deviations of wire distribution in both [3] Y.-W. Chang and S.-P. Lin, “MR: A New Framework for Multilevel Full-Chip
Routing,” IEEE TCAD, vol. 23, no. 5, pp. 793–800, May 2004.
directions (which implies better density smoothness) than MROR. [4] T.-C. Chen and Y.-W. Chang, “Multilevel Gridless Routing Considering Optical
The results on the Faraday benchmarks also show that the global Proximity Correction,” Proc. ASP-DAC, pp. 1160–1163, Jan. 2005.
[5] T.-C. Chen, Y.-W. Chang, and S.-C. Lin, “A Novel Framework for Multilevel
routing guided by the Voronoi-diagram-based CAA can achieve better Full-Chip Gridless Routing,” Proc. ASP-DAC, pp. 636–641, Jan. 2006.
wire uniformity than the minimum-pin density global router. Fig. 9 [6] H.-Y. Chen, M.-F. Chiang, Y.-W. Chang, L. Chen, and B. Han, “Novel Full-Chip
shows the routing layouts of “S13207” and the corresponding wire- Gridless Routing Considering Double-Via Insertion,” Proc. DAC, pp. 755–760,
Jul. 2006.
crossing maps in the vertical direction for the aforementioned three [7] Y. Chen, P. Gupta, and A. B. Kahng, “Performance-Impact Limited Area Fill
routers, and Fig. 10 shows the results for the Faraday circuit “RISC1” Synthesis,” Proc. DAC, pp. 22–27, Jun. 2003.
TABLE II
C OMPARISON FOR THE WIRE DENSITY CONTROL ON THE MCNC BENCHMARKS .
MROR [20] Minimum pin density global routing [9] + TTR's routing framework TTR (Ours)
Circuit CPU CPU CPU
#Netmax #Netavg_v #Netavg_h ıv ıh #LG #Seg #Netmax #Netavg_v #Netavg_h ıv ıh #LG #Seg #Netmax #Netavg_v #Netavg_h ıv ıh
(sec) (sec) (sec)
Mcc1 45 9.9 11.3 7.6 7.3 77.4 124 2600 41 10.3 11.1 5.1 7.6 36.1 124 2639 30 10.3 11.0 5.9 6.4 33.4
Mcc2 96 18.7 20.9 17.3 18.5 2714.9 256 15814 119 20.6 22.2 14.4 19.6 798.0 256 16644 87 20.5 22.2 13.9 16.0 645.0
Struct 7 1.4 1.4 1.1 1.6 61.4 193 2128 5 1.2 0.8 0.9 0.8 66.8 167 2124 6 1.1 0.8 1.1 1.0 58.2
Primary1 15 0.7 0.6 1.2 1.8 69.1 328 2423 12 0.8 0.7 0.9 1.4 27.0 215 2207 6 0.7 0.3 0.9 0.8 24.3
Primary2 25 2.1 1.9 1.6 4.5 322.2 387 8338 22 2.5 1.9 1.3 2.8 144.0 303 7693 8 1.8 0.9 1.3 1.6 131.0
S5378 15 4.4 3.5 3.4 2.1 4.5 87 1091 8 2.5 2.4 1.6 1.5 8.1 91 1193 9 2.5 2.4 1.8 1.5 8.2
S9234 14 4.0 2.6 3.2 1.6 3.2 95 912 7 1.7 1.6 1.4 1.3 5.2 95 1003 9 1.7 1.6 1.6 1.2 5.4
S13207 27 9.3 5.9 5.2 2.8 15.8 97 1727 13 3.4 3.0 2.1 1.8 24.8 97 1821 11 3.3 3.0 2.3 1.7 24.2
S15850 26 10.3 7.4 5.4 2.9 23.8 97 1834 12 4.0 3.8 2.3 1.9 34.2 97 1915 13 3.9 3.8 2.4 1.9 33.5
S38417 23 7.3 4.3 4.4 2.2 54.2 188 5043 10 3.0 2.4 1.8 1.4 62.5 188 5462 11 2.9 2.4 2.0 1.4 62.4
S38584 29 9.1 5.8 5.4 2.9 137.7 189 6004 16 3.3 3.1 2.3 1.6 112.0 189 6328 15 3.3 3.1 2.3 1.6 112.0
Comp. 1.00 1.00 1.00 1.00 1.00 1.00 - - 0.68 0.72 0.74 0.59 0.65 1.01 - - 0.57 0.66 0.64 0.64 0.65 0.98
TABLE III
C OMPARISON FOR THE WIRE DENSITY CONTROL ON THE INDUSTRIAL FARADAY BENCHMARKS .
Minimum pin density global routing [9] + TTR's routing framework TTR (Ours)
Circuit CPU CPU
Rout. #LG #Seg #Netm ax #Netavg_v #Netavg_h ıv ıh Rout. #LG #Seg #Netm ax #Netavg_v #Netavg_h ıv ıh
(sec) (sec)
DMA 99.19% 272 5168 14 3.14 2.77 1.70 1.77 48.8 99.29% 272 5325 10 3.08 2.70 1.75 1.64 47.0
DSP1 99.11% 264 4241 11 2.91 2.50 1.95 1.89 124.2 99.18% 263 4529 10 2.85 2.44 2.24 1.95 117.3
DSP2 99.10% 268 4676 14 2.78 2.78 1.71 1.92 87.2 99.06% 268 4892 10 2.72 2.70 1.90 1.91 82.3
RISC1 99.16% 265 5864 21 3.63 3.79 2.95 3.78 355.3 99.16% 265 6226 17 3.59 3.73 3.08 3.29 333.4
RISC2 99.23% 260 6141 21 3.64 3.70 2.55 3.08 297.4 99.19% 260 6533 13 3.59 3.62 2.77 2.89 280.0
Comp. 99.16% - - 1.00 1.00 1.00 1.00 1.00 1.00 99.18% - - 0.75 0.98 0.98 1.08 0.95 0.95
Horizontal Wire Crossing
[8] M. Cho, and D. Z. Pan, “A New Global Router Based on Box Expansion and
Progressive ILP,” Proc. DAC, pp. 373–378, Jul. 2006.
[9] M. Cho, D. Z. Pan, H. Xiang, and R. Puri, “Wire Density Driven Global Routing
for CMP Variation and Timing,” Proc. ICCAD, pp. 487–492, Nov. 2006.
[10] J. D. Cho, S. Raje, and M. Sarrafzadeh, “Approximation for the Maximum Cut,
k-Coloring and Maximum Linear Arrangement Problems,” Manuscript, Dept. of
EECS, Northwestern Univ., 1993.
[11] J. Cong, J. Fang, and K. Y. Khoo, “DUNE–A Multilayer Gridless Routing
System,” IEEE TCAD, vol. 20, no. 5, pp. 633–647, May 2001.
[12] J. Cong, M. Xie, and Y. Zhang, “An Enhanced Multilevel Routing System,” Proc.
ICCAD, pp. 51–58, Nov. 2002.
[13] M. de Berg, M. van Kreveld, M. Overmars, and O. Schwarzkopf, Computational
Geometry: Algorithms and Applications, Springer, 1997.
(a) (b) [14] T.-Y. Ho, Y.-W. Chang, S.-J. Chen, and D.-T. Lee, “Crosstalk- and Performance-
Driven Multilevel Full-Chip Routing,” IEEE TCAD, vol. 24, no. 6, pp. 869–878,
Horizontal Wire Crossing
Jun. 2005.
[15] T.-Y. Ho, C.-F. Chang, Y.-W. Chang, and S.-J. Chen, “Multilevel Full-Chip
Routing for the X-Based Architecture,” Proc. DAC, pp. 597–602, Jun. 2005.
[16] A. B. Kahng, G. Robins, A. Singh, and A. Zelikovsky, “Filling Algorithms and
Analyses for Layout Density Control,” IEEE TCAD, vol. 18, no. 4, pp. 445–462,
Apr. 1999.
[17] R. Kastner, E. Bozorgzadeh, and M. Sarrafzadeh, “Pattern Routing: Use and
Theory for Increasing Predictability and Avoiding Coupling,” IEEE TCAD,
pp. 777–790, Nov. 2002.
[18] A. Kurokawa, T. Kanamoto, T. Ibe, A. Kasebe, W. F. Chang, T. Kage, Y. Inoue,
and H. Masuda, “Dummy Filling Methods for Reducing Interconnect Capacitance
and Number of Fills,” Proc. ISQED, pp. 586–591, Mar. 2005.
[19] K.-S. Leung, “SPIDER: Simultaneous Post-Layout IR-Drop and Metal Density
Enhancement with Redundant Fill,” Proc. ICCAD, pp. 33–38, Nov. 2005.
[20] K. S.-M. Li, Y.-W. Chang, C.-L. Lee, C. Su, and J. E. Chen, “Multilevel Full-Chip
(c) (d)
Routing with Testability and Yield Enhancement,” IEEE TCAD, Sep. 2007.
[21] E. Papadopoulou and D. T. Lee, “Critical Area Computation via Voronoi
Fig. 10. The routing result and the horizontal wire-crossing map in tiles for Diagrams,” IEEE TCAD, vol. 18, no. 4, pp. 463–474, Apr. 1999.
“RISC1.” (The red, green, blue, magenta, coffee, and aqua blue lines represent [22] T. H. Park, “Characterization and Modeling of Pattern Dependencies in Copper
metals 1, 2, 3, 4, 5, and 6 respectively, and the white space is allocated by Interconnects for Integrated Circuits,” Ph.D. Dissertation, Dept. of EECS, MIT,
7 macros.) (a) and (b) The routing layout and its horizontal wire crossing May 2002.
[23] R. Tian, D. F. Wong, and R. Boone, “Model-Based Dummy Feature Placement for
obtained from the minimum-pin density global routing [9] + TTR’s routing Oxide Chemical-Mechanical Polishing Manufacturability,” Proc. DAC, pp. 667–
framework. The maximum horizontal wire crossing is 21. (c) and (d) The 670, Jun. 2000.
routing layout and its horizontal wire crossing of TTR (Ours). The maximum [24] Taiwan Semiconductor Manufacturing Company (TSMC), Reference Flows 7.0.
horizontal wire crossing is only 17. [25] X. Wang, C. C. Chiang, J. Kawa, and Q. Su, “A Min-Variance Iterative Method
for Fast Smart Dummy Feature Density Assignment in Chemical-Mechanical
Polishing,” Proc. ISQED, pp. 258–263, Mar. 2005.
[26] D. White and B. Moore, “An ‘Intelligent’ Approach to Dummy Fill,” EE Times,
Jan. 3, 2005.