[go: up one dir, main page]

Academia.eduAcademia.edu

Novel wire density driven full-chip routing for CMP variation control

2007, International Conference on Computer Aided Design

Novel Wire Density Driven Full-Chip Routing for CMP Variation Control Huang-Yu Chen† , Szu-Jui Chou† , Sheng-Lung Wang§ , and Yao-Wen Chang†‡ † Graduate Institute of Electronics Engineering, National Taiwan University, Taipei, Taiwan ‡ Department of Electrical Engineering, National Taiwan University, Taipei, Taiwan § Synopsys, Inc, Taipei, Taiwan Abstract— As nanometer technology advances, the post-CMP into layouts to restrict the variations on each layer. Dummy features dielectric thickness variation control becomes crucial for man- may either be connected to power/ground (tied fills) or left floating ufacturing closure. To improve CMP quality, dummy feature (floating fills) [19]. The tied fill has predictable but higher capaci- filling is typically performed by foundries after the routing tance, while the floating fill has lower but unpredictable one due to stage. However, filling dummy features may greatly degrade the floating nature. Traditionally, electrical impacts of dummy fills the interconnect performance and lead to explosion of mask can be negligible, and dummy features are inserted during the post data. It is thus desirable to consider wire-density uniformity during routing to minimize the side effects from aggressive post- routing stage. Filling algorithms have been proposed to satisfy density layout dummy filling. In this paper, we present a new full-chip bounds and reduce the density variation [16], [25]. However, as grid-based routing system considering wire density for reticle reported in [26], these filled dummy features may incur troublesome planarization enhancement. To fully consider wire distribution, problems at 65nm and successive technology nodes. The tied fill the router applies a novel two-pass, top-down planarity-driven may induce crosstalk for its high coupling capacitances to nearby routing framework, which employs a new density critical area interconnects and would place a heavy burden for P/G (power/ground) analysis based on Voronoi diagrams and incorporates an inter- network. On the other hand, the floating capacitance of floating mediate stage of density-driven layer/track assignment based on fills is usually uncertain, and thus the induced coupling capacitance incremental Delaunay triangulation. Experimental results show might unpredictably harm the timing-optimized results in the previous that our methods can achieve more balanced wire distribution design stages. Moreover, dummy fills also sheerly increase the data than state-of-the-art works. volume of mask, lengthening the time of mask-making processes such I. I NTRODUCTION as mask synthesis, writing, and inspection verification. Especially, As IC process geometries shrink to 65nm and below, one important these filled features would significantly increase the input data in yield loss of interconnects comes from the chemical-mechanical pol- the following time-consuming reticle enhancement techniques, such ishing (CMP) step in the copper metallization (Damascene) process. as OPC (optical proximity correction) and PSM (phase shift mask). Because of the difference in hardness between copper and dielectric Therefore, much research focuses on impact-limited dummy feature materials, the CMP planarizing process might generate topography filling algorithms [7], [18]. irregularities. A non-uniform feature density distribution on each In the nanometer technology, routing has become a decisive factor layer causes CMP to over polish or under polish, generating metal for determining chip manufacturability, since it presides over most dishing and dielectric erosion [22]. These thickness variations have to of the layout geometries in the back-end design process. In order be carefully controlled, since the variation in one interconnect level is to tackle these manufacturing challenges, routing techniques must progressively transferred to subsequent levels during manufacturing, handle the increasing complexity. The routing approaches applying and finally the compounding variation can be significant on an upper the bottom-up coarsening and top-down uncoarsening techniques level, which is often called the multi-layer accumulative effect [23]. have demonstrated the superior capability of handling large-scale Two key problems arise from the post-CMP thickness variation: routing problems, such as the Λ-shaped multilevel [3], [4], [12], (1) the layout surface fluctuates inside or outside the depth of the V-shaped multilevel [5], and the two-pass bottom-up [6] routing focus (DOF) of the photolithography system, such that the exposed frameworks. patterns do not appear acceptably sharp and open/short defects may Recently, routing considering wire distribution has attracted much even occur, and (2) these irregular variations greatly change the attention in the literature. The earlier studies for CMP processes have electrical characteristics of interconnects, especially for resistance and indicated that the post-CMP dielectric thickness is highly correlated capacitance, degrading the accuracy of timing analysis and worsening to the layout pattern density, because during the polishing step, the electromigration. As a result, in order to improve chip thickness interlevel dielectric (ILD) removal rates are varied with the pattern uniformity, TSMC recommends performing virtual CMP (VCMP) density [23]. Further, the layout pattern (consisting of wires and analysis to identify the metal and dielectric thickness variation hotspot dummy features) density can be systematically determined by the before chip fabrication for 65nm manufacturing processes (see TSMC wire density distribution, as reported in [9]. Therefore, managing Reference Flows 7.0) [24]. wire density at the routing stage has great potential for alleviating In order to improve the CMP quality, modern foundries often the aggressive dummy feature filling induced problems. impose recommended layout density rules and fill dummy features Li et al. [20] presented the first routing system in the literature addressing the CMP induced variation. By setting the desired density —————————————————————————————– in the cost function of global routing, the routing results have This work was supported in part by UMC and NSC of Taiwan under Grant No’s. NSC 96- 2752-E-002-008-PAE, NSC 96-2628-E-002-248-MY3, NSC 96-2628-E-002-249-MY3, more balanced interconnect distribution. Cho et al. [9] proposed a and NSC 96-2221-E-002-245. pioneering work to consider CMP variation during global routing. They empirically developed a predictive CMP density model and tal results show that TTR can achieve 43% reduction on the maximum showed that the number of inserted dummy features can be predicted number of nets crossing in tiles and obtain at least 35% smaller by the wire density. Therefore, they proposed a minimum-pin density standard deviations of wire distribution. global routing algorithm to reduce the maximum wire density in The rest of this paper is organized as follows. Section II describes each global tile. However, both approaches only consider the wire the routing model and the routing framework. Section III presents our density inside a routing tile. Since the topographic variation is a long- density-driven routing algorithms. Experimental results are reported range effect, focusing density value inside each routing tile may incur in Section IV, and conclusions are given in Section V. larger inter-tile density difference and result in more irregular post- CMP thickness. (See Fig. 1 (a).) Therefore, optimizing wire-density II. ROUTING M ODEL uniformity inside a routing tile is obviously not a right metric and We first explain the routing model. As illustrated in Fig. 2, Gk a common pitfall for CMP control. For better CMP control, it is corresponds to the routing graph of level k. Each level contains a more desirable to minimize the global variation of wire density, i.e., number of global cells (GCs), and the GCs belonging to different the density gradient. As the example shown in Fig. 1, if the density levels have different sizes. We denote GCk as the GC of level k. lower and upper bounds are 20% and 80% respectively, then the three The first top-down routing pass is for global routing, which starts adjacent routing tiles in Fig. 1 (b) all satisfy these rules. However, uncoarsening from the coarsest level to the finest level (level 0). At Fig. 1 (c) is a better choice for CMP control because it has the each level k, our global router finds routing paths for the local nets minimum wire-density gradient. (those nets that entirely sit inside GCk but not inside GCk−1 ). After all the global routings of level k are performed, we divide one GCk tile1 tile2 into four smaller GCk−1 and at the same time perform resource tile1 tile2 tile3 tile1 tile2 tile3 estimation for use at level k-1. Uncoarsening continues until the size of GCk at a level is below a threshold. 30% 30% The second top-down routing pass is for detailed routing. As the 20% 50% 30% 40% 50% 40% first pass, it processes uncoarsening from the coarsest level to the Post-CMP Thickness finest level. At each level, a detailed router is performed and rip-up/re- (a) (b) (c) route procedures are applied for failed nets. The process continues until we reach level 0 when the final routing solution is obtained. Fig. 1. Density variation among neighboring subregions impacts topography. (a) Different wire distribution in a subregion exists even under the same III. D ENSITY-D RIVEN ROUTING density. Large density variation among neighboring subregions leads to post- To deal with wire density optimization, we develop a Two-pass CMP thickness irregularities. (b) Three adjacent routing tiles satisfy density rules but result in unbalanced wire distribution. (c) A better result for Top-down full-chip grid-based Routing system, named TTR (see minimizing the density gradient among tiles. Fig. 2). The rational for top-down routing lies in the fact that it tends to route longer nets first level by level, which directly contributes to better wire planning since longer nets have greater impacts on In this paper, we present a new full-chip grid-based routing system, planarization than shorter ones. We detail the three distinguished named TTR (Two-pass Top-down grid-based Router), considering stages of TTR in the following subsections. wire-distribution uniformity for density variation minimization. To fully consider wire distribution, the router is based on a novel two- A. Density Critical Area Analysis (CAA) pass, top-down planarization-driven routing framework. (See Fig. 2 In order to guide the following routing for making better deci- for an illustration.) Different from the aforementioned works, TTR sions, TTR features a density critical area analysis in the prerouting has the following distinguished features: stage that identifies the potential over-dense hotspots. Recently, • A new routing framework of performing density prediction in Cho et al. [9] performed minimum-pin density routing to prevent the prerouting stage, followed by planarization-aware global global-routing paths from crossing through over-dense areas. The routing at the first uncoarsening stage, an intermediate stage of reason is that a path with higher pin density tends to pass through density-driven layer/track assignment, and then detailed routing more wire dense areas, since the existence of a pin means that at the second uncoarsening stage. eventually there is at least one wire connecting to other pins. This • An efficient density critical area analysis (CAA) algorithm approach can help reduce the wire density in each global tile. based on Voronoi diagrams is performed off-line in the pre- However, there are some limitations. As the global routing instance routing stage, which considers both topological information of shown in Fig. 3 (a), although the routing path n1 passes fewer pins, pins and wire connection to complement the density analysis. it may exacerbate the over-dense areas in its adjacent regions. In As shown in Section IV, the Voronoi-diagram based CAA contrast, the routing path n2 contains more pins but results in a better algorithm leads to 3–5% faster overall routing process due to balanced wire distribution. Moreover, the pin density is not directly easier density control for later detailed routing. Further, it can proportional to the wire density. As shown in Fig. 3 (b), the small substantially improve the resulting wire-density uniformity. pin count in the global tile may still contribute to large wire density. • A planarization-aware global router is employed to consider the Therefore, it is necessary to consider both topological information density lower and upper bounds while minimizing the density and wire connections of each pin to complement the density analysis. gradient among global tiles. To remedy the deficiencies, we develop a new enhanced analysis • A layer assigner for panel-density minimization and a density- model based on Voronoi diagrams. The Voronoi diagram of a point driven track assignment algorithm based on the incremental set P partitions the plane into regions, called Voronoi cells, each of Delaunay triangulation are performed before detailed routing which is associated with a point of P . If a point in the plane is closer to preserve more flexibility for wire density arrangement. to the point pt ∈ P than to any other point of P , then this point will Compared with the density-driven routing system [20], experimen- be in the interior of the Voronoi cell associated with pt . The boundary To-be-routed net Already-routed net uncoarsening uncoarsening G2 G2 uncoarsening uncoarsening G1 G1 G0 G0 high low Critical Area Analysis Layer/Track Assignment Prerouting Stage First Pass Stage Intermediate Stage Second Pass Stage Identify the potential density Apply prerouting-guided Perform density-driven Use segment-to-segment hot spots based on the pin planarization-aware global layer/track assignment detailed maze routing to distribution and wire pattern routing for local nets for long segments panel route short segments and connection to guide the and iteratively refine the by panel. reroute failed nets level by following global routing. solution. level. Fig. 2. The new two-pass, top-down routing framework. Target n1 n2 Source (a) (b) Fig. 4. Voronoi diagram for points with (a) non-uniform distribution and (a) (b) (b) uniform distribution. Fig. 3. Limitations of minimum-pin density routing [9]. (a) Path n1 passes fewer pins but tends to exacerbate the over-dense areas in its adjacent regions, whereas path n2 passes more pins but leads to better balanced wire density. (b) Pin count cannot reflect the wire density in the global tile well. p segments of a Voronoi cell are called the Voronoi edges. A Voronoi diagram can efficiently compute the physical proximity and has been well studied in computational geometry [13]. Papadopoulou and (a) (b) Lee [21] used Voronoi diagrams of rectilinear polygons to compute the critical areas for short defects in a circuit layout. Fig. 5. Voronoi-diagram-based pin density analysis. (a) Proximity relation The motivation for the Voronoi diagram approach lies in the induced by the Voronoi diagram reflects the dense quantity well. (b) Density following observation. cost is measured by the topological proximity and the number of wire connections. Observation 1: Given the Voronoi diagram of points, the standard deviation for the size of Voronoi cells strongly depends on the distribution of these points. As illustrated in Fig. 4 (a), the Voronoi cells for points with non- the dense quantity of the region where this point lies. uniform distribution have large variation in sizes; in contrast, as As shown in Fig. 5 (a), the point in the dense area has more Voronoi shown in Fig. 4 (b), for points with uniform distribution, the sizes of cells around it within a given circle with its center at this point. Voronoi cells are almost the same. Base on these observations, we specify a range r and associate Another observation can quantify the proximity relation to indicate each pin p with a density cost dp , which is defined as whether a point lies in the dense area. dp = ανp + (1 − α)ωp , (1) Observation 2: For a point, the number of adjacent Voronoi cells which entirely sit within a specified distance from this point reflects where νp is the number of Voronoi cells around p (excluding the Voronoi cell associated with p itself) which entirely sit inside the local density and minimizes the density difference among adjacent circle with a center at p and radius r, ωp is the number of wire regions. connecting to p, and α, 0 ≤ α ≤ 1, is a user-defined parameter. For For more balanced wire distribution, the cost function Φp of the the example shown in Fig. 5 (b), there are three Voronoi cells around global routing path gp is defined as follows: p which entirely sit inside the circle, and four wires are connected to p. Therefore νp and ωp equal 3 and 4, respectively. Φp = avg{Φt | tile t is on the path gp }, (3) In the current implementation, we set the radius r as the average in which the average manner can represent the consciousness of even distance among pins of adjacent Voronoi cells. In this way, the wire distribution. expected value for νp would be zero if p lies in a uniformly distributed region; otherwise, νp would increase as a penalty to reflect the density C. Density-Driven Layer/Track Assignment hotspot where p lies. Additionally, since two-pin nets practically Recently, Cong et al. [11] proposed the first wire-planning scheme dominate the netlist in most designs, the expected value of ωp would between global and detailed routers to reduce congestion. Battery- equal one. Therefore, the ranges of νp and ωp in Eq. (1) are similar wala et al. [2] also suggested to add a track assignment stage and can be reasonably combined together through the α parameter. between global and detailed routing to improve the routing quality. After all density costs of pins have been computed, we transform Ho et al. [14] developed a layer/track assignment heuristic in the these costs into the cost of global tiles. For each global tile t, we set its intermediate stage for crosstalk optimization. Later in [15], Ho et al. predicted density cost dt = max{dp | p is inside t} in the prerouting further extended their track assigner for the wirelength reduction in stage. Then TTR feeds the pre-estimated density information to the X-architecture routing. However, wire density is not addressed in following routing stages. The density critical area analysis can be these works. efficiently performed. We have the following theorem. 1) Density-Driven Layer Assignment: In this paper, we pro- Theorem 1: The Voronoi-diagram based density CAA runs in pose a new layer/track assignment algorithm for wire-density op- O(|P | lg |P |) time, where |P | is the number of pins. timization. To our best knowledge, this is the first work of wire Note that the Voronoi-diagram based CAA algorithm is performed planning that addresses the wire-density optimization in the literature. only once, and its running time overhead is very small (about 3% of We handle long horizontal (vertical) segments which span more the total running time in our experiment). Further, it even leads to than one complete global tile in a row (column) in the middle 3–5% faster overall routing process due to easier density control for layer/track assignment stage and delegate short segments to the later detailed routing, and it can substantially improve the resulting detailed router. The full row (or column) of a global tile array is wire-density uniformity. called a row (column) panel. We will refer to a row panel as a panel B. Planarization-Aware Global Routing throughout the paper for brevity, unless specified otherwise. The global routing plans tile-to-tile routing paths for all nets and In a panel, the local density of a column is defined as the thereby is an important step to decide the wire distribution and total number of segments and obstacles at that column, and the maintain a uniform metal density across the chip. As mentioned in panel density is the maximum local density among all columns. For the introduction, both previous works [9], [20] consider only the wire example, Fig. 6 (a) gives a row panel with 11 columns, c1 to c11 . density inside each global tile, which might incur larger inter-tile There are six segments s1 to s6 in the panel and two obstacles o1 and density gradient and thus more irregular post-CMP thickness. As a o2 in layers, and its panel density is equal to 4. We intend to evenly result, for better CMP control, a global router has to consider the arrange these segments to two horizontal layers (say layers 1 and 3) density variation (gradient) among global tiles in addition to wire while minimizing the panel density at each layer. The density-driven density inside each tile. layer assignment problem is defined as follows. In our TTR, the global routing performed in the first top-down • The Density-driven Layer Assignment (DLA) Problem: uncoarsening pass is based on pattern routing [17]. Pattern routing Given a set L of layers, a set S of disjoint segments in a panel, uses an L-shaped (1-bend) or Z-shaped (2-bend) route to make the and a set O of fixed obstacles in layers, assign each segment connection, which gives the shortest path length between two points of S to a layer, such that for each layer the local density is while reducing the routing bends. Therefore, the obtained routing balanced, and the panel density is minimized. path is the shortest, and we thus can focus on the objectives that we To solve the DLA problem, we partition the segments and obstacles most concern. in each panel into |L| layer groups such that the main objective of We define the planarization-aware cost Φt for each global tile t as DLA is achieved. follows: First, we build the horizontal constraint graph HCG(V, E) for S  and O in the panel. Each vertex v ∈ V corresponds to a segment κp , if dt ≥ Bu Φt = dt + 2 β(2dt − 1) + (1 − β)(dt − dt ) , if Bl ≤ dt < Bu or an obstacle, and two vertices vi and vj are connected by an edge κn , if dt < Bl e ∈ E if their spans overlap. The cost of edge e(vi , vj ) is defined as (2) the maximal local density among the overlapping columns between where dt is the wire density of t, dt is the predicted hotspot cost vi and vj . With this weighting policy, if two vertices are connected calculated in the prerouting stage, dt is the average wire density of by an edge with a high cost, they should be separated into different tiles adjacent to t, Bl and Bu are density lower and upper bounds layers. Fig. 6 (b) shows the HCG of the panel in Fig. 6 (a). Here, specified in foundry density rules respectively, and β, 0 ≤ 1, is the obstacle o2 and segment s3 overlap in columns c3 and c4 , and a user-defined parameter. (Note that both the values of 2dt − 1 the maximal local density of c3 and c4 is 3. So the cost of the edge 2 and (dt − dt ) are between 0 and 1.) κp and κn are constants, (o2 , s3 ) equals 3. where κp is a positive penalty that hinders the over denseness in the Consequently, we can formulate the DLA problem as a max-cut, k- global tile, and κn is a negative reward that encourages paths to go coloring problem (MCP) [10] on the HCG graph, where k equals |L|. through sparse tiles. The second equation simultaneously considers In this way, we can guarantee that the partitioning result can evenly s1 s4 s6 s2 s5 correct layer since there is no segment there (otherwise, there must s3 be an edge connected with vo ). The final assignment result after the repair procedure for exchanging the layer of vertex o1 with that of Segment vertex s6 is shown in Fig. 6 (d). As a result, the final assignment has 1 o2 2 a very balanced density distribution that the average local density of 3 o1 Layer 1 obstacle layer 1 is 1.18 and that of layer 3 is 1.27 while the panel densities in 4 both layers equal 2. See Figs. 6 (e) and (f) for the resulting segment Column c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 Layer 3 obstacle assignments for layers 1 and 3, respectively. Local density 1 2 3 2 2 2 3 2 4 4 2 Note that for practical concern, in addition to the objectives of (a) DLA, a good/practical layer assigner shall also assign layers with more segments of the same nets closer to each other to minimize s1 s2 s1 s2 s1 s2 1 1 1 1 the stacked-via usage. We can model the connectivity among layers 3 as a connection graph C(V, E) whose nodes represent layers and 2 s6 o2 s6 1 3 o2 s6 3 3 o2 4 3 edges denote the corresponding connectivity. Then, the problem can 4 4 3 o1 4 s3 o1 3 1 s3 o1 1 1 s3 be solved by first computing the Maximum-Weighted Hamiltonian 4 4 3 Path (MWHP) on C(V, E) and then assigning layers with the largest 3 3 3 3 s5 3 s4 s5 s4 s5 s4 connectivity closer to each other. Since the MWHP problem is NP- (b) (c) (d) hard, we apply a greedy algorithm similar to Kruskal’s minimum spanning tree algorithm to handle the MWHP problem. We first sort 1 s1 edges by their weights, and then add edges in non-increasing weight 2 s3 order if they form a path. 3 o1 Segment 2) Density-Driven Track Assignment: After the layer assign- 4 s2 ment, we intend to uniformly spread the segments in each layer Column c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 Layer 1 obstacle of panels and balance the segment distribution among neighboring Local density 1 1 2 1 1 1 1 1 2 2 0 panels. For convenience, we hereafter refer to a layer of a panel (e) as a panel since the layer assignment has already been performed. Let T be the set of tracks inside a panel. Each track τ ∈ T can be 1 o2 s6 represented by the set of its constituent contiguous intervals. Denoting 2 3 s4 these intervals by xi . A segment s ∈ S is said to be assignable to Segment 4 s5 τ ∈ T, τ ≡ xi , if either xi is a free interval or is an interval occupied by a segment of the same net. The density-driven track Column c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 Layer 3 obstacle assignment problem is defined as follows: Local density 0 1 1 1 1 1 2 1 2 2 2 • The Density-driven Track Assignment (DTA) Problem: (f) Given a panel A and its two neighboring panels Au and Ab , a Fig. 6. A density-driven layer assignment example. (a) A row panel A set of tracks T ∈ A, a set of segments S ∈ A, and a set of fixed consists of six segments and two obstacles. We intend to evenly assign obstacles O ∈ A, for a given cost function Ψ : S × T → R these segments to two horizontal layers (layers 1 and 3). (b) The horizontal which represents the density cost of assigning a segment to a constraint graph. (c) The layer-partitioning result for two layer groups by track, find a feasible assignment of S to T that minimizes Ψ. applying the maximum spanning tree and k-coloring algorithms. (d) The final layer assignment result by applying a minimum-impact repair procedure to To solve this problem, we propose an Incremental Delaunay- exchange the layers of s6 and o1 . (e) and (f) The final local densities of layers triangulation-based Track Assignment (IDTA) algorithm. In Obser- 1 and 3, respectively. vation 1, we have discovered the relation between density uniformity and the Voronoi diagram. Instead of using the Voronoi diagram, we can leverage the good properties of its dual graph, called Delaunay Triangulation (DT), to evaluate the segment distribution. The DT for distribute the segments of the maximal local density to different layer a point set is a triangulation that minimizes the standard deviations groups. However, the MCP is NP-complete [10]. Thus, we resort to of angles among all triangles, and the circumscribed circle of every a simple, yet efficient heuristic by constructing a maximum spanning triangle will not contain any other point in its interior [13]. Similar to tree on the HCG and applying a k-coloring algorithm on this tree. the Voronoi diagram, the standard deviation for the size of triangles in Note that the k-coloring algorithm on a tree can be solved in linear DT can reflect the distribution of these points. Thus, we can represent time. Fig. 6 (c) shows a layer-partitioning result of Fig. 6 (a), where each segment by three points, two end points and one center point, s1 , s2 , s3 and s6 are partitioned as one layer group, and o2 , s4 , s5 and and analyze the corresponding DT of these points. o1 are partitioned as another one. Note that the objects o1 , s3 , s5 , and Before performing the IDTA algorithm, we first model the distri- s6 at columns c9 and c10 that induce the maximum local density are bution of segments and obstacles in each neighboring panel into an separated into two different layer groups. artificial segment lying on the boundary of A. In order to reflect the At the last step, since obstacles are already in fixed layers, we distribution of objects in a neighboring panel An of A, we set the applied a minimum-impact repair procedure for obstacles. If an length of an artificial segment as the average occupied length per obstacle is not placed in the right layer (e.g., o1 in Fig. 6 (c)), the track in An , and the center of this artificial segment is determined layer of a vertex vo of an obstacle is exchanged with that of a vertex by the center of gravity of all segments and obstacles in An . vs of a segment such that the edge cost (vo , vs ) is the maximum Fig. 7 shows the IDTA algorithm. Without loss of generality, we among the edges connected with vo in the maximum spanning tree. discuss the track assignment at a row panel, and the case for a column If there does not exist such a vertex vs , we can just assign vo to the panel is similar. For the track assignment problem, the x-coordinates Algorithm: IDTA IDTA algorithm. Input: A /* The panel */ Theorem 2: The IDTA algorithm runs in O(|S| lg |S|) time, where S /* A set of segments */ O /* A set of fixed obstacles */ |S| is the number of segments in a panel. su , sb /* The artificial segments */ Fig. 8 shows a track assignment example. Fig. 8 (a) is the initial Output: T /* The assignment configuration */ 1 for each segment si ∈ S DT including only obstacles and artificial segments, and Figs. 8 (b), 2 Compute the flexibility of si , ξ(si ); (c), (d) are the assignment results of s3 , s2 , and s1 , respectively. 3 T ← ∅; The flexibilities of unassigned segments are listed on the right side 4 Construct an initial point set P based on O ∪ {su , sb }; of the figures. Note that each time when a segment is assigned, the 5 Construct an initial DT of P ; 6 while S is not empty flexibilities of unassigned segments are incrementally updated. 7 Choose the segment sj with the smallest flexibility; After the track assignment, the actual track position of a segment 8 Determine track(sj ) such that the maximum area difference is known. Thus, we can perform classical segment-to-segment maze among the introduced triangles is minimum; 9 T ← T ∪ {sj , track(sj )}; routing in the detailed routing stage to connect shorter nets which 10 Add the points introduced by sj into P ; span at most two routing tiles, and the whole routing process is 11 Update DT incrementally; finished. 12 S ← S − {sj }; 13 for each sk ∈ S overlapping sj 14 Update ξ(sk ); s1 s2 15 Return T ; s3 su Fig. 7. The Incremental Delaunay-triangulation-based Track Assignment (IDTA) algorithm. 1 Ӷ(s1) = 4.5 2 Ӷ(s2) = 5 3 o1 4 Ӷ(s3) = 3.125 of segments are fixed (i.e., the segments in row panels can only move sb in the vertical direction), so we can focus on the y direction. At the (a) beginning, we define the flexibility of a segment si as su 1 1 ξ(si ) = ti + , s3 i 2 Ӷ(s1) = 4.5 where ti is the number of assignable tracks of si , and i is the length 3 o1 Ӷ(s2) = 4 4 sb of si . Since the x-coordinate of si is fixed, ti can easily be computed. If the flexibility of si is smaller, which means that si might have (b) longer length or less space to insert, then si should be assigned first. su After the flexibility computation, we construct an initial DT that includes only the obstacles and two artificial segments. Each segment 1 s3 2 or obstacle is represented as three points, its left-end, center, and Ӷ(s1) = 4.5 3 o1 right-end points. Fig. 8 (a) shows the initial DT. The construction 4 sb of DT takes O(|P | lg |P |) time, where |P | is the number of points. s2 Note that a DT can be updated incrementally; if a new point is added (c) into an existing DT, we only need to update the triangles introduced su by this new point. Therefore, the process can be performed very 1 Segment efficiently. The update will be frequently used in the following steps. s1 s3 2 Lemma 1: Adding a new point into an existing Delaunay triangu- 3 o1 lation of |P | points takes O(lg |P |) time. Artificial segment 4 sb Segments are assigned sequentially in the non-decreasing order s2 of their flexibilities. Suppose segment sj has the smallest flexibility (d) Layer 1 obstacle among all unassigned segments, then we assign sj to a proper track. In order to minimize the area difference among all triangles, the track Fig. 8. A density-driven track assignment example. (a) The initial Delaunay which results in a DT with smaller area difference is preferred. triangulation. (b) Track assignment for segment s3 . (c) Track assignment for After assigning sj to the track track(sj ), we need to update segment s2 . (d) Track assignment for segment s1 . the DT and the flexibility of segments. Since we can incrementally update the DT, only the new triangles introduced by sj need to be re-generated. Only the segments that overlap sj and are originally assignable to track(sj ) need to update their values of flexibility. For IV. E XPERIMENTAL R ESULTS those segments, the new flexibility would be the original flexibility The TTR routing system was implemented in the C++ program- minus 1. The number of segments overlapping with sj is bounded ming language on a 1.2 GHz SUN Blade-2000 workstation with 8 by j × tj , which is bounded by the constant size of the panel; GB memory. We used the LEDA packages to compute the Voronoi here, j is a value, and tj is bounded by the number of tracks in a diagrams and Delaunay triangulation. We conducted the experiments panel, which is predetermined before the routing and is around 10– based on the 11 MCNC routing benchmarks [3] (these designs have 20 in our implementation. Therefore, the total time complexity of 3–4 routing layers and contain up to 28K connections) and 5 real updating DT and the flexibilities of segments is O(lg |S|), and we industrial Faraday benchmarks introduced in [1]. (See Table I for have the following theorem for the overall time complexity of the the statistics of the Faraday benchmarks.) In our implementation, the parameter α in Eq. (1) was set to 0.5, and the parameters β, κp , κn , and the horizontal wire-crossing maps. The experimental results Bl , and Bu in Eq. (2) for all benchmarks were given as 0.5, 2, -2, consistently show the superior effectiveness and efficiency of our 10%, and 40%, respectively. routing algorithm and framework in wire density control. TABLE I Vertical Wire Crossing 30 T HE FARADAY BENCHMARK CIRCUITS . 25 Circuit Size (µm2 ) #Layers #Nets #Connections #Pins 20 DMA 408.4×408.4 6 13256 36162 73982 15 DSP1 706×706 6 28447 63495 144872 DSP2 642.8×642.8 6 28431 36686 144703 10 RISC1 1003.6×1003.6 6 34034 95106 196677 5 RISC2 959.6×959.6 6 34034 95099 196670 0 (a) (b) We compared the proposed two-pass, top-down routing framework Vertical Wire Crossing of TTR with the grid-based full-chip multilevel router considering 30 balanced routing density in [20] (named MROR). The MROR pro- 25 gram was provided by the authors of [20] and was run on the same 20 machine. For fair comparison, TTR used the same setting for the size 15 of routing tiles in all benchmarks as MROR. Note that as reported 10 in [20], MROR achieves better solutions than the previous work [3], 5 and thus we shall directly compare TTR with MROR. 0 (c) (d) In addition, we also examined the effects of the Voronoi-diagram- based density critical area analysis (CAA) in TTR by comparing Vertical Wire Crossing 30 with the minimum-pin density routing algorithm presented in [9]. 25 Note that in [9], the authors applied their algorithm in an ILP- based global router called BoxRouter [8]. Therefore, to focus on the 20 comparison of the two CAA algorithms, we integrated the minimum- 15 pin density routing algorithm into TTR. In other words, we removed 10 the prerouting of TTR and replaced the cost function of the global 5 router in Eq. (2) by the minimum-pin density routing algorithm. 0 (e) (f) Tables II and III show the comparison results on the MCNC and Faraday benchmarks, respectively. Note that since the MROR Fig. 9. The routing result and the vertical wire-crossing map in tiles for program can only handle the designs with all pins lying in layer 1 “S13207.” (The red, green, and blue lines represent metals 1, 2, and 3, (as in the MCNC benchmarks), we did not conduct the experiments respectively) (a) and (b) The routing layout and its vertical wire crossing on the Faraday benchmarks (where pins are distributed between of MROR [20]. The maximum vertical wire crossing is 27. (c) and (d) The layers 1 and 3) for MROR. In the tables, we used the same routing layout and its vertical wire crossing obtained from the minimum-pin density global routing [9] + TTR’s routing framework. The maximum vertical metrics as those in [20] which can evaluate the uniformity of wire wire crossing is 13. (e) and (f) The routing layout and its vertical wire crossing distribution in the routing stage, where “Rout.” stands for routability, of TTR (Ours). The maximum vertical wire crossing is only 11. “#Netmax ” denotes the maximum number of nets crossing a level-0 tile, “#Netavg h ” represents the average number of nets horizontally V. C ONCLUSIONS crossing a tile and “σh ” gives its standard deviation, and “#Netavg v ” We have presented a new two-pass, top-down full-chip grid-based gives the average number of nets vertically crossing a tile and “σv ” router, named TTR, considering wire density for CMP variation gives its standard deviation. For the TTR routing systems, “#LG” control. TTR features a new Voronoi-diagram-based density critical denotes the total number of layer groups for the layer assignment, area analyzer, a planarization-aware global router, a layer assigner and “#Seg” shows the total number of segments. for panel-density minimization, and an effective track assigner based As shown in the tables, all routers obtain 100% routing com- on the incremental Delaunay triangulation. Experimental results have pletion on the MCNC benchmarks, and both routers applying the shown the effectiveness and efficiency of the proposed methods. new framework of TTR outperform the multilevel router MROR in wire uniformity. Compared with MROR, TTR incorporated with R EFERENCES the minimum-pin density global routing algorithm reduces #Netmax , [1] S. N. Adya, S. Chaturvedi, J. A. Roy, D. Papa, and I. L. Markov, “Unification #Netavg v , and #Netavg h by 32%, 28%, 26% respectively, and of Partitioning, Floorplanning and Placement,” Proc. ICCAD, pp. 550–557, Nov. TTR with Voronoi-diagram-based CAA can achieve 43%, 34%, 2004. [2] S. H. Batterywala, N. Shenoy, W. Nicholls, and H. Zhou, “Track Assignment: 36% reductions on #Netmax , #Netavg v , and #Netavg h respectively. A Desirable Intermediate Step Between Global Routing and Detailed Routing,” Moreover, the routers using the TTR framework also result in at Proc. ICCAD, pp. 59–66, Nov. 2002. least 35% smaller standard deviations of wire distribution in both [3] Y.-W. Chang and S.-P. Lin, “MR: A New Framework for Multilevel Full-Chip Routing,” IEEE TCAD, vol. 23, no. 5, pp. 793–800, May 2004. directions (which implies better density smoothness) than MROR. [4] T.-C. Chen and Y.-W. Chang, “Multilevel Gridless Routing Considering Optical The results on the Faraday benchmarks also show that the global Proximity Correction,” Proc. ASP-DAC, pp. 1160–1163, Jan. 2005. [5] T.-C. Chen, Y.-W. Chang, and S.-C. Lin, “A Novel Framework for Multilevel routing guided by the Voronoi-diagram-based CAA can achieve better Full-Chip Gridless Routing,” Proc. ASP-DAC, pp. 636–641, Jan. 2006. wire uniformity than the minimum-pin density global router. Fig. 9 [6] H.-Y. Chen, M.-F. Chiang, Y.-W. Chang, L. Chen, and B. Han, “Novel Full-Chip shows the routing layouts of “S13207” and the corresponding wire- Gridless Routing Considering Double-Via Insertion,” Proc. DAC, pp. 755–760, Jul. 2006. crossing maps in the vertical direction for the aforementioned three [7] Y. Chen, P. Gupta, and A. B. Kahng, “Performance-Impact Limited Area Fill routers, and Fig. 10 shows the results for the Faraday circuit “RISC1” Synthesis,” Proc. DAC, pp. 22–27, Jun. 2003. TABLE II C OMPARISON FOR THE WIRE DENSITY CONTROL ON THE MCNC BENCHMARKS . MROR [20] Minimum pin density global routing [9] + TTR's routing framework TTR (Ours) Circuit CPU CPU CPU #Netmax #Netavg_v #Netavg_h ıv ıh #LG #Seg #Netmax #Netavg_v #Netavg_h ıv ıh #LG #Seg #Netmax #Netavg_v #Netavg_h ıv ıh (sec) (sec) (sec) Mcc1 45 9.9 11.3 7.6 7.3 77.4 124 2600 41 10.3 11.1 5.1 7.6 36.1 124 2639 30 10.3 11.0 5.9 6.4 33.4 Mcc2 96 18.7 20.9 17.3 18.5 2714.9 256 15814 119 20.6 22.2 14.4 19.6 798.0 256 16644 87 20.5 22.2 13.9 16.0 645.0 Struct 7 1.4 1.4 1.1 1.6 61.4 193 2128 5 1.2 0.8 0.9 0.8 66.8 167 2124 6 1.1 0.8 1.1 1.0 58.2 Primary1 15 0.7 0.6 1.2 1.8 69.1 328 2423 12 0.8 0.7 0.9 1.4 27.0 215 2207 6 0.7 0.3 0.9 0.8 24.3 Primary2 25 2.1 1.9 1.6 4.5 322.2 387 8338 22 2.5 1.9 1.3 2.8 144.0 303 7693 8 1.8 0.9 1.3 1.6 131.0 S5378 15 4.4 3.5 3.4 2.1 4.5 87 1091 8 2.5 2.4 1.6 1.5 8.1 91 1193 9 2.5 2.4 1.8 1.5 8.2 S9234 14 4.0 2.6 3.2 1.6 3.2 95 912 7 1.7 1.6 1.4 1.3 5.2 95 1003 9 1.7 1.6 1.6 1.2 5.4 S13207 27 9.3 5.9 5.2 2.8 15.8 97 1727 13 3.4 3.0 2.1 1.8 24.8 97 1821 11 3.3 3.0 2.3 1.7 24.2 S15850 26 10.3 7.4 5.4 2.9 23.8 97 1834 12 4.0 3.8 2.3 1.9 34.2 97 1915 13 3.9 3.8 2.4 1.9 33.5 S38417 23 7.3 4.3 4.4 2.2 54.2 188 5043 10 3.0 2.4 1.8 1.4 62.5 188 5462 11 2.9 2.4 2.0 1.4 62.4 S38584 29 9.1 5.8 5.4 2.9 137.7 189 6004 16 3.3 3.1 2.3 1.6 112.0 189 6328 15 3.3 3.1 2.3 1.6 112.0 Comp. 1.00 1.00 1.00 1.00 1.00 1.00 - - 0.68 0.72 0.74 0.59 0.65 1.01 - - 0.57 0.66 0.64 0.64 0.65 0.98 TABLE III C OMPARISON FOR THE WIRE DENSITY CONTROL ON THE INDUSTRIAL FARADAY BENCHMARKS . Minimum pin density global routing [9] + TTR's routing framework TTR (Ours) Circuit CPU CPU Rout. #LG #Seg #Netm ax #Netavg_v #Netavg_h ıv ıh Rout. #LG #Seg #Netm ax #Netavg_v #Netavg_h ıv ıh (sec) (sec) DMA 99.19% 272 5168 14 3.14 2.77 1.70 1.77 48.8 99.29% 272 5325 10 3.08 2.70 1.75 1.64 47.0 DSP1 99.11% 264 4241 11 2.91 2.50 1.95 1.89 124.2 99.18% 263 4529 10 2.85 2.44 2.24 1.95 117.3 DSP2 99.10% 268 4676 14 2.78 2.78 1.71 1.92 87.2 99.06% 268 4892 10 2.72 2.70 1.90 1.91 82.3 RISC1 99.16% 265 5864 21 3.63 3.79 2.95 3.78 355.3 99.16% 265 6226 17 3.59 3.73 3.08 3.29 333.4 RISC2 99.23% 260 6141 21 3.64 3.70 2.55 3.08 297.4 99.19% 260 6533 13 3.59 3.62 2.77 2.89 280.0 Comp. 99.16% - - 1.00 1.00 1.00 1.00 1.00 1.00 99.18% - - 0.75 0.98 0.98 1.08 0.95 0.95 Horizontal Wire Crossing [8] M. Cho, and D. Z. Pan, “A New Global Router Based on Box Expansion and Progressive ILP,” Proc. DAC, pp. 373–378, Jul. 2006. [9] M. Cho, D. Z. Pan, H. Xiang, and R. Puri, “Wire Density Driven Global Routing for CMP Variation and Timing,” Proc. ICCAD, pp. 487–492, Nov. 2006. [10] J. D. Cho, S. Raje, and M. Sarrafzadeh, “Approximation for the Maximum Cut, k-Coloring and Maximum Linear Arrangement Problems,” Manuscript, Dept. of EECS, Northwestern Univ., 1993. [11] J. Cong, J. Fang, and K. Y. Khoo, “DUNE–A Multilayer Gridless Routing System,” IEEE TCAD, vol. 20, no. 5, pp. 633–647, May 2001. [12] J. Cong, M. Xie, and Y. Zhang, “An Enhanced Multilevel Routing System,” Proc. ICCAD, pp. 51–58, Nov. 2002. [13] M. de Berg, M. van Kreveld, M. Overmars, and O. Schwarzkopf, Computational Geometry: Algorithms and Applications, Springer, 1997. (a) (b) [14] T.-Y. Ho, Y.-W. Chang, S.-J. Chen, and D.-T. Lee, “Crosstalk- and Performance- Driven Multilevel Full-Chip Routing,” IEEE TCAD, vol. 24, no. 6, pp. 869–878, Horizontal Wire Crossing Jun. 2005. [15] T.-Y. Ho, C.-F. Chang, Y.-W. Chang, and S.-J. Chen, “Multilevel Full-Chip Routing for the X-Based Architecture,” Proc. DAC, pp. 597–602, Jun. 2005. [16] A. B. Kahng, G. Robins, A. Singh, and A. Zelikovsky, “Filling Algorithms and Analyses for Layout Density Control,” IEEE TCAD, vol. 18, no. 4, pp. 445–462, Apr. 1999. [17] R. Kastner, E. Bozorgzadeh, and M. Sarrafzadeh, “Pattern Routing: Use and Theory for Increasing Predictability and Avoiding Coupling,” IEEE TCAD, pp. 777–790, Nov. 2002. [18] A. Kurokawa, T. Kanamoto, T. Ibe, A. Kasebe, W. F. Chang, T. Kage, Y. Inoue, and H. Masuda, “Dummy Filling Methods for Reducing Interconnect Capacitance and Number of Fills,” Proc. ISQED, pp. 586–591, Mar. 2005. [19] K.-S. Leung, “SPIDER: Simultaneous Post-Layout IR-Drop and Metal Density Enhancement with Redundant Fill,” Proc. ICCAD, pp. 33–38, Nov. 2005. [20] K. S.-M. Li, Y.-W. Chang, C.-L. Lee, C. Su, and J. E. Chen, “Multilevel Full-Chip (c) (d) Routing with Testability and Yield Enhancement,” IEEE TCAD, Sep. 2007. [21] E. Papadopoulou and D. T. Lee, “Critical Area Computation via Voronoi Fig. 10. The routing result and the horizontal wire-crossing map in tiles for Diagrams,” IEEE TCAD, vol. 18, no. 4, pp. 463–474, Apr. 1999. “RISC1.” (The red, green, blue, magenta, coffee, and aqua blue lines represent [22] T. H. Park, “Characterization and Modeling of Pattern Dependencies in Copper metals 1, 2, 3, 4, 5, and 6 respectively, and the white space is allocated by Interconnects for Integrated Circuits,” Ph.D. Dissertation, Dept. of EECS, MIT, 7 macros.) (a) and (b) The routing layout and its horizontal wire crossing May 2002. [23] R. Tian, D. F. Wong, and R. Boone, “Model-Based Dummy Feature Placement for obtained from the minimum-pin density global routing [9] + TTR’s routing Oxide Chemical-Mechanical Polishing Manufacturability,” Proc. DAC, pp. 667– framework. The maximum horizontal wire crossing is 21. (c) and (d) The 670, Jun. 2000. routing layout and its horizontal wire crossing of TTR (Ours). The maximum [24] Taiwan Semiconductor Manufacturing Company (TSMC), Reference Flows 7.0. horizontal wire crossing is only 17. [25] X. Wang, C. C. Chiang, J. Kawa, and Q. Su, “A Min-Variance Iterative Method for Fast Smart Dummy Feature Density Assignment in Chemical-Mechanical Polishing,” Proc. ISQED, pp. 258–263, Mar. 2005. [26] D. White and B. Moore, “An ‘Intelligent’ Approach to Dummy Fill,” EE Times, Jan. 3, 2005.