[go: up one dir, main page]

CN116595938A - Layout method, system and integrated circuit of pipeline register - Google Patents

Layout method, system and integrated circuit of pipeline register Download PDF

Info

Publication number
CN116595938A
CN116595938A CN202310862159.3A CN202310862159A CN116595938A CN 116595938 A CN116595938 A CN 116595938A CN 202310862159 A CN202310862159 A CN 202310862159A CN 116595938 A CN116595938 A CN 116595938A
Authority
CN
China
Prior art keywords
block
path
port
blocks
register
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310862159.3A
Other languages
Chinese (zh)
Other versions
CN116595938B (en
Inventor
潘新阁
黄现
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Taorun Semiconductor Co ltd
Original Assignee
Shanghai Taorun Semiconductor Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Taorun Semiconductor Co ltd filed Critical Shanghai Taorun Semiconductor Co ltd
Priority to CN202310862159.3A priority Critical patent/CN116595938B/en
Publication of CN116595938A publication Critical patent/CN116595938A/en
Application granted granted Critical
Publication of CN116595938B publication Critical patent/CN116595938B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/30Circuit design
    • G06F30/39Circuit design at the physical level
    • G06F30/392Floor-planning or layout, e.g. partitioning or placement
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/30Circuit design
    • G06F30/39Circuit design at the physical level
    • G06F30/398Design verification or optimisation, e.g. using design rule check [DRC], layout versus schematics [LVS] or finite element methods [FEM]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2115/00Details relating to the type of the circuit
    • G06F2115/06Structured ASICs
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Geometry (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Architecture (AREA)
  • Semiconductor Integrated Circuits (AREA)
  • Design And Manufacture Of Integrated Circuits (AREA)

Abstract

The application discloses a layout method, a system and an integrated circuit of a pipeline register, wherein the method comprises the following steps: the integrated circuit module is divided into a plurality of blocks, wherein an initial size of each block is constrained by a first distance. And determining a communication path between the first port and the second port according to the positions of the blocks, wherein the communication path is the path which passes through the minimum path block when the first port and the second port are communicated. Pipeline registers are laid out in path blocks traversed by the communication paths, wherein each path block comprises at least one register. And adjusting the initial size of the path block according to the register timing constraint condition so that the EDA tool automatically optimizes the register position in the path block to enable the timing convergence of the pipeline register. By the method, unnecessary detour of the pipeline and register accumulation can be reduced in the pipeline register layout.

Description

Layout method, system and integrated circuit of pipeline register
Technical Field
The application relates to the field of integrated circuits, in particular to a layout method, a system and an integrated circuit of a pipeline register.
Background
With the development of high-performance chips, the technology is more advanced, the frequency is higher, the integration level is higher, the size of the chip is larger, the communication relationship is complex, and particularly in the high-performance chips, such as communication chips, graphic processors (graphics processing unit, graphics Processing Units (GPUs)) and the like. Due to the ultra-high complexity, the PPA-i.e., performance (performance), power consumption (power), area (area) optimization of the chip is simultaneously required to be higher and higher, and the chip design is finished by tools alone to face more and more challenges.
For example, for chips with large scale, complex port connection, and high operating frequency requirements, how to achieve timing convergence of chip design and achieve better PPA is a significant challenge.
Disclosure of Invention
The application provides a layout method, a system and an integrated circuit of a pipeline register, which divide an integrated circuit module into a plurality of blocks, and search the shortest pipeline path in the blocks so as to be in a proper position for the register layout in the pipeline register layout, thereby reducing unnecessary detouring of a pipeline and register accumulation.
Specifically, the technical scheme of the application is as follows:
In a first aspect, the present application discloses a method for laying out pipeline registers, wherein the method is used for laying out pipeline registers of an integrated circuit module, the integrated circuit module comprises a plurality of ports, the plurality of ports comprise a first port and a second port, and the method comprises:
dividing the integrated circuit module into a plurality of blocks, wherein an initial size of each block is constrained by a first distance;
determining a communication path between the first port and the second port according to the positions of the plurality of blocks; the communication path is a path with the least path blocks when the first port and the second port are communicated, and the path blocks comprise a starting block, a stopping block and M middle blocks, wherein M > =0; the starting block is adjacent to the first port, the ending block is adjacent to the second port, and M intermediate blocks are sequentially connected between the starting block and the ending block;
laying out the pipeline registers in the path blocks through which the communication paths pass, wherein each path block comprises at least one register;
and adjusting the initial size of the path block according to the register timing constraint condition so as to enable the timing sequence of the pipeline register to be converged.
In some implementations, the plurality of tiles are the same size and shape.
In some implementations, the plurality of tiles are rectangles, and a side length of the rectangle is constrained by the first distance.
In some embodiments, determining the communication path between the first port and the second port according to the positions of the plurality of blocks includes:
traversing all paths between the starting block and the ending block according to the positions of the blocks to obtain a path set;
and taking the shortest path in the path set as the communication path.
In some implementations, the adjusting the initial size of the path block according to the register timing constraint such that the timing of the pipeline register converges includes:
under the constraint of the distance range of the register timing convergence, adjusting the initial size of all or part of the path block so that an EDA tool automatically optimizes the register positions in the path block;
and acquiring the size of the path block and the positions of the registers after adjustment, verifying whether the distance between any two registers meets the time sequence constraint, and if not, performing readjustment until the time sequence of the pipeline registers converges.
In a second aspect, the present application discloses a pipeline register layout system, wherein the pipeline register layout system is used for an integrated circuit module, the integrated circuit module includes a plurality of ports, the plurality of ports includes a first port and a second port, and the system includes:
a dividing module for dividing the integrated circuit module into a plurality of blocks, wherein the initial size of each block is limited by a first distance;
a determining module, configured to determine a communication path between the first port and the second port according to the positions of the plurality of blocks; the communication path is a path with the least path blocks when the first port and the second port are communicated, and the path blocks comprise a starting block, a stopping block and M middle blocks, wherein M > =0; the starting block is adjacent to the first port, the ending block is adjacent to the second port, and M intermediate blocks are sequentially connected between the starting block and the ending block;
a layout module for laying out the pipeline registers in the path blocks through which the communication paths pass, wherein each path block includes at least one register;
And the adjusting module is used for adjusting the initial size of the path block according to the time sequence constraint condition of the register so as to enable the time sequence of the pipeline register to be converged.
In some implementations, the plurality of tiles are the same size and shape.
In some embodiments, the determining module is specifically configured to:
traversing all paths between the starting block and the ending block according to the positions of the blocks to obtain a path set;
and taking the shortest path in the path set as the communication path.
In a third aspect, the present invention discloses a pipeline register layout system, which is characterized by comprising a processor and a memory, wherein the memory stores instructions, and the processor calls the instructions to cause the processor to execute the pipeline register layout method according to any one of the embodiments.
In a fourth aspect, the present invention discloses an integrated circuit, comprising an integrated circuit module, the integrated circuit module comprising a plurality of ports, the plurality of ports comprising a first port and a second port, the integrated circuit module further comprising a pipeline register laid out by the layout method described in any of the above embodiments.
Compared with the prior art, the application has at least one of the following beneficial effects:
1. the application is mainly used for realizing the layout of a large number of pipeline registers with complex connection relations under the irregular-shaped modules in the high-performance chip, ensuring the timing sequence convergence of the registers and improving the optimization of the performance, the power consumption and the area of the chip.
2. The application not only avoids the problem that automatic layout of EDA tools can not be converged, but also avoids the gradual processing of a large number of registers, which are completely customized, realizes an optimized automatic routing algorithm, can better guide the layout of the tools, selects proper positions for the registers, and prevents unnecessary detours of the pipeline technology and register accumulation.
3. Through the algorithm, a large number of registers in the pipeline technology can find reasonable physical positions faster, so that the time sequence is easier to converge, machine resources and operation time are saved, and layout efficiency is improved.
Drawings
The above features, technical features, advantages and implementation of the present application will be further described in the following description of preferred embodiments with reference to the accompanying drawings in a clear and easily understood manner.
FIG. 1 is a schematic diagram of a pipeline register connecting a first port and a second port in the background art provided by the present application;
FIG. 2 is a schematic diagram illustrating the bypass of a pipeline register during routing in the prior art provided by the present application;
FIG. 3 is a flow chart of a method of layout of pipeline registers in an embodiment of the present application;
FIG. 4 is a schematic diagram of an integrated circuit module after block division according to an embodiment of the present application;
FIG. 5 is a schematic diagram showing the effect of adjusting the initial size of the block after the pipeline register layout in the embodiment of the present application;
FIG. 6 is a block diagram of a software architecture of a pipeline register layout system according to an embodiment of the present application;
FIG. 7 is a schematic diagram of a hardware architecture of a pipeline register layout system according to an embodiment of the present application.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system architectures, techniques, etc. in order to provide an understanding of the embodiments of the application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.
It will be understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
For the sake of simplicity of the drawing, the parts relevant to the present application are shown only schematically in the figures, which do not represent the actual structure thereof as a product. In addition, in order to simplify the drawings and facilitate understanding, components having the same structure or function in some drawings are schematically shown only one of them, but they include the case of "one or more than one".
It should further be understood that in the description of the application, the term "and/or" refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations as a and/or B, including a, B, or a combination of a and B.
In the description of the present application, unless explicitly stated and limited otherwise, the term "coupled" is to be interpreted broadly, and may be, for example, fixedly coupled, detachably coupled, or integrally coupled. Either mechanically or electrically. May be directly connected, may be indirectly connected through an intermediate medium, or may be in communication with the interior of two elements. The specific meaning of the above terms in the present application can be understood as appropriate by those of ordinary skill in the art.
In addition, in the description of the present application, the terms "first," "second," and the like are used merely to distinguish between the described objects and should not be construed as indicating or implying relative importance, order, or the like.
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the following description will explain the specific embodiments of the present application with reference to the accompanying drawings. It is evident that the drawings in the following description are only examples of the application, from which other drawings and other embodiments can be obtained by a person skilled in the art without inventive effort.
In chip design, a chip may be divided into a plurality of integrated circuit modules, each of which has a plurality of input/output ports, where "input/output ports" means that the ports are input ports, output ports, or both. In an integrated circuit module, the connection between two input/output ports (hereinafter referred to as ports) may be implemented by a pipeline (pipeline) register, which may be a multi-stage register, i.e., may include a plurality of registers. For example, fig. 1 is a schematic diagram of a pipeline register layout in an integrated circuit module according to an embodiment of the present application. As shown in fig. 1, a 2-stage pipeline register is arranged between the port 1 and the port 2, and an 11-stage pipeline register is arranged between the port 1 and the port 10. The layout of pipeline registers in the figure is merely illustrative, and is not intended to limit the present application, and other layout manners may be adopted in the actual layout, for example, the layout of pipeline registers between more ports, the layout of pipeline registers of other stages, etc., and the number of stages of pipeline registers between different ports is not limited, and may be the same or different. In addition, each port may support signaling of multiple bits (bits), and each bit may correspond to one of the pipeline registers.
The above integrated circuit modules are merely examples, and in practical designs, there may be modules with larger scale and more complex port connection relationships, irregular module shapes, and higher operating frequency requirements (e.g., above 1.0 Ghz). In pipeline register placement, the pipeline registers are often laid out using default routing functions of electronic design automation (electronic design automation, EDA) tools for multiple iterations and optimizations.
It can be seen that EDA tools may need to handle large scale, large area and irregularly shaped module routing, resulting in unnecessary pipeline detours, e.g., irregular parts may exhibit line routing such that pipeline registers cannot complete their path at a specified frequency.
For example, please refer to fig. 2, which is a schematic diagram of an integrated circuit module obtained by performing pipeline register layout according to the prior art. As shown in fig. 2, the path from stage 3 to stage 7 pipeline registers does not select the connection scheme shown by the more intuitive dotted line, but instead an unnecessary detour in marked circles occurs, which may result in wasting the physical distance that the registers can use to walk, resulting in pipeline registers not completing the path from bottom right to top left at a specified frequency.
In view of this, an embodiment of the present application provides a method for pipeline register layout, by dividing an integrated circuit module into blocks, to make an initial constraint on possible register layout positions of pipeline registers, and then further adjusting the positions of the registers by adjusting the blocks, so as to make the pipeline register timing converge, thus reducing unnecessary detours and register stacking of the pipeline, and making the pipeline registers complete their paths at a specified frequency.
Reference is now made to FIG. 3, which is a flowchart illustrating a method for pipeline register layout according to an embodiment of the present application. The method is used for layout of pipeline registers of an integrated circuit module, the integrated circuit module comprises a plurality of ports, wherein a first port and a second port are any two ports of the plurality of ports which are connected through registers. As shown in fig. 3, the method comprises the steps of:
s100, dividing the integrated circuit module into a plurality of blocks, wherein the initial size of each block is limited by a first distance.
S200, determining a communication path between the first port and the second port according to the positions of the blocks. The communication path is a path with the smallest passing path block when the first port and the second port are communicated, and the path block comprises a starting block, a stopping block and M middle blocks, wherein M > =0. The starting block is adjacent to the first port, the ending block is adjacent to the second port, and M intermediate blocks are sequentially connected between the starting block and the ending block.
S300, the pipeline registers are distributed in the path blocks through which the communication paths pass, wherein each path block comprises at least one register.
S400, according to the time sequence constraint condition of the register, the initial size of the path block is adjusted, so that the time sequence of the pipeline register is converged.
The above method may be applied to the layout of pipeline registers between any two ports of an integrated circuit module, where the integrated circuit module may be an entire chip or integrated circuit, or may be part of a chip or integrated circuit for implementing specific functions of the chip or integrated circuit.
The method divides the integrated circuit module into a plurality of blocks, utilizes the initial layout of the block constraint register, and searches the shortest pipeline path so as to provide proper positions for the register layout in the pipeline register layout, thereby reducing unnecessary detours of the pipeline and register accumulation.
In another embodiment of the layout method of the present invention, in the above step S100, the integrated circuit may be divided into a plurality of blocks according to the shape of the integrated circuit module. Or the integrated circuit may be divided into a plurality of blocks according to a preset shape and initial size. For example, please refer to fig. 4 of the specification, which illustrates an effect of dividing the integrated circuit module into blocks. In this embodiment, the size and shape of the blocks are the same, so that the complexity of dividing the integrated circuit module can be simplified and the dividing efficiency can be improved. In other embodiments, the size and shape of the blocks may be different, and for irregular integrated circuit modules, the shape of the integrated circuit module may be more adapted, the remaining space may be reduced, the layout positions may be increased, and more layout possibilities may be provided.
In this embodiment, the first distance may be a preset distance, and may be obtained according to historical empirical data. For example, the average value is calculated based on the distance between two adjacent stages of registers of the history pipeline register. I.e. the first distance may be derived from the wiring experience of the relevant technician. For another example, the first distance may be obtained according to a maximum distance of timing closure between two adjacent stages of registers of the pipeline register, e.g. taking the maximum distance or leaving a margin on the basis of the maximum distance. The present application is not limited herein to the manner in which the first distance is obtained. The shape of the block is not limited in the embodiment of the present application, and may be rectangular (including square), other polygons, circles, or other irregular patterns, for example. In some embodiments, the blocks are in a regular pattern, such as a rectangle, so that the division of the blocks can be simplified, the layout and adjustment of the subsequent registers are more facilitated, and the efficiency of the register layout design is improved. The initial size of a block is constrained by a first distance, meaning that the initial size of the block itself is constrained by the first distance, or the distance between adjacent blocks is constrained by the first distance. For example, in one embodiment, the length of the center position of a block to the center position of its neighboring block does not exceed the first distance, or the product of the first distance and a coefficient. For example, if the block is square and has a side length L, the distance from the center of the block to the center of the adjacent block is . The initial size of the block may also be considered to not exceed the first distance or the product of the first distance and a coefficient.
In another embodiment, the initial size of the block is constrained by the first distance, meaning that the initial size of the block does not exceed the first distance, e.g., the length and width of a rectangular block do not exceed the first distance, with some margin. For another example, the initial size of the block is constrained by the first distance, which means that the initial size of the block does not exceed the product of the first distance and a coefficient, for example, the diagonal length of a rectangular block does not exceed the product of the first distance and a coefficient, and the coefficient is determined according to the relationship between the diagonal and the length and width of the rectangle. For example, the diameter of the circular block does not exceed the first distance, leaving a margin. Or the radius of the circular block is not more than half of the first distance, leaving a certain margin.
In the above step S200, the first port may be an input port, and the second port may be an output port. Alternatively, the first port may be an output port and the second port may be an input port. In addition, the first port or the second port is an input port, and may be an output port corresponding to another port. Similarly, the first port or the second port may be an output port, and may be an input port corresponding to another port.
In addition, several pipeline registers may extend from one input port and be connected to different output ports, where if there are pipeline registers between two ports, these two ports may be referred to as a group of ports. The optimal communication path can be found through the method when the pipeline registers among the ports in each group are wired.
In another implementation of this embodiment, any two adjacent blocks may be regarded as being connected. One block closest to the input port in any group of ports is taken as a starting block, and one block closest to the output port is taken as a terminating block. The M middle blocks are sequentially connected with the starting block and the ending block. Where M > =0, the number of middle blocks is equal to or greater than 0, and the maximum number of M is also limited by the total number of blocks divided by the integrated circuit module.
Traversing all blocks, and counting the lengths of all feasible paths between the starting block and the ending block. One path with the smallest number of path blocks, namely one path with the shortest length, is used as the communication path of the group of ports. Through the method steps, all port groups are traversed, and the communication path of each group of ports is found.
Preferably, referring to fig. 5 of the specification, the pipeline registers shown in the figure include 11 path blocks on the communication path. The 11 path blocks comprise a start block, 9 middle blocks and a stop block which are connected in sequence.
In another implementation of this embodiment, there are only 2 path blocks on the communication path of the pipeline register. Wherein the 2 path blocks include a start block and a stop block. The pipeline register is provided with at least one register in the starting block and the ending block respectively.
In another implementation of this embodiment, only 1 path block is included on the communication path where the pipeline register exists. The path block is a start block and a stop block, and the first port is directly connected with the second port through the path block. At this time, this pipeline register needs to be provided with at least two registers in the path block. This is because pipeline registers need to contain at least one register to store the current instruction state and one register to store the previous instruction state to ensure proper operation of the pipeline. In practical processor designs, more pipeline registers are typically included to support deeper pipelines to increase the efficiency of instruction execution.
Thus, in some embodiments provided by the present application, the above S200 may include the steps of: traversing paths between the first port and the second port according to the positions of the blocks to obtain a path set, and taking the shortest path in the path set as the communication path between the first port and the second port.
In the above step S300, the path blocks refer to blocks through which the communication paths pass, and in one pipeline register, the communication paths pass through each path block at most once, and do not pass through the same path block repeatedly. And laying out pipeline registers in a path block through which the communication path passes means that registers on the pipeline registers are disposed in the range of the path block. In one implementation, registers are uniformly arranged in a range of path blocks, e.g., one register is arranged in each path block, and the difference in separation distance between each register and its neighboring registers is within a certain controllable range, i.e., the relative positions of the registers in each path block are about the same, or the separation distance between neighboring registers is about the same. Therefore, the layout can be simplified, the efficiency is improved, and the subsequent further adjustment of the position of the register is facilitated.
In another implementation of this embodiment, when one or more path blocks are configured such that there are more than one register, for example, a distance between two registers on a path is greater than a distance of timing closure, and timing closure cannot be achieved by fine adjustment of a block length and width, one register may be additionally configured, where the registers may comply with a routing constraint rule, that is, be located on a communication path and within a corresponding path block range.
Each stage of the pipeline register is limited by time sequence, and the time sequence is related to noise interference between windings, quality of clock tree and the like, so that the time sequence is easier to converge by fine tuning the initial sizes of blocks at different positions. Therefore, in the above step S400, the initial size of the path block is adjusted according to the register timing constraint condition so that the timing of the pipeline register converges.
Specifically, when the initial size of the block is adjusted, iterative verification can be performed on the time sequence of each register of the pipeline registers on the communication path, so as to determine whether the time sequences of the pipeline registers can all converge. In some embodiments, the initial size of all or part of the path block is adjusted under the constraint of the distance range of register timing convergence so that the EDA tool automatically optimizes register locations in the path block. And acquiring the size of the path block and the positions of the registers after adjustment, verifying whether the distance between any two registers meets the time sequence constraint, and if not, performing readjustment until the time sequence of the pipeline registers converges. For example, if the separation distance between any two adjacent registers exceeds the target distance range for timing closure, the initial size of the corresponding block (e.g., the length and width of a rectangular block) is adjusted, so that the EDA tool automatically optimizes the register positions in the path block. And performing iterative verification again after adjustment until the time sequence of each register of the pipeline registers can be converged. The target distance range comprises an upper boundary and a lower boundary, and when the target distance range exceeds the upper boundary, the initial size of the path blocks is adjusted so that the distance between the adjacent path blocks is reduced. When the lower boundary is exceeded, the initial size of the path blocks is adjusted so that the distance between the adjacent path blocks becomes larger. In addition, the initial size change of the path block may affect the initial sizes of the blocks of other non-path blocks around, and thus, the initial sizes of other blocks may be adjusted synchronously. In addition, the initial size adjustment of the path blocks between the other sets of ports may affect the initial size adjustment of the path blocks between the ports of the present set, so that eventually the initial sizes of all the blocks may be adjusted, or the initial sizes of some of the blocks may be adjusted. Accordingly, if there is an adjustment of the initial size of the block, and the registers between the ports of the group or other groups are out of range of the block, there will be an adjustment of the positions of all or part of the registers. The above adjustment can be stepwise adjustment or linear adjustment, and the application is not limited thereto.
Fig. 5 is a schematic diagram showing an effect of adjusting the initial size of the block after the pipeline register layout in the embodiment of the present application. Specifically, comparing fig. 4 of the specification with fig. 5 of the specification, it can be seen that the initial size of the adjusted block has changed. According to the wiring rule of the pipeline register, pipeline distribution is dense in the blocks at the central position of the integrated circuit module, and correspondingly, the target distance of the clock tree which can be converged is slightly shorter than the average distance in one clock period of the blocks at the central position of the integrated circuit module. In contrast, the distribution of pipeline registers in the blocks at the edge locations of the chip circuit modules is relatively small, so the clock tree walks faster and can converge a little longer than the average distance. In summary, when the initial size of the block is fine-tuned, the initial size of the block at the center of the integrated circuit module can be adjusted and shortened, and the initial size of the block at the edge of the integrated circuit module can be adjusted and increased.
In another implementation of this embodiment, if there are irregular blocks, the sizes of the blocks may be adjusted correspondingly. The shape of the tiles is not limited by the embodiments of the present application, and may be rectangular, for example, as shown in fig. 4 (including square), other polygons, circles, or other irregular patterns.
The initial size of a block is related to the shape of the block, for example, when the block is rectangular, for example, the initial size of the block refers to the length, width, or diagonal length, and for example, when the block is other polygons, the initial size of the block refers to the side length, or diagonal length, of the polygon, for example. For another example, when the block is circular, the initial dimension of the block is referred to as a diameter or radius, for example. When the block is an irregular pattern, the shortest distance or average distance between two points of the irregular pattern may be referred to.
It can be seen that when using the EDA tool for chip design in the prior art, unnecessary detour of the pipeline, such as the path from stage 3 to stage 7 pipeline registers in fig. 2, tends to occur in the face of large scale, large area, and even irregularly shaped integrated circuit modules. By adopting the method provided by the embodiment of the application, the pipeline bypassing phenomenon is greatly relieved (as shown in figure 5).
Furthermore, because of the default behavior of EDA tools, the physical distance that each level of registers walk tends to vary greatly: the physical distance of the level 3 of walking as in fig. 2 is much greater than the physical distances of the levels 1, 2 of walking, resulting in a timing violation of the level 3 register. The embodiment of the application not only considers the problem of the bypass of the pipelining, but also restricts the layout of the register through block division, thereby reducing the occurrence of overlong physical distance.
In addition, the initial size of the initial module is adjusted to further restrict the position of the register, so that the time sequence of the pipeline register is converged, the aging problem caused by large-scale iteration for realizing convergence in the existing register layout process is greatly reduced, the iteration time is greatly reduced, and the design efficiency is greatly improved.
Based on the same technical conception, the embodiment of the application also provides a layout system of the pipeline register, which can be used for realizing the layout method of any pipeline register. For example, please refer to fig. 6, which is a software architecture diagram of a pipeline register layout system of the present application, comprising:
the division module 10 divides the integrated circuit module into a plurality of blocks, wherein an initial size of each block is constrained by a first distance.
Specifically, fig. 4 of the specification shows the effect of dividing the integrated circuit module into blocks. In this embodiment, the first distance is the average longest distance between two adjacent stages of registers of the pipeline register. This average longest distance is obtained by the wiring experience of the relevant technician and is not limiting herein. The initial size of a block is constrained by a first distance, meaning that the length from the center of a rectangular block to the center of its neighboring block does not exceed the first distance. If the side length of the block is L, the distance from the center of the block to the center of the adjacent block is . The initial size of the block may also be considered to not exceed the first distance or the product of the first distance and a coefficient. In another embodiment, the initial size of the block is constrained by the first distance, meaning that the initial size of the block does not exceed the first distance, e.g., the length and width of a rectangular block does not exceed the longest distance, with some margin. For another example, the initial size of the block is constrained by the first distance, which means that the initial size of the block does not exceed the product of the first distance and a coefficient, for example, the diagonal length of a rectangular block does not exceed the product of the first distance and a coefficient, and the coefficient is determined according to the relationship between the diagonal and the length and width of the rectangle. For another exampleFor example, the diameter of the circular block does not exceed the longest distance, leaving a margin. Or the radius of the circular block does not exceed half of the longest distance, leaving a margin.
A determining module 20, configured to determine a communication path between the first port and the second port according to the positions of the plurality of blocks. The communication path is a path with the smallest passing path block when the first port and the second port are communicated, and the path block comprises a starting block, a stopping block and M middle blocks, wherein M > =0. The starting block is adjacent to the first port, the ending block is adjacent to the second port, and M intermediate blocks are sequentially connected between the starting block and the ending block.
Specifically, in this embodiment, the first port is an input port, the second port is an output port, a plurality of pipeline registers may extend from one input port as a start point and be connected to different output ports, and if there is a pipeline register between one first port and any one second port, we can refer to the first port and the second port as a group of ports. And a plurality of pipeline registers are also arranged between each group of ports, and each pipeline register needs to traverse all the blocks by the method to find the optimal communication path when being wired.
In another implementation manner of this embodiment, any two adjacent blocks may be communicated. And taking one block closest to the input port in any group of ports as a starting block and one block closest to the output port as a terminating block. Traversing all the blocks, and counting the lengths of all feasible paths between the starting block and the ending block. And taking the path with the shortest length as the communication path of the port group. Through the steps of the method, all the ports are traversed, and the communication path of each group of ports is found.
A layout module 30 for laying out the pipeline registers in the path blocks through which the communication paths pass, wherein each path block comprises at least one register.
Specifically, the path blocks refer to blocks through which the communication paths pass, and in a pipeline register, the communication paths pass through each path block at most once and cannot repeatedly pass through the same path block. The step of arranging the pipeline registers in the path blocks where the communication paths pass means that the registers on the pipeline registers are uniformly arranged in each block in the range of the block, and the difference value of the interval distance between each register and the adjacent registers is within a certain controllable range.
In another implementation of this embodiment, there is more than one register in a block, and it should be noted that, in a special case, for example, where the distance between two registers on a path is greater than the distance of timing convergence, and timing convergence cannot be achieved by fine adjustment of the block length and width, there is a need to additionally provide a register, which may adhere to the wiring constraint rule, and is located on a reasonable path and in the corresponding block range.
An adjustment module 40, configured to adjust the initial sizes of all or part of the blocks and the positions of the registers in all or part of the path blocks so that the timing of the pipeline registers converges.
The block after the initial size is adjusted is shown in figure 5 of the specification. According to the wiring rule of the pipeline register, pipeline distribution is dense in the blocks at the central position, and the target distance of the clock tree which can be converged is slightly shorter than the average distance in one clock period corresponding to the blocks at the central position. In contrast, the distribution of pipeline registers in the blocks at the edge of the chip circuit module is relatively small, so the clock tree walks faster and can converge a little longer than the average distance. In summary, when the length and width of the initial module are finely tuned, the length and width of the block at the center of the chip circuit module need to be adjusted and shortened, and the length and width of the block at the edge of the chip circuit module need to be adjusted and increased.
In another embodiment of the pipeline register layout system provided by the present application, based on the above system embodiment, the determining module is specifically configured to:
And traversing all paths between the starting block and the ending block according to the positions of the blocks to obtain a path set.
And taking the shortest path in the path set as the communication path.
Specifically, in this embodiment, the start block is the block closest to the input port. The termination block is the block closest to the output port. The M middle blocks are sequentially connected with the starting block and the ending block. Where M > =0, the number of middle blocks is equal to or greater than 0, and the maximum number of M is also limited by the total number of blocks divided by the integrated circuit module.
Preferably, referring to fig. 5 of the specification, the pipeline registers shown in the figure include 11 path blocks on the communication path. The 11 path blocks comprise a start block, 9 middle blocks and a stop block which are connected in sequence.
In another implementation of this embodiment, there are only 2 path blocks on the communication path of the pipeline register. Wherein the 2 path blocks include a start block and a stop block. The pipeline register is provided with at least one register in the starting block and the ending block respectively.
In another implementation of this embodiment, only 1 path block is included on the communication path where the pipeline register exists. The path block is a start block and a stop block, and the first port is directly connected with the second port through the path block. At this time, this pipeline register needs to be provided with at least two registers in the path block. This is because pipeline registers need to contain at least one register to store the current instruction state and one register to store the previous instruction state to ensure proper operation of the pipeline. In practical processor designs, more pipeline registers are typically included to support deeper pipelines to increase the efficiency of instruction execution.
It should be understood that the division of the modules in the above system is only a division of a logic function, and may be fully or partially integrated into a physical entity or may be physically separated. Further, modules in the system may be implemented in the form of processor-invoked software. The system comprises a processor, for example, the processor is connected with a memory, instructions are stored in the memory, the processor calls the instructions stored in the memory to realize any one of the methods or realize functions of each module of the system, and the processor is for example, a general purpose processor, for example, a CPU or a microprocessor, and the memory is a memory in the system or a memory outside the system. Alternatively, the modules in the system may be implemented in the form of hardware circuits, some or all of which may be implemented by a design of hardware circuits, which may be understood as one or more processors. For example, in one implementation, the hardware circuitry is an application specific integrated circuit (application specific integrated circuit, ASIC) that implements some or all of the functionality of the modules by designing the logical relationships of the elements within the circuit. For another example, in another implementation, the hardware circuit may be implemented by a programmable logic device (programmable logic device, PLD), which may include a large number of logic gates, and the connection between the logic gates is configured by a configuration file, so as to implement the functions of some or all of the above modules. All modules of the above system may be implemented entirely in the form of a processor caller, or entirely in the form of a hardware circuit, or partly in the form of a processor caller, and the remainder in the form of a hardware circuit.
In an embodiment of the present application, the processor is a circuit with signal processing capability, and in an implementation, the processor may be a circuit with instruction reading and running capability, such as a central processing unit CPU, a microprocessor, a core processor GPU, or a digital signal processor DSP. In another implementation, the processor may perform a function through a logical relationship of hardware circuitry that is fixed or reconfigurable, e.g., a hardware circuitry implemented as an ASIC or PLD, such as a field programmable gate array (Field Programmable Gate Array, FPGA). In the reconfigurable hardware circuit, the processor loads the configuration document, and the process of implementing the configuration of the hardware circuit can be understood as a process of loading instructions by the processor to implement the functions of some or all of the above modules. Furthermore, a hardware circuit designed for artificial intelligence may be provided, which may be understood as an ASIC, such as NPU, TPU, DPU, etc.
It will be seen that each module in the above system may be one or more processors (or processing circuits) configured to implement the above methods, for example: CPU, GPU, NPU, TPU, DPU, microprocessor, DSP, ASIC, FPGA, or a combination of at least two of these processor forms.
Furthermore, the modules in the above system may be integrated together in whole or in part, or may be implemented independently. In one implementation, these modules are integrated together and implemented in the form of an SOC. The SOC may include at least one processor for implementing any of the methods above or for implementing the functions of the modules of the system, where the at least one processor may be of different types, including, for example, a CPU and FPGA, a CPU and artificial intelligence processor, a CPU and GPU, and the like.
Based on the same conception, the application also discloses a pipeline register layout system, as shown in fig. 7, which is characterized in that the system comprises a processor and a memory, wherein the memory stores instructions, and the processor calls the instructions to execute the pipeline register layout method in any one of the method embodiments.
In particular, the memory includes computer program instructions that, when executed by a processor, cause the processor to perform the pipeline register layout method described in any of the method embodiments described above.
Based on the same conception, the application also discloses an integrated circuit which is characterized by comprising a plurality of blocks and a plurality of groups of input/output ports. Also included are pipeline registers laid out by the pipeline register layout method described in any of the method embodiments above.
The layout method, system and integrated circuit of the pipeline register of the present application have the same technical concept, wherein the technical details of the embodiments are applicable to each other, and are not repeated here for reducing repetition.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and storage media according to embodiments of the application. It will be understood that each flowchart and/or block of the flowchart illustrations and/or block diagrams, and combinations of flowcharts and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the application.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present application without departing from the spirit or scope of the application. Thus, it is intended that the present application also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (10)

1. A method of laying out pipeline registers for use in a layout of pipeline registers of an integrated circuit module, the integrated circuit module comprising a plurality of ports including a first port and a second port, the method comprising:
Dividing the integrated circuit module into a plurality of blocks, wherein an initial size of each block is constrained by a first distance;
determining a communication path between the first port and the second port according to the positions of the plurality of blocks; the communication path is a path with the least path blocks when the first port and the second port are communicated, and the path blocks comprise a starting block, a stopping block and M middle blocks, wherein M > =0; the starting block is adjacent to the first port, the ending block is adjacent to the second port, and M intermediate blocks are sequentially connected between the starting block and the ending block;
laying out the pipeline registers in the path blocks through which the communication paths pass, wherein each path block comprises at least one register;
and adjusting the initial size of the path block according to the register timing constraint condition so as to enable the timing sequence of the pipeline register to be converged.
2. A method of laying out a pipeline register as claimed in claim 1, wherein the plurality of blocks are the same size and shape.
3. A method of laying out a pipeline register as claimed in claim 2, wherein the plurality of tiles are rectangles, and the side length of the rectangles is constrained by the first distance.
4. The method of claim 1, wherein determining the communication path between the first port and the second port based on the locations of the plurality of blocks comprises:
traversing all paths between the starting block and the ending block according to the positions of the blocks to obtain a path set;
and taking the shortest path in the path set as the communication path.
5. The method of claim 1, wherein said adjusting the initial size of the path block according to the register timing constraint so that the timing of the pipeline register converges comprises:
under the constraint of the distance range of the register timing convergence, adjusting the initial size of all or part of the path block so that an EDA tool automatically optimizes the register positions in the path block;
and acquiring the size of the path block and the positions of the registers after adjustment, verifying whether the distance between any two registers meets the time sequence constraint, and if not, performing readjustment until the time sequence of the pipeline registers converges.
6. A layout system of pipeline registers for use in a layout of pipeline registers of an integrated circuit module, the integrated circuit module comprising a plurality of ports including a first port and a second port, the system comprising:
a dividing module for dividing the integrated circuit module into a plurality of blocks, wherein the initial size of each block is limited by a first distance;
a determining module, configured to determine a communication path between the first port and the second port according to the positions of the plurality of blocks; the communication path is a path with the least path blocks when the first port and the second port are communicated, and the path blocks comprise a starting block, a stopping block and M middle blocks, wherein M > =0; the starting block is adjacent to the first port, the ending block is adjacent to the second port, and M intermediate blocks are sequentially connected between the starting block and the ending block;
a layout module for laying out the pipeline registers in the path blocks through which the communication paths pass, wherein each path block includes at least one register;
And the adjusting module is used for adjusting the initial size of the path block according to the time sequence constraint condition of the register so as to enable the time sequence of the pipeline register to be converged.
7. The pipeline register layout system of claim 6, wherein the plurality of blocks are the same size and shape.
8. The pipeline register layout system according to claim 6, wherein the determining module is specifically configured to:
traversing all paths between the starting block and the ending block according to the positions of the blocks to obtain a path set;
and taking the shortest path in the path set as the communication path.
9. A pipeline register layout system comprising a processor and a memory, the memory storing instructions, the processor invoking the instructions to cause the processor to perform the pipeline register layout method of any of claims 1-5.
10. An integrated circuit comprising an integrated circuit module, the integrated circuit module comprising a plurality of ports, the plurality of ports comprising a first port and a second port, the integrated circuit module further comprising a pipeline register laid out by the pipeline register layout method of any one of claims 1-5.
CN202310862159.3A 2023-07-14 2023-07-14 Layout method, system and integrated circuit of pipeline register Active CN116595938B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310862159.3A CN116595938B (en) 2023-07-14 2023-07-14 Layout method, system and integrated circuit of pipeline register

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310862159.3A CN116595938B (en) 2023-07-14 2023-07-14 Layout method, system and integrated circuit of pipeline register

Publications (2)

Publication Number Publication Date
CN116595938A true CN116595938A (en) 2023-08-15
CN116595938B CN116595938B (en) 2023-09-15

Family

ID=87601189

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310862159.3A Active CN116595938B (en) 2023-07-14 2023-07-14 Layout method, system and integrated circuit of pipeline register

Country Status (1)

Country Link
CN (1) CN116595938B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117892673A (en) * 2024-03-18 2024-04-16 上海韬润半导体有限公司 Timing sequence convergence structure and method based on register and digital-analog hybrid chip

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140075404A1 (en) * 2012-09-13 2014-03-13 Taiwan Semiconductor Manufacturing Company Limited Group bounding box region-constrained placement for integrated circuit design
US8893071B1 (en) * 2013-07-12 2014-11-18 Xilinx, Inc. Methods of pipelining a data path in an integrated circuit
EP3101568A1 (en) * 2015-06-03 2016-12-07 Altera Corporation Methods for performing register retiming operations into synchronization regions interposed between circuits associated with different clock domains
CN112347733A (en) * 2020-11-26 2021-02-09 北京百瑞互联技术有限公司 Integrated circuit layout initialization and optimization method, device, storage medium and equipment
CN113792520A (en) * 2021-09-23 2021-12-14 西安紫光国芯半导体有限公司 Layout wiring method, layout wiring device, synchronous circuit and integrated circuit chip
CN113919275A (en) * 2020-09-21 2022-01-11 台积电(南京)有限公司 Method for optimizing the layout of an integrated circuit
CN114282486A (en) * 2021-12-30 2022-04-05 海光信息技术股份有限公司 Layout method and device for sequential logic device, electronic device and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140075404A1 (en) * 2012-09-13 2014-03-13 Taiwan Semiconductor Manufacturing Company Limited Group bounding box region-constrained placement for integrated circuit design
US8893071B1 (en) * 2013-07-12 2014-11-18 Xilinx, Inc. Methods of pipelining a data path in an integrated circuit
EP3101568A1 (en) * 2015-06-03 2016-12-07 Altera Corporation Methods for performing register retiming operations into synchronization regions interposed between circuits associated with different clock domains
CN113919275A (en) * 2020-09-21 2022-01-11 台积电(南京)有限公司 Method for optimizing the layout of an integrated circuit
CN112347733A (en) * 2020-11-26 2021-02-09 北京百瑞互联技术有限公司 Integrated circuit layout initialization and optimization method, device, storage medium and equipment
CN113792520A (en) * 2021-09-23 2021-12-14 西安紫光国芯半导体有限公司 Layout wiring method, layout wiring device, synchronous circuit and integrated circuit chip
CN114282486A (en) * 2021-12-30 2022-04-05 海光信息技术股份有限公司 Layout method and device for sequential logic device, electronic device and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
孔天明, 洪先龙: "分级的时延驱动布局算法", 半导体学报, no. 03 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117892673A (en) * 2024-03-18 2024-04-16 上海韬润半导体有限公司 Timing sequence convergence structure and method based on register and digital-analog hybrid chip
CN117892673B (en) * 2024-03-18 2024-05-31 上海韬润半导体有限公司 Timing sequence convergence structure and method based on register and digital-analog hybrid chip

Also Published As

Publication number Publication date
CN116595938B (en) 2023-09-15

Similar Documents

Publication Publication Date Title
CN110009092B (en) Activation functions for deep neural networks
US11347509B2 (en) Encoding and decoding variable length instructions
US20220253319A1 (en) Hardware Unit for Performing Matrix Multiplication with Clock Gating
CN108564168A (en) A kind of design method to supporting more precision convolutional neural networks processors
CN116595938B (en) Layout method, system and integrated circuit of pipeline register
US10642578B2 (en) Approximating functions
US8473881B1 (en) Multi-resource aware partitioning for integrated circuits
US20210349838A1 (en) Priority Based Arbitration
US20230085669A1 (en) Priority based arbitration between shared resource requestors using priority vectors and binary decision tree
CN111709205A (en) FPGA wiring method
US9317641B2 (en) Gate substitution based system and method for integrated circuit power and timing optimization
CN114792124A (en) Implementing dilated convolutions in hardware
US10776451B2 (en) Configurable FFT architecture
US8595668B1 (en) Circuits and methods for efficient clock and data delay configuration for faster timing closure
CN113128149B (en) Power consumption-based netlist partitioning method for multi-die FPGA
CN113128150A (en) Clock domain-based netlist partitioning method for multi-die FPGA
CN112131824A (en) Chip winding method based on standard unit barrier layer
CN118586338B (en) Method and device for simultaneous boxing layout of field programmable gate array
US7681160B1 (en) Weight based look up table collapsing for programmable logic devices
US20230280980A1 (en) Find first function
US20240184966A1 (en) FPGA Compiler Flow for Heterogeneous Programmable Logic Elements
CN115081371A (en) FPGA layout method based on IP core layout range constraint
US8972920B1 (en) Re-budgeting connections of a circuit design
CN117273093A (en) Mapping neural networks to hardware
CN118246498A (en) Mapping neural networks to hardware

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant