CN114861590A

CN114861590A - An indexing method applied to large-scale layout data

Info

Publication number: CN114861590A
Application number: CN202210609318.4A
Authority: CN
Inventors: 王学香; 秦泽鑫; 曹鹏
Original assignee: Southeast University
Current assignee: Southeast University
Priority date: 2022-05-31
Filing date: 2022-05-31
Publication date: 2022-08-05
Anticipated expiration: 2042-05-31
Also published as: CN114861590B

Abstract

The invention discloses an indexing method applied to large-scale layout data, which comprises the following steps: classifying input layout data according to different layout layers, and abstracting each module into a plurality of two-dimensional plane polygons; then, each layer of the layout is divided into planes in a recursive manner by adopting a quadtree independently; traversing each partitioned sub-region by using a space filling curve, and allocating an index value to each module according to the sequence of each polygon on the curve; and finally, storing the index values in a B + tree form. The invention can flexibly and quickly construct indexes for large-scale layout data and efficiently perform incremental updating of the layout data with lower system overhead. Based on the index structure obtained by the index method, the layout data at the corresponding two-dimensional coordinates can be quickly searched in the two-dimensional plane range of the layout.

Description

An indexing method applied to large-scale layout data

技术领域technical field

本发明涉及一种应用于大规模版图数据的索引方法，特别是涉及对于版图数据的增量更新与点查找。The invention relates to an indexing method applied to large-scale layout data, in particular to incremental update and point search for layout data.

背景技术Background technique

EDA技术是在电子CAD技术基础上发展起来的计算机软件系统，是指以计算机为平台，融合了应用电子技术、计算机技术、信息处理及智能化技术的最新成果，进行电子设计的平台。在EDA工具的帮助下，集成电路才能够以传奇般的速度发展，延续着摩尔定律的神话。EDA technology is a computer software system developed on the basis of electronic CAD technology. With the help of EDA tools, integrated circuits can develop at a legendary speed, continuing the myth of Moore's Law.

同时，集成电路的发展和工艺的不断进步也带来了以下几点影响：At the same time, the development of integrated circuits and the continuous progress of technology have also brought the following impacts:

1、随着工艺尺寸的不断缩小，芯片中集成的晶体管数目不断增大。因此在先进工艺下，芯片的版图数据大小也扩大了数倍。1. As the process size continues to shrink, the number of transistors integrated in the chip continues to increase. Therefore, under the advanced technology, the size of the layout data of the chip is also expanded several times.

2、随着工艺尺寸的不断缩小，先进工艺会带来新的设计规则，不仅增加芯片设计时的复杂度，也增大了芯片的版图数据大小。2. With the continuous shrinking of the process size, advanced processes will bring new design rules, which not only increase the complexity of chip design, but also increase the size of the layout data of the chip.

3、随着工艺尺寸的不断缩小，先进工艺还会引入新的设计工具，从而导致版图数据的规模急剧膨胀，并随着版图的规模呈线性增长。3. As the size of the process continues to shrink, the advanced process will also introduce new design tools, resulting in a sharp expansion of the size of the layout data, which increases linearly with the size of the layout.

目前大规模版图经过光掩模的数据已经达到了几百GB的规模，不管是在芯片的后端布局布线还是物理验证等阶段，需求量最大的便是在海量的版图数据中搜索出特定位置处的版图数据。而原有的一些应用于版图数据的索引方法没有随着集成电路的发展而不断更新，所以在对版图数据构建索引之后的实际使用效率低，系统开销大。并且当版图局部数据发生变化时，基于这些索引方法所形成的索引结构无法高效地完成更新，这也导致版图数据的增量更新效率较低。At present, the data of large-scale layout through photomask has reached the scale of hundreds of GB. Whether it is in the back-end layout and wiring of the chip or the physical verification stage, the most demanded is to search for a specific location in the massive layout data. Layout data at . However, some original indexing methods applied to layout data are not continuously updated with the development of integrated circuits, so the actual use efficiency after indexing the layout data is low, and the system overhead is high. And when the local data of the layout changes, the index structure formed based on these indexing methods cannot complete the update efficiently, which also leads to the low efficiency of incremental update of the layout data.

因此需要专门设计一个适用于大规模版图的版图索引方法，使得EDA工具借助该方法能够高效率地对版图中的局部信息数据进行读取，便于后续对这些版图数据进行处理，同时也便于后续版图数据发生改变后的增量更新。Therefore, it is necessary to design a layout index method suitable for large-scale layouts, so that EDA tools can efficiently read local information data in the layout with this method, which is convenient for subsequent processing of these layout data, and also facilitates subsequent layouts. Incremental updates after data changes.

发明内容SUMMARY OF THE INVENTION

有鉴于此，本发明的目的在于提供一种应用于大规模版图数据的索引方法，用以解决EDA工具应对大规模版图数据时索引构建效率低、查找数据速度慢和增量更新时系统开销大的问题。In view of this, the object of the present invention is to provide a kind of indexing method applied to large-scale layout data, in order to solve the problem that when EDA tools deal with large-scale layout data, the index construction efficiency is low, the data search speed is slow and the system overhead is large during incremental update. The problem.

为了实现上述目的，本发明采用如下技术方案：In order to achieve the above object, the present invention adopts the following technical solutions:

一种应用于大规模版图数据的索引方法，该方法包括：An indexing method applied to large-scale layout data, the method includes:

步骤S1、读入集成电路的版图数据，同时根据数据格式对流数据进行解析；提取每个模块的顶点坐标，将每个模块都抽象为二维平面上的多边形，并根据原模块在版图上的层数对这些多边形进行分层，使得位于版图同一层的模块，其对应的抽象化后的多边形仍处于同一层；Step S1, read the layout data of the integrated circuit, and analyze the stream data according to the data format; extract the vertex coordinates of each module, abstract each module into a polygon on a two-dimensional plane, and analyze the data according to the layout of the original module. Layers layer these polygons, so that the modules located in the same layer of the layout, the corresponding abstracted polygons are still in the same layer;

步骤S2、针对步骤S1中经过抽象化处理后的多层二维平面，以层为单位，独立地对每一层二维平面采用递归划分的方式进行区域的四等分，根据每个子区域内部多边形的数量，决定是否需要对该子区域继续进行划分，直至每个子区域内的多边形数量不超过预设的阈值；其中，在进行区域划分的同时，按照Hilbert曲线的顺序遍历每个子区域，同时分配一个独一无二的索引值，其命名为CellId，用以建立抽象化后的多边形与索引值之间的映射关系；Step S2, for the multi-layer two-dimensional plane after the abstraction process in step S1, take the layer as a unit, independently use the recursive division method for each layer of two-dimensional plane to quarterly divide the area, according to the internal The number of polygons determines whether the sub-area needs to be further divided until the number of polygons in each sub-area does not exceed the preset threshold; wherein, while dividing the area, each sub-area is traversed in the order of the Hilbert curve, and at the same time Allocate a unique index value, named CellId, to establish the mapping relationship between the abstracted polygon and the index value;

步骤S3、若某个子区域内的区域划分已结束并即将开始对另一个区域进行划分，则以当前划分完成的子区域内的所有索引值为键值，索引值所对应的区域内所有多边形信息为数据，构建一棵B+树；如果已经存在一棵B+树，那么将索引值插入该树并调整结构，保持B+树的特性；Step S3, if the area division in a certain sub-area has ended and is about to start dividing another area, then all the index values in the sub-area that are currently divided are the key values, and all the polygon information in the area corresponding to the index values. For the data, build a B+ tree; if there is already a B+ tree, insert the index value into the tree and adjust the structure to maintain the characteristics of the B+ tree;

步骤S4、当版图数据发生变化时，则进行索引结构的增量更新，其包括：Step S4, when the layout data changes, perform incremental update of the index structure, which includes:

针对步骤S3中构建的B+树，查询改动后相关区域内的多边形数目是否超过阈值；For the B+ tree constructed in step S3, query whether the number of polygons in the relevant area after the modification exceeds the threshold;

若没有超过，则更新B+树中相应结点内的多边形信息，若超过了，则对该区域进行一次四等分，并在B+树中对相关结点进行调整；If it does not exceed, update the polygon information in the corresponding node in the B+ tree. If it exceeds, the area is divided into quarters, and the relevant nodes are adjusted in the B+ tree;

步骤S5、当需要查询指定坐标处的版图数据时，通过指定的坐标值计算出索引值，从而在B+树中定位到包含该坐标的区域所对应的结点，通过遍历结点内每个多边形并判断与指定坐标的位置关系，获知相应的版图数据信息。Step S5, when it is necessary to query the layout data at the specified coordinates, calculate the index value through the specified coordinate value, so as to locate the node corresponding to the area containing the coordinates in the B+ tree, and traverse each polygon in the node by traversing each polygon. And determine the positional relationship with the specified coordinates, and obtain the corresponding layout data information.

进一步的，在所述步骤S1中，所述读入集成电路的版图数据，其具体包括：Further, in the step S1, the layout data of the read-in integrated circuit specifically includes:

步骤S101、输入GDSII文件的数据流；Step S101, input the data stream of the GDSII file;

步骤S102、针对该数据流，依次截取其16位二进制数，得到当前模块长度和当前的模块类型；Step S102, for this data stream, intercept its 16-bit binary number successively to obtain the current module length and the current module type;

步骤S103、判断得到的模块是否为所需模块，若不是所需的，则执行指针后移的操作，略过当前模块；若是所需的，则执行步骤S104；Step S103, judging whether the obtained module is the required module, if not, then perform the operation of moving the pointer backward, and skip the current module; if it is required, then perform step S104;

步骤S104、翻译并记录当前模块内的多边形信息，同时根据已知长度向后移动指针；Step S104, translate and record the polygon information in the current module, while moving the pointer backwards according to the known length;

步骤S105、判断数据流是否结束，若尚未结束，则回到步骤S102；若已经结束，则输出多边形信息的集合。Step S105 , judging whether the data stream has ended, if not, go back to step S102 ; if it has ended, output the set of polygon information.

进一步的，在所述步骤S105中，所述的多边形信息，其包括每个多边形的顶点信息，也包含了每个多边形所位于的图层编号，用以依照图层编号将多边形区分开。Further, in the step S105, the polygon information includes vertex information of each polygon, and also includes the layer number where each polygon is located, so as to distinguish the polygons according to the layer number.

进一步的，在所述步骤S2中，采用如下的方法进行区域的四等分：Further, in the step S2, the following method is used to divide the area into quarters:

采用Hilbert曲线，该Hilbert曲线为分形曲线，低阶的Hilbert曲线通过迭代扩展为高阶的Hilbert曲线，其中，对于最低阶的Hilbert曲线，其由四个正方形的网格单元的中心点依次串联而成，该四个正方形的网格单元构成一个正方形平面；A Hilbert curve is adopted, which is a fractal curve, and a low-order Hilbert curve is extended to a high-order Hilbert curve through iteration. For the lowest-order Hilbert curve, the center points of the four square grid cells are connected in series to form the Hilbert curve. , the four square grid cells form a square plane;

当某一阶的Hilbert曲线迭代为更高一阶的Hilbert曲线时，采用与四叉树采用相类似的递归划分方法；即每次均在前一次的基础上，根据图形的密集程度，对原网格单元中的一个或多个，进行再次划分，每个网格单元均等分为四个更小的网格单元，直至每个网格单元内的多边形数目低于预先设定的阈值才停止划分。When a Hilbert curve of a certain order is iterated to a higher-order Hilbert curve, a recursive division method similar to the quadtree is adopted; that is, each time on the basis of the previous one, according to the density of the graph, the original grid is divided One or more of the units are divided again, each grid unit is equally divided into four smaller grid units, and the dividing is stopped until the number of polygons in each grid unit is lower than a preset threshold.

进一步的，在所述步骤S4中，所述更新B+树中相应结点内的多边形信息，其具体包括：Further, in the step S4, the updating of the polygon information in the corresponding nodes in the B+ tree specifically includes:

步骤S401、针对发生改动的多边形信息，判断其是否需要进行划分，若不需要，则按照Hilbert曲线顺序进行区域划分；若需要，则执行步骤S402；Step S401, for the polygon information that has changed, determine whether it needs to be divided, if not, then divide the area according to the order of the Hilbert curve; if necessary, execute step S402;

步骤S402、在新区域生成对应的CellId，该CellId保存相关多边形信息；Step S402, generating a corresponding CellId in the new area, where the CellId stores relevant polygon information;

步骤S403、判断已有索引中是否存在与新CellId相关的值，若存在，则将新旧CellId以及相关多边形信息进行合并；若不存在，则将多边形信息输入CellId，并将CellId插入B+树中；Step S403, determine whether there is a value related to the new CellId in the existing index, if so, merge the old and new CellId and related polygon information; if not, input the polygon information into CellId, and insert CellId into the B+ tree;

步骤S404、输出更新后的索引结构。Step S404, outputting the updated index structure.

进一步的，在所述步骤S5中，所述在B+树中定位到包含该坐标的区域所对应的结点，其具体包括：Further, in the step S5, the node corresponding to the region containing the coordinates is located in the B+ tree, which specifically includes:

步骤S501、输入二维点坐标；Step S501, input two-dimensional point coordinates;

步骤S502、计算出恰好覆盖该点的最小CellId；Step S502, calculating the minimum CellId that just covers the point;

步骤S503、输入CellId给B+树；Step S503, input CellId to B+ tree;

步骤S504、查找出覆盖该CellId的结点；Step S504, find out the node covering the CellId;

步骤S505、遍历该结点内所有多边形信息；Step S505, traverse all polygon information in the node;

步骤S506、执行判断，判断是否存在包含该坐标点的多边形，若存在，则汇总相关多边形信息，若不存在，则该坐标点处无版图数据；Step S506, execute judgment, judge whether there is a polygon containing the coordinate point, if so, summarize relevant polygon information, if not, there is no layout data at the coordinate point;

步骤S507、输出查找的结果。Step S507, output the search result.

本发明的有益效果是：The beneficial effects of the present invention are:

1、本发明方法可在版图区域划分的过程中进行索引值的计算与更新，从而提升版图数据的索引结构的构建效率；1. The method of the present invention can calculate and update the index value in the process of dividing the layout area, thereby improving the construction efficiency of the index structure of the layout data;

2、本发明方法可为版图数据构建起索引结构，从而提升查找指定坐标处的版图数据的效率；2. The method of the present invention can build an index structure for layout data, thereby improving the efficiency of finding layout data at specified coordinates;

3、本发明方法支持版图数据的增量更新。3. The method of the present invention supports incremental update of layout data.

附图说明Description of drawings

图1为实施例1中提供的一种应用于大规模版图数据的索引方法的流程示意图；1 is a schematic flowchart of an indexing method applied to large-scale layout data provided in Embodiment 1;

图2为实施例1中提供的简单版图的样例图；Fig. 2 is the sample diagram of the simple layout provided in embodiment 1;

图3为步骤1中对版图数据进行抽象化的流程示意图；FIG. 3 is a schematic flowchart of abstracting layout data in step 1;

图4为步骤2中对版图进行区域划分并同时分配索引值的流程示意图；Fig. 4 is the schematic flow chart of carrying out regional division to the layout in step 2 and allocating index value simultaneously;

图5为步骤2中的Hilbert曲线遍历子区域的示意图，其中，该图5中的a图表示为一阶Hilbert曲线，该图5中的b图表示为二阶Hilbert曲线；FIG. 5 is a schematic diagram of the Hilbert curve traversing the sub-region in step 2, wherein, the a graph in FIG. 5 is represented as a first-order Hilbert curve, and the b graph in FIG. 5 is represented as a second-order Hilbert curve;

图6为步骤2中名为CellId的索引值的结构示意图及其样例图，其中，该图6中的a图表示为CellId的索引值的结构示意图，该图6中的b图表示为名为CellId的索引值的一个样例图；FIG. 6 is a schematic diagram of the structure of the index value named CellId in step 2 and a sample diagram thereof, wherein the picture a in FIG. 6 is a schematic diagram of the structure of the index value of CellId, and the picture b in FIG. 6 is a schematic diagram of the structure of the index value of CellId A sample image of the index value of CellId;

图7为步骤2中分配索引值的一个样例图，其中，该图7中的a图表示为CellId在一阶Hilbert曲线上的分配，该图7中的b图表示为CellId在二阶Hilbert曲线上的分配；FIG. 7 is a sample diagram of assigning index values in step 2, wherein the a graph in FIG. 7 represents the assignment of CellId on the first-order Hilbert curve, and the b graph in FIG. 7 represents the CellId on the second-order Hilbert curve distribution on the curve;

图8为四种不同的基础Hilbert曲线示意图，其中，该图8中的a图表示为标准顺序的单元Hilbert曲线，该图8中的b图表示为轴旋转的单元Hilbert曲线，该图8中的c图表示为上下倒置的单元Hilbert曲线，该图8中的d图表示为轴旋转左右倒置的单元Hilbert曲线；FIG. 8 is a schematic diagram of four different basic Hilbert curves, wherein, the a graph in FIG. 8 represents the unit Hilbert curve in standard order, and the b graph in FIG. The c diagram of is represented as a unit Hilbert curve that is upside down, and the d diagram in FIG. 8 is represented as a unit Hilbert curve whose axis is rotated left and right inverted;

图9为步骤3中所要构建的B+树的一个样例图；Fig. 9 is a sample diagram of the B+ tree to be constructed in step 3;

图10为实施例1中提供的版图数据进行增量更新时的流程示意图；10 is a schematic flowchart of the incremental update of the layout data provided in Embodiment 1;

图11为基于实施例1方法所形成的索引结构，查找任意坐标处的版图数据的流程示意图。FIG. 11 is a schematic flowchart of searching layout data at arbitrary coordinates based on the index structure formed by the method of Embodiment 1.

具体实施方式Detailed ways

为使本发明实施例的目的、技术方案和优点更加清楚，下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。In order to make the purposes, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments These are some embodiments of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

实施例1Example 1

参见图1-图11，本实施例提供一种应用于大规模版图数据的索引方法，该方法的流程如图1所示，该方法具体包括：Referring to FIG. 1 to FIG. 11 , this embodiment provides an indexing method applied to large-scale layout data. The flow of the method is shown in FIG. 1 , and the method specifically includes:

步骤S1、读入集成电路的版图数据，同时根据数据格式对流数据进行解析；提取每个模块的顶点坐标，将每个模块都抽象为二维平面上的多边形，并根据原模块在版图上的层数对这些多边形进行分层，使得位于版图同一层的模块，其对应的抽象化后的多边形仍处于同一层。Step S1, read the layout data of the integrated circuit, and analyze the stream data according to the data format; extract the vertex coordinates of each module, abstract each module into a polygon on a two-dimensional plane, and analyze the data according to the layout of the original module. The number of layers layers these polygons, so that the modules located in the same layer of the layout, the corresponding abstracted polygons are still in the same layer.

具体的说，在本实施例中，该步骤S1为版图数据的读入与抽象化的操作，在本实施例中，一个简单的版图如图2所示，版图数据在描述这些元素位置与大小等相关信息时，是以二进制的格式进行描述与存储的。因此在本实施例中，要按照图3所示的流程，进行版图数据的读入，即根据预先所知的版图数据所采用的存储信息的格式，编写程序以数据流的形式读入版图数据文件，并通过进制转换获取版图每个模块的相关信息。Specifically, in this embodiment, step S1 is the operation of reading and abstracting layout data. In this embodiment, a simple layout is shown in FIG. 2 , and the layout data describes the positions and sizes of these elements. and other related information, it is described and stored in binary format. Therefore, in this embodiment, the layout data is read in accordance with the flow shown in FIG. 3 , that is, according to the format of the stored information used by the layout data known in advance, the program is written to read in the layout data in the form of a data stream. file, and obtain information about each module of the layout through hex conversion.

更具体的说，在本实施例中，该版图数据的读入操作，其具体包括：More specifically, in this embodiment, the read-in operation of the layout data specifically includes:

在执行完成上述的读入操作之后，便是版图数据的抽象化，即将每个模块抽象化为二维平面上的多边形，后续输出给索引的构建算法。由于版图通常情况下包含多个图层，因此输出的多边形信息中，既包含了每个多边形的顶点信息，也包含了每个多边形所位于的图层编号，便于将海量的多边形依照图层编号区分开。After the above-mentioned read-in operation is performed, the layout data is abstracted, that is, each module is abstracted into a polygon on a two-dimensional plane, which is subsequently output to the index construction algorithm. Since the layout usually contains multiple layers, the output polygon information includes not only the vertex information of each polygon, but also the layer number where each polygon is located, which is convenient for mass polygons according to the layer number. differentiate.

步骤S2、以层为单位，独立地对每一层二维平面采用类似于四叉树的方式递归地进行区域划分，直至每个子区域内的多边形数目不超过预设的阈值；在进行区域划分的同时，按照Hilbert曲线的顺序遍历每个子区域，同时分配一个独一无二的索引值，这里将其命名为CellId。Step S2, taking the layer as a unit, independently recursively divide the two-dimensional plane of each layer in a manner similar to a quadtree, until the number of polygons in each sub-region does not exceed a preset threshold; At the same time, each sub-region is traversed in the order of the Hilbert curve, and a unique index value is assigned, which is named CellId here.

具体的说，在本实施例中，该步骤S2为索引关系的构建的操作，在该步骤中，在对版图平面进行区域划分的同时，为每个区域分配一个索引值，以此完成抽象化后的多边形与索引值之间的映射关系，从而实现索引关系的构建，该步骤S2的具体流程包括：Specifically, in this embodiment, step S2 is an operation of constructing an index relationship. In this step, while the layout plane is divided into regions, an index value is assigned to each region, so as to complete the abstraction The mapping relationship between the polygon and the index value, so as to realize the construction of the index relationship. The specific process of this step S2 includes:

步骤S201、针对步骤S105中得到的多边形信息的集合，按照Hilbert曲线顺序进行区域划分；Step S201, for the set of polygon information obtained in step S105, perform area division according to the order of the Hilbert curve;

步骤S202、对划分完成之后的新区域，更新CellId；Step S202, for the new area after the division is completed, update the CellId;

步骤S203、更新每个CellId内的多边形编号信息；Step S203, update the polygon number information in each CellId;

步骤S204、输出CellId集合。Step S204, outputting the CellId set.

更具体的说，在本实施例中，在版图上进行区域划分，其包括：More specifically, in this embodiment, region division is performed on the layout, which includes:

由图5a和图5b可知，上述的Hilbert曲线是一种分形曲线，可以通过迭代实现。低阶的Hilbert曲线可以迭代的扩展为高阶的曲线。图5a中的Hilbert曲线是最低阶的基础形式，仅仅连接了四个点，编号为0-3。同时可以发现如果将整个正方形平面等分为四个正方形的网格单元，那这四个点恰好位于每个网格单元的中心处。图5b则是在图5a中的基础上生成的。对于图5a中的每一个网格单元，均使用形如图5a的一阶Hilbert曲线进行替代，从而构建出形如图5b的二阶Hilbert曲线。与图5a类似，二阶Hilbert曲线共连接了16个点，若将整个正方形平面等分为16个网格单元，那么每个点都是所在网格单元的中心点。因此对于Hilbert曲线来说，第i阶曲线可以通过使用第i-1阶曲线替代一阶基础曲线中的每个网格单元实现。最终当i趋近于无穷大时，Hilbert曲线可以把空间中的所有点连接起来。It can be seen from Fig. 5a and Fig. 5b that the above-mentioned Hilbert curve is a fractal curve, which can be realized by iteration. Lower-order Hilbert curves can be iteratively extended to higher-order curves. The Hilbert curve in Figure 5a is the lowest-order basis form, with only four points connected, numbered 0-3. At the same time, it can be found that if the entire square plane is equally divided into four square grid cells, then these four points are located at the center of each grid cell. Figure 5b is generated on the basis of Figure 5a. For each grid element in Fig. 5a, a first-order Hilbert curve as shown in Fig. 5a is used to replace it, thereby constructing a second-order Hilbert curve as shown in Fig. 5b. Similar to Figure 5a, the second-order Hilbert curve connects a total of 16 points. If the entire square plane is equally divided into 16 grid cells, then each point is the center point of the grid cell where it is located. So for the Hilbert curve, the ith-order curve can be realized by replacing each mesh element in the first-order base curve with the ith-1th-order curve. Eventually, as i approaches infinity, the Hilbert curve can connect all points in space.

综上所示，本实施例的索引方法中所使用的版图区域划分方法，应该与四叉树采用相类似的递归划分方法。即每次均在前一次的基础上，根据图形的密集程度，对原网格单元中的一个或多个，进行再次划分，每个网格单元均等分为四个更小的网格单元，直至每个网格单元内的多边形数目低于预先设定的阈值才停止划分。To sum up, the layout area division method used in the indexing method of this embodiment should adopt a recursive division method similar to that of the quadtree. That is, on the basis of the previous time, one or more of the original grid cells are divided again according to the density of the graphics, and each grid cell is equally divided into four smaller grid cells. The division will not stop until the number of polygons in each grid cell is lower than a preset threshold.

更具体的说，在本实施例中，索引关系所使用的索引值，其包括：More specifically, in this embodiment, the index value used by the index relationship includes:

由于Hilbert曲线每迭代一次，都会对原网格单元中的每一个都进行一次划分，然后将这些网格单元的中心点按照线性顺序依次连接并编号。所以每迭代一次，网格单元数都是原先的4倍。因此一旦划分结束，那么每个网格单元的编号也唯一确定下来了。后续如果想进一步划分，那么每个网格单元都要重新编号，这样做不仅会减慢区域划分的速度，也会增加程序运行时的开销。Since each iteration of the Hilbert curve will divide each of the original grid cells once, and then connect and number the center points of these grid cells in linear order. So each iteration, the number of grid cells is 4 times the original. Therefore, once the division is over, the number of each grid cell is uniquely determined. If you want to divide it further later, then each grid cell must be renumbered, which will not only slow down the speed of area division, but also increase the overhead of program runtime.

为了弥补这一缺点，本实施例采用了一种特殊的，名为CellId的键值来作为索引值，以下是其相关的介绍。In order to make up for this shortcoming, this embodiment uses a special key value named CellId as the index value, and the following is the related introduction.

图6a是这种名为CellId的索引值的结构示意图，图6b是一个CellId的样例，它共有64个二进制位数，在此将它们按照由高到低的顺序分为三类：Figure 6a is a schematic diagram of the structure of such an index value named CellId, and Figure 6b is an example of a CellId, which has a total of 64 binary digits. Here, they are divided into three categories in order from high to low:

第0至6位：标志位。这五位数据表示了该CellId位于版图上的哪一图层。因为它共用了7位二进制位来表示数据，因此它能表示的数值大小为0至127。所以该索引结构支持的输入版图数据的最大图层数为128。Bits 0 to 6: Flag bits. These five bits of data indicate which layer on the layout the CellId is located. Because it uses 7 binary bits to represent data, it can represent values ranging from 0 to 127. Therefore, the maximum number of layers of input layout data supported by this index structure is 128.

第7至n位：数据位，其中8≤n≤63。这些数据位所表达的值就是该CellId在Hilbert曲线上的编号值。因为它最多可以用56位二进制位来表示数据，因此它能表示的数值大小为0至2⁵⁶-1。Bits 7 to n: data bits, where 8≤n≤63. The value expressed by these data bits is the number value of the CellId on the Hilbert curve. Because it can represent data with up to 56 binary bits, it can represent values of size 0 to 2 ⁵⁶ -1.

第n至63位：截止位。截止位紧跟着数据位之后，并且截止位的开头第一个数字为1，剩余位数用0补足，用于表示CellId的数据位的结束。Bits n to 63: cutoff bits. The cutoff bit follows the data bit, and the first digit at the beginning of the cutoff bit is 1, and the remaining digits are filled with 0 to indicate the end of the data bit of CellId.

更具体的说，在本实施例中，索引值的分配与更新，其包括：More specifically, in this embodiment, the allocation and update of index values include:

对于初始的一阶Hilbert曲线而言，它将平面分为4个区域，因此只需要第7至8位数据位便可以给这4个区域分别分配00、01、10和11这四个编号，此时第9至63位均为截止位，相关索引值的编号情况如图7a所示。For the initial first-order Hilbert curve, it divides the plane into 4 regions, so only the 7th to 8th data bits are needed to assign the four numbers 00, 01, 10, and 11 to these 4 regions, respectively. At this time, the 9th to 63rd bits are all cut-off bits, and the numbering of the relevant index values is shown in Figure 7a.

对这一阶Hilbert曲线进行一次迭代得到二阶Hilbert曲线之后，原先的4个区域被划分成16个区域，那么只需要第7至10位数据位便可以分别给这16个区域分配编号。并且观察图7a可知，一阶曲线中编号为0的区域，在二阶曲线中的编号为0至3，在三阶曲线中的编号为0至15。因此假设用两位数据位去表示一阶Hilbert曲线中编号为0至3的区域，那么它们的数值分别为00、01、10、11。单拎出编号为00的区域进行观察，那么这个区域在二阶Hilbert曲线上的编号0至3就可以用4位数据位去表示，分别为0000、0001、0010、0011，相关索引值的编号情况如图7b所示。After one iteration of this order Hilbert curve to obtain the second order Hilbert curve, the original 4 regions are divided into 16 regions, then only the 7th to 10th data bits are needed to assign numbers to these 16 regions respectively. 7a, it can be seen that the regions numbered 0 in the first-order curve are numbered 0 to 3 in the second-order curve, and the regions numbered 0 to 15 in the third-order curve. Therefore, it is assumed that two data bits are used to represent the regions numbered 0 to 3 in the first-order Hilbert curve, then their values are 00, 01, 10, and 11, respectively. Simply pick out the area numbered 00 for observation, then the number 0 to 3 of this area on the second-order Hilbert curve can be represented by 4 data bits, respectively 0000, 0001, 0010, 0011, the number of the relevant index value The situation is shown in Figure 7b.

一阶Hilbert曲线为单元Hilbert曲线，顾名思义，它是组成高阶Hilbert曲线的基本单元。二阶Hilbert曲线就是由轴旋转、上下倒置和轴旋转左右倒置这三种单元Hilbert曲线按照一定顺序首尾相连而成的。The first-order Hilbert curve is a unit Hilbert curve, as the name implies, it is the basic unit that composes a higher-order Hilbert curve. The second-order Hilbert curve is formed by connecting the three element Hilbert curves of axis rotation, up and down inversion, and axis rotation left and right inversion in a certain order.

为了保存这四种单元Hilbert曲线的信息，可以发现每种单元Hilbert曲线的区别就在于它们连接这四个网格单元的顺序不同，因此可以用数组的形式加以保存。In order to save the information of the Hilbert curves of these four elements, it can be found that the difference between the Hilbert curves of each element lies in the order in which they connect the four grid elements, so they can be saved in the form of arrays.

以下分别说明如何用数组形式表示这四种单元Hilbert曲线：The following describes how to represent these four element Hilbert curves in array form:

首先可以如图8a所示设置横纵坐标，那么四个网格单元的坐标分别为，用0代表(0,0)，用1代表(0,1)，用2代表(1,0)，用3代表(1,1)。First, the horizontal and vertical coordinates can be set as shown in Figure 8a, then the coordinates of the four grid cells are, respectively, use 0 to represent (0,0), 1 to represent (0,1), and 2 to represent (1,0), Use 3 to represent (1,1).

标准顺序如图8a所示：按照(0,0)、(0,1)、(1,0)、(1,1)的顺序连接，因此用数组表示标准顺序的单元Hilbert曲线为[0,1,3,2]；The standard order is shown in Figure 8a: connect in the order of (0,0), (0,1), (1,0), (1,1), so the unit Hilbert curve representing the standard order with an array is [0, 1,3,2];

轴旋转如图8b所示：按照(0,0)、(1,0)、(1,1)、(0,1)的顺序，相应的数组为[0,2,3,1]；The axis rotation is shown in Figure 8b: in the order of (0,0), (1,0), (1,1), (0,1), the corresponding array is [0,2,3,1];

上下倒置如图8c所示：按照(1,1)、(1,0)、(0,0)、(0,1)的顺序，相应的数组为[3,2,0,1]；Upside down as shown in Figure 8c: in the order of (1,1), (1,0), (0,0), (0,1), the corresponding array is [3,2,0,1];

轴旋转左右倒置如图8d所示：按照(1,1)、(0,1)、(0,0)、(1,0)的顺序，相应的数组为[3,1,0,2]。The axis rotation is inverted left and right as shown in Figure 8d: in the order of (1,1), (0,1), (0,0), (1,0), the corresponding array is [3,1,0,2] .

Hilbert曲线的迭代有其固定算法，因此若初始的一阶Hilbert曲线形状确定了，那么之后所有阶数的Hilbert曲线形状均确定下来了。所以在迭代过程中，会同时保存下一次迭代结束后的Hilbert曲线形状，图8a～图8d中每个格子的右下角展示了下一次迭代时该格子内的Hilbert曲线形状。The iteration of the Hilbert curve has its fixed algorithm, so if the shape of the initial first-order Hilbert curve is determined, then the shape of the Hilbert curve of all subsequent orders is determined. Therefore, during the iteration process, the shape of the Hilbert curve after the next iteration is saved at the same time. The lower right corner of each grid in Figures 8a to 8d shows the shape of the Hilbert curve in the grid at the next iteration.

在存储了原始形状信息后，便可以用00、01、10和11这四个数来分别表示图8a～图8d的四种单元Hilbert曲线。接下来介绍如何在一阶Hilbert曲线的基础上，借助已知的单元Hilbert曲线信息生成二阶Hilbert曲线。After the original shape information is stored, four numbers of 00, 01, 10 and 11 can be used to represent the four types of unit Hilbert curves in FIGS. 8 a to 8 d, respectively. Next, we will introduce how to generate a second-order Hilbert curve with the help of known element Hilbert curve information based on the first-order Hilbert curve.

首先构造一个一行四列的辅助矩阵[1,0,0,3]，以二进制数进行表示则为[01,00,00,11]。First construct an auxiliary matrix [1,0,0,3] with one row and four columns, which is [01,00,00,11] in binary numbers.

以图8c为例，已知该一阶Hilbert曲线是上下倒置类型的单元Hilbert曲线，可以用10表示。将10与辅助矩阵中的每个元素进行异或运算可以得到：Taking Fig. 8c as an example, it is known that the first-order Hilbert curve is an upside-down type element Hilbert curve, which can be represented by 10. XORing 10 with each element in the auxiliary matrix gives:

[01,00,00,11]^10＝[11,10,10,01][01,00,00,11]^10=[11,10,10,01]

上文提到上下倒置类型的单元Hilbert曲线用数组表示为[3,2,0,1]，即按照(1,1)、(1,0)、(0,0)、(0,1)的顺序，所以将这两个一行四列的矩阵每一项进行匹配，可知：The unit Hilbert curve of the upside-down type mentioned above is represented by an array as [3,2,0,1], that is, according to (1,1), (1,0), (0,0), (0,1) order, so match each item of these two matrices with one row and four columns, we can see:

(1,1)格内用编号为11，即形如图8d的单元Hilbert曲线进行填充；(1,1) The cell numbered 11 is filled with the Hilbert curve of the cell in the shape of Figure 8d;

(1,0)格内用编号为10，即形如图8c的单元Hilbert曲线进行填充；The cell (1,0) is filled with the cell numbered 10, that is, the Hilbert curve in the shape of Figure 8c;

(0,0)格内用编号为10，即形如图8c的单元Hilbert曲线进行填充；The (0,0) cell is filled with the cell number 10, that is, the cell Hilbert curve as shown in Figure 8c;

(0,1)格内用编号为01，即形如图8b的单元Hilbert曲线进行填充。The (0,1) cell is filled with the cell Hilbert curve numbered 01, which is shaped like Figure 8b.

二阶Hilbert曲线迭代至三阶Hilbert曲线的过程也同理。The process of iterating the second-order Hilbert curve to the third-order Hilbert curve is also the same.

步骤S3、若某个子区域内的区域划分已结束并即将开始对另一个区域进行划分，则以当前划分完成的子区域内的所有索引值为键值，索引值所对应的区域内所有多边形信息为数据，构建一棵B+树；如果已经存在一棵B+树，那么将索引值插入该树并调整结构，保持B+树的特性。Step S3, if the area division in a certain sub-area has ended and is about to start dividing another area, then all the index values in the sub-area that are currently divided are the key values, and all the polygon information in the area corresponding to the index values. For the data, build a B+ tree; if there is already a B+ tree, insert the index value into the tree and adjust the structure to maintain the characteristics of the B+ tree.

具体的说，在本实施例中，该步骤S3为执行基于B+树的索引值的存储，在该步骤中，包括：经过上述操作，海量多边形按照Hilbert曲线的顺序，以CellId近似包含它们的位置信息，构建起了抽象化的版图数据与CellId间的索引关系。这里要进行索引值的存储工作。Specifically, in this embodiment, the step S3 is to perform the storage of the index value based on the B+ tree. In this step, it includes: after the above operations, the massive polygons are in the order of the Hilbert curve and approximate their positions with CellId. information to build an index relationship between abstract layout data and CellId. Here, we need to store the index value.

假设目前均在第0图层，以CellId分别对一阶与二阶Hilbert曲线中的每个网格单元进行编号，由于CellId位数过长，以下截取前15位CellId用于对比：Assuming that it is currently on the 0th layer, each grid cell in the first-order and second-order Hilbert curves is numbered with CellId. Since the number of CellId digits is too long, the first 15 CellIds are intercepted for comparison:

一阶Hilbert曲线中编号为1的区域其对应的CellId为：The corresponding CellId of the region numbered 1 in the first-order Hilbert curve is:

000000000100000000000000100000

编号为2的区域其对应的CellId为：The corresponding CellId of the area numbered 2 is:

000000001100000000000001100000

该区域被二阶Hilbert曲线划分为了四块，编号值为5至8，这四块区域其对应的CellId，按照编号值由小到大的顺序排列，则分别为：The area is divided into four blocks by the second-order Hilbert curve, and the numbered values are from 5 to 8. The corresponding CellIds of these four areas are arranged in the order from small to large, respectively:

000000001001000…000000001001000…

000000001011000…000000001011000…

000000001101000…000000001101000…

000000001111000…000000001111000…

可以看出在一阶Hilbert曲线中，编号为1的区域其CellId值小于编号为2的区域。因此倘若后续采用四叉树类似的划分方法，即根据多边形的疏密程度，对局部区域例如编号为1的区域不断划分，那么该区域内的所有后续划分所构建的CellId值均小于编号区域2内的CellId。It can be seen that in the first-order Hilbert curve, the CellId value of the region numbered 1 is smaller than that of the region numbered 2. Therefore, if a division method similar to a quadtree is used in the future, that is, according to the density of the polygon, the local area, such as the area numbered 1, is continuously divided, then the CellId value constructed by all subsequent divisions in this area is smaller than the numbered area 2. CellId inside.

因此如果按照Hilbert曲线遍历网格单元的顺序去构建索引关系，那么最终输出的一系列CellId是严格按照自小到大的顺序进行排列的。所以对于线性数据，自然而然地想到可以采用以B+树的形式存储这些CellId，即如图9所示。Therefore, if the index relationship is constructed according to the order in which the grid cells are traversed by the Hilbert curve, the final output of a series of CellIds is arranged in strict order from small to large. Therefore, for linear data, it is natural to think that these CellIds can be stored in the form of a B+ tree, as shown in FIG. 9 .

步骤S4、当版图数据发生了变化，可进行索引结构的增量更新。在B+树中查询改动后相关区域内的多边形数目是否超过阈值。若没有超过，则更新B+树中相应结点内的多边形信息，若超过了，则对该区域进行一次四等分，并在B+树中对相关结点进行调整。Step S4, when the layout data changes, incremental update of the index structure can be performed. Query the B+ tree whether the number of polygons in the relevant area exceeds the threshold after the modification. If it does not exceed, update the polygon information in the corresponding node in the B+ tree. If it exceeds, the area is divided into quarters, and the relevant nodes are adjusted in the B+ tree.

具体的说，在本实施例中，该步骤S4为执行版图数据的增量更新，具体的说，在该步骤中，当索引结构已经建立完毕后，如果对版图数据进行增添、删除、修改等操作，那么应当在已经构建好的索引结构的基础上，根据版图数据的改动而变动索引结构。这样既能提升新索引结构构建的效率，也能减少系统的开销。在CellId这种数据形式作为索引值的支持下，本实施例能进行局部索引结构的调整。Specifically, in this embodiment, step S4 is to perform incremental update of the layout data. Specifically, in this step, after the index structure has been established, if the layout data is added, deleted, modified, etc. operation, then the index structure should be changed according to the changes of the layout data on the basis of the already constructed index structure. This can not only improve the efficiency of building the new index structure, but also reduce the overhead of the system. With the support of the data form of CellId as the index value, this embodiment can adjust the local index structure.

更具体的说，本实施例的增量化更新的流程图如图10所示，其包括：More specifically, the flowchart of incremental update in this embodiment is shown in FIG. 10 , which includes:

步骤S403、判断已有索引中是否存在与新CellId相关的值，若存在，则将新旧CellId以及相关多边形信息进行合并；若不存在，则将多边形信息输入CellId，并将CellId插入B+树中；Step S403, judging whether there is a value related to the new CellId in the existing index, if so, merge the old and new CellId and related polygon information; if not, input the polygon information into CellId, and insert CellId into the B+ tree;

更具体的说，针对上述的流程，其最关键的步骤在于将新CellId插入旧B+树的过程。因为B+树中的CellId是严格按照由小到大的顺序存储的。尽管由于每个局部区域划分的深度不同，导致相对应的CellId数据位的长度不同，但是所有的CellId相应的区域是不会出现重合情况。例如在图7a中中，B+树不会同时存储编号为00的区域与0010区域，并且00区域对应的CellId与0000结点对应的CellId无法共存。More specifically, for the above process, the most critical step is the process of inserting the new CellId into the old B+ tree. Because the CellIds in the B+ tree are stored strictly in ascending order. Although the lengths of the corresponding CellId data bits are different due to the different depths of division of each local area, all areas corresponding to the CellId will not overlap. For example, in FIG. 7a, the B+ tree does not store the area numbered 00 and the area 0010 at the same time, and the CellId corresponding to the 00 area and the CellId corresponding to the 0000 node cannot coexist.

因此为了保证预插入的CellId相应的区域不会与B+树中所有CellId相应的区域重合，需要将新CellId转换为区间形式，搜索B+树中是否存在与该区间重合的CellId。例如在图7a中中，对00区域进行任意深度的划分，每划分一次CellId的数据位便向后扩展两位，已有的数据位并不会发生变化。所以得到的CellId的值均位于[00000000010…,00000000110…]这个区间内。Therefore, in order to ensure that the region corresponding to the pre-inserted CellId does not overlap with the regions corresponding to all CellIds in the B+ tree, it is necessary to convert the new CellId into an interval format, and search the B+ tree for any CellId that overlaps with the interval. For example, in FIG. 7a, the 00 area is divided into arbitrary depths, and the data bits of CellId are expanded by two bits each time the data bits are divided, and the existing data bits do not change. Therefore, the obtained CellId values are all located in the interval [00000000010...,00000000110...].

最终会有三种可能的情形：There are ultimately three possible scenarios:

(1)不存在与新CellId的网格单元相交的CellId。此时仅需把新CellId插入B+树相应的位置。为防止插入之后该结点处的key值数量超出预设值，还需对B+树进行微调；(1) There is no CellId that intersects the grid cell of the new CellId. At this time, it is only necessary to insert the new CellId into the corresponding position of the B+ tree. In order to prevent the number of key values at the node from exceeding the preset value after insertion, it is necessary to fine-tune the B+ tree;

(2)存在CellId，其相应网格单元完全覆盖了新CellId的网格单元。将新CellId中的多边形添加进原B+树中的CellId，重复图10中的流程；(2) There is a CellId whose corresponding grid unit completely covers the grid unit of the new CellId. Add the polygon in the new CellId to the CellId in the original B+ tree, and repeat the process in Figure 10;

(3)存在CellId，其相应网格单元完全被新CellId的网格单元所覆盖。对新CellId的网格单元进行一次划分，重复图10中的流程。(3) There is a CellId whose corresponding grid unit is completely covered by the grid unit of the new CellId. The grid cells of the new CellId are divided once, and the process in FIG. 10 is repeated.

步骤S5、当需要查询指定坐标处的版图数据时，可通过指定的坐标值计算出索引值，从而在B+树中定位到包含该坐标的区域所对应的结点，通过遍历结点内每个多边形并判断与指定坐标的位置关系，获知相应的版图数据信息。Step S5, when it is necessary to query the layout data at the specified coordinates, the index value can be calculated by the specified coordinate value, so as to locate the node corresponding to the area containing the coordinates in the B+ tree, and by traversing each node in the node. Polygon and determine the positional relationship with the specified coordinates, and obtain the corresponding layout data information.

具体的说，在本实施例中，该步骤S5为执行版图数据的点查找，点查找是指给定版图某层上的某二维坐标，查找出包含了该坐标点的所有抽象化后的多边形所对应的版图数据，相关流程如图11所示，其包括：Specifically, in this embodiment, step S5 is to perform point search of layout data. Point search refers to a given two-dimensional coordinate on a certain layer of the layout, and all abstracted coordinates including the coordinate point are found. The layout data corresponding to the polygon, the related process is shown in Figure 11, which includes:

步骤S503、输入CellId给B+树；Step S503, input CellId to B+ tree;

步骤S507、输出查找的结果。Step S507, output the search result.

更具体的说，在前端送来点查找需求之后，可以将该点近似的看做是一个极小的矩形，从而为其生成一个恰好能够覆盖其面积的最小子区域所对应的CellId。More specifically, after the front end sends a point search request, the point can be approximately regarded as a very small rectangle, so as to generate a CellId corresponding to the smallest sub-region that can just cover its area.

上述部分已经提出本实施提供的索引方法所形成的索引结构其是B+树形式的，每个结点处的索引值都为CellId，因此可以将所要查询的坐标点通过放缩的方法，放大至该CellId所对应的面积，由于CellId的数据位长度限制，因此这种放缩的误差不会很大，反而能借助放缩来提升查找的效率。然后在B+树中查找与CellId的相关结点，取出结点内的所有多边形信息，与原待查询的二维坐标点判断相互的位置关系，得到相应的结果。The above part has already proposed that the index structure formed by the index method provided by this implementation is in the form of a B+ tree, and the index value at each node is CellId, so the coordinate point to be queried can be enlarged to For the area corresponding to the CellId, due to the limitation of the data bit length of the CellId, the error of this scaling is not very large, but the search efficiency can be improved by means of scaling. Then search for the relevant node with CellId in the B+ tree, take out all polygon information in the node, and determine the mutual positional relationship with the original two-dimensional coordinate point to be queried, and obtain the corresponding result.

综上所述，本发明提供的一种应用于大规模版图数据的索引方法，其使用了一种由标志位、数据位和截止位组合而成的索引值结构，在区域划分的过程中可随着递归的深入，通过拓宽数据位的长度并减少截止位的长度的方式，自顶向下地不断更新索引值，提升索引构建的效率；同时基于该索引值结构可以在不同区域根据模块的疏密程度采用不同的区域划分深度，提升了索引结构的灵活性。To sum up, the present invention provides an indexing method applied to large-scale layout data, which uses an index value structure composed of flag bits, data bits and cutoff bits, and can be used in the process of region division. With the deepening of recursion, by expanding the length of data bits and reducing the length of cut-off bits, the index value is continuously updated from top to bottom to improve the efficiency of index construction. The density level adopts different area division depths, which improves the flexibility of the index structure.

本发明在基于该索引方法建立索引结构之后，每当版图数据发生了增加、删除和修改等变化后，可以在原有索引结构的基础上，仅对发生变化的局部区域所对应的结点进行更新，从而实现版图数据的增量更新，大大节省了系统开销。After the present invention establishes an index structure based on the index method, whenever the layout data is changed, such as addition, deletion and modification, only the nodes corresponding to the changed local area can be updated on the basis of the original index structure. , so as to realize the incremental update of the layout data, which greatly saves the system overhead.

本发明未详述之处，均为本领域技术人员的公知技术。The parts that are not described in detail in the present invention are known techniques of those skilled in the art.

以上详细描述了本发明的较佳具体实施例。应当理解，本领域的普通技术人员无需创造性劳动就可以根据本发明的构思作出诸多修改和变化。因此，凡本技术领域中技术人员依本发明的构思在现有技术的基础上通过逻辑分析、推理或者有限的实验可以得到的技术方案，皆应在由权利要求书所确定的保护范围内。The preferred embodiments of the present invention have been described in detail above. It should be understood that those skilled in the art can make many modifications and changes according to the concept of the present invention without creative efforts. Therefore, any technical solutions that can be obtained by those skilled in the art through logical analysis, reasoning or limited experiments on the basis of the prior art according to the concept of the present invention shall fall within the protection scope determined by the claims.

Claims

1. an indexing method applied to large-scale layout data, is characterized in that, this method comprises:

Step S1, read the layout data of the integrated circuit, and analyze the stream data according to the data format; extract the vertex coordinates of each module, abstract each module into a polygon on a two-dimensional plane, and analyze the data according to the layout of the original module. Layers layer these polygons, so that the modules located in the same layer of the layout, the corresponding abstracted polygons are still in the same layer;

Step S2, for the multi-layer two-dimensional plane after the abstraction process in step S1, take the layer as a unit, independently use the recursive division method for each layer of two-dimensional plane to quarterly divide the area, according to the internal The number of polygons determines whether the sub-area needs to continue to be divided until the number of polygons in each sub-area does not exceed the preset threshold; wherein, while dividing the area, each sub-area is traversed in the order of the Hilbert curve, and at the same time Allocate a unique index value, named CellId, to establish the mapping relationship between the abstracted polygon and the index value;

Step S3, if the area division in a certain sub-area has ended and is about to start dividing another area, then all the index values in the sub-area that are currently divided are the key values, and all the polygon information in the area corresponding to the index values. For the data, build a B+ tree; if there is already a B+ tree, insert the index value into the tree and adjust the structure to maintain the characteristics of the B+ tree;

Step S4, when the layout data changes, perform incremental update of the index structure, which includes:

For the B+ tree constructed in step S3, query whether the number of polygons in the relevant area after the modification exceeds the threshold;

If it does not exceed, update the polygon information in the corresponding node in the B+ tree. If it exceeds, the area is divided into quarters, and the relevant nodes are adjusted in the B+ tree;

Step S5, when it is necessary to query the layout data at the specified coordinates, calculate the index value through the specified coordinate value, so as to locate the node corresponding to the area containing the coordinates in the B+ tree, and traverse each polygon in the node by traversing each polygon. And determine the positional relationship with the specified coordinates, and obtain the corresponding layout data information.

2. The indexing method applied to large-scale layout data according to claim 1, wherein, in the step S1, the layout data of the read-in integrated circuit specifically comprises:

Step S101, input the data stream of the GDSII file;

Step S102, for this data stream, intercept its 16-bit binary number successively to obtain the current module length and the current module type;

Step S103, judging whether the obtained module is the required module, if not, then perform the operation of moving the pointer backward, and skip the current module; if it is required, then perform step S104;

Step S104, translate and record the polygon information in the current module, while moving the pointer backwards according to the known length;

Step S105 , judging whether the data stream has ended, if not, go back to step S102 ; if it has ended, output the set of polygon information.

3. An indexing method applied to large-scale layout data according to claim 2, characterized in that, in the step S105, the polygon information, which includes the vertex information of each polygon, also includes The layer number where each polygon is located, used to distinguish polygons by layer number.

4. a kind of indexing method applied to large-scale layout data according to claim 3, is characterized in that, in described step S2, adopts following method to carry out the quartering of the area:

A Hilbert curve is adopted, which is a fractal curve, and a low-order Hilbert curve is extended to a high-order Hilbert curve through iteration. For the lowest-order Hilbert curve, the center points of the four square grid cells are connected in series to form the Hilbert curve. , the four square grid cells form a square plane;

When a Hilbert curve of a certain order is iterated to a higher-order Hilbert curve, a recursive division method similar to the quadtree is adopted; that is, each time on the basis of the previous one, according to the density of the graph, the original grid is divided One or more of the units are divided again, each grid unit is equally divided into four smaller grid units, and the dividing is stopped until the number of polygons in each grid unit is lower than a preset threshold.

5. A kind of indexing method applied to large-scale layout data according to claim 4, is characterized in that, in described step S4, described updating the polygon information in the corresponding node in B+ tree, it specifically comprises:

Step S401, for the polygon information that has changed, determine whether it needs to be divided, if not, then divide the area according to the order of the Hilbert curve; if necessary, execute step S402;

Step S402, generating a corresponding CellId in the new area, where the CellId stores relevant polygon information;

Step S403, judging whether there is a value related to the new CellId in the existing index, if so, merge the old and new CellId and related polygon information; if not, input the polygon information into CellId, and insert CellId into the B+ tree;

Step S404, outputting the updated index structure.

6. A kind of indexing method applied to large-scale layout data according to claim 5, is characterized in that, in described step S5, described in the B+ tree locating the node corresponding to the area containing this coordinate , which specifically includes:

Step S501, input two-dimensional point coordinates;

Step S502, calculating the minimum CellId that just covers the point;

Step S503, input CellId to B+ tree;

Step S504, find out the node covering the CellId;

Step S505, traverse all polygon information in the node;

Step S506, execute judgment, judge whether there is a polygon containing the coordinate point, if so, summarize relevant polygon information, if not, there is no layout data at the coordinate point;

Step S507, output the search result.