CN100440859C

CN100440859C - A Recursive Flow Classification Method and System Based on Bitmap Aggregation

Info

Publication number: CN100440859C
Application number: CNB2005100749196A
Authority: CN
Inventors: 都珂; 王军
Original assignee: ZTE Corp
Current assignee: Nanjing Creates A Code Science And Technology LLC
Priority date: 2005-06-06
Filing date: 2005-06-06
Publication date: 2008-12-03
Anticipated expiration: 2025-06-06
Also published as: CN1878123A

Abstract

The invention discloses a recursive stream classification method and system for bitmap aggregation. Firstly, a value is taken out from a value range of a rule field, and a set of rules is traversed to obtain a bitmap BM; the aggregate bit part and details are obtained by aggregating it. The bitmap CBM of the aggregated form of the bit part; if the CBM does not exist in the bitmap set to which the corresponding index block belongs, generate a new EqID and add it to the bitmap set, otherwise use the same EqID; create the EqID of the CBM in the index block The mapping relationship with the value of the rule field; in the same way, construct the index block of each value within the value range of each rule field, and obtain the 0th level index table; according to the recursive structure, recursively recursively follow the upper level index block , to get the next level of index block until recursively becoming an index block; after the classification system receives the data packet, it matches it to the rule with the highest priority according to the index table. The invention can reduce storage space overhead and accelerate bitmap generation and search speed.

Description

A Recursive Flow Classification Method and System Based on Bitmap Aggregation

技术领域 technical field

本发明涉及一种基于位图的递推流分类方法和系统，尤其涉及数据通讯领域中根据数据包上多个字段，将符合预定规则的特定的数据包从其它数据包中区别出来的方法和系统。The present invention relates to a bitmap-based recursive flow classification method and system, in particular to a method and system for distinguishing a specific data packet conforming to predetermined rules from other data packets according to multiple fields on the data packet in the field of data communication system.

背景技术 Background technique

IP网络进入了“IP电信网”的发展阶段，由此也带来了许多技术上新的挑战，许多新业务，诸如：包过滤、QoS、策略路由、NAT、流计费等，传统的基于目的地址路由匹配的数据包转发过程已经不再满足网络运营和管理的要求，基于整个IP头，甚至传输层头部的IP包(流)分类转发技术成为这些新业务的核心技术之一，其中快速流分类系统是分类转发技术的基础和关键。IP network has entered the development stage of "IP telecommunication network", which also brings many new technical challenges, many new services, such as: packet filtering, QoS, policy routing, NAT, flow accounting, etc., traditional based The data packet forwarding process of destination address routing matching no longer meets the requirements of network operation and management. The IP packet (flow) classification and forwarding technology based on the entire IP header and even the transport layer header has become one of the core technologies of these new services. The fast flow classification system is the basis and key of classification and forwarding technology.

快速流分类系统是数据通讯设备数据包处理流程的一部分，快速流分类系统有一个可配置的规则集，即若干条有优先级的规则的集合，每条规则描述一簇数据包头部的若干特征条件，快速流分类系统就是要在数据包到来时，从数据包满足其要求的若干规则中找出优先级最高的规则来。The fast flow classification system is a part of the data packet processing flow of the data communication equipment. The fast flow classification system has a configurable rule set, that is, a collection of several rules with priority. Each rule describes several characteristics of a cluster of data packet headers. condition, the fast flow classification system is to find out the rule with the highest priority from several rules that the data packet meets its requirements when the data packet arrives.

包分类问题可形式化定义为：一个有N个规则的分类器{Rk}，其中k＝1，2，...，N；Rk包括2个实体：The packet classification problem can be formally defined as: a classifier {Rk} with N rules, where k=1, 2, ..., N; Rk includes 2 entities:

1、一个d元的范围[l1k，r1k]、[l2k，r2k]、......[ldk，rdk]。1. A d-element range [l1k, r1k], [l2k, r2k], ... [ldk, rdk].

2、一个指定分类器中优先权的数字Pri(Rk)；2. A number Pri(Rk) specifying the priority in the classifier;

对一个到达分类器的被一d元组(P1，P2，...，Pd)指定相应域的数据包P，d维分类器问题就是找一个特定的值k，称为k^*，满足：For a data packet P that arrives at the classifier and is assigned a corresponding domain by a d-tuple (P1, P2, ..., Pd), the problem of a d-dimensional classifier is to find a specific value k, called k ^* , that satisfies:

对任意k≠k^*，pri(Rk^*)＞Pri(Rk)；且对任意i：1≤i≤d，有lik≤Pi≤rik流分类器不可能等到数据包到达后再一条一条地匹配，以找到优先级最高的规则。快速流分类器通常分为两个部分，一部分是控制层面根据分类规则集构造分类索引表的预处理模块；另一部分是数据层面将实际数据包通过分类索引表匹配到优先权最高规则的索引模块，可参照图1。For any k≠k ^* , pri(Rk ^* )＞Pri(Rk); and for any i: 1≤i≤d, lik≤Pi≤rik flow classifier cannot wait until the data packets arrive and then match them one by one , to find the highest priority rule. The fast flow classifier is usually divided into two parts, one part is the preprocessing module that constructs the classification index table according to the classification rule set at the control plane; the other part is the index module that matches the actual data packets to the highest priority rule through the classification index table at the data plane , can refer to Figure 1.

1999～2000年间，流分类技术有过一个研究的高潮，取得了一些成果，涌现了基于Trie算法流分类系统、递推流分类(RFC，Recursive FlowClassification)系统、基于层次化空间分割(Hi-Cuts)算法的流分类系统、基于位向量(BV)算法的流分类系统等快速流分类系统，这些系统各有优点，也各有缺点，有些分类系统分类速度随着分类规则集中规则数的增加而下降；有些系统需要配置很大的内存，系统成本较高；有些分类系统规则变化响应时间较长。这些算法都只能满足了数百条分类规则的应用场合的需求，其中递推流分类系统因为分类速度较快，且与分类规则数无关，是应用最为广泛的流分类系统。From 1999 to 2000, there was a climax of research on flow classification technology, and some achievements were made. A flow classification system based on Trie algorithm, a recursive flow classification (RFC, Recursive Flow Classification) system, and a hierarchical space segmentation (Hi-Cuts) system emerged. ) algorithm-based flow classification system, bit-vector (BV) algorithm-based flow classification system and other fast flow classification systems, these systems have their own advantages and disadvantages, and the classification speed of some classification systems increases with the increase of the number of rules in the classification rule set. Decrease; some systems need to configure a large amount of memory, and the system cost is high; some classification systems have a long response time for rule changes. These algorithms can only meet the application requirements of hundreds of classification rules. Among them, the recursive traffic classification system is the most widely used traffic classification system because of its fast classification speed and independent of the number of classification rules.

递推流分类系统是1999年由Pankaj Gupta和Nick McKeown提出的一个在多字段上进行包分类的流分类系统，这个系统的核心包括：1)一个索引表的结构；2)一个由规则集构造索引表的预处理算法单元；3)一个索引算法单元。递推流分类系统没有增量式更新算法，一旦规则集发生变化，需要重新执行预处理算法单元。The recursive flow classification system is a multi-field packet classification system proposed by Pankaj Gupta and Nick McKeown in 1999. The core of this system includes: 1) a structure of an index table; 2) a structure constructed by a rule set A preprocessing algorithm unit of the index table; 3) an index algorithm unit. The recursive flow classification system does not have an incremental update algorithm. Once the rule set changes, the preprocessing algorithm unit needs to be re-executed.

其中，索引表结构如图2所示，索引表分为若干级(phase)，每一级又有若干个索引块(Chunk)，分级的结构就是采用的递推结构，每一级索引表包含两个字段，即索引值和对应的eqID，第一级的索引值是数据包的字段值，后面每一级的索引值是递推结构中前面两个或几个表的索引结果(即eqID的加权和)。在第Phase 0索引表中，规则的每个字段对应一个索引块；而Phase J(J＞0)索引表中的每一个Chunk是由上一级索引表的若干个Chunk合并得到的。每一索引块的索引结果称为等价类标识(eqId，equal-class identification)，所谓等价类，就是m元子空间(m≤d)中匹配到相同规则的子集(不考虑优先权)。位图用于表示等价类和规则集中规则的匹配情况，位图是一个N位的数据，其中N是规则集中的规则数，如果等价类与第i条规则匹配，则位图的第i位为“1”，否则为“0”。这样，等价类索引值就可以通过索引块中对应的等价类标识映射到位图。Among them, the structure of the index table is shown in Figure 2. The index table is divided into several stages (phases), and each stage has several index blocks (Chunks). The hierarchical structure is the recursive structure adopted, and each stage of the index table contains Two fields, that is, the index value and the corresponding eqID. The index value of the first level is the field value of the data packet, and the index value of each subsequent level is the index result of the first two or several tables in the recursive structure (ie eqID weighted sum). In the Phase 0 index table, each field of the rule corresponds to an index block; while each Chunk in the Phase J (J>0) index table is obtained by merging several Chunks in the upper-level index table. The index result of each index block is called the equivalence class identification (eqId, equal-class identification). The so-called equivalence class is the subset that matches the same rule in the m-element subspace (m≤d) (regardless of priority ). The bitmap is used to indicate the matching between the equivalence class and the rules in the rule set. The bitmap is an N-bit data, where N is the number of rules in the rule set. If the equivalence class matches the i-th rule, the bitmap’s The i bit is "1", otherwise it is "0". In this way, the equivalence class index value can be mapped to a bitmap through the corresponding equivalence class identifier in the index block.

索引算法单元收到一个数据包，由数据包(packet)对应位置的每一个字段，查找一个phase0索引表，由若干上一级的索引结果的运算值(如，加权和)索引下一级表，直到得到最后一级索引的结果，该结果就是packet在d元空间上等价类的标识(eqId)，也就匹配到了一个位图，这个位图表明了这个等价类和哪些规则匹配，从中可以找出优先权最高的规则。值得说明的是，由于每个eqId对应的位图在预处理阶段就确定了，对应的优先权最高的规则也是确定的，大多数分类系统在预处理阶段就把eqId和对应的优先权最高的规则关联起来，从而直接得到数据包所符合的优先级最高的规则，索引算法单元不需要真的处理位图。The index algorithm unit receives a data packet, searches a phase0 index table for each field corresponding to the position of the data packet (packet), and indexes the lower-level table by the calculated values (eg, weighted sum) of several upper-level index results , until the result of the last level index is obtained, the result is the identity (eqId) of the equivalence class of the packet in the d-element space, and a bitmap is matched. This bitmap indicates which rules the equivalence class matches. From this you can find the rule with the highest priority. It is worth noting that since the bitmap corresponding to each eqId is determined in the preprocessing stage, and the corresponding rule with the highest priority is also determined, most classification systems combine the eqId and the corresponding highest priority rule in the preprocessing stage The rules are associated, so as to directly obtain the highest priority rule that the data packet conforms to, and the index algorithm unit does not need to actually process the bitmap.

近年来分类规则数有爆炸增长的趋势。递推流分类系统与其它流分类系统一样暴露出了固有的弱点，一个具有图2结构的递推流分类系统，当规则集中的规则数达到2000条扩展访问列表规则时，等价类规则匹配位图(CBM，Equal Class Bitmap)表需要26M以上的内存，且在采用了Pentium1.6GHz CPU的系统中，等价类规则匹配位图响应规则变化的时间达2000秒。动辄数十兆的内存开销和数百秒的规则变化响应时间是妨碍超过2000条规则的快速流分类技术进入实用的关键因素。In recent years, the number of classification rules has exploded. The recursive flow classification system has exposed inherent weaknesses like other flow classification systems. For a recursive flow classification system with the structure shown in Figure 2, when the number of rules in the rule set reaches 2000 extended access list rules, the equivalence class rules match The bitmap (CBM, Equal Class Bitmap) table requires more than 26M of memory, and in a system using a Pentium1.6GHz CPU, it takes 2000 seconds for the equivalence class rule to match the bitmap response rule change. Tens of megabytes of memory overhead and hundreds of seconds of rule change response time are the key factors that prevent the rapid flow classification technology with more than 2,000 rules from being practical.

发明内容 Contents of the invention

本发明要解决的技术问题是提出一种位图聚合的递推流分类方法，可以减少等价类规则匹配位图表的存储空间开销，加快等价类规则匹配位图生成和查找的操作速度。本发明还要提供一种可实现上述方法的系统。The technical problem to be solved by the present invention is to propose a recursive stream classification method for bitmap aggregation, which can reduce the storage space overhead of the equivalence class rule matching bitmap and accelerate the operation speed of generating and searching the equivalence class rule matching bitmap. The present invention also provides a system capable of realizing the above method.

为了解决上述技术问题，本发明提供了一种位图聚合的递推流分类方法，应用于递推流分类系统，包括以下步骤：In order to solve the above-mentioned technical problems, the present invention provides a recursive flow classification method of bitmap aggregation, which is applied to a recursive flow classification system, comprising the following steps:

(a)首先构造一个初始位图，使其每一位对应于规则集合的一条规则，然后从一个规则字段取值范围内取出一个值，遍历规则集合，若该值满足当前规则，将位图中该规则对应的位设置为1，否则设置为0，得到一个位图BM；(a) First construct an initial bitmap, so that each bit corresponds to a rule of the rule set, then take a value from the value range of a rule field, traverse the rule set, and if the value satisfies the current rule, the bitmap The bit corresponding to the rule in is set to 1, otherwise it is set to 0, and a bitmap BM is obtained;

(b)对位图BM进行聚合度为q的聚合，得到包括定长的聚合位部分和变长的细节位部分的聚合形式位图CBM，其聚合位部分的第i位对应于位图BM的第i个q元组，该q元组全为0时该位设置为0，否则该位设置为1，其细节位部分由位图BM去掉全为0的q元组后构成；(b) Aggregate the bitmap BM with the degree of aggregation q, and obtain the aggregated form bitmap CBM including the fixed-length aggregate bit part and the variable-length detail bit part, and the i-th bit of the aggregate bit part corresponds to the bitmap BM The i-th q-tuple of , when the q-tuple is all 0, this bit is set to 0, otherwise the bit is set to 1, and its detail bit part is formed by removing the q-tuple that is all 0 from the bitmap BM;

(c)比较聚合形式位图CBM和该规则字段对应索引块所属的位图集合中的聚合形式位图，检查是否已有相同的位图，若没有，为位图CBM生成新的等价类标识并将其加入到该位图集合，否则位图CBM使用其相同位图对应的等价类标识；然后，在所述索引块中建立位图CBM的等价类标识和当前规则字段值的映射关系；(c) Compare the aggregated form bitmap CBM with the aggregated form bitmap in the bitmap set to which the rule field corresponds to the index block, check whether the same bitmap exists, if not, generate a new equivalence class for the bitmap CBM Identify and add it to the bitmap collection, otherwise the bitmap CBM uses the equivalence class identification corresponding to the same bitmap; then, establish the equivalence class identification of the bitmap CBM and the current rule field value in the index block Mapping relations;

(d)对该规则字段取值范围内的每一个值，按步骤(a)到(c)的方法处理完成后，该规则字段对应的索引块构造完成，再按同样方法构造出每个规则字段对应的索引块，形成完整的第0级索引表；(d) After each value within the value range of the rule field is processed according to steps (a) to (c), the index block corresponding to the rule field is constructed, and then each rule is constructed in the same way The index block corresponding to the field forms a complete level 0 index table;

(e)按照设定的递推结构，对上一级的索引块进行递推，得到其下一级的索引块，递推时两个位图按其聚合形式进行比较，两个位图的“与”运算以聚合形式位图的“与”运算方法完成，如此逐级递推，直至所有索引块最后递推成为一个索引块；(e) According to the set recursive structure, recurse the index block of the upper level to obtain the index block of the lower level. During the recursion, the two bitmaps are compared according to their aggregation forms. The "AND" operation is completed by the "AND" operation method of the aggregated bitmap, and it is repeated step by step until all index blocks are finally recursively become an index block;

(f)递推流分类系统收到一个数据包后，根据该数据包与流分类相关的每一个字段查找第0级的一个索引表，由若干上一级的索引结果运算得到的索引值来索引下一级表，直到得到最后一级索引结果，然后将该数据包匹配到其满足要求的优先权最高的规则。(f) After the recursive flow classification system receives a data packet, it looks up an index table at level 0 according to each field related to the flow classification of the data packet, and calculates the index value obtained by calculating the index results of several upper levels. Index the lower-level table until the last-level index result is obtained, and then match the packet to the rule with the highest priority that satisfies the requirements.

进一步地，上述方法还可具有以下特点：所述步骤(b)进一步分为以下步骤：Further, the above method can also have the following characteristics: the step (b) is further divided into the following steps:

(b1)按设定的聚合度q将N位的位图BM分为N/q个q位组，第i个q位组对应于位图中第q*i～q*i+q-1位，i＝0，1，......，N/q-1，其对应的聚合形式位图CBM包括N/q位的聚合位部分和变长的细节位部分；(b1) Divide the N-bit bitmap BM into N/q q-bit groups according to the set aggregation degree q, and the i-th q-bit group corresponds to the q*i～q*i+q-1 in the bitmap Bit, i=0, 1,..., N/q-1, its corresponding aggregation form bitmap CBM includes an aggregation bit part of N/q bits and a variable-length detail bit part;

(b2)逐一判断位图BM的第i个q位组，i＝0，1，......，N/q-1，对每个q位组，判断其各个位是否全为0，如果是，将所述聚合形式位图的聚合位部分第i位设置为0，否则将该聚合位部分的第i位设置为1，并将该q位组在位图BM中的对应位追加拷贝到聚合形式位图CBM的细节位部分；(b2) judge the i-th q-bit group of the bitmap BM one by one, i=0, 1, ..., N/q-1, for each q-bit group, judge whether each bit is all 0 , if yes, set the i-th bit of the aggregation bit part of the aggregation form bitmap to 0, otherwise set the i-th bit of the aggregation bit part to 1, and set the corresponding bit of the q bit group in the bitmap BM Append and copy to the detail bit part of the aggregate form bitmap CBM;

(b3)对位图BM的所有q位组处理完成后，其对应的聚合形式位图CBM构造完成。(b3) After the processing of all q-byte groups of the bitmap BM is completed, the construction of the corresponding aggregation form bitmap CBM is completed.

进一步地，上述方法还可具有以下特点：所述聚合形式位图的聚合位部分采用q位边界对齐，不足部分补0。Furthermore, the above-mentioned method may also have the following characteristics: the aggregation bit part of the aggregation form bitmap adopts q-bit boundary alignment, and 0 is filled in the insufficient part.

进一步地，上述方法还可具有以下特点：所述步骤(b3)后还包括步骤：再将所述聚合形式位图细节位部分的变长长度写入到其聚合位部分的前面，作为所述聚合形式位图的变长长度部分。Further, the above-mentioned method can also have the following characteristics: after the step (b3), it also includes the step of writing the variable length of the detailed bit part of the aggregated form bitmap to the front of the aggregated bit part as the The variable length portion of the aggregated form bitmap.

进一步地，上述方法还可具有以下特点：所述细节位部分的变长长度是用细节位部分包含的q位组个数表示的。Further, the above method may also have the following feature: the variable length of the detail bit part is represented by the number of q-byte groups contained in the detail bit part.

进一步地，上述方法还可具有以下特点：所述聚合形式位图的比较算法是：依次比较聚合形式位图CBM1和CBM2的变长长度部分、聚合位部分和细节位部分，只要有一个部分不同，直接返回二个位图不同的结果，结束；如果三个部分都相同，则返回二个位图相同的结果。Further, the above method can also have the following characteristics: the comparison algorithm of the aggregated form bitmap is: sequentially compare the variable length part, the aggregated bit part and the detail bit part of the aggregated form bitmap CBM1 and CBM2, as long as one part is different , directly return the result that the two bitmaps are different, and end; if the three parts are the same, return the result that the two bitmaps are the same.

进一步地，上述方法还可具有以下特点：对两个聚合形式位图CBM1和CBM2进行与运算得到聚合形式的结果位图CBM的方法是：Further, the above method can also have the following characteristics: the method of performing an AND operation on two aggregated form bitmaps CBM1 and CBM2 to obtain the aggregated form result bitmap CBM is:

依次比较CBM1和CBM2的聚合位部分的每一个相同位，每次比较时：Each identical bit of the aggregated bit portion of CBM1 and CBM2 is compared in turn, each time:

如果参与比较的两个位至少有一个为0，则将CBM聚合位部分的相同位设置为0；If at least one of the two bits involved in the comparison is 0, set the same bit of the CBM aggregation bit part to 0;

如果参与比较的两个位均为1，将该两个位在各自位图的细节位部分对应的q位组进行按位“与”操作，得到q位组Q，如果Q全为0，将CBM聚合位部分的相同位设置为0，否则将CBM聚合位部分的相同位设置为1，并将CBM细节位部分与其聚合位部分该位对应的q位组赋值为Q。If the two bits involved in the comparison are both 1, perform a bitwise "AND" operation on the q-bit group corresponding to the two bits in the detail bit part of the respective bitmap to obtain the q-bit group Q. If Q is all 0, the The same bit of the CBM aggregation bit part is set to 0, otherwise the same bit of the CBM aggregation bit part is set to 1, and the q bit group corresponding to the bit of the CBM detail bit part and its aggregation bit part is assigned as Q.

进一步地，上述方法还可具有以下特点：所述步骤(b3)中，还将所述聚合形式位图细节位部分的变长长度写入其聚合位部分的前面，作为该聚合形式位图的变长长度部分；且在进行位图CBM1和CBM2的“与”运算时，在两者聚合位部分的所有位比较和处理完后，还将得到的CBM细节位部分的变长长度写入到其变长长度部分，得到包括三部分的结果位图CBM。Further, the above-mentioned method can also have the following characteristics: in the step (b3), the variable length of the detail bit part of the aggregated form bitmap is also written in front of the aggregated bit part, as the aggregated form bitmap variable length part; and when carrying out the "AND" operation of bitmaps CBM1 and CBM2, after all the bits of the aggregated bit part of the two have been compared and processed, the variable length of the CBM detail bit part obtained will also be written into The variable length part is obtained to obtain the result bitmap CBM comprising three parts.

进一步地，上述方法还可具有以下特点：所述步骤(e)对上一级的两个索引块Chunk 1和Chunk 2递推得到下一级索引块Chunk 3时，对分属两个索引块Chunk 1和Chunk 2的两个等价类标识EqID1和EqID2，执行以下步骤：Further, the above method can also have the following characteristics: when the step (e) recursively obtains the next-level index block Chunk 3 from the two upper-level index blocks Chunk 1 and Chunk 2, the pair belongs to two index blocks The two equivalence classes of Chunk 1 and Chunk 2 identify EqID1 and EqID2, perform the following steps:

(e1)将EqID1和EqID2对应的聚合形式位图进行“与”操作，生成属于下一级索引块Chunk 3的一个聚合形式的位图CBM3；(e1) performing an "AND" operation on the aggregation form bitmaps corresponding to EqID1 and EqID2 to generate an aggregation form bitmap CBM3 belonging to the next-level index block Chunk 3;

(e2)检查聚合形式位图CBM3在Chunk 3所属的位图集合中是否存在，若不存在，为其生成新的等价类标识，并将其加入到Chunk 3所属位图集合中；若存在，则位图CBM3使用与其相同的位图的等价类标识；(e2) Check whether the aggregation form bitmap CBM3 exists in the bitmap set to which Chunk 3 belongs, if not, generate a new equivalence class identifier for it, and add it to the bitmap set to which Chunk 3 belongs; if it exists , the bitmap CBM3 uses the equivalence class identifier of the same bitmap;

(e3)计算EqID1×(Chunk 2的等价类个数)+EqID2，作为Chunk 3中对应于位图CBM3的索引值，然后将位图CBM3的等价类标识填入Chunk3中该索引值对应的位置，形成字段值到该等价类标识的映射关系；(e3) Calculate EqID1×(number of equivalence classes of Chunk 2)+EqID2 as the index value corresponding to bitmap CBM3 in Chunk 3, and then fill in the equivalence class identifier of bitmap CBM3 into the corresponding index value in Chunk3 to form a mapping relationship from the field value to the equivalence class identifier;

对Chunk 1和Chunk 2中的任意两个等价类标识，按上述方法处理完成后，得到最终的递推结果Chunk 3。For any two equivalence class identifiers in Chunk 1 and Chunk 2, after processing according to the above method, the final recursive result Chunk 3 is obtained.

进一步地，上述方法还可具有以下特点：所述聚合度q为8、16、32、64或128。Further, the above method may also have the following characteristics: the degree of polymerization q is 8, 16, 32, 64 or 128.

本发明提供的位图聚合的递推流分类系统包括用于根据分类规则集构造分类索引表的预处理模块和将实际数据包通过分类索引表匹配到优先权最高规则的索引模块，所述预处理模块包含预处理总控单元、第0级索引表构造单元和第J级索引表构造单元，所述索引模块包含索引表和位图存储单元、索引算法单元，其特点是：The recursive flow classification system for bitmap aggregation provided by the present invention includes a preprocessing module for constructing a classification index table according to a classification rule set and an index module for matching actual data packets to the rule with the highest priority through the classification index table. The processing module includes a preprocessing master control unit, a zero-level index table construction unit, and a J-level index table construction unit. The index module includes an index table, a bitmap storage unit, and an index algorithm unit. Its characteristics are:

所述第0级索引表构造单元包含聚合运算子单元和位图比较子单元，在构造各个规则字段对应的索引块时，所述聚合运算子单元用于对未经聚合的位图BM进行聚合度为q的聚合，得到包括定长的聚合位部分和变长的细节位部分的聚合形式位图CBM，其聚合位部分的第i位对应于位图BM的第i个q元组，该q元组全为0时该位设置为0，否则该位设置为1，其细节位部分由位图BM去掉全为0的q元组后构成；而所述位图比较子单元按两个位图的聚合形式完成该两个位图的比较；The 0-level index table construction unit includes an aggregation operation subunit and a bitmap comparison subunit. When constructing the index block corresponding to each rule field, the aggregation operation subunit is used to aggregate the unaggregated bitmap BM Aggregation with a degree of q to obtain an aggregate bitmap CBM including a fixed-length aggregation bit part and a variable-length detail bit part, the i-th bit of the aggregation bit part corresponds to the i-th q-tuple of the bitmap BM, the When the q tuple is all 0, this bit is set to 0, otherwise this bit is set to 1, and its detail bit part is formed after removing the q tuple that is all 0 from the bitmap BM; and the bitmap comparison subunit is divided into two The aggregation form of the bitmap completes the comparison of the two bitmaps;

所述第J级索引表构造单元包含位图比较子单元和“与”运算子单元，在按照设定的递推结构，对上一级索引块进行递推得到下一级索引块时，位图以聚合后的形式表示，所述位图比较子单元按两个位图的聚合形式完成该两个位图的比较，所述“与”运算子单元以聚合形式位图的“与”运算方法完成两个位图的“与”运算；The J-level index table construction unit includes a bitmap comparison subunit and an "AND" operation subunit. When recursing the upper-level index block to obtain the next-level index block according to the set recursive structure, the bitmap The graph is expressed in an aggregated form, the bitmap comparison subunit completes the comparison of the two bitmaps in the aggregated form of the two bitmaps, and the "AND" operation subunit performs the "AND" operation on the aggregated bitmaps method to complete the "AND" operation of two bitmaps;

所述索引表存储单元用于保存构造得到的各级索引表。The index table storage unit is used to store the constructed index tables of all levels.

进一步地，上述系统还可具有以下特点：所述聚合运算子单元按以下方式完成对位图的聚合运算：Further, the above-mentioned system may also have the following characteristics: the aggregation operation subunit completes the aggregation operation on the bitmap in the following manner:

(I)按设定的聚合度q将N位的位图BM分为N/q个q位组，第i个q位组对应于位图中第q*i～q*i+q-1位，i＝0，1，......，N/q-1，其对应的聚合形式位图CBM包括N/q位的聚合位部分和变长的细节位部分；(1) Divide the N-bit bitmap BM into N/q q-bit groups according to the set aggregation degree q, and the i-th q-bit group corresponds to the q*i～q*i+q-1 in the bitmap Bit, i=0, 1,..., N/q-1, its corresponding aggregation form bitmap CBM includes an aggregation bit part of N/q bits and a variable-length detail bit part;

(II)逐一判断位图BM的第i个q位组，i＝0，1，......，N/q-1，对每个q位组，判断其各个位是否全为0，如果是，将所述聚合形式位图的聚合位部分第i位设置为0，否则将该聚合位部分的第i位设置为1，并将该q位组在位图BM中的对应位追加拷贝到聚合形式位图CBM的细节位部分；(II) judge the i-th q-bit group of the bitmap BM one by one, i=0, 1, ..., N/q-1, for each q-bit group, judge whether its each bit is all 0 , if yes, set the i-th bit of the aggregation bit part of the aggregation form bitmap to 0, otherwise set the i-th bit of the aggregation bit part to 1, and set the corresponding bit of the q bit group in the bitmap BM Append and copy to the detail bit part of the aggregate form bitmap CBM;

(III)对位图BM的所有q位组处理完成后，其对应的聚合形式位图CBM构造完成。(III) After the processing of all q-byte groups of the bitmap BM is completed, the construction of the corresponding aggregation form bitmap CBM is completed.

进一步地，上述系统还可具有以下特点：所述聚合运算子单元在进行位图的聚合运算时，在所述步骤(III)中，还将所述聚合形式位图CBM的细节位部分的变长长度写入到其聚合位部分的前面，作为所述聚合形式位图CBM的变长长度部分。Further, the above-mentioned system can also have the following characteristics: when the aggregation operation subunit performs the aggregation operation of the bitmap, in the step (III), it also changes the detail bit part of the aggregation form bitmap CBM The long length is written in front of its aggregated bit portion as the variable length portion of the aggregated form bitmap CBM.

进一步地，上述系统还可具有以下特点：所述位图比较子单元对两个聚合形式的位图进行比较时，依次比较这两个位图的变长长度部分、聚合位部分和细节位部分，只要有一个部分不同，就直接返回二个位图不同的结果，结束；如果三个部分都相同，则返回二个位图相同的结果。Further, the above-mentioned system may also have the following characteristics: when the bitmap comparing subunit compares two bitmaps in aggregate form, it sequentially compares the variable length part, the aggregate bit part and the detail bit part of the two bitmaps , as long as one part is different, it will directly return the result that the two bitmaps are different, and end; if all three parts are the same, return the result that the two bitmaps are the same.

进一步地，上述系统还可具有以下特点：所述“与”运算子单元对两个聚合形式位图CBM1和CBM2进行“与”运算得到聚合形式的结果位图CBM的方法是：Further, the above-mentioned system may also have the following characteristics: the method for the "AND" operation subunit to perform "AND" operation on two aggregated bitmaps CBM1 and CBM2 to obtain the aggregated result bitmap CBM is:

进一步地，上述系统还可具有以下特点：所述位图比较子单元进行以聚合形式表示的位图的比较时，在步骤(III)中，还将聚合形式位图CBM的细节位部分的变长长度写入其聚合位部分的前面，作为其变长长度部分；相应地，所述“与”运算子单元进行以聚合形式位图的“与”运算时，在将位图CBM1和CBM2聚合位部分的所有位比较并处理完后，还将聚合形式位图CBM的细节位部分的变长长度写入到该CBM的变长长度部分。Further, the above-mentioned system can also have the following characteristics: when the bitmap comparison subunit compares the bitmaps represented in the aggregated form, in step (III), it will also change the detail bit part of the aggregated form bitmap CBM The long length is written in front of its aggregation bit part as its variable length part; correspondingly, when the "AND" operation subunit performs the "AND" operation of the bitmap in the aggregated form, when the bitmaps CBM1 and CBM2 are aggregated After all the bits in the bit part are compared and processed, the variable length of the detail bit part of the aggregated form bitmap CBM is written into the variable length part of the CBM.

由上可知，本发明通过对递推流分类算法中等价类规则匹配位图的聚合，减少了递推流分类系统在规则集预处理生成索引表阶段等价类规则匹配位图表的存储空间开销，也减少了等价类规则匹配位图生成操作(按位与运算)和等价类规则匹配位图表查找的CPU时间开销，从而缩短预处理时间，加快规则变化响应速度。As can be seen from the above, the present invention reduces the storage space overhead of the equivalence rule matching bitmap in the recursive flow classification system at the stage of rule set preprocessing to generate an index table by aggregating the equivalence class rule matching bitmap in the recursive flow classification algorithm , and also reduces the CPU time overhead of equivalence class rule matching bitmap generation operation (bitwise AND operation) and equivalence class rule matching bitmap search, thereby shortening preprocessing time and speeding up the response speed of rule changes.

附图说明 Description of drawings

图1是本发明实施例位图聚合的递推流分类系统结构框图；Fig. 1 is the structural block diagram of the recursive flow classification system of the bitmap aggregation of the embodiment of the present invention;

图2是典型递推IP流分类系统的递推结构图；Fig. 2 is a recursive structural diagram of a typical recursive IP flow classification system;

图3是本发明实施例位图聚合示意图；Fig. 3 is a schematic diagram of bitmap aggregation according to an embodiment of the present invention;

图4是本发明实施例位图聚合的递推流分类系统Phase 0索引表的构造流程图；Fig. 4 is the construction flowchart of the recursive flow classification system Phase 0 index table of the embodiment of the present invention bitmap aggregation;

图5是本发明实施例聚合形式的位图“与”操作的流程图；Fig. 5 is the flow chart of the bitmap "AND" operation of the aggregation form of the embodiment of the present invention;

图6是本发明实施例聚合形式的位图比较操作的流程图；Fig. 6 is a flow chart of the bitmap comparison operation in the form of aggregation according to an embodiment of the present invention;

图7是本发明实施例位图聚合的递推流分类系统Phase J(J＞0)索引表的构造流程图。Fig. 7 is a flow chart of construction of a phase J (J>0) index table of the recursive stream classification system for bitmap aggregation according to an embodiment of the present invention.

具体实施方式 Detailed ways

位图是N位的数据，在预处理模块存储了大量的位图，并且大多数时间都在进行位图的“与”及“比较”操作。通过大量的研究分析，发明人发现，实际上位图中“1”位是很少，一个等价类往往只与极少的规则匹配。也就是说位图是稀疏的，因此，本发明通过对稀疏的位图表项进行聚合处理，用长度更短的数据来表达位图所携带的信息，从而减少位图信息的存储所占用的内存空间，以及减少位图“与”及“比较”操作所花费的时间，也就是响应时间。The bitmap is N-bit data, and a large number of bitmaps are stored in the preprocessing module, and the "AND" and "comparison" operations of the bitmaps are performed most of the time. Through a lot of research and analysis, the inventors found that, in fact, there are very few "1" bits in the bitmap, and an equivalence class often only matches very few rules. That is to say, the bitmap is sparse. Therefore, the present invention uses shorter data to express the information carried by the bitmap by aggregating the sparse bitmap items, thereby reducing the memory occupied by the storage of the bitmap information. Space, and reduce the time it takes for bitmap "AND" and "Compare" operations, that is, response time.

如图1所示，本实施例所述位图聚合的递推流分类系统主要由以下控制层面和数据层面两部分组成：As shown in Figure 1, the recursive flow classification system for bitmap aggregation described in this embodiment is mainly composed of the following two parts: the control plane and the data plane:

控制层面是根据分类规则集构造分类索引表的预处理模块，该模块进一步包括：The control plane is a preprocessing module for constructing a classification index table according to the classification rule set, which further includes:

预处理总控单元，用于控制Phase 0索引表构造模块构造Phase 0的各个索引块(chunk)，然后控制Phase J(J＞0)索引表构造模块依次递推构造PhaseJ(J＞0)的各个chunk。The preprocessing master control unit is used to control the Phase 0 index table construction module to construct each index block (chunk) of Phase 0, and then control the Phase J (J>0) index table construction module to recursively construct PhaseJ (J>0) in turn Each chunk.

Phase 0索引表构造单元，用于根据构造各个规则字段对应的索引块。其包含聚合运算子单元和位图比较子单元(各个子单元在图中未示出)，在构造时，所述聚合运算子单元用于对未经聚合的位图BM进行聚合度为q的聚合，得到包括定长的聚合位部分和变长的细节位部分的聚合形式位图CBM，其聚合位部分的第i位对应于位图BM的第i个q元组，该q元组全为0时该位设置为0，否则该位设置为1，其细节位部分由位图BM去掉全为0的q元组后构成；而所述位图比较子单元在比较两个位图时，是按这两个位图的聚合形式来比较的。Phase 0 index table construction unit, used to construct index blocks corresponding to each rule field. It includes an aggregation operation subunit and a bitmap comparison subunit (each subunit is not shown in the figure), and when constructed, the aggregation operation subunit is used to perform an aggregation degree of q on the unaggregated bitmap BM Aggregate to obtain the aggregated form bitmap CBM including the fixed-length aggregation bit part and the variable-length detail bit part, the i-th bit of the aggregation bit part corresponds to the i-th q-tuple of the bitmap BM, and the q-tuple is all When it is 0, this bit is set to 0, otherwise this bit is set to 1, and its detail bit part is formed after removing the q tuples that are all 0 from the bitmap BM; and the bitmap comparison subunit compares two bitmaps , are compared as aggregates of the two bitmaps.

Phase J(J＞0)索引表构造单元，用于按照设定的递推结构，对上一级Chunk进行递推计算得到下一级的Chunk，如此逐级递推直至最终得到一个Chunk。其包含位图比较子单元和“与”运算子单元，在递推时，位图以聚合后的形式表示，所述位图比较子单元按两个位图的聚合形式完成该两个位图的比较，所述“与”运算子单元以聚合形式位图的“与”运算方法完成两个位图的“与”运算。Phase J (J>0) index table construction unit is used to recursively calculate the upper-level Chunk to obtain the next-level Chunk according to the set recursive structure, and so on until finally a Chunk is obtained. It includes a bitmap comparison subunit and an "AND" operation subunit. During recursion, the bitmap is expressed in an aggregated form, and the bitmap comparison subunit completes the two bitmaps in the form of aggregation of the two bitmaps The "AND" operation sub-unit completes the "AND" operation of two bitmaps by using the "AND" operation method of aggregated bitmaps.

数据层面是将实际数据包通过分类索引表匹配到优先权最高规则的索引模块，该单元包括索引表和位图存储单元，以及索引算法单元，其中：The data plane is an index module that matches the actual data packet to the highest priority rule through the classification index table. This unit includes the index table and bitmap storage unit, and the index algorithm unit, where:

索引表和位图存储单元保存了构造得到的各级索引块，每个索引块包括一个索引表和对应的以聚合形式表示的位图。在另一实施例中，聚合形式的位图可以只在索引表构造阶段生成，可以不保存聚合形式的位图，只要有索引表就可以了，当然，需要预先直接将最后一级的EqID对应到优先权最高的规则。The index table and bitmap storage unit stores the constructed index blocks of various levels, and each index block includes an index table and a corresponding bitmap expressed in an aggregated form. In another embodiment, the aggregated bitmap can be generated only in the index table construction stage, and the aggregated bitmap may not be saved, as long as there is an index table. Of course, it is necessary to directly map the last level of EqID to to the highest priority rule.

索引算法单元收到一个数据包后，由数据包(packet)的每一个字段，查找phase 0的一个索引表，按照相同的递推结构，由上一级的索引结果(即EqID)的加权和作为索引值来索引下一级表，直到得到最后一级索引的结果，即为packet在d元空间上等价类的标识(eqId)，最后找到该标识对应的优先权最高的规则，可以在预处理阶段把eqId和对应的优先权最高的规则直接关联起来。After the index algorithm unit receives a data packet, each field of the data packet (packet) is used to search an index table of phase 0, and according to the same recursive structure, the weighted sum of the upper-level index results (ie EqID) Use the index value to index the next-level table until the result of the last-level index is obtained, which is the identifier (eqId) of the equivalence class of the packet in the d-element space, and finally find the rule with the highest priority corresponding to the identifier, which can be found in The preprocessing phase directly associates the eqId with the corresponding highest priority rule.

本发明的位图聚合没有改变递推分类系统的索引块的形式，所以本发明的索引模块与传统的递推分类系统相同，只是位图采用聚合后的形式来表示。但是由于位图的聚合，Phase 0索引表构造单元和Phase J(J＞0) 索引表构造单元的构造流程需要做相应的改进。The bitmap aggregation of the present invention does not change the form of the index block of the recursive classification system, so the index module of the present invention is the same as the traditional recursive classification system, except that the bitmap is expressed in an aggregated form. However, due to the aggregation of bitmaps, the construction process of Phase 0 index table construction unit and Phase J (J>0) index table construction unit needs to be improved accordingly.

Phase 0索引表构造模块构造一个规则字段(一般来说是8～16bits)对应的Chunk的方法如图4所示，包括以下步骤：The method for the Phase 0 index table construction module to construct a Chunk corresponding to a rule field (generally 8-16 bits) is shown in Figure 4, including the following steps:

步骤110，构造一初始位图，长度等于规则集合中的规则数，位图中每一bit位设置为0，对应于一条规则；Step 110, constructing an initial bitmap whose length is equal to the number of rules in the rule set, and each bit in the bitmap is set to 0, corresponding to a rule;

步骤120，从该规则字段取值范围(1，2B-1)内依次取出一个值，B是规则中某个字段的宽度；Step 120, take out a value sequentially from the value range (1, 2B-1) of the rule field, where B is the width of a certain field in the rule;

步骤130，依先后顺序遍历规则集合，如果该值满足当前规则，则将位图中该规则对应的bit位设置为1，否则设置为0，得到当前位图BM；Step 130, traversing the rule set in sequence, if the value satisfies the current rule, then set the bit corresponding to the rule in the bitmap to 1, otherwise set it to 0, and obtain the current bitmap BM;

步骤140，对位图BM进行聚合度为q的聚合，得到该位图的聚合形式表示CBM，下文也称为聚合形式位图CBM，其聚合位部分的第i位对应于位图BM的第i个q元组，该q元组全为0时该位设置为0，否则该位设置为1，其细节位部分由位图BM去掉全为0的q元组后构成；Step 140, aggregate the bitmap BM with an aggregation degree of q, and obtain the aggregated form of the bitmap to represent CBM, which is also referred to as the aggregated form bitmap CBM hereinafter, and the i-th bit of the aggregated bit part corresponds to the bitmap BM's i q-tuples, when the q-tuples are all 0, the bit is set to 0, otherwise the bit is set to 1, and the detail bits are formed by removing the q-tuples that are all 0 from the bitmap BM;

步骤150，将聚合形式位图CBM与该规则字段对应索引块所属位图集合中的位图(也是聚合形式的)进行比较，检查是否有相同的位图，若没有，则为位图CBM生成新的EqID(等价类标识)并将其加入到该位图集合中，否则，位图CBM使用其相同位图对应的EqID；Step 150, compare the aggregated form bitmap CBM with the bitmap (also in aggregated form) in the bitmap set to which the index block corresponding to the rule field belongs, check whether there is the same bitmap, if not, generate the bitmap CBM New EqID (equivalence class identifier) and add it to the bitmap collection, otherwise, the bitmap CBM uses the EqID corresponding to the same bitmap;

步骤160，将位图CBM的EqID填入Chunk中当前规则字段值对应的位置，形成字段值到EqID的映射关系；Step 160, fill in the EqID of the bitmap CBM into the corresponding position of the current rule field value in the Chunk, forming a mapping relationship from the field value to the EqID;

步骤170，判断该规则字段取值范围内的所有值是否处理完毕，如果没有，返回步骤120，如果处理完毕，则该规则字段对应的Chunk构造完成。Step 170, judge whether all the values within the value range of the rule field have been processed, if not, return to step 120, if the processing is completed, the Chunk corresponding to the rule field is constructed.

通过以同样的方法构造每个规则字段对应的Chunk，Phase 0索引表构造模块最终构造出完整的Phase 0的索引表。By constructing the Chunk corresponding to each rule field in the same way, the Phase 0 index table construction module finally constructs a complete Phase 0 index table.

下面将以一个实例详细论述位图聚合运算方法。The bitmap aggregation operation method will be discussed in detail below with an example.

我们对N Bits的稀疏位图进行q Bits的聚合压缩。q称为聚合度，可以取8、16、32、64、128等值以便于CPU处理。图3是一个聚合度为8的位图聚合的一个实例。We aggregate and compress q Bits of sparse bitmaps of N Bits. q is called the degree of aggregation, and can take values such as 8, 16, 32, 64, and 128 to facilitate CPU processing. Figure 3 is an example of bitmap aggregation with an aggregation degree of 8.

聚合后的位图由三部分组成：第一部分是变长长度部分，用于表示聚合形式位图细节位部分的变长长度，即其包含的q元组个数。该变长长度部分是冗余信息，但是该字段是含有丰富的位图信息，是信息总体的特征，有助于加快比较的速度，进一步改进系统性能；第二部分是N/q位的聚合位部分(q位边界对齐，不足部分补0)，聚合位部分的第i位是未经聚合的位图中第i*q～i*q+1位“或”运算的结果，该位为0表示其对应q位组全为0，该位为1表示其对应q位组至少有一个“1”位；第三部分是变长的细节位部分，由未经聚合的位图去掉等于0的q位组而成，细节位部分的第k个q位组(第k*q～k*q+q-1位)对应于聚合位部分的第k个“1”位，如果聚合位部分的第k个“1”位出现在第i位上，则细节位部分的第k个q位组与未经聚合位图的第i个q位组(第i*q～i*q+q-1位)相等。The aggregated bitmap is composed of three parts: the first part is the variable length part, which is used to indicate the variable length of the detail part of the aggregated bitmap, that is, the number of q-tuples it contains. The variable length part is redundant information, but this field contains rich bitmap information, which is the overall feature of the information, which helps to speed up the comparison and further improve system performance; the second part is the aggregation of N/q bits The bit part (q bit boundary is aligned, and the insufficient part is filled with 0). The i-th bit of the aggregated bit part is the result of the "or" operation of the i*q~i*q+1 bits in the unaggregated bitmap. This bit is 0 means that its corresponding q-bit group is all 0, and this bit is 1 means that its corresponding q-bit group has at least one "1" bit; the third part is the variable-length detail bit part, which is equal to 0 after being removed from the unaggregated bitmap The k-th q-bit group (k*q～k*q+q-1 bit) of the detail bit part corresponds to the k-th "1" bit of the aggregation bit part, if the aggregation bit part The k-th "1" bit of the k-th "1" appears on the i-bit, then the k-th q-bit group of the detail bit part and the i-th q-bit group of the unaggregated bitmap (i*q～i*q+q -1 bit) are equal.

聚合的方法如下：依次对未经聚合的位图的第i个q位组进行处理(即第q*i～q*i+q-1位，i＝0，1，...)，如果该q位组不全为0，则聚合后的位图第二部分的第i位设置为1，并把未经聚合的位图的这个非0的局部(第q*i～q*i+q-1位)追加拷贝到聚合后的位图的第三部分(细节位部分)。否则，聚合形式位图的第二部分(聚合位部分)的第i位设置为0。第三部分相当于未经聚合的位图忽略了全0的q位组后的形式。未经聚合的位图的所有q位组处理完成后，再调整聚合形式位图的第一部分(变长长度部分)为写入第三部分(细节位部分)的q位组个数。The aggregation method is as follows: sequentially process the i-th q-bit group of the unaggregated bitmap (that is, the q*i～q*i+q-1 bits, i=0, 1, ...), if The q bit group is not all 0, then the i-th bit of the second part of the aggregated bitmap is set to 1, and the non-zero part of the unaggregated bitmap (q*i～q*i+q -1 bit) append copied to the third part (detail part) of the aggregated bitmap. Otherwise, the i-th bit of the second part of the aggregated form bitmap (the aggregated bits part) is set to zero. The third part is equivalent to the form after the unaggregated bitmap ignores the q bits of all 0s. After the processing of all q-bytes of the unaggregated bitmap is completed, the first part (variable length part) of the aggregated bitmap is adjusted to the number of q-bytes written in the third part (detail part).

图3是一个30条规则的分类器中某等价类所对应位图的聚合示意图，该等价类和规则{0，1，2，20}匹配，取聚合度q＝8。因为未经聚合的位图的0～8位、16～23位中有“1”位，所以聚合形式位图的聚合位部分的第1、3位为“1”，其它位为“0”，其细节位部分中包含了未经聚合的位图中的非全0的q位组，即0～8位和16～23位。FIG. 3 is a schematic diagram of aggregation of bitmaps corresponding to an equivalence class in a 30-rule classifier. The equivalence class matches the rules {0, 1, 2, 20}, and the aggregation degree q=8. Because there are "1" bits in bits 0-8 and bits 16-23 of the unaggregated bitmap, the first and third bits of the aggregate bit part of the aggregated bitmap are "1", and the other bits are "0". , the detail bit part includes non-all 0 q-bit groups in the unaggregated bitmap, that is, bits 0-8 and bits 16-23.

在位图聚合的递推流分类系统中需要对聚合形式位图做两种操作，第一种是两个聚合形式位图的“与”运算，以求得两个等价类交集(新等价类)的位图(聚合形式)，其流程参见图5。第二种是两个聚合形式位图的比较操作，用于判断两个等价类是否可以合并，其流程参见图6。In the recursive flow classification system of bitmap aggregation, two operations need to be performed on the aggregated bitmap. The first one is the "AND" operation of two aggregated bitmaps to obtain the intersection of two equivalence classes (new etc. valence class) bitmap (aggregation form), its process is shown in Figure 5. The second is a comparison operation of two aggregated bitmaps, which is used to judge whether two equivalence classes can be merged. The process is shown in FIG. 6 .

图5的流程用于对两个聚合形式位图CBM1和CBM2进行与运算，得到聚合形式的结果位图CBM，包括如下步骤：The process in Figure 5 is used to perform an AND operation on two aggregated bitmaps CBM1 and CBM2 to obtain an aggregated result bitmap CBM, including the following steps:

步骤210，依次读出CBM1和CBM2的“聚合位部分”的一个相同位(通常从第0位开始)；Step 210, sequentially read an identical bit (usually starting from the 0th bit) of the "aggregation bit part" of CBM1 and CBM2;

步骤220，判断参与比较的两个bit位是否至少有一个为0，如果是，执行下一步，否则，执行步骤240；Step 220, judging whether at least one of the two bit positions participating in the comparison is 0, if yes, perform the next step, otherwise, perform step 240;

步骤230，将结果CBM的聚合位部分的对应bit位设置为0，执行步骤270；Step 230, the corresponding bit of the aggregation bit part of the result CBM is set to 0, and step 270 is performed;

步骤240，将CBM1和CBM2“细节位部分”中与各自“聚合位部分”该位对应的两个q位组进行按位“与”运算，得到q位组Q；Step 240, performing a bitwise "AND" operation on the two q-bit groups corresponding to the respective bits in the "detail bit part" of CBM1 and CBM2 to obtain the q-bit group Q;

步骤250，判断该q位组Q是否全为0，如果是，执行步骤230，否则，执行下一步；Step 250, judge whether the q bit group Q is all 0, if yes, execute step 230, otherwise, execute the next step;

步骤260，将结果位图CBM的聚合位部分的对应bit位设置为1，并将结果位图CBM细节位部分中与该bit位对应的q位组(如该bit位为聚合位部分第k个“1”位，则该q位组为细节位部分第k个q位组)赋值为Q；Step 260, the corresponding bit of the aggregation bit part of the result bitmap CBM is set to 1, and the q bit group corresponding to the bit in the detail bit part of the result bitmap CBM (as the bit is the kth bit of the aggregation bit part) "1" bits, then the q bit group is the kth q bit group of the detail bit part) is assigned Q;

步骤270，判断CBM1和CBM2“聚合位部分”的所有位是否处理完毕，如果是，执行下一步，否则，返回步骤210；Step 270, judging whether all bits of the "aggregation bit part" of CBM1 and CBM2 have been processed, if yes, perform the next step, otherwise, return to step 210;

步骤280，调整CBM的变长长度部分为写入的细节位部分的q位组个数，得到最终的结果位图CBM。Step 280, adjusting the variable length part of the CBM to the number of q-bytes in the detail bit part to obtain the final result bitmap CBM.

图6示出了对两个聚合形式位图CBM1和CBM2进行比较操作的流程，包括以下步骤：Fig. 6 shows the process of comparing two aggregated form bitmaps CBM1 and CBM2, including the following steps:

步骤310，比较聚合形式位图CBM1和CBM2的变长长度部分是否相同，如果不同，执行步骤350，否则继续下一步；Step 310, comparing whether the variable length parts of the aggregated form bitmaps CBM1 and CBM2 are the same, if not, execute step 350, otherwise continue to the next step;

步骤320，比较聚合形式位图CBM1和CBM2的聚合位部分是否相同，如果不同，执行步骤350，否则继续下一步；Step 320, compare whether the aggregation bit parts of the aggregation form bitmap CBM1 and CBM2 are the same, if not, execute step 350, otherwise continue to the next step;

步骤330，比较聚合形式位图CBM1和CBM2的细节位部分是否相同，如果不同，执行步骤350，否则继续下一步；Step 330, compare whether the detail bits of the aggregated form bitmaps CBM1 and CBM2 are the same, if not, execute step 350, otherwise continue to the next step;

步骤340，返回二者相同的结果，结束；Step 340, return the same result of both, end;

步骤350，返回二者不同的结果，结束。Step 350, return different results, and end.

可以看出，相比传统递推流分类系统中的位图“与”及“比较”操作，聚合后的位图数据长度更短，“与”及“比较”操作花费时间也短，所以位图聚合的递推流分类系统预处理模块处理时间短，响应时间减少。It can be seen that compared with the bitmap "AND" and "comparison" operations in the traditional recursive stream classification system, the length of the aggregated bitmap data is shorter, and the "AND" and "comparison" operations take less time, so the bitmap The preprocessing module of the recursive stream classification system for graph aggregation has short processing time and reduced response time.

Phase J(J＞0)索引表构造模块通过对前一级生成的Chunk进行递推计算，得到其下一级的索引表，此过程一直进行下去(J不断增加)，直到所有的Chunk最后递推成为一个Chunk。The Phase J (J>0) index table construction module obtains the next-level index table by recursively calculating the Chunk generated by the previous level. This process continues (J keeps increasing) until all the Chunks are finally handed over Push becomes a Chunk.

对两个Chunk 1和Chunk 2进行递推得到递推结果Chunk 3的方法如图7所示，包括以下步骤：The method of recursing two Chunk 1 and Chunk 2 to obtain the recursive result Chunk 3 is shown in Figure 7, including the following steps:

步骤410，取出分属Chunk 1和Chunk 2的两个等价类标识EqID1和EqID2，将其对应的聚合形式位图CBM1和CBM2进行“与”操作，生成属于Chunk 3的一个聚合形式的结果位图CBM3；Step 410, take out the two equivalence class identifiers EqID1 and EqID2 belonging to Chunk 1 and Chunk 2, and perform "AND" operation on their corresponding aggregation form bitmaps CBM1 and CBM2 to generate an aggregation form result bit belonging to Chunk 3 Figure CBM3;

步骤420，检查聚合形式位图CBM3在Chunk 3所属的位图集合中是否存在，若不存在，执行下一步，若存在，执行步骤440；Step 420, check whether the aggregated form bitmap CBM3 exists in the bitmap set to which Chunk 3 belongs, if not, execute the next step, and if exist, execute step 440;

步骤430，为位图CBM3生成新的EqID，并将其加入到Chunk 3所属的位图集合中，执行步骤450；Step 430, generate a new EqID for the bitmap CBM3, and add it to the bitmap collection to which Chunk 3 belongs, and execute step 450;

步骤440，位图CBM3使用相同位图对应的EqID；Step 440, the bitmap CBM3 uses the EqID corresponding to the same bitmap;

步骤450，计算EqID1×Chunk 2的等价类个数+EqID2，结果作为Chunk3中对应于位图CBM3的索引值；Step 450, calculate the number of equivalence classes + EqID2 of EqID1 × Chunk 2, and use the result as the index value corresponding to bitmap CBM3 in Chunk3;

步骤460，将该位图CBM3的EqID填入Chunk 3中该索引值对应的位置，形成字段值到EqID的映射关系；Step 460, fill the EqID of the bitmap CBM3 into the position corresponding to the index value in Chunk 3, forming a mapping relationship from the field value to the EqID;

步骤470，判断Chunk 1和Chunk 2中任意两个等价类标识的组合是否都已处理完毕，如果未处理完，返回步骤410，如果处理完成，则得到了递推结果Chunk 3，结束。Step 470, judge whether the combination of any two equivalence class identifiers in Chunk 1 and Chunk 2 has been processed, if not, return to step 410, if the processing is completed, the recursive result Chunk 3 is obtained, and end.

对于具有图2递推结构的分类系统，传统递推分类系统和采用本发明位图聚合的递推分类系统在Pentium M1.6GHz计算机上仿真对比如下：For the classification system with Fig. 2 recursive structure, traditional recursive classification system and the recursive classification system that adopts bitmap aggregation of the present invention compare as follows on Pentium M1.6GHz computer emulation:

一，1000条标准acl分类规则的分类器，传统递推分类器算法和位图聚合的递推分类器(聚合度32)的比较见表1。1. The classifier of 1000 standard ACL classification rules, the traditional recursive classifier algorithm and the recursive classifier of bitmap aggregation (aggregation degree 32) are compared in Table 1.

表1：1000条标准acl规则分类器比较Table 1: Comparison of classifiers for 1000 standard ACL rules

内存占用 memory usage 响应时间 Response time 传统递推分类系统 traditional recursive classification system 13,173,102 13,173,102 ＜1s <1s 位图聚合的递推分类系统 A Recursive Classification System for Bitmap Aggregations 1,654,800 1,654,800 ＜1s <1s

二，1000条扩展acl分类规则的分类器，传统递推分类器算法和位图聚合的递推分类器(聚合度32)的比较见表2。Second, the classifier with 1000 extended ACL classification rules, the comparison between the traditional recursive classifier algorithm and the recursive classifier (aggregation degree 32) of bitmap aggregation is shown in Table 2.

表2：1000条扩展acl规则分类器比较Table 2: Comparison of classifiers for 1000 extended ACL rules

内存占用 memory usage 响应时间 Response time 传统递推分类系统 traditional recursive classification system 13,173,102 13,173,102 157s 157s 位图聚合的递推分类系统 A Recursive Classification System for Bitmap Aggregations 7,526,704 7,526,704 ＜1s <1s

三，2000条扩展acl分类规则的分类器，传统递推分类器算法和位图聚合的递推分类器(聚合度32)的比较见表3。Three, the classifier with 2000 extended ACL classification rules, the comparison of traditional recursive classifier algorithm and bitmap aggregation recursive classifier (aggregation degree 32) is shown in Table 3.

表3：2000条扩展acl规则分类器比较Table 3: Comparison of classifiers for 2000 extended ACL rules

内存占用 memory usage 响应时间 Response time 传统递推分类系统 traditional recursive classification system 26,728,467 26,728,467 2,052s 2,052s 位图聚合的递推分类系统 A Recursive Classification System for Bitmap Aggregations 18,032,924 18,032,924 4s 4s

实践证明改进的规则变化响应时间优化了10～100倍，空间开销优化了2～10倍。可见改进取得了相当好的效果，其内存和响应时间开销完全能够满足2000条以上扩展acl规则的分类器的要求。Practice has proved that the improved rule change response time is optimized by 10 to 100 times, and the space overhead is optimized by 2 to 10 times. It can be seen that the improvement has achieved quite good results, and its memory and response time overhead can fully meet the requirements of a classifier with more than 2000 extended ACL rules.

Claims

1, a kind of bit-map aggregated recursive stream sorting method is applied to the recursive stream sorting system, may further comprise the steps:

(a) at first construct an initial bitmap, make its each rule corresponding to regular collection, in a rule field span, take out a value then, the traversal rule set, if this value satisfies current rule, position that should the rule correspondence in the bitmap is set to 1, otherwise puts 0, obtains a bitmap BM;

(b) bitmap BM is carried out the polymerization that the degree of polymerization is q, obtain comprising the polymerization bit position of fixed length and the polymerized form bitmap CBM of elongated details bit position, the i position of its polymerization bit position is corresponding to i the q tuple of bitmap BM, this q tuple is that 0 o'clock this position is set to 0 entirely, otherwise this position is set to 1, and it is to constitute after 0 the q tuple entirely that its details bit position is removed by bitmap BM;

(c) the polymerized form bitmap during relatively the affiliated bitmap of polymerized form bitmap CBM and this rule field manipulative indexing piece is gathered, check and whether have identical bitmap, if do not have, for bitmap CBM generates new equivalence class sign and it is joined this bitmap set, otherwise bitmap CBM uses the equivalence class sign of its identical bitmap correspondence; Then, in described index block, set up the equivalence class sign of bitmap CBM and the mapping relations of current rule field value;

(d) to each value in this rule field span, after (a) finishes dealing with to the method for (c) set by step, the index block structure of this rule field correspondence is finished, and goes out the index block of each rule field correspondence again by same method construct, forms the 0th grade of complete concordance list;

(e) according to the recursion structure of setting, index block to upper level carries out recursion, obtain the index block of its next stage, two bitmaps compare by its polymerized form during recursion, the AND operation of two bitmaps is finished with the AND operation method of polymerized form bitmap, so increase by degrees, become an index block until the last recursion of all index blocks;

(f) after a packet is received by the recursive stream sorting system, search a concordance list of the 0th grade according to this packet each field relevant with traffic classification, the index value that is obtained by the indexed results computing of some upper levels comes index next stage table, up to obtaining the afterbody indexed results, then will this data packet matched the highest rule of priority that meets the demands to it.

2, the method for claim 1 is characterized in that, described step (b) is further divided into following steps:

(b1) by the degree of polymerization q that sets the bitmap BM of N position is divided into N/q q hyte, i q hyte is corresponding to the q*i in the bitmap～q*i+q-1 position, i=0,1, ..., N/q-1, its corresponding polymerized form bitmap CBM comprises the polymerization bit position of N/q position and elongated details bit position;

(b2) judge i the q hyte of bitmap BM one by one, i=0,1 ..., N/q-1, to each q hyte, judge whether its each position is 0 entirely, if the polymerization bit position i position of described polymerized form bitmap is set to 0, otherwise the i position of this polymerization bit position is set to 1, and the details bit position that copies polymerized form bitmap CBM to is appended in the corresponding position of this q hyte in bitmap BM;

(b3) all q hytes of bitmap BM are finished dealing with after, its corresponding polymerized form bitmap CBM structure is finished.

3, method as claimed in claim 2 is characterized in that, the polymerization bit position of described polymerized form bitmap adopts the alignment of q bit boundary, and insufficient section mends 0.

4, method as claimed in claim 2, it is characterized in that, also comprise step after the described step (b3): again the q hyte number of described polymerized form bitmap details bit position is written to the front of its polymerization bit position, as the elongated length part of described polymerized form bitmap.

5, method as claimed in claim 4, it is characterized in that, the comparison algorithm of described polymerized form bitmap is: the elongated length part, polymerization bit position and the details bit position that compare polymerized form bitmap CBM1 and CBM2 successively, as long as there is a part different, directly return two results that bitmap is different, finish; If three parts are all identical, then return two results that bitmap is identical.

6, method as claimed in claim 2 is characterized in that, the method for two polymerized form bitmap CBM1 and CBM2 being carried out obtaining with computing the CBM of bitmap as a result of polymerized form is:

Each identical bits that compares the polymerization bit position of CBM1 and CBM2 successively, when comparing at every turn:

Have at least one to be 0 if participate in two positions relatively, then the identical bits of CBM polymerization bit position is set to 0;

Be 1 if participate in two positions relatively, the step-by-step AND-operation is carried out in the q hyte of the details bit position correspondence of bitmap separately in these two positions, obtain q hyte Q, if Q is 0 entirely, the identical bits of CBM polymerization bit position is set to 0, otherwise the identical bits of CBM polymerization bit position is set to 1, and is Q with this corresponding q hyte assignment of CBM details bit position and its polymerization bit position.

7, method as claimed in claim 6 is characterized in that, in the described step (b3), also the q hyte number of described polymerized form bitmap details bit position is write the front of its polymerization bit position, as the elongated length part of this polymerized form bitmap; And when carrying out the AND operation of bitmap CBM1 and CBM2, after all bit comparisons of both polymerization bit positions and handling, also the elongated length with the CBM details bit position that obtains is written to its elongated length part, obtains comprising the CBM of bitmap as a result of three parts.

8, as the described method of arbitrary claim in the claim 1 to 7, it is characterized in that, when described step (e) obtains next stage index block Chunk3 to two index block Chunk1 of upper level and Chunk2 recursion, two equivalence classes that adhere to two index block Chunk1 and Chunk2 separately are identified EqID1 and EqID2, carry out following steps:

(e1) the polymerized form bitmap with EqID1 and EqID2 correspondence carries out AND-operation, generates the bitmap CBM3 of a polymerized form that belongs to next stage index block Chunk3;

(e2) checking in the bitmap set of polymerized form bitmap CBM3 under Chunk3 whether exist, if do not exist, is that it generates new equivalence class sign, and it is joined under the Chunk3 in the bitmap set; If exist, then bitmap CBM3 uses the equivalence class sign of the bitmap identical with it;

(e3) calculate EqID1 * (the equivalence class number of Chunk2)+EqID2, as among the Chunk3 corresponding to the index value of bitmap CBM3, then the equivalence class sign of bitmap CBM3 is inserted the position of this index value correspondence among the Chunk3, formed the mapping relations of field value to this equivalence class sign;

Any two equivalence classes sign among Chunk1 and the Chunk2 after finishing dealing with as stated above, obtains final recursion Chunk3 as a result.

9, the method for claim 1 is characterized in that, described degree of polymerization q is 8,16,32,64 or 128.

10, a kind of bit-map aggregated recursive stream sorting system, comprise and be used for matching the priority index module of high rule by the classified index table according to the pretreatment module of classifying rules collection structural classification concordance list with actual data packet, described pretreatment module comprises preliminary treatment Master Control Unit, the 0th grade of concordance list structural unit and J level concordance list structural unit, described index module comprises concordance list and bitmap memory cell, Index Algorithm unit, it is characterized in that:

Described preliminary treatment Master Control Unit is used to control each index block that the 0th grade of concordance list structural unit constructed the 0th grade, control then J level concordance list structural unit successively recursion construct each index block of J level;

Described the 0th grade of concordance list structural unit comprises polymerization operator unit and bitmap compares subelement, when the index block of each rule field correspondence of structure, described polymerization operator unit is used for the bitmap BM without polymerization is carried out the polymerization that the degree of polymerization is q, obtain comprising the polymerization bit position of fixed length and the polymerized form bitmap CBM of elongated details bit position, the i position of its polymerization bit position is corresponding to i the q tuple of bitmap BM, this q tuple is that 0 o'clock this position is set to 0 entirely, otherwise this position is set to 1, and it is to constitute after 0 the q tuple entirely that its details bit position is removed by bitmap BM; And described bitmap comparison subelement is finished the comparison of these two bitmaps by the polymerized form of two bitmaps;

Described J level concordance list structural unit comprises bitmap relatively subelement and AND operation subelement, in recursion structure according to setting, the upper level index block is carried out recursion when obtaining the next stage index block, the form of bitmap after with polymerization represented, described bitmap comparison subelement is finished the comparison of these two bitmaps by the polymerized form of two bitmaps, and described AND operation subelement is finished the AND operation of two bitmaps with the AND operation method of polymerized form bitmap;

Described concordance list and bitmap memory cell are used to preserve the concordance lists at different levels that structure obtains;

After described Index Algorithm unit is used to receive a packet, each field by packet, search a concordance list of the 0th grade, according to identical recursion structure, weighted sum by the indexed results of upper level is come index next stage table as index value, up to the result who obtains the afterbody index, find the corresponding the highest rule of priority of this sign at last, at pretreatment stage this sign and the corresponding the highest regular direct correlation of priority are got up.

11, system as claimed in claim 10 is characterized in that, described polymerization operator unit is done in such a manner the polymerization computing to bitmap:

(I) by the degree of polymerization q that sets the bitmap BM of N position is divided into N/q q hyte, i q hyte is corresponding to the q*i in the bitmap～q*i+q-1 position, i=0,1, ..., N/q-1, its corresponding polymerized form bitmap CBM comprises the polymerization bit position of N/q position and elongated details bit position;

(II) judge i the q hyte of bitmap BM one by one, i=0,1 ..., N/q-1, to each q hyte, judge whether its each position is 0 entirely, if the polymerization bit position i position of described polymerized form bitmap is set to 0, otherwise the i position of this polymerization bit position is set to 1, and the details bit position that copies polymerized form bitmap CBM to is appended in the corresponding position of this q hyte in bitmap BM;

(III) all q hytes of bitmap BM are finished dealing with after, its corresponding polymerized form bitmap CBM structure is finished.

12, system as claimed in claim 11, it is characterized in that, described polymerization operator unit is when the polymerization computing of carrying out bitmap, in described step (III), also the q hyte number of the details bit position of described polymerized form bitmap CBM is written to the front of its polymerization bit position, as the elongated length part of described polymerized form bitmap CBM.

13, system as claimed in claim 12, it is characterized in that, when described bitmap comparison subelement compares the bitmap of two polymerized forms, the elongated length part, polymerization bit position and the details bit position that compare these two bitmaps successively, as long as there is a part different, just directly return two results that bitmap is different, finish; If three parts are all identical, then return two results that bitmap is identical.

14, system as claimed in claim 10 is characterized in that, described AND operation subelement carries out the method that AND operation obtains the CBM of bitmap as a result of polymerized form to two polymerized form bitmap CBM1 and CBM2 and is:

15, system as claimed in claim 14, it is characterized in that, when described bitmap comparison subelement carries out the comparison of the bitmap represented with polymerized form, in step (III), also the elongated length of the details bit position of polymerized form bitmap CBM is write the front of its polymerization bit position, as its elongated length part; Correspondingly, when described AND operation subelement carries out with the AND operation of polymerized form bitmap, with after all bit comparisons of bitmap CBM1 and CBM2 polymerization bit position and handling, also the elongated length of the details bit position of polymerized form bitmap CBM is written to the elongated length part of this CBM.