CN100521655C

CN100521655C - Dynamic sharing device of physical queue based on the stream queue

Info

Publication number: CN100521655C
Application number: CNB2006101655909A
Authority: CN
Inventors: 胡成臣; 刘斌; 唐毅
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2006-12-22
Filing date: 2006-12-22
Publication date: 2009-07-29
Anticipated expiration: 2026-12-22
Also published as: CN101009646A

Abstract

The invention belongs to the technical field of business flow management in a computer network, and is characterized in that it contains: the device contains: a physical storage device RAM, and a writing module integrated on the FPGA/ASIC, a scheduling module, a RAM management module, and a dynamic sharing list , a dynamic sharing module; wherein, the physical storage device RAM is used to cache data packets and the next pointer of each data block; the data packets are written into the address corresponding to the RAM; the scheduling module is active according to the scheduled scheduling mode Scheduling in the flow of state; RAM management module manages the data structure of data packets stored in RAM according to the set method; dynamic sharing list maintains the mapping relationship between active flow and physical queue; dynamic sharing provides a fast search mechanism , the insertion mechanism, and the deletion mechanism. The device reduces the required physical queue and realizes queuing according to flow.

Description

Per-flow physical queue dynamic sharing device

技术领域 technical field

本发明是一种实现线速对业务流进行按每流排队(per-flow queueing)的物理队列动态共享装置，可以应用于高速宽带网络转发设备中以实现服务质量(QoS)保证，属于计算机网络领域。The present invention is a physical queue dynamic sharing device that implements per-flow queuing (per-flow queuing) for business flows at wire speed, and can be applied to high-speed broadband network forwarding equipment to realize service quality (QoS) guarantee, and belongs to computer network field.

背景技术 Background technique

通过在路由器、交换机等计算机网络转发设备中采用按每流排队(per-flow queuing)的方式进行缓存管理，可以严格保证不同业务流之间的服务质量(Quality of Service，QoS)。传统意义上，按照每流排队需要物理隔离不同的流进行管理，也就是说要为每个流单独维护一个物理队列。通过网络测量等手段发现，高速宽带网络中(如2.5Gbps，10Gbps带宽的网络)可以同时共存百万个以上的不同的活跃业务流，从而需要为每流排队的管理方式维护上百万个物理队列，而这将使得管理单元占用过多的资源和耗费过多的处理时间，使得难以实现。因此，尽管按每流排队的缓存管理方式可以提供QoS保证，但是传统的观念却被认为不具有可扩展性而无法在高速宽带网络中实现。By adopting per-flow queuing (per-flow queuing) cache management in computer network forwarding devices such as routers and switches, the Quality of Service (QoS) between different business flows can be strictly guaranteed. Traditionally, queuing per flow requires physical isolation of different flows for management, that is to say, a separate physical queue must be maintained for each flow. Through network measurement and other means, it is found that in a high-speed broadband network (such as a network with a bandwidth of 2.5Gbps and 10Gbps), more than one million different active service flows can coexist at the same time, so it is necessary to maintain millions of physical service flows for each flow queuing management method Queue, and this will make the management unit take up too many resources and consume too much processing time, making it difficult to implement. Therefore, although the buffer management method of queuing up per flow can provide QoS guarantee, the traditional concept is considered not scalable and cannot be implemented in high-speed broadband networks.

尽管高速宽带中可以共存百万个以上的不同的活跃业务流，但是通过对真实的网络数据仿真试验，发现任何一个时刻长度不为空的队列数目其实不超过几百或几千的数量级。直观的解释：每一个数据包在路由器等转发设备中存储的时间很短(ms级，甚至ns级)，而当同一流分组之间到达的间隔大于这个数量级时，会出现队列暂时为空的情况。本发明基于这一事实，提出一种可扩展装置，在物理上只维护少量队列(如，小于1k个队列)，通过对于队列的动态共享的方式，保证在任何时刻，不同的活跃业务流在转发设备中能够分配到单独的队列，避免一个队列中存在多个流，从而实现按每流排队，保证服务质量。Although more than one million different active service flows can coexist in high-speed broadband, through simulation experiments on real network data, it is found that the number of queues whose length is not empty at any time does not exceed the order of hundreds or thousands. Intuitive explanation: each data packet is stored in a forwarding device such as a router for a very short time (ms level, or even ns level), and when the arrival interval between the same flow packets is greater than this order of magnitude, the queue will be temporarily empty Condition. Based on this fact, the present invention proposes an expandable device that physically only maintains a small number of queues (for example, less than 1k queues), and ensures that at any time, different active service flows are The forwarding device can be assigned to a separate queue to avoid multiple flows in one queue, thereby realizing queuing per flow and ensuring the quality of service.

发明内容 Contents of the invention

每流排队存储管理系统中，为了确定需要预留多少资源，可以通过以下方法测量链路上传输的流的数量：1)流分类根据数据分组头部中的5元组来对分组进行分类，不同的数据分组归属不同的流；2)判断是否为正在传输中的流。当一个流的首个数据分组达到时，认为流开始传输；为检测流的传输是否结束，设置τ超时时间。如果τ时间内该流无数据分组到达，则认为流的传输过程结束。测量时设置τ为60秒甚至更长。测量结果表明通过节点的流的数量可达到几十万甚至上百万个。基于该结果，如果每流排队系统为每个流维护一个队列，则队列的数目也将达到几十万甚至上百万个，使每流排队的存储管理在高速路由器中变得难以实现。然而事实上，在流传输的过程中，某些时刻一些流对应的队列并没有存储数据分组，此时虽然它们的传输并未结束，但是在这些时刻，数据分组存储系统没有必要为其保存队列。实验和建模的结果表明，实际被占用的队列数远远小于同时进行传输的流的数量。基于此，数据分组存储系统中只需实现少量的队列，而通过动态物理队列的方式，达到每流排队的目的。In the per-flow queuing storage management system, in order to determine how many resources need to be reserved, the number of flows transmitted on the link can be measured by the following methods: 1) Flow classification Classify the packets according to the 5-tuple in the data packet header, Different data packets belong to different streams; 2) judging whether it is a stream being transmitted. When the first data packet of a flow arrives, it is considered that the flow starts to transmit; in order to detect whether the transmission of the flow ends, τ timeout is set. If no data packet arrives in the flow within τ time, the transmission process of the flow is considered to be over. When measuring, set τ to 60 seconds or even longer. Measurements show that the number of flows passing through nodes can reach hundreds of thousands or even millions. Based on this result, if the per-flow queuing system maintains one queue for each flow, the number of queues will also reach hundreds of thousands or even millions, making the storage management of per-flow queuing difficult to implement in high-speed routers. However, in fact, in the process of stream transmission, the queues corresponding to some streams do not store data packets at certain moments. Although their transmission is not over at this time, at these moments, the data packet storage system does not need to save queues for them. . Experimental and modeling results show that the actual number of occupied queues is much smaller than the number of concurrently transmitting streams. Based on this, only a small number of queues need to be implemented in the data packet storage system, and the purpose of queuing for each flow is achieved through the way of dynamic physical queues.

本发明提出一种可适用于高速宽带网络的可线速实现按每流排队的可扩展的装置，通过动态共享的机制，在少量物理队列中实现大量业务流的按每流排队，保证在任何时刻同一物理队列中不会缓存不同的业务流。定义两种流的状态：1)活跃状态。当一个流在物理队列中存储有分组，则认为该流处于活跃状态；2)静默状态。当一个流在网络转发设备中存储的所有分组都已经被转发，在没有属于该流的新的分组到达之前，认为该流处于静默状态。只有当一个流活跃时才占用一个物理队列，相应地可以建立其流号(即当前活跃的流)f_n和物理队列PQ_q间的映射关系f_n→PQ_q，集合{f_n→PQ_q}称为活跃流映射。当流从活跃变为静默的时候，需要将其对应的流号到物理队列的映射从活跃流映射中删除，并释放对应的物理队列；当流从静默变为活跃时，需要为其分配一个新的物理队列，在活跃流列表中添加其流号到物理队列的映射。The present invention proposes an expandable device applicable to high-speed broadband networks that can realize queuing per flow at a wire speed. Through a dynamic sharing mechanism, a large number of service flows can be queuing per flow in a small number of physical queues, ensuring that the queuing of a large number of service flows is performed in any Different service flows will not be cached in the same physical queue at any time. Two stream states are defined: 1) active state. When a flow stores packets in the physical queue, it is considered that the flow is in an active state; 2) a silent state. When all packets stored in a network forwarding device for a flow have been forwarded, the flow is considered to be in a silent state until no new packets belonging to the flow arrive. A physical queue is occupied only when a flow is active, and correspondingly, the mapping relationship between its flow number (that is, the current active flow) f _n and the physical queue PQ _q can be established f _n →PQ _q , and the set {f _n →PQ _q } is called an active flow map. When a flow changes from active to silent, the mapping from the corresponding flow number to the physical queue needs to be deleted from the active flow mapping, and the corresponding physical queue should be released; when the flow changes from silent to active, it needs to be assigned a For a new physical queue, add the mapping of its flow number to the physical queue in the active flow list.

本发明装置的系统结构图如图1共存流数目随时间的变化曲线，其中实线是OC-192的曲线，虚线是OC-48的曲线。The system structure diagram of the device of the present invention is shown in Fig. 1 as the change curve of the number of coexisting streams with time, wherein the solid line is the curve of OC-192, and the dotted line is the curve of OC-48.

图2占用物理队列数目随时间的变化曲线，其中下三角标记曲线是OC-192的曲线，上三角标记曲线是OC-48的曲线。Fig. 2 is a curve of the number of occupied physical queues over time, in which the lower triangle marked curve is the curve of OC-192, and the upper triangle marked curve is the curve of OC-48.

图3利用链表结构组织动态共享列表的子表时的操作复杂度，其中下三角标记曲线是OC-192的曲线，上三角标记曲线是OC-48的曲线；(a)图表示查找已存在活跃流的情况，(b)图是插入一个新的活跃流到物理队列映射的情况。Figure 3 is the operational complexity when using the linked list structure to organize the sub-tables of the dynamic shared list, where the lower triangle marked curve is the curve of OC-192, and the upper triangle marked curve is the curve of OC-48; (a) shows that the search has been active In the case of a flow, Figure (b) is the case of inserting a new active flow into the physical queue map.

图4利用二分查找树组织动态共享列表的子表时的操作复杂度，其中下三角标记曲线是OC-192的曲线，上三角标记曲线是OC-48的曲线；(a)图表示查找已存在活跃流的情况，(b)图是插入一个新的活跃流到物理队列映射的情况。Fig. 4 Operational complexity when using a binary search tree to organize the sub-tables of a dynamic shared list, where the lower triangle marked curve is the curve of OC-192, and the upper triangle marked curve is the curve of OC-48; (a) shows that the search already exists In the case of an active flow, Figure (b) is the case of inserting a new active flow into the physical queue map.

图5动态共享管理模块返回业务流对应物理队列的平均时间，其中上三角标记曲线是动态共享子表采用链表的组织方式时的情况，正方形标记曲线是动态共享子表采用二分查找树的组织方式时的情况；(a)图表示OC-48情况，(b)图是OC-192的情况。Figure 5. The average time for the dynamic sharing management module to return to the physical queue corresponding to the business flow. The upper triangle mark curve is the situation when the dynamic share sub-table adopts the linked list organization method, and the square mark curve is the dynamic share sub-table adopts the binary search tree organization method. The situation at that time; (a) Figure shows the situation of OC-48, (b) Figure is the situation of OC-192.

图6利用二分查找树组织动态共享列表的子表时，动态共享管理模块返回业务流对应物理队列的平均时间的累积分布函数。In Fig. 6, when the sub-tables of the dynamic sharing list are organized by binary search tree, the dynamic sharing management module returns the cumulative distribution function of the average time of the physical queue corresponding to the service flow.

图7所示，其中，Figure 7, where,

RAM是存储数据分组的物理存储器件；RAM is a physical storage device that stores data packets;

RAM管理模块对数据分组在RAM中存储的数据结构进行管理，包括，存储器的划分，空闲块的分配回收，向写入模块提供空闲块写入地址，向调度模块提供被调度队列的队头元素的地址。The RAM management module manages the data structure of data packets stored in RAM, including memory division, allocation and recovery of free blocks, providing free block write addresses to the writing module, and providing queue head elements of the dispatched queue to the scheduling module the address of.

动态共享列表是用来进行存储和维护活跃流映射关系f_n→PQq。The dynamic shared list is used to store and maintain the active flow mapping relationship f _n → PQq.

动态共享模块1)根据写入模块提供的流号f_n，根据动态共享列表里存在的信息判断当前是否存在物理队列对应流f_n，存在则返回其对应的物理队列号PQ_q；否则返回一个空闲队列号给写入模块，并在动态共享列表为其添加一条新的映射关系。2)在收到调度模块返回的某一物理队列变空的消息后，从动态共享列表里删除该物理队列对应的映射表项。Dynamic sharing module 1) According to the flow number f _n provided by the writing module, and according to the information existing in the dynamic sharing list, judge whether there is currently a physical queue corresponding to flow f _n , and return its corresponding physical queue number PQ _q if it exists; otherwise, return a The idle queue number is given to the writing module, and a new mapping relationship is added to it in the dynamic shared list. 2) After receiving the message that a certain physical queue becomes empty returned by the scheduling module, delete the mapping entry corresponding to the physical queue from the dynamic shared list.

写入模块负责到达分组的写入。首先向动态共享模块请求当前到达分组所在流的对应物理队列，然后根据动态共享模块返回的结果，从RAM管理模块得到该物理队列的队尾写入地址，并将数据分组写入RAM对应的地址中。The write module is responsible for the writing of arriving packets. First, request to the dynamic sharing module the corresponding physical queue of the flow where the currently arriving packet is located, and then obtain the tail write address of the physical queue from the RAM management module according to the result returned by the dynamic sharing module, and write the data packet into the address corresponding to the RAM middle.

调度模块负责数据分组的调度。先根据调度算法得到应该获得调度的物理队列，然后从RAM管理模块获取物理队列队头的读出地址，并将其从RAM中读出。当物理队列经过调度变空时，调度模块通知动态共享模块从动态共享列表中删除该物理队列对应的映射表项。The scheduling module is responsible for the scheduling of data packets. First obtain the physical queue that should be scheduled according to the scheduling algorithm, then obtain the read address of the head of the physical queue from the RAM management module, and read it out from the RAM. When the physical queue becomes empty after scheduling, the scheduling module notifies the dynamic sharing module to delete the mapping entry corresponding to the physical queue from the dynamic sharing list.

本发明的处理流程可以如下概括为：Processing flow of the present invention can be summarized as follows:

本发明解决如何使链路中(尤其是主干网中)数以百万计的业务流共享网络转发设备中有限的物理队列资源，实现按每流排队，其挑战主要在于：The present invention solves how to make millions of service flows in the link (especially in the backbone network) share the limited physical queue resources in the network forwarding equipment, and realize queuing according to each flow. The challenges mainly lie in:

1)从高速网络转发设备处理时间的尺度上来看，在高速网络转发设备中缓存的业务流是随机变化的，因此需要及时响应当前活跃流的变化。本发明提供了有效的机制维护和更新活跃流和物理队列之间的映射关系。1) From the perspective of the processing time scale of the high-speed network forwarding equipment, the service flow cached in the high-speed network forwarding equipment changes randomly, so it is necessary to respond to the change of the current active flow in time. The invention provides an effective mechanism to maintain and update the mapping relationship between active flows and physical queues.

2)在高速网络中，数据分组之间到达的间隔很小，因此需要快速的查找机制定位到达分组所应存入的物理队列。本发明提供了快速算法查找活跃流和物理队列之间的映射关系。2) In a high-speed network, the arrival interval between data packets is very small, so a fast search mechanism is needed to locate the physical queue where the arriving packets should be stored. The invention provides a fast algorithm to find the mapping relationship between the active flow and the physical queue.

3)存储器带宽和容量是高速网络转发设备的潜在瓶颈之一，RAM的管理对于存储器带宽和容量的影响巨大。本发明了采用高效的数据结构和控制方式维护RAM的管理。3) Memory bandwidth and capacity are one of the potential bottlenecks of high-speed network forwarding equipment, and RAM management has a huge impact on memory bandwidth and capacity. The invention adopts efficient data structure and control mode to maintain the management of RAM.

本发明的特征在于，该装置含有：物理存储器件RAM，以及集成在FPGA/ASIC上的写入模块、调度模块、RAM管理模块、动态共享列表、动态共享模块；其中，The present invention is characterized in that the device contains: a physical storage device RAM, and a writing module integrated on the FPGA/ASIC, a scheduling module, a RAM management module, a dynamic sharing list, and a dynamic sharing module; wherein,

物理存储器件RAM，用于缓存数据分组和每个数据块的下一指针，采用SRAM，DRAM或者其他物理存储设备中的任何一种；The physical storage device RAM is used to cache data packets and the next pointer of each data block, using any one of SRAM, DRAM or other physical storage devices;

写入模块，设有活跃流的到达分组输入端，当前到达分组所在流的对应物理队列输入端，与动态共享模块的相应输出端相连，用于把所述物理队列写入物理存储器件RAM中的空闲块的队尾写入地址输入端，与RAM管理模块相应输出端相连；还设有，所述对应物理队列的请求信号输出端，与动态共享模块的相应输入端相连；所述对应物理队列的队尾写入地址请求信号输出端，与RAM管理模块相应输入端相连；数据分组输出端，与物理存储器件RAM的相应输入端相连；所述写入模块先向动态共享模块请求当前到达分组所在流的对应物理队列，然后根据动态共享模块返回的结果，从RAM管理模块得到该物理队列的队尾写入地址，并将数据分组写入RAM对应的地址中；The writing module is provided with the input terminal of the arriving packet of the active flow, the input terminal of the corresponding physical queue of the flow where the current arriving packet is located, and is connected with the corresponding output terminal of the dynamic sharing module, and is used to write the physical queue into the physical storage device RAM The tail write address input end of the free block of the free block is connected to the corresponding output end of the RAM management module; it is also provided that the request signal output end of the corresponding physical queue is connected to the corresponding input end of the dynamic shared module; the corresponding physical The tail of the queue writes the address request signal output terminal, which is connected with the corresponding input terminal of the RAM management module; the data packet output terminal is connected with the corresponding input terminal of the physical storage device RAM; The corresponding physical queue of the flow where the packet is located, and then according to the result returned by the dynamic sharing module, the write address of the tail of the physical queue is obtained from the RAM management module, and the data packet is written into the address corresponding to the RAM;

调度模块，用于数据分组的调度，设有：当前应该获得调度的物理队列输入端，与动态共享模块的相应输出端相连；所述物理队列队头读出地址输入端与RAM管理模块相应输出端相连；与所述物理队列队头读出地址对应的物理队列读入端与物理存储器件RAM的相应读出端相连；还设有：活跃流输出端，分别为向动态共享模块、RAM管理模块、物理存储器件RAM发出的请求信号输出端，其中包括向动态共享模块发出的，请求从动态共享列表中删除经过调度已经变空的物理队列所对应的映射表象请求信号；该调度模块按照设定的调度方式在处于活跃状态的流中进行调度，得到应该获得调度的物理队列，然后从RAM管理模块获取物理队列队头的读出地址，并将其从物理存储器件RAM中读出对应的数据分组；所属活跃状态是指一个流的对应的物理队列中存储有数据分组；The scheduling module is used for the scheduling of data packets, and is provided with: the input terminal of the physical queue that should be scheduled at present is connected with the corresponding output terminal of the dynamic sharing module; the read address input terminal of the head of the physical queue and the corresponding output of the RAM management module The corresponding physical queue read-in end of the read-out address of the head of the physical queue is connected with the corresponding read-out end of the physical storage device RAM; it is also provided with: an active flow output end, respectively to the dynamic shared module, RAM management The request signal output terminal that module, physical storage device RAM sends, comprises sending to dynamic sharing module, request deletes the corresponding mapping appearance request signal of the physical queue that has become empty through scheduling from dynamic sharing list; This scheduling module according to setting Schedule in the active stream according to the specified scheduling method, obtain the physical queue that should be scheduled, and then obtain the read address of the head of the physical queue from the RAM management module, and read the corresponding address from the RAM of the physical storage device Data packet; the active state refers to the data packet stored in the corresponding physical queue of a flow;

RAM管理模块，在与物理存储器件RAM、写入模块、调度模块分别相连的同时，按照设定的方式对数据分组在RAM中存储的数据结构进行管理，包括，存储器的划分，空闲块的分配回收，向写入模块提供空闲块写入地址，向调度模块提供被调度队列的队头元素的地址；The RAM management module, while being connected to the physical storage device RAM, the writing module, and the scheduling module, manages the data structure of the data group stored in the RAM according to the set method, including the division of the memory and the allocation of free blocks Recycling, providing the free block writing address to the writing module, and providing the dispatching module with the address of the head element of the dispatched queue;

动态共享列表维护活跃流A和物理队列PQ之间一对一的映射关系，A和PQ中的元素的个数相同；The dynamic shared list maintains a one-to-one mapping relationship between the active flow A and the physical queue PQ, and the number of elements in A and PQ is the same;

动态共享模块与动态共享列表互连；该模块提供快速的查找机制，负责从动态共享列表中找到活跃流对应的物理队列；还提供快速的插入机制，当新到达一个活跃流的分组时，负责插入一条新的活跃流到物理队列的映射到动态共享列表中；此外还提供快速的删除机制，当一个流所分配的物理队列变空时，负责在动态共享列表中删除一条活跃流到物理队列的映射。The dynamic sharing module is interconnected with the dynamic sharing list; this module provides a fast search mechanism, which is responsible for finding the physical queue corresponding to the active flow from the dynamic sharing list; it also provides a fast insertion mechanism, and is responsible for when a new packet of an active flow arrives. Insert a new active flow to the physical queue and map it to the dynamic shared list; in addition, it also provides a fast deletion mechanism. When the physical queue allocated by a flow becomes empty, it is responsible for deleting an active flow to the physical queue in the dynamic shared list. mapping.

所述的调度模块按照轮转的方式在一个以上的活跃的物理队列中进行调度。The scheduling module schedules in more than one active physical queue in a round-robin manner.

所述的RAM管理模块把RAM存储空间和分成大小固定的存储块，其大小等于一个数据传输单元的大小。The RAM management module divides the RAM storage space into fixed-size storage blocks, whose size is equal to the size of a data transmission unit.

所述的RAM管理模块采用栈结构方式实现空闲存储块的分配和回收；空闲块栈则在数据分组离开并归还空闲块后，栈顶指针向上移动，而当数据分组到达，分配了空闲块，空闲块栈顶指针向下移动。The RAM management module adopts a stack structure to realize the allocation and recovery of free storage blocks; the free block stack moves upward after the data packets leave and return the free blocks, and when the data packets arrive, the free blocks are allocated, The free block top pointer moves down.

所述的空闲块的分配与回收，当数据分组到达时，RAM管理模块从栈顶取出空闲块地址分配空闲块，将数据分组存入；当数据分组离开时，占用的存储块变为空闲块，RAM管理模块又将其地址写入栈结构，收回空闲块。The allocation and recovery of the free block, when the data packet arrives, the RAM management module takes out the free block address from the top of the stack to allocate the free block, and stores the data packet in; when the data packet leaves, the occupied storage block becomes a free block , the RAM management module writes its address into the stack structure again, and reclaims the free block.

所述的RAM管理模块采用链表方式实现物理队列的管理，对于每一个物理队列，维护一个头部指针，指向物理队列的头部；维护一个尾部指针，指向物理队列的尾部；物理队列的每个数据块中保存一个下一指针，指向下一个数据块。Described RAM management module adopts linked list mode to realize the management of physical queue, for each physical queue, maintain a head pointer, point to the head of physical queue; Maintain a tail pointer, point to the tail of physical queue; A next pointer is saved in the data block, pointing to the next data block.

所述的RAM管理模块在写入模块写入一个新的数据块时更新队尾指针，在调度模块调度走一个数据块时更新队头指针。The RAM management module updates the queue tail pointer when the writing module writes a new data block, and updates the queue head pointer when the scheduling module schedules a data block.

所述的动态共享列表包括一个直接寻址存储表和一系列动态共享子表；该直接寻址存储表的每一个位置包括一个比特的标识位和一个指向子表的指针；其中，标识位为“1”，当某一位置存在子表，并且指向子表的指针有效的；标识位为“0”，当某一位置不存在子表，并且指向子表的指针为空指针。The dynamic shared list includes a direct addressable storage table and a series of dynamic shared sub-tables; each position of the direct addressable storage table includes a one-bit identification bit and a pointer to the sub-list; wherein, the identification bit is "1", when there is a subtable in a certain location, and the pointer to the subtable is valid; the flag is "0", when there is no subtable in a certain location, and the pointer to the subtable is a null pointer.

所述的直接寻址存储表的位置，由到达分组的流号经过一次哈希函数之后的值确定；其中，哈希函数包括是MD-5或SHA-1等任意均匀哈希函数。The location of the direct addressing storage table is determined by the value of the flow number of the arriving packet after a hash function; wherein, the hash function includes any uniform hash function such as MD-5 or SHA-1.

所述的动态共享子表可以用链表的方式进行组织。The dynamic shared sublist can be organized in the form of a linked list.

所述的动态共享子表可以用二分查找树的的方式进行组织。The dynamic shared subtable can be organized in the form of a binary search tree.

所述的动态共享子表可以用平衡二分查找树的的方式进行组织。The dynamic shared subtable can be organized in a balanced binary search tree.

所述的动态共享模块的查找机制，指的是，先根据到达分组的流号经过哈希函数确定要查找的动态共享子表，再在相应的动态共享子表进行查找。The search mechanism of the dynamic sharing module refers to firstly determining the dynamic sharing subtable to be searched according to the flow number of the arriving packet through a hash function, and then searching in the corresponding dynamic sharing subtable.

所述的在相应的动态共享子表进行查找的机制，针对权利要求0所述的链表组织方式，采用线型表顺序查找的方式进行。The mechanism for searching in the corresponding dynamic shared sub-table is based on the linked list organization method described in claim 0, and uses the sequential search method of the linear table.

所述的在相应的动态共享子表进行查找的机制，当采用二分查找树或者二分平衡查找树时，先比较要查找的流号值和根结点上存储的流号的值的大小，大于则转向右子树，小于则转向左子树，再比较子树根结点上存储的流号的值，依此类推，直到找到要找的流号。The mechanism for searching in the corresponding dynamic shared sub-table, when using a binary search tree or a binary balanced search tree, first compare the value of the stream number to be searched with the value of the stream number stored on the root node, which is greater than Then turn to the right subtree, if it is less than, turn to the left subtree, then compare the value of the stream number stored on the root node of the subtree, and so on, until you find the stream number you are looking for.

所述的动态共享模块的插入机制，先根据分组流号经过哈希函数的值确定要插入的动态共享子表，再将要插入的节点插入到相应的动态共享子表。The insertion mechanism of the dynamic sharing module firstly determines the dynamic sharing subtable to be inserted according to the value of the packet flow number through the hash function, and then inserts the node to be inserted into the corresponding dynamic sharing subtable.

所述的将要插入的节点插入到相应的动态共享子表机制，当采用链表组织方式时，采用线型表顺序插入的方式在动态共享子表的链表的尾部插入。The described mechanism of inserting the node to be inserted into the corresponding dynamic shared sublist is inserted at the end of the linked list of the dynamic shared sublist in a linear table sequential insertion manner when the linked list organization mode is adopted.

所述的将要插入的节点插入到相应的动态共享子表机制，当采用二分查找树的组织方式时，先比较要插入表项中的流号值和根结点上存储的流号的值的大小，大于则转向右子树，小于则转向左子树，再比较子树根结点上存储的流号的值，依此类推，直到找到叶子结点，如果要插入表项中的流号值大于叶子结点上存储的流号的值，则插入在右孩子，否则插入在左孩子。The described mechanism of inserting the node to be inserted into the corresponding dynamic shared subtable, when using the binary search tree organization method, first compare the value of the stream number to be inserted into the entry with the value of the stream number stored on the root node size, if it is larger, turn to the right subtree, if it is smaller, turn to the left subtree, then compare the value of the flow number stored on the root node of the subtree, and so on, until the leaf node is found, if you want to insert the flow number in the entry If the value is greater than the value of the flow number stored on the leaf node, it will be inserted in the right child, otherwise it will be inserted in the left child.

所述的将要插入的节点插入到相应的动态共享子表机制，当采用的平衡二分查找树的组织方式时，先按照二分查找树组织方式下的方式插入要插入的子表表项，然后再调整二分查找树，使之每一个节点的左右子树的深度差不超过1。The described mechanism of inserting the node to be inserted into the corresponding dynamic shared subtable, when using the balanced binary search tree organization method, first insert the subtable entry to be inserted according to the method under the binary search tree organization method, and then Adjust the binary search tree so that the depth difference between the left and right subtrees of each node does not exceed 1.

所述的动态共享模块的删除机制，先根据分组流号经过一次哈希函数的值确定要删除的节点所在的动态共享子表，再在相应的动态共享子表进行删除。The deletion mechanism of the dynamic sharing module first determines the dynamic sharing sub-table where the node to be deleted is located according to the value of the hash function of the packet flow number, and then deletes in the corresponding dynamic sharing sub-table.

所述的公台共享模块对相应的动态共享子表进行删除的机制，当采用链表组织方式时，用线性表顺序查找方式找到要删除的节点，然后删除。The public station sharing module deletes the corresponding dynamic shared sub-table. When the linked list organization mode is adopted, the node to be deleted is found by the linear list sequential search mode, and then deleted.

所述的动态共享模块在相应的动态共享子表进行删除的机制，当采用二分查找树的组织方式时，先以二分查找树的查找方式找到要删除的节点，然后对其进行删除；其中，当删除的节点是叶子结点时，只需将其父节点指向NULL空指针；当删除的节点只有一个左(或右)孩子节点时，用其父节点直接指向要删除节点的左(或右)孩子；当删除的节点既有左孩子又有右孩子时，将删除的节点的右子树作为其左子树中值最大的节点的右子树，然后将调整后的左子树作为其父节点的子树。The mechanism for deleting the corresponding dynamic sharing subtable of the dynamic sharing module, when adopting the binary search tree organization method, first finds the node to be deleted by the binary search tree search method, and then deletes it; wherein, When the deleted node is a leaf node, just point its parent node to a NULL pointer; when the deleted node has only one left (or right) child node, use its parent node to directly point to the left (or right) of the node to be deleted. ) children; when the deleted node has both left and right children, the right subtree of the deleted node is used as the right subtree of the node with the largest value in its left subtree, and then the adjusted left subtree is used as its The subtree of the parent node.

所述的动态共享模块在相应的动态共享子表进行删除的机制，当采用二分查找树的组织方式时，先以二分查找树的删除方式删除要删除的节点，然后再调整二分查找树，使之每一个节点的左右子树的深度差不超过1。The mechanism that the described dynamic sharing module deletes in the corresponding dynamic sharing subtable, when adopting the organizational mode of the binary search tree, first deletes the node to be deleted with the deletion mode of the binary search tree, and then adjusts the binary search tree so that The depth difference between the left and right subtrees of each node does not exceed 1.

所述的动态共享列表可以用内容寻址存储器CAM代替。The dynamic shared list can be replaced by content addressable memory CAM.

所述装置按以下步骤实现物理队列的共享：The device realizes the sharing of physical queues according to the following steps:

步骤1：当有一个数据分组进入写入模块，写入模块用数据分组的流号向动态共享模块请求相应的物理队列；Step 1: When a data packet enters the writing module, the writing module uses the flow number of the data packet to request the corresponding physical queue from the dynamic sharing module;

步骤2：动态共享模块从动态共享列表查找相应流号所对应的表项，若存在则返回物理队列写入地址给写入模块；否则，分配一个新的物理队列给该流，返回物理队列号给写入模块，并在动态共享列表写入这条新的表项；Step 2: The dynamic sharing module looks up the entry corresponding to the corresponding flow number from the dynamic sharing list, and returns the physical queue write address to the writing module if it exists; otherwise, assigns a new physical queue to the flow and returns the physical queue number Write to the module, and write this new entry in the dynamic shared list;

步骤3：写入模块得知相应物理队列写入地址之后，向RAM管理模块获得一个空闲块写入RAM，RAM管理模块更新该物理队列的尾指针和空闲块栈顶指针；Step 3: After the write module knows the write address of the corresponding physical queue, it obtains a free block from the RAM management module and writes it into RAM, and the RAM management module updates the tail pointer and the free block stack top pointer of the physical queue;

步骤4：调度模块根据调度算法调度各个物理队列，向RAM管理模块询问当前被调度队列的队头指针，从RAM中读出；同时RAM管理模块回收此数据块加入空闲块栈，修改空闲块栈顶指针，并修改相应物理队列的队头指针；Step 4: the scheduling module schedules each physical queue according to the scheduling algorithm, asks the RAM management module for the queue head pointer of the currently scheduled queue, and reads it from the RAM; at the same time, the RAM management module reclaims the data block and adds it to the free block stack, and modifies the free block stack Top pointer, and modify the head pointer of the corresponding physical queue;

步骤5：当当前被调度队列变为空时，调度模块通知动态共享模块，删除其在动态共享管理列表上表项。Step 5: When the currently scheduled queue becomes empty, the scheduling module notifies the dynamic sharing module to delete its entry in the dynamic sharing management list.

本发明提出的可扩展的装置，通过动态共享的机制，在只提供少量物理队列资源的情况下，仍能保证在任何时刻同一物理队列中不会缓存不同的业务流，从而实现了高速宽带网络的可线速实现按每流排队。试验结果表明，60％的情况下，一个时钟周期就可以返回结果，完成队列共享；在90％的情况下，可以保证2个时钟周期返回结果，完成队列共享。The scalable device proposed by the present invention, through a dynamic sharing mechanism, can still ensure that different service flows will not be cached in the same physical queue at any time, thereby realizing a high-speed broadband network. A wire-speed implementation of queuing per flow. The test results show that in 60% of the cases, the result can be returned in one clock cycle, and the queue sharing can be completed; in 90% of the cases, the result can be returned in 2 clock cycles, and the queue sharing can be completed.

实验验证Experimental verification

为验证本发明的装置方案，本发明进行了相关的实验，实验选取两段来自NLANR(http://pma.nlanr.net/Special/)的实际网络业务流量数据：1)OC-48：链路容量为2.5Gbps。2)OC-192：链路容量为10Gbps。其中，同时传输的流的数目如图1所示，OC-48上统计的流数在300,000上下波动，OC-192统计得到的流数在360,000上下波动。In order to verify the device solution of the present invention, the present invention has carried out related experiments, and the experiment selects two sections of actual network traffic data from NLANR (http://pma.nlanr.net/Special/): 1) OC-48: link The road capacity is 2.5Gbps. 2) OC-192: The link capacity is 10Gbps. Among them, the number of flows transmitted at the same time is shown in Figure 1. The number of flows counted by the OC-48 fluctuates around 300,000, and the number of flows counted by the OC-192 fluctuates around 360,000.

仿真按分组的到达时刻将其存入不同的队列进行排队，调度在被占用的物理队列间公平地分配带宽，采用轮询方式，每个物理队列每次得到调度后转发固定长度的字节(1500字节)，然后轮询至下一个物理队列。实验统计的性能参数如下：The simulation stores the packets in different queues according to their arrival time, and the scheduling allocates the bandwidth fairly among the occupied physical queues. The round-robin method is adopted, and each physical queue forwards a fixed-length byte ( 1500 bytes), then poll to the next physical queue. The performance parameters of the experimental statistics are as follows:

1)平均速率。某个时间段内通过节点的数据量和时间长度的比值，用s表示；1) Average speed. The ratio of the amount of data passing through a node to the length of time in a certain period of time is represented by s;

2)共存的流数。对流设置τ时间的超时，对正在传输的流数进行统计，用N_s(τ)表示；2) The number of streams that coexist. Set the timeout of τ time for the stream, and count the number of streams being transmitted, represented by N _s (τ);

3)活跃流数，某个时刻占用队列进行排队的流的数量，即被占用的队列的数量，用N_a表示。3) The number of active streams, the number of streams occupying the queue for queuing at a certain moment, that is, the number of occupied queues, represented by _Na .

统计过程每25ns对N_a和N_s(τ)进行一次采样，相应地得出某个时间段(例如1秒或者10分钟)内的Max{N_a}和Max{N_s(τ)}。系统的负载L定义为真实业务流量的平均速率和出口带宽的比值，由于真实业务流量的平均速率无法改变，只能通过改变出口带宽来调整仿真的负载L以统计不同负载情况下的性能参数。如图2所示，为系统负载L设定为0.97时，物理队列占用随时间变化的情况。从图中可以看出，OC-48统计得到的流数最大仅为89，OC-192最大仅为406。将图2和图1中的原始共存流的数目比较，通过本发明的装置，使得需要设定的物理队列的数量大大减小。The statistical process samples N _a and N _s (τ) every 25 ns, and accordingly obtains Max{N _a } and Max{N _s (τ)} within a certain period of time (eg, 1 second or 10 minutes). The load L of the system is defined as the ratio of the average rate of real business traffic to the egress bandwidth. Since the average rate of real business traffic cannot be changed, the simulated load L can only be adjusted by changing the egress bandwidth to count performance parameters under different load conditions. As shown in Figure 2, when the system load L is set to 0.97, the physical queue occupancy changes with time. It can be seen from the figure that the maximum flow count obtained by OC-48 is only 89, and that of OC-192 is only 406. Comparing the number of original co-existing flows in FIG. 2 with that in FIG. 1, the device of the present invention greatly reduces the number of physical queues that need to be set.

通过哈希函数将动态共享列表分成多个子表的情况如下表所示。The dynamic sharing list is divided into multiple sub-tables by hash function as shown in the table below.

从表中可以看出，经过哈希函数之后，每个子表的平均长度小于3。尽管最坏情况下子表的长度仍然可能达到10以上，但是，子表条目超过1或者2的概率就已经很小了。同时增加子表的数目，也可以极大地减小子表的长度。It can be seen from the table that after the hash function, the average length of each subtable is less than 3. Although the length of the sublist may still reach more than 10 in the worst case, the probability of sublist entries exceeding 1 or 2 is already very small. At the same time, increasing the number of sub-tables can also greatly reduce the length of the sub-tables.

图3和图4分别是利用链表和二分查找树组织动态共享列表的子表时的操作复杂度。两个图中的(a)是子表中已经存在该活跃流时的查找周期和子表数目的关系，而(b)是一个新的活跃流到达时的查找周期和子表数目的关系(需要插入新的表项)。从图中可以看出，随着子表数目的增加，查找的开销减小，当子表数目大于64时，就(a)(b)中显示的操作开销都在1.5个时钟以下。Figure 3 and Figure 4 respectively show the operational complexity when using linked list and binary search tree to organize the sub-tables of the dynamic shared list. (a) in the two figures is the relationship between the search cycle and the number of sub-tables when the active flow already exists in the sub-table, and (b) is the relationship between the search cycle and the number of sub-tables when a new active flow arrives (need to insert new entries). It can be seen from the figure that as the number of sub-tables increases, the cost of searching decreases. When the number of sub-tables is greater than 64, the operation costs shown in (a) and (b) are all below 1.5 clocks.

图5是动态共享管理模块向写入模块返回业务流对应物理队列的平均时间。(a)(b)分别是针对OC-48和OC-192的实际业务流量时的平均时间开销。其中包括了一个周期的计算hash函数的时间开销。从图中可以看出，随着子表数目的增加，时间的开销减小，当子表数目大于256时，就(a)(b)中显示的操作开销都在2个时钟以下。Fig. 5 is the average time for the dynamic sharing management module to return the physical queue corresponding to the service flow to the writing module. (a) and (b) are respectively the average time overhead for the actual service flow of OC-48 and OC-192. It includes the time overhead of calculating the hash function for one cycle. It can be seen from the figure that as the number of sub-tables increases, the time overhead decreases. When the number of sub-tables is greater than 256, the operation costs shown in (a) and (b) are all below 2 clocks.

图6是存在256个子表并采用二分查找树组织子表结构时，时间开销的累计概率函数。从图中可以看出，60％的情况下，一个时钟周期就可以返回结果，在90％的情况下，可以保证2个时钟周期返回结果。Fig. 6 is a cumulative probability function of time overhead when there are 256 sub-tables and a binary search tree is used to organize the sub-table structure. It can be seen from the figure that in 60% of the cases, the result can be returned in one clock cycle, and in 90% of the cases, the result can be returned in 2 clock cycles.

附图说明 Description of drawings

图1共存流数目随时间的变化曲线，其中实线是OC-192的曲线，虚线是OC-48的曲线。Figure 1 shows the change curve of the number of coexisting streams with time, in which the solid line is the curve of OC-192, and the dotted line is the curve of OC-48.

图6利用二分查找树组织动态共享列表的子表时，动态共享管理模块返回业务流对应物理队列的平均时间的累积分布函数。In Fig. 6, when the sub-tables of the dynamic sharing list are organized by binary search tree, the dynamic sharing management module returns the cumulative distribution function of the average time of the physical queue corresponding to the business flow.

图7系统结构图。Figure 7 system structure diagram.

图8块存储方式下队列结构的实现。Figure 8 Realization of queue structure in block storage mode.

图9利用哈希函数分割动态共享列表。Figure 9 utilizes a hash function to partition a dynamic share list.

图10利用链表结构组织利用哈希函数分割后的动态共享列表的子表。FIG. 10 uses a linked list structure to organize the sub-tables of the dynamic sharing list divided by the hash function.

图11利用二分查找树组织动态共享列表的子表的例子。Fig. 11 is an example of organizing sub-tables of a dynamic share list using a binary search tree.

图12在二分查找树组织的动态共享列表的子表中插入一个表项的例子。Fig. 12 is an example of inserting an entry into a sub-table of a dynamic shared list organized by a binary search tree.

图13在二分查找树组织的动态共享列表的子表中删除一个表项的例子，其中(a)是删除74的例子，(b)是删除12的例子，(c)是删除66的例子。Figure 13 is an example of deleting an entry in the sub-table of the dynamic shared list organized by binary search tree, wherein (a) is an example of deleting 74, (b) is an example of deleting 12, and (c) is an example of deleting 66.

具体实施方式 Detailed ways

实施例1Example 1

写入模块负责到达分组的写入。首先向动态共享模块请求当前到达分流的对应物理队列，然后根据动态共享模块返回的结果，从RAM管理模块得到该物理队列的队尾写入地址，并将数据分组写入RAM对应的地址中。The write module is responsible for the writing of arriving packets. First, request the corresponding physical queue that is currently arriving at the shunt from the dynamic sharing module, and then obtain the tail write address of the physical queue from the RAM management module according to the result returned by the dynamic sharing module, and write the data packet into the address corresponding to the RAM.

为方便管理，RAM管理模块通常把数据分组存储空间分成大小固定的存储块。但是，网络上传输的是变长的数据分组，因此需要把分组切割成大小固定的单元，称为数据单元(DataUnit，简称DU)，这些DU在离开转发设备之前被重新组装成原始的变长数据分组。存储器RAM被分成的存储块的大小和DU的相同，因此一个DU正好占用一个存储块。For the convenience of management, the RAM management module usually divides the data packet storage space into fixed-size storage blocks. However, variable-length data packets are transmitted on the network, so it is necessary to cut the packets into units of fixed size, called Data Units (DU for short), and these DUs are reassembled into original variable-length packets before leaving the forwarding device. Data grouping. The size of the storage block into which the memory RAM is divided is the same as that of the DU, so one DU occupies exactly one storage block.

在数据分组存储器中，未被占用的存储块称为空闲块，RAM管理模块用栈结构来组织空闲块的地址。当数据分组到达时，RAM管理模块从栈顶取出空闲块地址分配空闲块，将数据分组存入；当数据分组离开时，占用的存储块变为空闲块，RAM管理模块又将其地址写入栈结构，收回空闲块。In the data packet memory, the unoccupied storage block is called a free block, and the RAM management module uses a stack structure to organize the address of the free block. When the data packet arrives, the RAM management module takes out the free block address from the top of the stack to allocate the free block, and stores the data packet in; when the data packet leaves, the occupied storage block becomes a free block, and the RAM management module writes its address into Stack structure, recover free blocks.

RAM管理模块采用链表方式对物理队列的管理进行管理。链表中维护一个头部指针，指向物理队列链表的头部；维护一个尾部指针，指向物理队列链表的尾部。物理队列中数据存储在RAM中，在每一个数据块后面维护下一指针，指向下一个数据块。所有物理队列的头尾指针存储在RAM管理模块中，物理队列的数目不是事先固定的，物理队列是根据当前活跃流的个数动态创建和撤销。本发明使用FPGA/ASIC片内的SRAM保存物理队列头尾指针和空闲块栈，而数据分组和下一指针则存储于外部RAM存储设备中，如图8所示。The RAM management module manages the management of the physical queue by means of a linked list. A head pointer is maintained in the linked list, pointing to the head of the physical queue list; a tail pointer is maintained, pointing to the tail of the physical queue list. The data in the physical queue is stored in RAM, and the next pointer is maintained behind each data block, pointing to the next data block. The head and tail pointers of all physical queues are stored in the RAM management module. The number of physical queues is not fixed in advance, and the physical queues are dynamically created and revoked according to the number of current active flows. The present invention uses the SRAM in the FPGA/ASIC chip to preserve the head and tail pointers of the physical queue and the free block stack, while the data grouping and the next pointer are stored in the external RAM storage device, as shown in Figure 8.

当数据分组到达或离去时，需要对指针数据进行更新，更新涉及到两个部分：1)下一指针的更新，以DU为单位。当一个DU到达时，需要完成入队操作，从空闲块栈中获取一个空闲块，将数据写入，此时把前一个DU所在的存储块的下一指针指向当前的DU存储位置；当一个DU离开时，需要完成出队操作，把空闲块加到空闲块队列末尾，更新前一个空闲块的下一指针使其指向收回的空闲块；2)头尾指针的更新。更新以数据分组为单位进行。当新的数据分组到达并存入后，队列的尾部发生变化，将尾部指针更新到新的位置；当数据分组离开时，队列的头部位置发生了改变，将头部指针更新到新的位置。空闲块栈则在数据分组离开并归还空闲块后，栈顶指针向上移动，而当数据分组到达，分配了空闲块，空闲块栈顶指针向下移动。When a data packet arrives or leaves, the pointer data needs to be updated, and the update involves two parts: 1) The update of the next pointer, in units of DU. When a DU arrives, it needs to complete the enqueue operation, obtain a free block from the free block stack, and write the data. At this time, point the next pointer of the storage block where the previous DU is located to the current DU storage location; when a When DU leaves, it needs to complete the dequeue operation, add the free block to the end of the free block queue, update the next pointer of the previous free block to point to the recovered free block; 2) update the head and tail pointers. Updates are performed in units of data packets. When a new data packet arrives and is stored, the tail of the queue changes, and the tail pointer is updated to a new position; when the data packet leaves, the head position of the queue changes, and the head pointer is updated to a new position . In the free block stack, after the data packet leaves and returns the free block, the top pointer of the stack moves upward, and when the data packet arrives, a free block is allocated, and the top pointer of the free block moves downward.

按流排队要求不同流之间的分组必须在每一个时刻都存储在不同的队列中，动态共享管理模块和动态共享列表的引入正是为了实现不同流能够无冲突的共享物理队列资源。本发明动态共享机制的其特点在于：Queuing by flow requires that the packets between different flows must be stored in different queues at every moment. The introduction of the dynamic sharing management module and the dynamic sharing list is just to realize that different flows can share physical queue resources without conflicts. Its characteristic of dynamic sharing mechanism of the present invention is:

1)活跃流集合A是全体流集合F的一个子集，即 $A &Subset; F,$ 并且A在F内时刻变化；1) The active flow set A is a subset of the overall flow set F, namely $A &Subset; f,$ And A changes all the time in F;

2)A和物理队列PQ之间维持一个一对一的映射关系，A和PQ中的元素的个数相同，但是大大小于F中的元素个数；2) A one-to-one mapping relationship is maintained between A and the physical queue PQ. The number of elements in A and PQ is the same, but much smaller than the number of elements in F;

3)动态共享管理提供一种快速的查找机制，负责从动态共享列表中找到活跃流对应的物理队列；3) Dynamic sharing management provides a fast search mechanism, which is responsible for finding the physical queue corresponding to the active flow from the dynamic sharing list;

4)动态共享管理提供一种快速的插入机制，当新到达一个活跃流的分组时，负责插入一条新的活跃流到物理队列的映射；4) Dynamic sharing management provides a fast insertion mechanism, when a new packet of an active flow arrives, it is responsible for inserting a new mapping from the active flow to the physical queue;

5)动态共享管理提供一种快速的删除机制，当一个流所分配的物理队列变空时，负责删除一条活跃流到物理队列的映射。5) Dynamic sharing management provides a fast deletion mechanism, which is responsible for deleting the mapping from an active flow to a physical queue when the physical queue allocated by a flow becomes empty.

通过TCP/IP分组头部的五源组信息(源IP地址，目的IP地址，源端口号，目的端口号，协议)可以区分不同的业务流。由于该五源组同有13bytes(以IP v4为例)，则全体流集合F最多可以包含2¹⁰⁴个元素。尽管活跃流的数目不超过1K，但是由于活跃流在最多可以包含2¹⁰⁴个元素的集合F中动态变化，使得动态共享管理具有很大的难度。我们分两步来解决动态共享的问题。首先，我们利用一个哈希函数将整个动态共享列表分成多个子表；然后我们再利用链表或者二分查找树的结构来组织子表。Different service flows can be distinguished through five source group information (source IP address, destination IP address, source port number, destination port number, protocol) in the header of the TCP/IP packet. Since the five source groups all have 13 bytes (taking IP v4 as an example), the total flow set F can contain ²¹⁰⁴ elements at most. Although the number of active streams does not exceed 1K, since active streams change dynamically in the set F that can contain up to 2 ¹⁰⁴ elements, it is very difficult to manage dynamic sharing. We solve the problem of dynamic sharing in two steps. First, we use a hash function to divide the entire dynamic shared list into multiple sub-tables; then we use the structure of linked list or binary search tree to organize the sub-tables.

利用哈希函数将整个动态共享列表分成多个子表的情况如图9所示。图中，动态共享列表是一个直接寻址存储表，其上的每一个位置对应一个子表。活跃流经过一个哈希函数h(·)之后，被哈希到相应的子表上，如h(f₇)被哈希到子表s₁上，h(f₃)和h(f₅)被哈希到子表s₂上。其中，哈希函数可以是MD-5，SHA-1等任意均匀哈希函数。The situation of dividing the entire dynamic shared list into multiple sub-tables by using the hash function is shown in FIG. 9 . In the figure, the dynamic shared list is a directly addressable storage table, and each location on it corresponds to a sub-list. The active flow is hashed to the corresponding subtable after passing through a hash function h(·), such as h(f ₇ ) is hashed to the subtable s ₁ , h(f ₃ ) and h(f ₅ ) is hashed onto subtable _s2 . Wherein, the hash function may be any uniform hash function such as MD-5 and SHA-1.

通过哈希的作用，不同的流被均匀地分配到各个子表上，而且通过实验发现，每个子表上的活跃流数目很小(后面的实验验证部分会详细列举实验结果)。因此，我们可以简单使用链表结构来组织各个子表，其结构如图10所示。在动态共享列表上的每一个位置用一个比特来表示该子表是否有元素存在，1表示存在子表，并且其后用一个指针指向子表的元素；0表示无子表，并且其后跟一个null空指针。子表中的每一个元素除了指向下一元素的指针之外，包含一个流号f和一个物理队列号PQ。在每个子表上的查找，插入和删除操作可以按照线型表方式的处理。Through the function of hash, different streams are evenly distributed to each sub-table, and it is found through experiments that the number of active streams on each sub-table is very small (the experimental results will be listed in detail in the later experimental verification section). Therefore, we can simply use the linked list structure to organize each sub-table, and its structure is shown in Figure 10. Each position on the dynamic shared list uses a bit to indicate whether there is an element in the sublist, 1 indicates that there is a sublist, and then a pointer points to the element of the sublist; 0 indicates that there is no sublist, and it is followed by a null Null pointer. Each element in the sub-table contains a flow number f and a physical queue number PQ in addition to the pointer to the next element. Lookup, insertion and deletion operations on each subtable can be handled as a linear table.

当写入模块向动态共享模块发送一个流号为f₅的查找请求时，首先经过一次哈希运算，被哈希到子表s₂，顺序查找s₂上的表项。通过2次查找找到f₅，返回其对应的物理队列号q₈。When the writing module sends a lookup request with the stream number f ₅ to the dynamic sharing module, it first goes through a hash operation and is hashed to the subtable s ₂ , and the entries on s ₂ are searched sequentially. Find f ₅ through two searches, and return its corresponding physical queue number q ₈ .

当写入模块向动态共享模块发送一个流号为f₆的查找请求时，首先经过一次哈希运算，被哈希到子表s₃，由于s₃上的标识为0，因此分配一个物理队列后直接在s₃插入一条表项。When the writing module sends a lookup request with the flow number _f6 to the dynamic sharing module, it first undergoes a hash operation and is hashed to the subtable _s3 . Since the identifier on _s3 is 0, a physical queue is allocated Then directly insert a table entry in _s3 .

当调度模块将队列q₇中的元素全部调度空时，删除子表s₂上对应的表项，此时子表s₂上表项变为<f₅，q₈>。为了使得产出操作更加简单，可以在子表的每一个表项上增加一个上一指针，指向前一条表项。When the scheduling module schedules all the elements in the queue q ₇ to be empty, delete the corresponding entry on the sub-table s ₂ , and the entry on the sub-table s ₂ becomes <f ₅ , q ₈ >. In order to make the output operation easier, a previous pointer can be added to each entry of the subtable to point to the previous entry.

调度模块按照轮转的方式在活跃的物理队列中进行调度。例如当前活跃的物理队列为q₁和q₈，则调度模块轮流调度两个队列，当前时刻调度q₁，则下一个时刻调度q₈，依此类推。如果新插入一个队列q₅，则调度模块轮流调度这三个队列；如果随后q₁被调度空，怎调度模块轮流调度q₅和q₈二个队列。The scheduling module schedules in active physical queues in a round-robin manner. For example, the currently active physical queues are q ₁ and q ₈ , and the scheduling module schedules the two queues in turn. When scheduling q ₁ at the current time, it will schedule q ₈ at the next time, and so on. If a new queue q ₅ is inserted, the scheduling module will schedule these three queues in turn; if q ₁ is scheduled to be empty later, the how scheduling module will schedule the two queues q ₅ and q _{8 in} turn.

实施例2Example 2

写入模块和实施例1的工作方式一致。The writing module works in the same way as in Embodiment 1.

RAM管理模块和实施例1的工作方式一致。The working mode of the RAM management module is consistent with that of Embodiment 1.

调度模块和实施例1的工作方式一致。The scheduling module works in the same way as in Embodiment 1.

动态共享列表同样先按照图9方式的哈希方式进行划分成不同的子表，每一个子表按照二分查找树的方式进行组织，以减小在子表上的时间开销。二分查找树上的每一个节点同样包含活跃流号和对应的物理队列号，一个二分查找树的例子如图11所示，为了表示方便，每一个节点上的只给出对应的流号的数值。一个二分查找树的特点在于，左子树的数值小于根结点的数值，而右子树的数值大于根结点的数值。The dynamic shared list is also first divided into different sub-tables according to the hash method shown in Figure 9, and each sub-table is organized according to a binary search tree to reduce the time overhead on the sub-tables. Each node on the binary search tree also contains the active stream number and the corresponding physical queue number. An example of a binary search tree is shown in Figure 11. For convenience, only the value of the corresponding stream number is given on each node . The characteristic of a binary search tree is that the value of the left subtree is less than the value of the root node, and the value of the right subtree is greater than the value of the root node.

如果要查找二分查找树上流号为74的活跃流对应的表项，先查看根结点，74>56，转向右子树，74>66，转向右子树，74<86，转向左子树，找到。If you want to find the entry corresponding to the active flow number 74 on the binary search tree, first check the root node, 74>56, turn to the right subtree, 74>66, turn to the right subtree, 74<86, turn to the left subtree ,turn up.

当查找的过程中没有找到所要的节点，则在叶子节点处插入新的节点，图12给出了一个插入流号为30的例子。30<56，转向左子树，30>18，转向右子树，30<36而且36的左子树为空，则插入30作为36的左叶子结点。When the desired node is not found during the search process, a new node is inserted at the leaf node. Figure 12 shows an example where the insertion flow number is 30. 30<56, turn to the left subtree, 30>18, turn to the right subtree, 30<36 and the left subtree of 36 is empty, then insert 30 as the left leaf node of 36.

在查找二分查找树上的删除分三种情况，如图13所示。当删除的节点是叶子结点时，只需将其父节点指向NULL空指针，如图13(a)所示。当删除的节点只有一个左(或右)孩子节点时，用其父节点直接指向要删除节点的左(或右)孩子，图13(b)所示。当删除的节点既有左孩子又有右孩子时，将删除的节点的右子树作为其左子树中值最大的节点的右子树，然后将调整后的左子树作为其父节点的子树，图13(c)所示。There are three cases of deletion on the binary search tree, as shown in Figure 13. When the deleted node is a leaf node, it is only necessary to point its parent node to a NULL pointer, as shown in Figure 13(a). When the deleted node has only one left (or right) child node, use its parent node to directly point to the left (or right) child of the node to be deleted, as shown in Figure 13(b). When the deleted node has both left and right children, the right subtree of the deleted node is used as the right subtree of the node with the largest value in its left subtree, and then the adjusted left subtree is used as the right subtree of its parent node The subtree is shown in Figure 13(c).

实施例3Example 3

动态共享列表同样先按照图9方式的哈希方式进行划分成不同的子表，但是每一个子表按照平衡二分查找树的方式进行组织。其与二分查找树的区别在于，每一个节点的左右子树的深度差不超过1。平衡二分查找树的查找过程和实施例2中二分查找树的查找过程一致。平衡二分查找树插入一条表项时，先和实施例2中二分查找树的查找过程一致，然后再对生成二分查找树进行调整，使得其左右子树的深度差不超过1。平衡二分查找树删除一条表项时，先和实施例2中二分查找树的删除过程一致，然后再对生成二分查找树进行调整，使得其左右子树的深度差不超过1。The dynamic shared list is also firstly divided into different sub-tables according to the hash method shown in Figure 9, but each sub-table is organized according to a balanced binary search tree. The difference between it and the binary search tree is that the depth difference between the left and right subtrees of each node does not exceed 1. The search process of the balanced binary search tree is consistent with the search process of the binary search tree in Embodiment 2. When a balanced binary search tree is inserted into an entry, it is first consistent with the search process of the binary search tree in Embodiment 2, and then the generated binary search tree is adjusted so that the depth difference between the left and right subtrees does not exceed 1. When deleting an entry in the balanced binary search tree, the deletion process of the binary search tree in Embodiment 2 is first consistent, and then the generated binary search tree is adjusted so that the depth difference between the left and right subtrees does not exceed 1.

Claims

1. An apparatus for dynamic per-flow queued physical queue sharing, the apparatus comprising: the system comprises a physical storage device RAM, a writing module, a scheduling module, a RAM management module, a dynamic sharing list and a dynamic sharing module, wherein the writing module, the scheduling module, the RAM management module, the dynamic sharing list and the dynamic sharing module are integrated on the FPGA/ASIC; wherein,

a physical storage device RAM, which is a physical storage device, for caching data packets and a next pointer for each data block; a write module provided with an arrival packet input end of an active stream; the corresponding physical queue input end of the current arrival packet in the flow is connected with the corresponding output end of the dynamic sharing module; the queue tail write address input end is used for writing data packets into idle blocks in the RAM of the physical memory device and is connected with the corresponding output end of the RAM management module; the request signal output end of the corresponding physical queue is connected with the corresponding input end of the dynamic sharing module; the queue tail write address request signal output end of the corresponding physical queue is connected with the corresponding input end of the RAM management module; a data packet output terminal connected to a corresponding input terminal of the physical memory device RAM; the writing module firstly requests a corresponding physical queue of a current arrival packet to the dynamic sharing module, then obtains a queue tail writing address of the physical queue from the RAM management module according to a result returned by the dynamic sharing module, and writes the data packet into the address corresponding to the RAM;

the scheduling module is used for scheduling data packets and is provided with: the input end of the physical queue which needs to obtain the scheduling at present is connected with the corresponding output end of the dynamic sharing module; the physical queue head reads an address input end and is connected with a corresponding output end of the RAM management module; a physical queue read-in end corresponding to the read-out address of the head of the physical queue is connected with a corresponding read-out end of a physical memory device RAM; also provided with: an active stream output end, which is a request signal output end respectively sent to the dynamic sharing module, the RAM management module and the RAM of the physical memory device, wherein the request signal comprises a mapping table item request signal sent to the dynamic sharing module and requesting to delete the scheduled empty physical queue from the dynamic sharing list; the scheduling module schedules in the stream in an active state according to a set scheduling mode to obtain a physical queue which needs to obtain scheduling, then obtains a read address of the head of the physical queue from the RAM management module, and reads a corresponding data packet from a physical memory RAM; the active state means that a data packet is stored in a corresponding physical queue of one flow; the RAM management module is respectively connected with the RAM, the write-in module and the scheduling module, and manages a data structure stored in the RAM by data grouping according to a set mode, wherein the data structure comprises the division of a memory, the allocation and recovery of idle blocks, the write-in module provides idle block write-in addresses, and the scheduling module provides addresses of head-of-line elements of a scheduled queue;

the dynamic sharing list maintains a one-to-one mapping relation between an active flow A and a physical queue PQ, and the number of elements in the A and the PQ is the same;

the dynamic sharing module is interconnected with the dynamic sharing list; the module provides a quick searching mechanism and is responsible for finding out a physical queue corresponding to the active flow from the dynamic sharing list; a quick insertion mechanism is also provided, and when a packet of an active flow is newly arrived, the quick insertion mechanism is responsible for inserting a new active flow into the mapping dynamic sharing list of the physical queue; and a quick deleting mechanism is also provided, and when the physical queue allocated by one flow becomes empty, the quick deleting mechanism is responsible for deleting the mapping of one active flow to the physical queue in the dynamic sharing list.

2. The apparatus of claim 1, wherein the scheduling module schedules the one or more active physical queues in a round robin fashion.

3. The apparatus of claim 1, wherein the RAM management module divides the RAM memory space into fixed-size memory blocks, the memory blocks being equal to one data transfer unit in size.

4. The apparatus according to claim 1, wherein said RAM management module implements allocation and reclamation of idle blocks in a stack structure; the idle block stack moves the top of the stack pointer up after the data packet leaves and returns to the idle block, and moves the top of the idle block stack pointer down when the data packet arrives and the idle block is allocated.

5. The apparatus of claim 4, wherein the RAM management module retrieves the free block address from the top of the stack to allocate the free block and stores the data packet; when the data packet leaves, the occupied storage block becomes a free block, and the RAM management module writes the address of the free block into the stack structure and withdraws the free block.

6. The apparatus of claim 1, wherein the RAM management module implements management of the physical queues in a linked list manner, and maintains a head pointer for each physical queue, pointing to the head of the physical queue; maintaining a tail pointer pointing to the tail of the physical queue; each data block of the physical queue holds a next pointer to the next data block.

7. The apparatus of claim 6, wherein the RAM management module updates the tail pointer when the write module writes a new data block and updates the head pointer when the scheduling module schedules a data block.

8. The apparatus for per-flow queued physical queue dynamic sharing of claim 1, wherein the dynamic share list comprises a direct-addressed memory table and a series of dynamic share sub-tables; each location of the direct addressing storage table comprises a bit of identification bits and a pointer to a sub-table; wherein, when a certain position exists in the sub-table and the pointer pointing to the sub-table is effective, the identification bit is '1'; a bit is identified as "0" when there is no child table in a location and the pointer to the child table is a null pointer.

9. The apparatus for dynamic per-flow queued physical queue sharing according to claim 8, wherein the location of the direct-addressed memory table is determined by the value of the stream number of the arriving packet after a hash function; wherein the hash function is a uniform hash function.

10. The apparatus of claim 8, wherein the dynamically shared sub-tables are organized in a linked list.

11. The apparatus of claim 8, wherein the dynamically shared sub-tables are organized in a binary tree lookup.

12. The apparatus of claim 8, wherein the dynamically shared sub-tables are organized in a binary balanced look-up tree.

13. The apparatus according to claim 1, 8 or 9, wherein the lookup mechanism of the dynamic sharing module is to determine the dynamic sharing sub-table to be looked up through a hash function according to the stream number of the arriving packet, and then look up the dynamic sharing sub-table.

14. The apparatus of claim 10, wherein the mechanism for performing lookups in the corresponding dynamic shared sub-tables is configured to perform linear table sequential lookups for the linked list organization of claim 10.

15. The apparatus of claim 1, 11 or 12 wherein the means for performing lookups in the corresponding dynamic shared sub-tables, when a binary lookup tree or a binary balanced lookup tree is used, compares the value of the stream number to be looked up with the value of the stream number stored at the root node, if it is larger, it is diverted to the right sub-tree, if it is smaller, it is diverted to the left sub-tree, then it compares the values of the stream numbers stored at the root nodes of the sub-trees, and so on until the stream number to be looked up is found.

16. The apparatus according to claim 1, 8 or 9, wherein the insertion mechanism of the dynamic sharing module determines the dynamic sharing sub-table to be inserted according to the packet stream number through the value of the hash function, and then inserts the node to be inserted into the corresponding dynamic sharing sub-table.

17. The apparatus of claim 16, wherein the node to be inserted is inserted into the corresponding dynamic shared sub-table mechanism, and when the linked list organization is adopted, the linear table sequential insertion is adopted to be inserted at the tail of the linked list of the dynamic shared sub-table.

18. The apparatus of claim 16, wherein the nodes to be inserted are inserted into the corresponding dynamic shared sub-table mechanism, when the binary tree is organized, the sizes of the stream number value to be inserted into the entry and the stream number value stored at the root node are compared, if the size is larger than the size, the entry is shifted to the right sub-tree, if the size is smaller than the size, the entry is shifted to the left sub-tree, if the size is smaller than the size, the entry is shifted to the root node, and if the size is larger than the size, the entry is inserted into the right sub-tree, and if the size is smaller than the size, the entry is shifted to the left sub-tree.

19. The apparatus of claim 16, wherein the node to be inserted is inserted into the corresponding dynamic shared sub-table mechanism, and when a binary balanced tree-lookup organization is used, the entry of the sub-table to be inserted is inserted in the binary tree-lookup organization, and then the binary tree-lookup is adjusted so that the difference between the depths of the left and right subtrees of each node does not exceed 1.

20. The apparatus according to claim 1 or 8, wherein the deleting mechanism of the dynamic sharing module determines the dynamic sharing sub-table where the node to be deleted is located according to the value of the packet stream number through a hash function, and then deletes the node from the corresponding dynamic sharing sub-table.

21. The apparatus according to claim 1, 10 or 14, wherein the dynamic sharing module is configured to delete the corresponding dynamic sharing sub-table, and when a linked list organization is adopted, find the node to be deleted by a linear table sequential lookup, and then delete the node.

22. The device according to claim 1, 8 or 11, wherein the dynamic sharing module is configured to delete the corresponding dynamic sharing sub-table, and when the binary tree is organized, the node to be deleted is found in the binary tree and then deleted; when the deleted node is a leaf node, only the father node of the deleted node needs to point to a NULL pointer; when the deleted node only has a left or right child node, the parent node of the deleted node is used for directly pointing to the left or right child of the node to be deleted; when the deleted node has both left and right children, the right sub-tree of the deleted node is taken as the right sub-tree of the node with the largest median value in the left sub-tree, and then the adjusted left sub-tree is taken as the sub-tree of the parent node.

23. The apparatus according to claim 1, 8 or 12, wherein the dynamic sharing module performs the deleting mechanism in the corresponding dynamic sharing sub-table, when the organization mode of the binary balanced search tree is adopted, the nodes to be deleted are deleted in the deleting mode of the binary search tree, and then the binary search tree is adjusted so that the depth difference between the left and right subtrees of each node does not exceed 1.

24. The apparatus according to claim 1 or 8, wherein said dynamic shared list can be replaced by a Content Addressable Memory (CAM).

25. The apparatus for per-flow queued physical queue dynamic sharing according to one of claims 1 to 12, wherein the apparatus implements physical queue sharing by:

step 25.1: when a data packet enters the write-in module, the write-in module requests a corresponding physical queue from the dynamic sharing module by using the stream number of the data packet;

step 25.2: the dynamic sharing module searches the table entry corresponding to the corresponding stream number from the dynamic sharing list, and if the table entry exists, the dynamic sharing module returns the physical queue write address to the write-in module; otherwise, a new physical queue is allocated to the flow, a physical queue number is returned to the writing module, and the new table entry is written in the dynamic sharing list;

step 25.3: after learning the write address of the corresponding physical queue, the write-in module writes data into the RAM in a grouped manner after obtaining an idle block from the RAM management module, and the RAM management module updates a tail pointer and an idle block stack top pointer of the physical queue;

step 25.4: the scheduling module schedules each physical queue according to a scheduling algorithm, inquires a head pointer of the currently scheduled queue to the RAM management module, and reads data pointed by the head pointer from the RAM; meanwhile, the RAM management module recycles the data block and adds the data block into an idle block stack, modifies a top pointer of the idle block stack and modifies a head pointer of a corresponding physical queue;

step 25.5: when the current scheduled queue becomes empty, the scheduling module informs the dynamic sharing module to delete the item of the dynamic sharing module on the dynamic sharing management list.

26. The apparatus for per-flow queued physical queue dynamic sharing of claim 1, wherein the physical storage device RAM can be SRAM or DRAM.

27. The apparatus for per-flow queued physical queue dynamic sharing according to claim 9, wherein the uniform hash function may be MD-5 or SHA-1.