[go: up one dir, main page]

CN108833299B - Large-scale network data processing method based on reconfigurable switching chip architecture - Google Patents

Large-scale network data processing method based on reconfigurable switching chip architecture Download PDF

Info

Publication number
CN108833299B
CN108833299B CN201711448872.4A CN201711448872A CN108833299B CN 108833299 B CN108833299 B CN 108833299B CN 201711448872 A CN201711448872 A CN 201711448872A CN 108833299 B CN108833299 B CN 108833299B
Authority
CN
China
Prior art keywords
message
packet
header
network data
slice
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711448872.4A
Other languages
Chinese (zh)
Other versions
CN108833299A (en
Inventor
陶淑婷
赵沛
闫攀
毛雅欣
牛建泽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Microelectronic Technology Institute
Mxtronics Corp
Original Assignee
Beijing Microelectronic Technology Institute
Mxtronics Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Microelectronic Technology Institute, Mxtronics Corp filed Critical Beijing Microelectronic Technology Institute
Priority to CN201711448872.4A priority Critical patent/CN108833299B/en
Publication of CN108833299A publication Critical patent/CN108833299A/en
Application granted granted Critical
Publication of CN108833299B publication Critical patent/CN108833299B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/90Buffering arrangements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/74Address processing for routing
    • H04L45/745Address table lookup; Address filtering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/215Flow control; Congestion control using token-bucket
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/22Traffic shaping
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/24Traffic characterised by specific attributes, e.g. priority or QoS
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/24Traffic characterised by specific attributes, e.g. priority or QoS
    • H04L47/2441Traffic characterised by specific attributes, e.g. priority or QoS relying on flow classification, e.g. using integrated services [IntServ]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/32Flow control; Congestion control by discarding or delaying data units, e.g. packets or frames
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/50Queue scheduling
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/50Queue scheduling
    • H04L47/62Queue scheduling characterised by scheduling criteria
    • H04L47/625Queue scheduling characterised by scheduling criteria for service slots or service orders
    • H04L47/6275Queue scheduling characterised by scheduling criteria for service slots or service orders based on priority
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/30Peripheral units, e.g. input or output ports
    • H04L49/3009Header conversion, routing tables or routing tags
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/22Parsing or analysis of headers

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

本发明公开了一种基于可重构交换芯片架构的大规模网络数据处理方法:(1)、从物理链路接收多路报文并存储;(2)、将报文分成N个报文切片,当报文切片大于1时,执行步骤(3)~(5)和步骤(6);否则,执行步骤(4)和步骤(6);(3)、将包含报文数据净荷的报文切片存储,并将相应存储地址指针信息增加到报文头切片中;(4)、给报文头切片分配一个序列号,解析报文头,得到报文类型,根据报文类型,并行独立地对报文头进行解析、分类、转发处理,以更新报文头切片;(5)、提取报文数据净荷,将其与相应的报文头拼接成完整的报文;(6)、根据报文头携带的序号,将并行处理的报文按照顺序进行流量整形、队列管理处理之后转发。

Figure 201711448872

The invention discloses a large-scale network data processing method based on a reconfigurable switching chip architecture: (1) receiving and storing multiple packets from a physical link; (2) dividing the packet into N packet slices , when the packet slice is greater than 1, perform steps (3) to (5) and (6); otherwise, perform steps (4) and (6); (3), send the packet containing the packet data payload The message slice is stored, and the corresponding storage address pointer information is added to the message header slice; (4), assign a sequence number to the message header slice, parse the message header, and obtain the message type. According to the message type, parallel and independent The packet header is parsed, classified, and forwarded to update the packet header slice; (5), extracting the packet data payload, and splicing it with the corresponding packet header into a complete packet; (6), According to the sequence number carried in the packet header, the packets processed in parallel are processed in sequence by traffic shaping and queue management before being forwarded.

Figure 201711448872

Description

Large-scale network data processing method based on reconfigurable switching chip architecture
Technical Field
The invention relates to a large-scale network data processing method based on a reconfigurable switching chip architecture, and belongs to the technical field of wired communication.
Background
With the development of world economy and science and technology, network users rapidly climb, and the functional requirements and bandwidth requirements of network services are continuously increased, so that a greater challenge is provided for the development of network technology, on the premise of continuously increasing bandwidth, more and more network protocols and more complex and changeable network structures are provided, the requirements on programmability and multifunctionality of various network entities such as routers, switches, gateways and the like are continuously increased, and a reconfigurable switching chip architecture is provided for meeting the increasing network requirements. The powerful data high-speed processing capability of the reconfigurable switching chip is mainly realized by integrating a plurality of microprocessors in the chip, simultaneously adopting some hardware acceleration technologies and including a plurality of hardware threads in the microprocessors; while the designer may be given more freedom by using a dedicated co-processing unit. The developer can utilize the reconfigurable exchange chip to realize quick programming and flexible provision of functions required by customers, and the reconfigurable exchange chip enables the network system to have high performance and high flexibility.
The reconfigurable switching chip bears various message processing tasks, how to effectively support the reconfiguration realization of services such as message forwarding, routing table lookup, flow management and the like, and how to improve the flexibility of the reconfigurable switching chip through optimization on the premise of ensuring the message processing performance, so that the difficulty of realizing the reconfigurable switching chip in supporting large-scale network data processing is high.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: aiming at the requirements of the reconfigurable switching chip on performance and flexibility, on the basis of analyzing the message forwarding and processing working characteristics of the reconfigurable chip, the large-scale network data processing method based on the reconfigurable switching chip architecture is provided, and the flexibility is realized on the premise of ensuring the message processing performance.
The technical solution of the invention is as follows: a large-scale network data processing method based on a reconfigurable switching chip architecture comprises the following steps:
(1) receiving and storing a plurality of paths of messages from the physical link;
(2) dividing the message stored in the step (1) into N message slices according to the preset slice size, wherein N is larger than or equal to 1, the size of each slice is larger than or equal to the size of a message header, and when the message slices are larger than 1, executing the steps (3) - (5) and the step (6); otherwise, executing the step (4) and the step (6);
(3) storing the message slice containing the message data payload, and adding corresponding message data payload storage address pointer information to the message header slice containing the message header;
(4) distributing a serial number to a message header slice containing message header information, analyzing the message header to obtain a message type, and analyzing, classifying and forwarding the message header independently in parallel according to the message type to update the message header slice;
(5) according to the message data payload storage address information carried by the message header, extracting the message data payload from the cache, and splicing the message data payload and the corresponding message header into a complete message;
(6) and according to the sequence number carried by the message header, the parallel processed messages are divided into multiple paths for forwarding after carrying out flow shaping and queue management processing in sequence.
The step (1) is realized specifically as follows:
(1.1) receiving messages from a physical link through a plurality of ports;
(1.2) carrying out message identification, verification and filtering on the received messages, filtering out invalid messages, and storing the remaining valid messages in a receiving buffer area;
(1.3) converging the messages into a path of data according to the sequence of arrival time;
and (1.4) caching the messages obtained in the step (1.3) in sequence.
The step (4) adopts a plurality of parallel microengines to analyze, classify and forward the message header independently in parallel, and specifically comprises the following steps:
(4.1) polling the thread working state of each thread of each microengine, and submitting the received message header to the microengines with more idle threads;
and (4.2) loading a corresponding microcode instruction by the micro engine receiving the message, scheduling a plurality of threads to access related table entries in corresponding storage units in the memory module in a rotating non-preemptive mode according to the microcode instruction, and completing analysis, classification and forwarding processing of a message header data frame so as to update a message header slice.
And the threads in each micro engine work in a pipeline working mode.
The specific method for accessing the relevant entries in the corresponding storage units in the memory module in the round-robin non-preemptive manner in step (4.2) is as follows:
(4.2.1) recording the thread numbers of all the microengines which are in the state of preparing to access the memory cells in the memory and the memory cells needing to be accessed;
and (4.2.2) polling whether the storage unit is in an accessed state, when a thread finishes accessing the storage unit, sequentially searching a thread ready to access the storage unit in the recorded thread number, and giving access right to the thread.
When the micro engine accesses the DDR memory, the micro engine firstly calls a search engine and appoints the search engine to search the table items in the DDR by adopting a hash algorithm or a binary tree search algorithm, searches the table items matched with the message header processed by the micro engine, and feeds back the search result to the micro engine.
The microengines are integrated on one chip.
The chip is internally provided with a special instruction set specially aiming at network data packet processing, the special instruction set comprises a multiplication instruction, a cyclic redundancy check instruction, a content addressing instruction and an FFS instruction, and the micro-engine schedules threads to execute the instructions according to a microcode instruction to complete corresponding message processing.
And (6) performing flow shaping processing on the message by adopting a token bucket algorithm based on priority.
And (6) adopting a priority queue, a flow-based weighted queue, a fair queue and a PQ or CQ queue method to perform queue management on the message.
Compared with the prior art, the invention has the beneficial effects that:
(1) the invention inherits the reconfigurable core idea, separates data forwarding and control, realizes the processing function of high-speed forwarding data packets between an input port and an output port by mainly operating the data on a micro-engine processing core, has the linear speed execution characteristic, fully utilizes the independence of the data packets, adopts a parallel processing mode, operates the control aspect on a hardware coprocessor, processes routing table lookup, flow management, completes high-level QoS control and the like.
(2) The message processing of the invention is programmable, the microcode runs on the microengine, and the reloadability of the microcode provides great convenience for system upgrading.
(3) The invention can identify the data packet according to the protocol type, port number, destination address and other information specific to the protocol of the message in the aspects of protocol identification and classification.
(4) The invention can process the message in slice in the aspects of message disassembly and assembly and message recombination, and can ensure the forwarding sequence of the message when the message is recombined.
(5) In the aspect of message header processing, a plurality of parallel microengines are adopted to simultaneously carry out end-to-end complete processing on a plurality of message headers, and each microengine comprises a plurality of threads, so that high-bandwidth line speed processing can be realized.
(6) The invention can shape the flow according to certain protocol or application requirements to make the flow meet the requirements of time delay and time delay jitter when outputting, and sends the message to the corresponding queue for priority processing and the like after the flow is shaped, thereby realizing QoS guarantee.
(7) The invention adopts a special hardware acceleration processing unit to carry out co-processing on specific tasks, such as a search engine SE, an order-preserving engine OE, a flow shaping TM and a queue management QM, thereby improving the processing speed.
Drawings
Fig. 1 is a large-scale network data processing diagram based on a reconfigurable switching chip architecture according to the present invention.
FIG. 2 is a flow chart of traffic management according to an embodiment of the present invention;
FIG. 3 is a flow chart of queue management according to an embodiment of the present invention;
fig. 4 is a system block diagram of a large-scale network data processing method based on a reconfigurable switching chip architecture according to an embodiment of the present invention.
Detailed Description
The invention will be further described with reference to the accompanying drawings and specific embodiments.
As shown in fig. 1, the present invention provides a large-scale network data processing method based on a reconfigurable switching chip architecture, which specifically comprises the following steps:
(1) receiving and storing the multi-channel messages from the physical link; the method specifically comprises the following steps:
(1.1) receiving messages from a plurality of ports;
(1.2) carrying out message identification, verification and filtering on the received messages, filtering out invalid messages, and storing the remaining valid messages in a receiving buffer area;
(1.3) converging the messages into a path of data according to the sequence of arrival time; (ii) a
And (1.4) caching the messages obtained in the step (1.3) in sequence.
(2) Dividing the message stored in the step (1) into N message slices according to the preset slice size, wherein N is larger than or equal to 1, the size of each slice is larger than or equal to the size of a message header, and when the message slices are larger than 1, executing the steps (3) - (5) and the step (6); otherwise, executing the step (4) and the step (6);
(3) storing the message slice containing the message data payload, and adding corresponding message data payload storage address pointer information to the message slice containing the message header;
(4) distributing a serial number to a message header slice containing message header information, analyzing the message header to obtain a message type, and analyzing, classifying and forwarding the message header independently in parallel according to the message type to update the message header slice;
the specific method for analyzing, classifying and forwarding the message header independently and parallelly by adopting a plurality of parallel microengines comprises the following steps:
(4.1) polling the thread working state of each thread of each microengine, and submitting the received message header to the microengines with more idle threads;
and (4.2) loading a corresponding microcode instruction by the micro engine receiving the message, scheduling a plurality of threads to access related table entries in corresponding storage units in the memory module in a rotating non-preemptive mode according to the microcode instruction, and completing analysis, classification and forwarding processing of a message header data frame so as to update a message header slice. The specific method comprises the following steps:
(4.2.1) recording the thread numbers of all the microengines which are in the state of preparing to access the memory cells in the memory and the memory cells needing to be accessed;
and (4.2.2) polling whether the storage unit is in an accessed state, when a thread finishes accessing the storage unit, sequentially searching a thread ready to access the storage unit in the recorded thread number, and giving access right to the thread.
When the micro engine accesses the DDR memory, the micro engine firstly calls a search engine and appoints the search engine to search the table items in the DDR by adopting a hash algorithm or a binary tree search algorithm, searches the table items matched with the message header processed by the micro engine, and feeds back the search result to the micro engine.
And the threads in each micro engine work in a pipeline working mode. The microengines are integrated on one chip. The chip is internally provided with a special instruction set specially aiming at network data packet processing, the special instruction set comprises a multiplication instruction, a cyclic redundancy check instruction, a content addressing instruction and an FFS instruction, and the micro-engine schedules threads to execute the instructions according to a microcode instruction to complete corresponding message processing.
(5) According to the message data payload storage address information carried by the message header, extracting the message data payload from the cache, and splicing the message data payload and the corresponding message header into a complete message;
(6) and according to the sequence number carried by the message header, the parallel processed messages are divided into multiple paths for forwarding after carrying out flow shaping and queue management processing in sequence.
As shown in fig. 2, a specific method for performing traffic shaping processing on a packet by using a token bucket algorithm based on priority includes: firstly, classifying messages according to a preset matching rule, and directly sending messages which do not accord with the matching rule without processing by a token bucket; for messages that meet the matching rules, token buckets are required to be used for processing. When enough tokens are in the bucket, the message can be continuously sent, and the quantity of the tokens in the token bucket is correspondingly reduced according to the length of the message; when the token in the token bucket is insufficient, the message cannot be sent, and the message can be sent only when a new token is generated in the bucket. Therefore, the flow of the message is limited to be less than or equal to the token generation speed, and the purpose of limiting the flow is achieved. And the message is transmitted to a QM module through flow shaping.
And (6) adopting a priority queue, a flow-based weighted queue, a fair queue, a PQ or CQ queue method to perform queue management on the message.
Based on the large-scale network data Processing method based on the reconfigurable switching chip architecture, the invention provides a large-scale network data Processing system based on the reconfigurable switching chip architecture, the system structure of which is shown in fig. 4, the system comprises XGE 1-XGEn ports, MAC module (Medium Access Control), aggregation module rmux (roll multiplexer), input cache module ibm (ingress Buffer management), packet analysis module pa (packet analysis), polling scheduling module pba (packet allocation), sequential assurance Engine module OE (Order-caching Engine), micro Engine cluster module npe (network Processing Engine), packet editing module pe (packet editing), traffic shaping module tm (traffic management), and queue management module qm (equest management), and output module ebb.
S1: XGE1 XGEN ports: the received message is sent to the MAC module, XEG is: Ten-Gigabit Ethernet;
s2: and the MAC (Medium Access control) module is used for identifying, checking and filtering the received messages, filtering out invalid messages and storing the remaining valid messages in a receiving buffer area. The MAC module is composed of 3 module parts of a control module, a sending module and a receiving module, and supports full-duplex communication.
The control module comprises a general processor interface, a register and the like and is used for realizing the control processing of the general processor on the MAC; and the statistics of the messages received and transmitted by the interface are also provided, wherein the statistics comprises the statistics information of unicast, multicast, broadcast, short packets, long packets, CRC correctness/errors and the like.
The sending module is mainly used for completing the transmission of data frames, reading data from a sending buffer area by taking bytes as a unit, filling the CRC and the lead codes of the Ethernet frames, converting the data into a physical layer XGE mode for transmission, and ensuring the minimum interval between the two Ethernet frames through a frame gap counter during the transmission;
and the receiving module is mainly used for receiving the data frame, receiving data from the XGE interface of the physical layer, identifying, checking and filtering the message, and storing the message in a receiving buffer area.
S3: the convergence module RMUX (roll multiplexer) converges each path of message into one path of data according to the sequence of arrival time and then sends the data to the IBM module;
s4: an input cache module IBM (ingress Buffer management) caches input messages in sequence, and simultaneously divides the input messages into N message slices according to the preset slice size, wherein N is more than or equal to 1, the size of each slice is more than or equal to the size of a message header, the size of a common slice is 80 bytes, and the slices are processed and then sent to a message analysis module PA;
s5: a packet analysis module PA (packet analysis), when the packet slice is greater than 1, storing the packet slice containing the packet data payload into an RB (resource buffer) module, and adding the corresponding packet data payload storage address pointer information into the packet slice containing the packet header; analyzing to obtain the message types, wherein the message types comprise ARP (Address Resolution Protocol), IPV4(Internet Protocol Version 4) and IPV6(Internet Protocol Version 6), analyzing the message types, and forwarding the message header to a polling scheduling module PBA.
Further, through PA analysis, once the message is found to need processing of more than 4 layers of protocols, the message is sent to the general processor for higher-level protocol processing after being processed by the NPE module.
S6: a polling scheduling module (PBA) and a polling scheduling module (PBA) are used for polling the thread working state of each thread of each micro-engine in the network message header processor, allocating a sequence number sent by an order assurance engine module (OE) to a received message header and submitting the sequence number to the micro-engine with more thread idle number;
s7: order-ensuring Engine module OE (Order-ensuring Engine): in order to prevent the messages from being out of order after being processed by the micro-engine, a sequence number is distributed to each message header before the messages enter the micro-engine, and a polling scheduling module PBA is sent.
S8: and the network message header processor analyzes, classifies and forwards the message headers independently in parallel to update message header slices, and sends the processed message header slices to the message editing module PE.
The network message header processor comprises a micro-engine cluster module NPE, a task scheduling module RBA and a storage unit. Wherein:
the micro-engine cluster module NPE (network Processing Engine) consists of a plurality of parallel micro-engines, each micro-engine completes the complete Processing of a message, each micro-engine comprises a plurality of threads, and each thread works in a pipeline working mode. The micro engine which receives the message loads a corresponding microcode instruction from the instruction memory IMEM, and according to the microcode instruction, a plurality of threads are scheduled by the task scheduling module RBA to access related table items in corresponding memory units in the memory module in a rotating non-preemptive mode, so that the analysis, classification and forwarding processing of a message header data frame are completed, and a message header slice is updated. And sending the processed message header to the PE module.
The specific method for accessing the relevant table entries in the corresponding storage units in the memory module in a round-robin non-preemptive manner by the RBA is as follows: the RBA records the thread numbers of all the microengines which are in the state of preparing to access the memory cells in the memory and the memory cells needing to be accessed; polling whether the memory unit is in an accessed state, when a thread finishes accessing the memory unit, sequentially searching a thread ready to access the memory unit in the recorded thread number, and giving access to the thread.
The storage unit comprises a DDR memory, a TCAM and an on-chip memory LMEM. Wherein:
the DDR memory is used for storing the table items which are related to the services such as the VLAN table, the MPLS table and the like and have relatively low requirements on the processing speed; the micro engine calls a search engine through the task scheduler, and appoints the search engine to search the table items in the DDR (double Data rate) by adopting a corresponding search algorithm, searches the table items matched with the message header processed by the micro engine, and feeds back the search result to the micro engine.
And the TCAM (ternary Content Addressable memory) memory is used for storing items with higher requirements on processing speed, such as an MAC address table, a routing table and the like. The MAC address table and the routing table are stored in a TCAM form, and during searching, the task scheduling module converts information in a message header into a TCAM table for storage, matches the TCAM table with the MAC address table and the routing table, finds a required data matching item and feeds the data matching item back to the micro-engine.
An on-chip memory LMEM (local memory) for storing a flow table and directly accessed by the thread of the micro-engine through the task scheduler.
S9: a message editing module PE (packet editing) for modifying the data content of the message header, storing address information according to the message data payload carried by the message header, extracting the message data payload from the cache, splicing the message data payload and the corresponding message header into a complete message, and sending the complete message to a traffic shaping module TM;
s10: a traffic shaping module TM (traffic management) for performing traffic shaping on the message and sending the shaped message to a queue management module QM;
the method specifically comprises the following steps: as shown in fig. 2, the specific method for guaranteeing the network QoS by the token bucket algorithm based on the priority is as follows: firstly, classifying messages according to a preset matching rule, and directly sending messages which do not accord with the matching rule without processing by a token bucket; for messages that meet the matching rules, token buckets are required to be used for processing. When enough tokens are in the bucket, the message can be continuously sent, and the quantity of the tokens in the token bucket is correspondingly reduced according to the length of the message; when the token in the token bucket is insufficient, the message cannot be sent, and the message can be sent only when a new token is generated in the bucket. Therefore, the flow of the message is limited to be less than or equal to the token generation speed, and the purpose of limiting the flow is achieved.
S11: queue management module QM (queue management) for managing the queue of the message and sending the message managed by the queue to EBM (accumulation Buffer management) module;
the method specifically comprises the following steps: as shown in fig. 3, a queue is created according to an index, when no queue congestion occurs at an interface, a message is sent out immediately after the message arrives, if congestion occurs, the message is classified and sent to different queues, a queue scheduling mechanism processes the messages with different priorities respectively, the queue with a high priority is processed preferentially, and after the length of the queue reaches a certain maximum value, a RED or WRED policy can be used for packet loss processing to avoid network overload. And the message is transmitted to the EBM module after being managed by the queue.
S12: and the output cache module EBM caches the output message and sends the message to the MAC module.
S13: and the MAC module receives the message, stores the message into a sending buffer area, reads data from the sending buffer area, fills an Ethernet frame CRC (cyclic Redundancy check) and a lead code, and converts the message into a physical layer XGE mode for transmission.
Further, in the large-scale network data processing system based on the reconfigurable switching chip architecture shown in fig. 4, the inside of the block is an on-chip processing unit, and the outside of the block is an off-chip processing unit.
Further, in the large-scale network data processing system based on the reconfigurable switching chip architecture shown in fig. 4, OE, SE, TM, and QM are hardware coprocessors.
Further, in the large-scale network data processing system based on the reconfigurable switch chip architecture shown in fig. 4, TM and QM may be implemented both on-chip and off-chip, for example, the switch chip has a higher processing speed and lower power consumption.
Aiming at the packet data processing, the invention adopts an optimization system structure, a special instruction set and a hardware unit, and can meet the requirement of the high-speed packet data linear speed processing. The editable message processing method comprises editable message analysis, message matching search, message editing and message forwarding, so that the message processing is more flexible and faster, and the method is more suitable for large-scale network data processing. The functions of processing the high-speed and high-capacity intelligent data packets, including message analysis, classification, forwarding and the like, are finished by the micro-engine; some complex and frequently operated functions such as routing table lookup, packet order preservation, traffic management and queue management adopt a hardware coprocessor to further improve the processing performance, thereby realizing the organic combination of service flexibility and high performance.
The invention is not described in detail and is within the knowledge of a person skilled in the art.

Claims (10)

1.一种基于可重构交换芯片架构的大规模网络数据处理方法,其特征在于包括如下步骤:1. a large-scale network data processing method based on reconfigurable switching chip architecture, is characterized in that comprising the steps: (1)、从物理链路接收多路报文并存储;(1) Receive multiple packets from the physical link and store them; (2)、将步骤(1)存储的报文按照预设的切片大小分成到N个报文切片,N≥1,每个切片的大小大于等于报文头的大小,当报文切片大于1时,执行步骤(3)~(5)和步骤(6);否则,执行步骤(4)和步骤(6);(2) Divide the packet stored in step (1) into N packet slices according to the preset slice size, N≥1, the size of each slice is greater than or equal to the size of the packet header, when the packet slice is greater than 1 , perform steps (3) to (5) and (6); otherwise, perform steps (4) and (6); (3)、将包含报文数据净荷的报文切片存储,并将相应的报文数据净荷存储地址指针信息增加到包含报文头的报文头切片中;(3), store the message slice containing the message data payload, and add the corresponding message data payload storage address pointer information to the message header slice containing the message header; (4)、给包含报文头信息的报文头切片分配一个序列号,解析报文头,得到报文类型,根据报文类型,并行独立地对报文头进行解析、分类、转发处理,以更新报文头切片;(4) Allocate a sequence number to the packet header slice containing the packet header information, parse the packet header, and obtain the packet type. According to the packet type, the packet header is parsed, classified, and forwarded in parallel and independently. to update the packet header slice; (5)、根据报文头携带的报文数据净荷存储地址信息,从缓存中提取报文数据净荷,将其与相应的报文头拼接成完整的报文;(5), according to the message data payload storage address information carried by the message header, extract the message data payload from the cache, and splicing it and the corresponding message header into a complete message; (6)、根据报文头携带的序号,将并行处理的报文按照顺序进行流量整形、队列管理处理之后分成多路转发出去。(6) According to the sequence number carried in the header of the message, the messages processed in parallel are processed in sequence for traffic shaping and queue management, and then divided into multiple channels and forwarded. 2.根据权利要求1所述的一种基于可重构交换芯片架构的大规模网络数据处理方法,其特征在于:所述步骤(1)的具体实现为:2. a kind of large-scale network data processing method based on reconfigurable switching chip architecture according to claim 1, is characterized in that: the concrete realization of described step (1) is: (1.1)、从物理链路通过多个端口接收报文;(1.1) Receive packets from the physical link through multiple ports; (1.2)、对接收的报文,进行报文识别、校验和过滤,滤除掉无效的报文,将剩下的有效的报文存贮在接收缓冲区;(1.2) Perform message identification, checksum filtering on received messages, filter out invalid messages, and store the remaining valid messages in the receiving buffer; (1.3)、对各路报文按照到达时间先后顺序汇集成一路数据;(1.3) Assemble the data of each channel according to the order of arrival time; (1.4)、对步骤(1.3)得到的报文按顺序缓存。(1.4), buffer the packets obtained in step (1.3) in order. 3.根据权利要求1所述的一种基于可重构交换芯片架构的大规模网络数据处理方法,其特征在于:所述步骤(4)采用多个并行的微引擎并行独立地对报文头进行解析、分类、转发处理,具体为:3. a kind of large-scale network data processing method based on reconfigurable switching chip architecture according to claim 1, is characterized in that: described step (4) adopts a plurality of parallel micro-engines in parallel and independently to the message header Perform parsing, classification, and forwarding processing, specifically: (4.1)、轮询各微引擎每个线程的线程工作状态,将收到的报文头递交给线程空闲数较多的微引擎;(4.1), poll the thread working status of each thread of each micro-engine, and submit the received message header to the micro-engine with more idle threads; (4.2)、收到报文的微引擎加载相应的微码指令,根据微码指令,调度多个线程以轮转非抢占方式访问存储器模块中相应存储单元中的相关表项,完成报文头数据帧解析、分类和转发处理,以更新报文头切片。(4.2) The microengine receiving the message loads the corresponding microcode instructions, and according to the microcode instructions, schedules multiple threads to access the relevant entries in the corresponding storage units in the memory module in a round-robin non-preemptive manner, and completes the message header data Frame parsing, classification, and forwarding processing to update header slices. 4.根据权利要求3所述的一种基于可重构交换芯片架构的大规模网络数据处理方法,其特征在于所述每个微引擎内部的线程之间采用流水线工作方式工作。4 . The large-scale network data processing method based on a reconfigurable switching chip architecture according to claim 3 , wherein the threads within each micro-engine work in a pipeline mode. 5 . 5.根据权利要求3所述的一种基于可重构交换芯片架构的大规模网络数据处理方法,其特征在于:所述步骤(4.2)中以轮转非抢占方式访问存储器模块中相应存储单元中的相关表项的具体方法是:5. A large-scale network data processing method based on a reconfigurable switching chip architecture according to claim 3, characterized in that: in the step (4.2), the corresponding storage units in the memory module are accessed in a round-robin non-preemptive manner. The specific method of the relevant table entry is: (4.2.1)、记录所有微引擎已处于准备访问存储器中的存储单元状态的线程号及其需要访问的存储单元;(4.2.1), record the thread numbers of all micro-engines that are ready to access the storage units in the memory and the storage units that need to be accessed; (4.2.2)、轮询该存储单元是否正处于被访问状态,当有线程完成对该存储单元的访问时,在记录的线程号中顺序搜寻一个准备访问该存储单元的线程,将访问权交给该线程。(4.2.2), poll whether the storage unit is in the accessed state, when a thread completes the access to the storage unit, sequentially search for a thread ready to access the storage unit in the recorded thread number, and set the access right to this thread. 6.根据权利要求5所述的一种基于可重构交换芯片架构的大规模网络数据处理方法,其特征在于所述存储单元包括用来存储VLAN表、MPLS表的DDR存储器,当微引擎访问DDR存储器时,微引擎通过首先调用搜索引擎,并指定搜索引擎采用散列算法或二叉树搜索算法对DDR中的表项进行搜索,查找与微引擎所处理的报文头相匹配的表项,并将搜索结果反馈给微引擎。6. a kind of large-scale network data processing method based on reconfigurable switching chip architecture according to claim 5, is characterized in that described storage unit comprises the DDR memory that is used to store VLAN table, MPLS table, when micro-engine accesses When DDR memory is used, the microengine first calls the search engine and specifies that the search engine uses a hash algorithm or a binary tree search algorithm to search for entries in the DDR, finds entries that match the packet headers processed by the microengine, and Feed back the search results to the microengine. 7.根据权利要求3所述的一种基于可重构交换芯片架构的大规模网络数据处理方法,其特征在于所述微引擎集成在一块芯片上。7 . The large-scale network data processing method based on a reconfigurable switching chip architecture according to claim 3 , wherein the micro-engine is integrated on a chip. 8 . 8.根据权利要求7所述的一种基于可重构交换芯片架构的大规模网络数据处理方法,其特征在于所述芯片内部设有专门针对网络数据包处理的专用指令集,所述专用指令集包括乘法指令、循环冗余校验指令、按内容寻址指令、FFS指令,微引擎按照微码指令,调度线程执行这些指令,完成相应报文处理。8. A large-scale network data processing method based on a reconfigurable switching chip architecture according to claim 7, wherein the chip is provided with a dedicated instruction set for network data packet processing, and the dedicated instruction The set includes multiplication instructions, cyclic redundancy check instructions, content addressing instructions, and FFS instructions. The microengine schedules threads to execute these instructions according to the microcode instructions to complete corresponding message processing. 9.根据权利要求1所述的一种基于可重构交换芯片架构的大规模网络数据处理方法,其特征在于:所述步骤(6)采用基于优先级的令牌桶算法对报文进行流量整形处理。9. A large-scale network data processing method based on a reconfigurable switching chip architecture according to claim 1, wherein the step (6) adopts a priority-based token bucket algorithm to flow the packets Plastic processing. 10.根据权利要求1所述的一种基于可重构交换芯片架构的大规模网络数据处理方法,其特征在于:所述步骤(6)采用优先级队列、基于流的加权队列、公平队列、PQ或者CQ队列方法对报文进行队列管理。10. A large-scale network data processing method based on reconfigurable switching chip architecture according to claim 1, characterized in that: said step (6) adopts priority queue, flow-based weighted queue, fair queue, The PQ or CQ queuing method performs queue management on packets.
CN201711448872.4A 2017-12-27 2017-12-27 Large-scale network data processing method based on reconfigurable switching chip architecture Active CN108833299B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711448872.4A CN108833299B (en) 2017-12-27 2017-12-27 Large-scale network data processing method based on reconfigurable switching chip architecture

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711448872.4A CN108833299B (en) 2017-12-27 2017-12-27 Large-scale network data processing method based on reconfigurable switching chip architecture

Publications (2)

Publication Number Publication Date
CN108833299A CN108833299A (en) 2018-11-16
CN108833299B true CN108833299B (en) 2021-12-28

Family

ID=64153941

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711448872.4A Active CN108833299B (en) 2017-12-27 2017-12-27 Large-scale network data processing method based on reconfigurable switching chip architecture

Country Status (1)

Country Link
CN (1) CN108833299B (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109684269B (en) * 2018-12-26 2020-06-02 成都九芯微科技有限公司 PCIE (peripheral component interface express) exchange chip core and working method
CN110177046B (en) * 2019-04-18 2021-04-02 中国人民解放军战略支援部队信息工程大学 Security exchange chip based on mimicry thought, implementation method and network exchange equipment
CN110716797A (en) * 2019-09-10 2020-01-21 无锡江南计算技术研究所 DDR4 performance balance scheduling structure and method for multiple request sources
CN113037635B (en) * 2019-12-09 2022-10-11 郑州芯兰德网络科技有限公司 Multi-source assembling method and device for data block in ICN router
CN111031044A (en) * 2019-12-13 2020-04-17 浪潮(北京)电子信息产业有限公司 A message parsing hardware device and message parsing method
CN116868555A (en) * 2021-02-20 2023-10-10 华为技术有限公司 Switching system
CN113098798B (en) * 2021-04-01 2022-06-21 烽火通信科技股份有限公司 Method for configuring shared table resource pool, packet switching method, chip and circuit
CN112995067B (en) * 2021-05-18 2021-09-07 中国人民解放军海军工程大学 A coarse-grained reconfigurable data processing architecture and its data processing method
CN113691469B (en) * 2021-07-27 2023-12-26 新华三技术有限公司合肥分公司 Message disorder rearrangement method and single board
CN113949669B (en) * 2021-10-15 2023-12-01 湖南八零二三科技有限公司 Vehicle-mounted network switching device and system capable of automatically configuring and analyzing according to flow
CN117768947A (en) * 2022-09-26 2024-03-26 华为技术有限公司 Data communication method, exchange chip, communication node and communication network
CN115714693B (en) * 2022-11-16 2025-03-18 芯云晟(杭州)电子科技有限公司 Low power consumption control method, medium and low power consumption processor of network message pre-classification processor
CN116192660A (en) * 2023-02-28 2023-05-30 上海祖暅科技合伙企业(有限合伙) A Reconfigurable Packet Analysis Module
CN117319332B (en) * 2023-11-30 2024-04-02 成都北中网芯科技有限公司 Programmable hardware acceleration method for network message slicing and network processing chip

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1499792A (en) * 2002-11-11 2004-05-26 华为技术有限公司 Method for raising retransmission capability of network processor for servicing multiple data parts
CN1558626A (en) * 2004-02-10 2004-12-29 中兴通讯股份有限公司 Method for realizing group control function by means of network processor
CN1677952A (en) * 2004-03-30 2005-10-05 武汉烽火网络有限责任公司 Method and apparatus for wire speed parallel forwarding of packets
CN101276294A (en) * 2008-05-16 2008-10-01 杭州华三通信技术有限公司 Method and apparatus for parallel processing heteromorphism data
CN101442486A (en) * 2008-12-24 2009-05-27 华为技术有限公司 Method and apparatus for distributing micro-engine
CN101616097A (en) * 2009-07-31 2009-12-30 中兴通讯股份有限公司 A management method and system for network processor output port queue
EP2372962A1 (en) * 2010-03-31 2011-10-05 Alcatel Lucent Method for reducing energy consumption in packet processing linecards
CN105511954A (en) * 2014-09-23 2016-04-20 华为技术有限公司 Method and device for message processing
CN106612236A (en) * 2015-10-21 2017-05-03 深圳市中兴微电子技术有限公司 Many-core network processor and micro engine message scheduling method and micro engine message scheduling system thereof

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7831974B2 (en) * 2002-11-12 2010-11-09 Intel Corporation Method and apparatus for serialized mutual exclusion

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1499792A (en) * 2002-11-11 2004-05-26 华为技术有限公司 Method for raising retransmission capability of network processor for servicing multiple data parts
CN1558626A (en) * 2004-02-10 2004-12-29 中兴通讯股份有限公司 Method for realizing group control function by means of network processor
CN1677952A (en) * 2004-03-30 2005-10-05 武汉烽火网络有限责任公司 Method and apparatus for wire speed parallel forwarding of packets
CN101276294A (en) * 2008-05-16 2008-10-01 杭州华三通信技术有限公司 Method and apparatus for parallel processing heteromorphism data
CN101442486A (en) * 2008-12-24 2009-05-27 华为技术有限公司 Method and apparatus for distributing micro-engine
CN101616097A (en) * 2009-07-31 2009-12-30 中兴通讯股份有限公司 A management method and system for network processor output port queue
EP2372962A1 (en) * 2010-03-31 2011-10-05 Alcatel Lucent Method for reducing energy consumption in packet processing linecards
CN105511954A (en) * 2014-09-23 2016-04-20 华为技术有限公司 Method and device for message processing
CN106612236A (en) * 2015-10-21 2017-05-03 深圳市中兴微电子技术有限公司 Many-core network processor and micro engine message scheduling method and micro engine message scheduling system thereof

Also Published As

Publication number Publication date
CN108833299A (en) 2018-11-16

Similar Documents

Publication Publication Date Title
CN108833299B (en) Large-scale network data processing method based on reconfigurable switching chip architecture
CN108809854B (en) Reconfigurable chip architecture for large-flow network processing
US11038993B2 (en) Flexible processing of network packets
US12095882B2 (en) Accelerated network packet processing
US11374858B2 (en) Methods and systems for directing traffic flows based on traffic flow classifications
CN101771627B (en) Equipment and method for analyzing and controlling node real-time deep packet on internet
US10735325B1 (en) Congestion avoidance in multipath routed flows
US11258726B2 (en) Low latency packet switch architecture
US7177276B1 (en) Pipelined packet switching and queuing architecture
US11818022B2 (en) Methods and systems for classifying traffic flows based on packet processing metadata
US10778588B1 (en) Load balancing for multipath groups routed flows by re-associating routes to multipath groups
US10693790B1 (en) Load balancing for multipath group routed flows by re-routing the congested route
JP2003508851A (en) Network processor, memory configuration and method
JP2003508954A (en) Network switch, components and operation method
JP2003508957A (en) Network processor processing complex and method
US10819640B1 (en) Congestion avoidance in multipath routed flows using virtual output queue statistics
US11863467B2 (en) Methods and systems for line rate packet classifiers for presorting network packets onto ingress queues
WO2001050259A1 (en) Method and system for frame and protocol classification
CN114327833A (en) Efficient flow processing method based on software-defined complex rule
US9590897B1 (en) Methods and systems for network devices and associated network transmissions
US11693664B2 (en) Methods and systems for distributing instructions amongst multiple processing units in a multistage processing pipeline
Mariño et al. Loopback strategy for in-vehicle network processing in automotive gateway network on chip
US10454831B1 (en) Load-balanced forwarding of network packets generated by a networking device
US20240163230A1 (en) Systems and methods for using a packet processing pipeline circuit to extend the capabilities of rate limiter circuits

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant