CN105634958A

CN105634958A - Packet forwarding method and device based on multi-core system

Info

Publication number: CN105634958A
Application number: CN201510990469.9A
Authority: CN
Inventors: 刘健男
Original assignee: Neusoft Corp
Current assignee: Neusoft Corp
Priority date: 2015-12-24
Filing date: 2015-12-24
Publication date: 2016-06-01
Anticipated expiration: 2035-12-24
Also published as: CN105634958B

Abstract

The present invention provides a packet forwarding method and device based on a multi-core system. The method comprises a step of receiving a packet and extracting quintuple from the packet, a step of carrying out lock-free searching to find that whether the session corresponding to the quintuple exists or not in a global session hash table, a step of newly establishing a session in the CPU which receives the packet if the session corresponding to the quintuple does not exist, a step of determining the CPU to which the session belongs if the session corresponding to the quintuple exists, a step of forwarding the packet according to the session in the CPU which receives the packet if the CPU to which the session belongs is the CPU which receives the packet, and a step of forwarding the packet to the CPU to which the session belongs if the CPU to which the session belongs is not the CPU which receives the packet, and forwarding the packet according to the session by the CPU to which the session belongs. According to the method, the packet forwarding based on a multi-core system can be achieved.

Description

Message forwarding method and device based on multiple nucleus system

Technical field

The present invention relates to network communication technology field, particularly relate to a kind of message forwarding method based on multiple nucleus system and device.

Background technology

Along with the development of technology, commonly use polycaryon processor at present. The appearance of polycaryon processor, is expect that disposal ability can be double because of the increase of central processing unit (CentralProcessingUnit, CPU) number. But, owing to needing to share resource between multinuclear, there is mutually competition, generally can adopt locking mode when CPU uses and shares resource, the use of lock can cause hydraulic performance decline when multinuclear is concurrent.

Based on multiple nucleus system message forward time, it is necessary to message is forwarded by dialogue-based table. The design of conversational list resource and lookup conversational list become affects the key that multiple nucleus system E-Packets. Find conversational list typically by searching global session Hash table, adopt locking mode at present when searching global session Hash table, and locking itself can bring performance cost, affects the message forwarding performance of multiple nucleus system.

Summary of the invention

It is contemplated that one of technical problem solved at least to a certain extent in correlation technique.

For this, it is an object of the present invention to propose a kind of message forwarding method based on multiple nucleus system, the method can improve the message forwarding performance of multiple nucleus system.

Further object is that a kind of apparatus for forwarding message based on multiple nucleus system of proposition.

For reaching above-mentioned purpose, the message forwarding method based on multiple nucleus system that first aspect present invention embodiment proposes, including: receive message, and from described message, extract five-tuple; In global session Hash table, without whether lock search exists the session corresponding with described five-tuple; If it does not exist, then newly-built session in the CPU receiving message, forward described message according to newly-built session; If it does, determine the CPU that described session belongs to; If the CPU that described session belongs to is the CPU receiving message, forward described message according to the described session in the CPU receiving message; If the CPU that described session belongs to is not the CPU receiving message, then described message is transmitted to the CPU that described session belongs to, and the CPU belonged to by described session forwards described message according to described session.

The message forwarding method based on multiple nucleus system that first aspect present invention embodiment proposes, by when searching global session Hash table, adopt without lock side formula, search relative to locking mode, the performance that locking itself brings can be avoided to reduce problem, such that it is able to improve the performance that multiple nucleus system E-Packets. Further, described session is Local resource independent in each CPU, and the CPU being only capable of being established session accesses, and forbids that the non-CPU setting up session accesses, and the CPU being established session adopts without lock access mode when accessing, lock, such that it is able to reduce further, the degradation problem brought.

For reaching above-mentioned purpose, the apparatus for forwarding message based on multiple nucleus system that second aspect present invention embodiment proposes, including: extraction module, it is used for receiving message, and from described message, extracts five-tuple; Search module, in global session Hash table, without whether lock search exists the session corresponding with described five-tuple; Newly-built module, when being used for being absent from described session, then newly-built session in the CPU receiving message, forward described message according to newly-built session; Determine module, when being used for existing described session, it is determined that the CPU that described session belongs to; First forwarding module, for when the CPU that described session belongs to is the CPU receiving message, forwarding described message according to the described session in the CPU receiving message; Second forwarding module, for when the CPU that described session belongs to not is the CPU receiving message, then described message being transmitted to the CPU that described session belongs to, and the CPU belonged to by described session forwards described message according to described session.

The apparatus for forwarding message based on multiple nucleus system that second aspect present invention embodiment proposes, by when searching global session Hash table, adopt without lock side formula, search relative to locking mode, the performance that locking itself brings can be avoided to reduce problem, such that it is able to improve the performance that multiple nucleus system E-Packets. Further, described session is Local resource independent in each CPU, and the CPU being only capable of being established session accesses, and forbids that the non-CPU setting up session accesses, and the CPU being established session adopts without lock access mode when accessing, lock, such that it is able to reduce further, the degradation problem brought.

Aspect and advantage that the present invention adds will part provide in the following description, and part will become apparent from the description below, or is recognized by the practice of the present invention.

Accompanying drawing explanation

The present invention above-mentioned and/or that add aspect and advantage will be apparent from easy to understand from the following description of the accompanying drawings of embodiments, wherein:

Fig. 1 is the schematic flow sheet of the message forwarding method based on multiple nucleus system that one embodiment of the invention proposes;

Fig. 2 is the schematic flow sheet of the message forwarding method based on multiple nucleus system that another embodiment of the present invention proposes;

Fig. 3 is the schematic flow sheet adding hash table in the embodiment of the present invention;

The change schematic diagram of barrel chain table when Fig. 4-7 is to add hash table in the embodiment of the present invention;

Fig. 8 is the schematic flow sheet deleting hash table in the embodiment of the present invention;

The change schematic diagram of barrel chain table when Fig. 9-12 is to delete hash table in the embodiment of the present invention;

Figure 13 is the schematic diagram of session structure body in the embodiment of the present invention;

Figure 14 is a kind of experiment scene schematic diagram in the embodiment of the present invention;

Figure 15 is the structural representation of the apparatus for forwarding message based on multiple nucleus system that another embodiment of the present invention proposes;

Figure 16 is the structural representation of the apparatus for forwarding message based on multiple nucleus system that another embodiment of the present invention proposes.

Detailed description of the invention

Being described below in detail embodiments of the invention, the example of described embodiment is shown in the drawings, and wherein same or similar label represents same or similar module or has the module of same or like function from start to finish. The embodiment described below with reference to accompanying drawing is illustrative of, and is only used for explaining the present invention, and is not considered as limiting the invention. On the contrary, all changes within the scope of embodiments of the invention include falling into attached claims spirit and intension, amendment and equivalent.

Fig. 1 is the schematic flow sheet of the message forwarding method based on multiple nucleus system that one embodiment of the invention proposes, and the method includes:

S11: receive message, and extract five-tuple from described message.

In multiple nucleus system, can there is multiple CPU. After a CPU receives message, it is possible to extract the five-tuple of message from message.

The five-tuple of message includes: the Internet, source (IP) address, source port (port) number, target ip address, destination slogan and protocol type.

S12: in global session Hash table, without whether lock search exists the session corresponding with described five-tuple.

It is different from conventional locking mode, in the present embodiment, when making a look up in global session Hash table, adopts without lock side formula.

Global session Hash table includes multiple hash table, stores key-value pair (key, value) in usual hash table, and wherein, key value is five-tuple, and value value is session (session).

Therefore, may determine that the session that the five-tuple whether existed and extract in message is corresponding according to hash table.

Further, in order to reduce amount of storage, it is possible to hash table is optimized, specifically may refer to follow-up associated description.

S13: if it does not exist, then newly-built session in the CPU receiving message, forward described message according to newly-built session.

Such as, after CPU_A receives message, if CPU_A is after searching global session Hash table, it does not have find the session corresponding with the five-tuple of message, then CPU_A newly-built session in CPU_A, can E-Packet according to newly-built session afterwards.

S14: if it does, determine the CPU that described session belongs to.

Such as, after CPU_A receives message, if CPU_A is after searching global session Hash table, it is possible to find the session corresponding with the five-tuple of message, then CPU_A determines the CPU that this session belongs to, it is assumed that the CPU that this session belongs to is CPU_B.

S15: if the CPU that described session belongs to is the CPU receiving message, forwards described message according to the described session in the CPU receiving message.

Such as, when CPU_B and CPU_A is identical, CPU_A can E-Packet according to self interior session.

In each CPU, the set of multiple sessions composition is properly termed as conversational list, and each item of conversational list is a session (or being called session entry), and the forwarding of message is based on session.

Session is a kind of structure, contains the essential information of whole piece session, various access strategy, statistical information etc., and essential information such as includes: five-tuple, medium education (MediaAccessControl, MAC) address etc. The size (size) of each session probably takies 450 bytes.

S16: if the CPU that described session belongs to is not the CPU receiving message, then be transmitted to the CPU that described session belongs to by described message, and the CPU belonged to by described session forwards described message according to described session.

Such as, when CPU_B and CPU_A is different, CPU_A forwards the message to CPU_B, CPU_B and E-Packets according to the session in CPU_B.

In the present embodiment, session is only capable of being established the CPU of this session and accesses, and adopts without lock access mode when the CPU setting up this session accesses.

In the present embodiment, by when searching global session Hash table, adopting without lock side formula, search relative to locking mode, it is possible to avoid the performance that locking itself brings to reduce problem, such that it is able to improve the performance that multiple nucleus system E-Packets. Further, described session is Local resource independent in each CPU, and the CPU being only capable of being established session accesses, and forbids that the non-CPU setting up session accesses, and the CPU being established session adopts without lock access mode when accessing, lock, such that it is able to reduce further, the degradation problem brought.

Fig. 2 is the schematic flow sheet of the message forwarding method based on multiple nucleus system that another embodiment of the present invention proposes, and the method includes:

S21:CPU receives the message that network interface card sends, wherein, described network interface card adopts receiving terminal convergent-divergent (ReceiveSideScaling, RSS) message is distributed to CPU by algorithm, when adopting RSS algorithm, the private key adopted meets following condition: after dividing private key with two bytes for one group, the numerical value of adjacent two groups is identical.

Same session (or being called connection) can be divided into both direction: client-> service end (referred to as left-hand), service end-> client (referred to as dextrad).

The both direction of same session is given same CPU process by the present embodiment, reduces the performance cost that core causes cache miss (cachemiss) to cause with lock.

In order to ensure to be divided into the both direction of same session same CPU process, the private key (secretkey) meeting above-mentioned condition will be adopted.

Principle is as follows:

First, RSS algorithm is analyzed. RSS algorithm is as follows:

Above-mentioned output is the queue that result, result determine that message is assigned to, and same queue is by same CPU process. Therefore, in order to the both direction of same session is distributed to same CPU, then the message of the both direction of same session needs corresponding identical result.

Above-mentioned RSS algorithm textual representation is as follows:

Private key (K) for input (inputkey) and one 320:

Initial result=0;

Each b of corresponding input (inputkey), if b=1, then result=result XOR K's is the highest 32;

Otherwise, result is constant;

K moves to left one.

Secondly, input (inputkey) is analyzed.

In actual hardware network interface card, inputkey in RSS algorithm is actual is the four-tuple (purpose IP address, source, source destination interface) of message, and inputkey is as follows:

1)TCPIPV4[SIP(32bit)],DIP(32bit),SPORT(16bit),DPROT(16bit)]

2)IPV4[SIP(32bit),DIP(32bit)]

3)TCPIPV6[SIP(32bit),DIP(32bit),SPORT(16bit),DPROT]

4)IPV6[SIP(128bit),DIP(128bit)]

5)UDPIPV4[SIP(32bit),DIP(32bit),SPORT(16bit),DPROT(16bit)]

6)UDPIPV6[SIP(128bit),DIP(128bit),SPORT(16bit),DPROT(16bit)]

Owing to the five-tuple of message contains the above-mentioned key value that hardware uses, therefore, it can just the connection of both direction directly be assigned on same CPU on hardware level. In above-mentioned RSS algorithm, the inputkey value of five-tuple, there is no network address translation (NetworkAddressTranslation, NAT) when, it is known for almost can ensure that left and right directions comprises the number of 1, in the process of RSS, also it is ensured that the number of times of XOR is identical.

Finally, design K ensures that the both direction of same connection is assigned to same queue.

Owing to inputkey is unfixed, it is necessary to ensureing that the value of each XOR is all identical, K should be defined by this. Analyze RSS algorithm: no matter whether b value is 1, and each K will move to left one, say, that front 4B (sip) XOR of ip head must be the front 4B of K; 4-7B (dip) XOR of ip head is also the 4-7B of K; 8-9B (SPORT) XOR is also the 8-9B of K; 10-11B XOR is the 10-11B of K, so:

If ipv4 message, if private key (K) meets following condition, it is ensured that the value of each XOR of sip and dip is all identical:

0-3B (SIP)+4-7B (DIP)=8B-15B

9-10B (SPORT)+11-12B (DPORT)=13B-16B

If ipv6 message, K needs to meet following condition:

0-15B (SIP)+16-31B (DIP)=32B-63B

32-33B (SPORT)+34-35B (DPORT)=36B-39B

For 0-3B (SIP)+4-7B (DIP)=8B-15B, this formula is meant that: the 8-15 byte of the 0-3 byte=K of K, remainder formula implication is identical.

Take both minima, K is needed to be set to the special key that two bytes of two bytes are identical, also it is after K is grouped with two bytes for one group, the numerical value of two adjacent groups is identical, thereby may be ensured that the message of the both direction of same connection is assigned in same queue, and then by same CPU process.

In the present embodiment, same connection can only be accessed by same CPU, limits other the CPU operation to this session, forbids that the session internal resource of this core CPU is carried out write operation by other CPU, deletes, newly-built operation. In the present embodiment, Session Resources is Local resource, and the CPU being only capable of being created Session Resources conducts interviews, specific as follows:

(1) both direction of same connection can only by same CPU process, and this CPU creates the CPU of session (session) exactly.

(2) when searching session Hash table, if the connection of message belongs to other CPU, need that this connects through the queue of falling core to be sent in other CPU and process, it is ensured that without lock during multi-core operation session, it is prevented that same resource multinuclear adds latching operation when concurrently accessing.

(3) each CPU only starts a thread, does not have multiple thread and runs the situation on same CPU. In this case, session resource is every nuclear resource, and namely session resource is every thread resources, does not have multi-thread concurrent and accesses the situation of mutual exclusion.

S22:CPU extracts the five-tuple of message.

Such as, the source IP address of message, purpose IP address, source port number, destination slogan and protocol type are extracted.

S23:CPU is without lock search global session Hash table, it may be judged whether there is the session corresponding with the five-tuple of message.

In order to ensure safety and vigorousness, general method is generally designed and reduces the concurrent mutual exclusion protection to shared resource for accessing global session Hash table by Read-Write Locks. Owing to Read-Write Locks can allow read-write jointly carry out, then again just like rcu (reading, copy, update) algorithm, atomic atom variable scheduling algorithm promotes performance when Hash table is concurrently searched. But by experiment and theory analysis, the performance of current wide variety of concurrent lookup algorithm is actually not high.

The Hash lookup algorithm that the present embodiment uses is not lock when searching, and only writes write operation mutual exclusion, it is ensured that in high performance situation, still can ensure that safety and vigorousness. Concrete reasoning and analysis of experimental data refer to follow-up associated description.

Global session Hash table includes multiple hash table, is empty time initial. After CPU sets up session, it is possible to add the hash table that session is corresponding in global session Hash table, after session discharges, global session Hash table is deleted the hash table that session is corresponding.

When multiple nucleus system is concurrent in a large number, even when conflict, performance also can be caused bigger expense by read lock, cause that when multinuclear is concurrent, performance can not phenomenon increase, so this method is the lookup without lock, it is add to write lock only in creating and deleting. Concrete condition includes:

(1) same bucket does not allow to add simultaneously, delete

(2) same bucket allows to add simultaneously, search

(3) same bucket allows to delete simultaneously, search

(4) the Hash key value of session entry will not change in life cycle: the five-tuple being namely absent from session entry changes in message repeating process, if really needing variation, can only first delete then the session entry of newly-built again.

In some embodiments, referring to Fig. 3, the flow process adding hash table may include that

S31: obtain hash table to be added.

In CPU after newly-built session, it is necessary to add the hash table of correspondence in global session Hash table.

Being typically expressed as (key, value) in hash table, wherein, key value is the five-tuple of session, and value is session.

Again because the both direction of same session is by same CPU process, therefore, after newly-built session, it is possible to add (key, the value) of left and right both direction in Hash table: c-> s [left-hand key], s-> c [dextrad key]. Generally under there is no NAT rule:

Assume that left-hand key assignments (leftkey) is (sip, sport, dip, dport, protocol), then dextrad key assignments (rightkey) is: (dip, dport, sip, sport, protocol).

For there being the IP address of NAT rule, message is front in establishment session (session), can carry out NAT rule match, calculates correct rightkey as the dextrad key assignments expected, is suspended on Hash table.

Hash table is usually noted (key, value), and key is five-tuple, i.e. (sip, dip, sport, dport, protocol), and it accounts for four crossed joints altogether; Value value is then for session (session).

Same session is divided into both direction, and they are made up of five-tuple, and taking up room respectively of they is as follows:

Source IP address: ipv4 is 4 bytes, and ipv6 is 16 bytes

Source port: ipv4 and ipv6 is 2 bytes

Purpose IP address: ipv4 is 4 bytes, and ipv6 is 16 bytes

Destination interface: ipv4 and ipv6 is 2 bytes

Transport layer protocol: 1 byte

In entirety, in order to two kinds of agreements of ipv4 and ipv6 meet simultaneously, structure needs to be set to union form, this is done to ensure concordance, and simultaneously in order to ensure the high-performance of internal storage access, it is designed to cacheline alignment. Its size is 40 bytes, and not only that, it also has both direction, so whole session entry, when internal what all do not store, only key assignments just account for 80 bytes, it has to says that this is the internal storage structure body of a super large.

Based in the session conversational list forwarded, five-tuple is not only unique key value of conversational list, and it is also forward the element needing to use.

Due to the byte number of session the chances are 450 bytes itself, it is a structure huge comparatively speaking, and it represents this all relevant informations connected. In order to reduce amount of storage, what value can store is the address of session.

In order to save space further, key and the value in Hash table all exists in the structure of session, and the address with regard to being only session hung in Hash table. For five-tuple, also has a feature, be exactly connect be the key representing both direction, it is divided into client (left) and service end (right), that is, after a connection establishment is got up, the address of both direction will be hung up on Hash table, the two direction is all a session (session) in fact, simply need when hanging session Hash table, need the direction of mark session, then what Hash table was hung cannot be only only address, also should include the direction of session.

On the address of computer stores, generally last position is 0, so in order to save space, ensure that the uniqueness (key and value) that multinuclear accesses is inconsistent because of accessing without lock, the address hung on Hash table is addr | direction, wherein 0x00 represents left-hand, and 0x01 represents dextrad.

If the direction of session is left-hand, bucket=hash (lkey), recording address is then Addr (session) | 0x00; If the direction of session is dextrad, bucket=hash (rkey), recording address is Addr (session) | 0x01.

In sum, (the key hung on Hash table bucket, value), it is exactly an address (addr) | direction, its size is 64bit, the principle that Atomicity accesses, do so both can ensure that the safety that when reading without lock, atomicity accesses, and also can reduce space availability ratio. Do not think that direction is likely to only several bytes, but it is that typically in network security manufacturer, the even more session entry of guarantee 1000w is very usual thing, and about space 80 bytes especially shared by key value, this design is extremely to save space-efficient.

S32: calculate bucket (bucket) chained list of correspondence according to the five-tuple in described hash table.

Such as, the computing formula of bucket item is:

Bucket=hash_tbl [hash (key)]

Key is five-tuple.

Such as, it is determined that as shown in Figure 4, this original hash table of barrel chain table includes hey2_addr and key1_addr to the barrel chain table gone out.

S33: adopt locking mode, described hash table to be added is added to the gauge outfit of described barrel chain table.

In the present embodiment, when adding hash table, only add gauge outfit to, it is impossible to insert in adding procedure in the middle of the hash table existed every time.

Lock before addition, unlock after the addition, with algorithmic notation be:

Lock(bucket)

Old_head=bucket-> next

New=old_head-> next

Bucket-> next=new

unlock(bucket)

The gauge outfit that hash table adds to barrel chain table may include that

S331: obtain the original first catena of barrel chain table.

As it is shown in figure 5, original first catena key2_addr represents, hash table key3_addr to be added represents.

S332: the original pointer keeping barrel chain table is constant, by original for the pointed of hash table to be added first catena.

As shown in Figure 6, by the pointed key2_addr of key3_addr.

Such advantage is: owing to Hash table is read not lock, accordingly even when in the process that Hash table adds, search not read lock, also can find correct list item, be not result in finding list item.

S333: the gauge outfit of barrel chain table is changed to, by pointing to original first catena, the hash table that sensing is to be added.

As it is shown in fig. 7, gauge outfit is changed to sensing key3_addr from original sensing key2_addr.

In some embodiments, referring to Fig. 8, the flow process deleting hash table may include that

S81: determine hash table to be deleted.

S82: calculate bucket (bucket) chained list of correspondence according to the five-tuple in described hash table.

Such as, the computing formula of bucket item is:

Bucket=hash_tbl [hash (key)]

Key is the five-tuple of session.

Such as, it is determined that the barrel chain table gone out is as shown in Figure 9.

S83: adopt locking mode, described hash table to be deleted is deleted from described barrel chain table.

Need to lock before deletion, unlock after deletion. Namely during one in deleting bucket chained list, do not allow same bucket is added or deletion action simultaneously, but allow for searching (namely reading) operation.

Concrete, the flow process deleting hash table from barrel chain table may include that

S831: determine the previous hash table of hash table to be deleted.

For example, with reference to Fig. 9, hash table key2_addr to be deleted represents, previous hash table key3_addr represents.

S832: keep the pointer of hash table to be deleted to remain unchanged, by the later hash table of hash table to be deleted described in the pointed of described previous hash table.

For example, with reference to Figure 10, the later hash table key1_addr of hash table to be deleted represents, then the pointer keeping key2_addr is constant, and from pointing to key2_addr, the pointer of key3_addr is changed to sensing key1_addr.

The benefit of the next pointer of temporary transient motionless key2_addr is, if delete process in, have read operation to carry out, it is also possible to according to the next pointer chain of original chained list, to find correct list item simultaneously, will not drop, thus causing finding list item.

S823: discharge the pointer of hash table to be deleted.

For example, with reference to Figure 11, the pointer of release key2_addr.

S824: delete hash table to be deleted.

For example, with reference to Figure 12, delete key2_addr, can put in its time free memory pool.

This barrel chain table no longer comprises key2_addr, is that key3_addr points to key1_addr.

Lower surface analysis is read without lock and the concurrent safety deleted:

Foregoing each CPU only has one and forwards thread, then would not occur that multiple threads in same CPU conversate the lookup of Hash table and deletion action simultaneously, without the concurrent problem of same CPU. Next analyze and read without lock and lock the concurrent safety deleted.

When the lookup of session is parallel with deletion, operate due to multinuclear simultaneously and read unlocked, it is possible that find a hash table being just deleted, would be likely to occur the problem accessing illegal address or accessing invalid session entry, but, it is absent from above-mentioned safety problem in the present invention, is analyzed as follows:

(1) owing to conversational list is to adopt memory pool mechanism, so not havinging access illegal address. Conversational list is good according to a certain number of application in memory pool in advance, its memory address is forever effective, as long as this memory pool yet suffers from, address would not be illegal memory address, even if just found session entry is deleted by other CPU, then it is without problem memory address Access Violation occur.

(2) if conversational list occurs in that lookup and deletion action be not at a CPU, owing to the deletion of session entry is only capable of performing in the CPU creating session. Assume that message is received by CPU_A, then make a look up, during lookup, find that session is at CPU_B. And just session entry is deleted by CPU_B at this moment. Now whether there will be this message and have found an invalid session entry (because this is just deleted). This problem is non-existent, because message falls core in CPU_B after being searched. After core to CPU_B, again can again calculate key value, again search Hash table. Now because this hash table is deleted, so invalid session entry will not be found.

S24: if it does not exist, then newly-built session in the CPU receiving message, forward described message according to newly-built session.

S25: if it does, determine the CPU that described session belongs to.

S26: if the CPU that described session belongs to is the CPU receiving message, forwards described message according to the session entry that the described session in the CPU receiving message is corresponding.

S27: if the CPU that described session belongs to is not the CPU receiving message, then be transmitted to the CPU that described session belongs to by described message, and the CPU belonged to by described session forwards described message according to the session entry that described session is corresponding.

The particular content of S24-S27 may refer to the associated description of S13-S16, does not repeat them here.

Further, each CPU can in advance for session application internal memory, in order to after CPU creates session, it will words storage is in internal memory. Quantity in good each session (session) pond of overall application in advance, number is applied for: apply in advance according to licence plate (license), can so that application process in, reduce the number of times of distribution (alloc) internal memory, it is prevented that cause too much memory fragmentation.

First in first out (FirstInFirstOut, FIFO) queue and buffer memory (cache) can be adopted to safeguard application and the release of internal memory. Use First Input First Output to store the address pool get/put of session, safeguard application and the release of session. Wherein the application of each session and the queue of release also have level cache cache, are sized to 32, and the frequency to ensure this top32 can preferentially be used, and increases cache hit rate, reduce the cachemiss performance impact brought. In order to promote local cache utilization rate, still preserving N number of cache outside fifo ring, these are frequently calling of topN, first put in cache if be the most recently used, and the words failed to lay down place in fifo ring; During for alloc, being also first alloc from fifo, if it's not true for fifo, then batch allocN, in cache, continues to take from fifo from fifo. This method improves the cache utilization rate of resource, thus adding performance.

It addition, when session is stored internal memory, store with structure form, described structure includes: left-hand part, dextrad part and common part, and the normal access elements of each part is stored in the buffer.

Concrete, it is possible to including:

(1) the normal access elements of statistics cache lines (cacheline).

Wherein it is possible to TopN access elements is divided into two classes: read-only, only write.

By the element compact package of the most normal read access in the same time in forwarding critical path together, and it is organized as cacheline alignment, to reduce the cachemiss impact on performance, because crossing over two cacheline, it is meant that twice load or twice store.

Try not to be placed in the cacheline only reading element the structural element writing data, cpu can be tried one's best and once reads the cacheline element reading that just will need.

(2) structure segmentation layout

The uniqueness of conversational list is as the criterion with the five-tuple (sip, dip, sport, dport) of message, and the message of both direction is both needed to give same cpu process, and in conversation-based forward-path, it is necessary to the inner element of high-frequency access session list item. But for high security, high robustness fire wall for, the structure content of session entry not only contains its key value, state machine, the ip address in each direction, mac address, inlet/outlet equipment, Lin Juxiang, cam list item, counting messages number etc., also have public such as the various strategies etc. of connection status machine, connection.

As above said, generally in the forwarding of message, client-> server direction, server-> client direction, public general part can be divided into according to forwarding logic and access content. And in most cases, for c-> s direction, it mainly accesses the part+common part that content is only just c-> s direction, and the content that s-> c direction accesses is only s-> c direction+common content.

Therefore, referring to Figure 13, general three parts of session structure body:

Left-hand part: the exclusive content in client (client)-> service end (server) this direction;

Dextrad part: the exclusive content in service end (server)-> client (client) this direction;

Common part: left-hand and dextrad common content.

It addition, the filling part in each part represents the element that its altofrequency accesses, these elements frequently accessed are distributed in minimum several cacheline compactly according to size, are promoted the performance of fast-forwarding by high cache hit probability.

In sum, technical essential is as follows:

First, multinuclear is big and gives, the lock analysis to performance impact.

Multinuclear is also given, and the use sharing resource can cause occurring competition mutual exclusion between multinuclear cpu. In conversation-based forwarding process, the performance handled up is totally dependent on the lookup of global session table and the process to session entry.

Current existing technical paper and invention, all without the performance cost forwarded for multinuclear without lock algorithm paying attention to read lock or the atom variable that optimized. Really, the big flow if not in ten thousand Broadcoms, even 10G network interface card is handled up in test, and the expense that affects of performance is almost negligible by read lock, but in high performance forwarded environment, even if read lock, its expense can not be despised. The lock expense to performance is described by principle and experimental data, the present invention devises the algorithm without lock search of fast session Hash table according to this conclusion, not only efficiently solve the collision problem of read-write competitive resource simultaneously, also the performance cost read when No Assets competition conflict is solved, when making multinuclear concurrently search, not because using lock to ensure that the safety of resource contention causes to accomplish that performance is double when multinuclear forwards.

Second, the concurrent conversational list of multinuclear is searched, is processed design. Core mechanism based on the multinuclear forwarding session of every core mechanism, safety, healthy and strong nothing lock Hash table search method for designing, the every coring of Session Resources, the probability of resource contention mutual exclusion when making multinuclear concurrent is substantially reduced, simultaneously global session Hash table without lockization not only ensure that in performance multinuclear handle up forward double, the method added simultaneously, delete, and the processing method of session be also ensure that safety, vigorousness.

Session entry resource is every core mechanism, and Local resource only allows this core cpu to operate. Because the session of same connection comprises both direction.

The rss utilizing hardware network interface card divides Queue Algorithm, the secretkey selecting suitable hardware makes the both direction of same session be assigned to same cpu, make to fall core it is ensured that same connection is assigned in same cpu, make the Message processing process of same connection avoids down core and Session Resources can be exclusively enjoyed in repeating process, thus it is double to make session section process still can reach performance in multinuclear repeating process.

Meet the key value session that network five-tuple is session and search Hash table global resource, without lock during lookup, need to add to write lock when adding and delete. In forwarded process, the key value of same connection is constant, even and if add and delete chained list mode and can ensure that reading is without lock, remain safety and healthy and strong.

3rd, the efficient memory management design of session entry and Hash table.

Session entry resource adopts and allocates in advance and caching mechanism, and resource adopts every core system simultaneously, had both avoided mutual exclusion competition during multi-core operation shared resource, and it also avoid the generation of memory fragmentation. (the key of hash table, value) design of storage, hash table structure, namely ensure that efficient cache utilization rate, in joint space-efficient simultaneously, efficiently utilizing again space so that even if in the super-flow repeating process of 160G, internal storage access does not become performance bottleneck point yet, being greatly improved internal storage access speed, under multinuclear, performance becomes linear increase.

Technique effect is as follows:

First, it is substantially improved relative to performances such as Read-Write Locks, rcu, atom variablees.

Process the analysis to performance impact by various locks to contrast, carry out the use of design share resource and lock, be best implemented with high-performance and share the concurrently access of resource, analyze and Experimental comparison based on this, the lookup algorithm searching employing nothing lock of session session Hash table.

Current is based upon reading the methods such as atom variable, rcu, the performance cost that busy waiting when reducing conflict brings without lock algorithm mostly. Even if without considering also performance to be caused large effect at read lock. Principle, during room is tested it is concluded that read lock, atom variable, rcu use still result in significantly hydraulic performance decline, this be also the present invention use without lock algorithm immediate cause.

First performance is also had certain expense by the cyles of latching operation itself, and it only accounts for a very little part certainly.

Secondly in the process locked, operation including read lock, atom variable, capital causes that hardware instruction can writing all Refresh Datas of buffer area in internal memory, and in the repeating process that the high speed of multinuclear 80G nearly is handled up, calling these instructions continually can affect performance.

Again lock instruction, the use of atom variable prohibits cpu optimization reordered to instruction in running. For fast-forwarding, search ltsh chain table and necessarily often wrap once, its call frequency it can be seen that and optimizing of instruction is critically important to the lifting of performance, this also result in performance relatively cannot often core concurrently after performance double.

Finally, be also most important reason, it does not have relatively good utilisation buffer lock (use buffer lock time performance relatively low) or actual call in employ bus lock. Such as when the region of memory of lock protection is across multiple cache lines, hardware still form lock bus can be caused to realize locking or when resource contention is very frequent. Modal type, when the lock protection of core region of memory in the buffer, but other cores are still attempted to access (including read-write), all can cause lock bus. As used cache lock when CPU1 revises the lock i in cache lines, then CPU2 cannot the buffer memory cache lines of lock i simultaneously, thus blocking other cpu so that this shared drive can be exclusively enjoyed when cpu1 obtains lock. This situation can cause that buffer memory is monopolized by this CPU, monopolized by that CPU for a moment for a moment, at this moment just can constantly produce buffer consistency RFO instruction and have influence on concurrency performance. If same lock is frequently accessed (no matter reading and writing, all can cause using buffer lock to carry out improving performance) by multinuclear simultaneously. Especially buffer lock do not supported by current most of processors, and wherein intel only has Pentium4, IntelXeon and P6 processor to support this function.

Referring to Figure 14, forward as experiment basis with the two of dpdk layers, on the basis that conversational list has built up, for two bucket list items being close to of two Hash tables, allow two cpu search respectively. That assume cpu2 lookup use is bucket2, key2_addr, and cpu1 searches and uses bucket1, key1_addr. Even if it is obvious that they directly do not lock in fact, there will not be any problem. The following is two layers and forward udp64bit parcel performance comparison:

Readlock is used to search readunlock before searching:

total_tx_speed:14880546.400(pps)7618.840(Mbps)

total_rx_speed:10562110.350(pps)5407.800(Mbps)

Total_lossrate:29.021 (%)

Not read lock before lookup, after only searching:

total_tx_speed:14880572.000(pps)7618.853(Mbps)

total_rx_speed:14880298.700(pps)7618.713(Mbps)

Total_lossrate:0.002 (%)

Experiment proves, multi core design is thought when not there is resource contention, read lock, atom variable use be non-existent without influence on performance, the impact of performance is belonging to noiseless impact by him, development along with science and technology, performance optimization is placed on the performance optimization of the lock of potentially conflicting by many developers, even and if the performance cost when reading does not conflict being ignored. The present invention is just from this aspect, and with real without lock search, the high-performance ensured in fast-forwarding process is handled up.

For the design of handling up of high-performance fast-forwarding, how finding accurate and safe session session entry rapidly is the key determining performance. Just because of Read-Write Locks, atom variable, rcu algorithm, performance is all had the impact of bigger performance by various algorithms without lock search based on them, it is impossible to ensure that when multinuclear is concurrent, performance can performance increase higher, so devising the conversational list entirely without lock without lock Hash lookup algorithm.

Second, multinuclear forwards throughput performance linear increase (double)

Owing to resource is just adopted every core mechanism by the design initial stage, the hash algorithm of session is also without lock search simultaneously, in multinuclear repeating process, almost can accomplish without sharing resource between multinuclear, and Lothrus apterus; Hardware rss algorithm especially with the network interface card of intel so that the both direction of same connection is directly assigned in same cpu so that do not need software to fall core it is ensured that same cpu processes a resource, it does not have multinuclear competes same resource.

To sum up, design above ensure that performance is namely based on the number linear increase of cpu, linearly double result.

3rd, efficient internal memory/cache utilization rate

The design of Session session Hash table saves memory headroom. Simultaneous session list item use memory pool design, it is ensured that internal memory application, release high efficiency, decrease the generation of memory fragmentation. The design of simultaneous session table internal structure, takes full advantage of the utilization rate of cache, improves memory access performance.

4th, safe, healthy and strong

Owing to the core of the present invention is courageously to employ the Hash conversational list lookup algorithm without lock, but algorithm is safe, healthy and strong on the whole. And by test for a long time, in actual scene, also demonstrate its vigorousness. Cardinal principle is because the lookup algorithm coordinating this without lock, and algorithm also needs to adhere to following principle: newly-built, deletion must lock, and is only capable of processing this connection in the cpu of newly-built session; Session session connection table is every nuclear resource; The key value of session conversational list cannot be changed; The interpolation (establishment) of session Hash table is only capable of adding in gauge outfit; The interpolation of session Hash table and the pointer of deletion move and have to comply with rule etc.

To sum up, in principle, whole design is safe, healthy and strong. Meanwhile, in actual test and application, he is also the test of the time of afford to experience. To sum up, this is one and has high robustness, high security, high performance forwarding method for designing of handling up.

Figure 15 is the structural representation of the apparatus for forwarding message based on multiple nucleus system that another embodiment of the present invention proposes, and this device 150 includes: extraction module 151, searches module 152, newly-built module 153, determine module the 154, first forwarding module 155 and the second forwarding module 156.

Extraction module 151, is used for receiving message, and extracts five-tuple from described message;

Optionally, extraction module 151 is used for receiving message, including:

Receiving the message that network interface card sends, wherein, described network interface card adopts RSS algorithm that message is distributed to CPU, and when adopting RSS algorithm, the private key of employing meets following condition: after dividing private key with two bytes for one group, the numerical value of adjacent two groups is identical.

Search module 152, in global session Hash table, without whether lock search exists the session corresponding with described five-tuple;

Optionally, described global session Hash table includes multiple hash table, record in described hash table: session address and conversation direction.

Newly-built module 153, when being used for being absent from described session, then newly-built session in the CPU receiving message, forward described message according to newly-built session.

Determine module 154, when being used for existing described session, it is determined that the CPU that described session belongs to;

First forwarding module 155, for when the CPU that described session belongs to is the CPU receiving message, forwarding described message according to the described session in the CPU receiving message;

Second forwarding module 156, for when the CPU that described session belongs to not is the CPU receiving message, then described message being transmitted to the CPU that described session belongs to, and the CPU belonged to by described session forwards described message according to described session.

In some embodiments, referring to Figure 16, this device also includes:

Writing module 157, for, in global session Hash table, adopting locking mode to add hash table, or, adopt locking mode to delete hash table.

Optionally, writing module is used for adopting locking mode to add hash table, including:

Obtain hash table to be added;

The barrel chain table of correspondence is calculated according to the five-tuple in described hash table;

Adopt locking mode, described hash table to be added is added to the gauge outfit of described barrel chain table.

Further, writing module 157 is used for adding described hash table to be added to the gauge outfit of described barrel chain table, including:

Obtain the original first catena of barrel chain table;

The original pointer keeping barrel chain table is constant, by original for the pointed of hash table to be added first catena;

The gauge outfit of barrel chain table is changed to, by pointing to original first catena, the hash table that sensing is to be added.

Optionally, writing module 157 is used for adopting locking mode to delete hash table, including:

Determine hash table to be deleted;

Adopt locking mode, described hash table to be deleted is deleted from described barrel chain table.

Further, writing module 157 is used for deleting described hash table to be deleted from described barrel chain table, including:

Determine the previous hash table of hash table to be deleted;

The pointer keeping hash table to be deleted remains unchanged, by the later hash table of hash table to be deleted described in the pointed of described previous hash table;

Discharge the pointer of hash table to be deleted;

Delete hash table to be deleted.

In some embodiments, referring to Figure 16, this device also includes:

Memory module 158, is used for being session application internal memory in advance, and described session is stored in the internal memory of correspondence, wherein, adopts application and the release of fifo queue and cache maintenance internal memory.

Optionally, described memory module 158 is used for being stored in described session in the internal memory of correspondence, including:

Storing session with structure form, described structure includes: left-hand part, dextrad part and common part, and the normal access elements of each part is stored in the buffer.

The particular content of above-mentioned module may refer to the associated description in embodiment of the method, does not repeat them here.

It should be noted that in describing the invention, term " first ", " second " etc. only for descriptive purposes, and it is not intended that instruction or hint relative importance. Additionally, in describing the invention, except as otherwise noted, the implication of " multiple " refers at least two.

Describe in flow chart or in this any process described otherwise above or method and be construed as, represent and include the module of code of executable instruction of one or more step for realizing specific logical function or process, fragment or part, and the scope of the preferred embodiment of the present invention includes other realization, wherein can not press order that is shown or that discuss, including according to involved function by basic mode simultaneously or in the opposite order, performing function, this should be understood by embodiments of the invention person of ordinary skill in the field.

Should be appreciated that each several part of the present invention can realize with hardware, software, firmware or their combination. In the above-described embodiment, multiple steps or method can realize with the storage software or firmware in memory and by suitable instruction execution system execution. Such as, if realized with hardware, the same in another embodiment, can realize by any one in following technology well known in the art or their combination: there is the discrete logic of logic gates for data signal realizes logic function, there is the special IC of suitable combination logic gate circuit, programmable gate array (PGA), field programmable gate array (FPGA) etc.

Those skilled in the art are appreciated that realizing all or part of step that above-described embodiment method carries can be by the hardware that program carrys out instruction relevant and complete, described program can be stored in a kind of computer-readable recording medium, this program upon execution, including the step one or a combination set of of embodiment of the method.

Additionally, each functional unit in each embodiment of the present invention can be integrated in a processing module, it is also possible to be that unit is individually physically present, it is also possible to two or more unit are integrated in a module. Above-mentioned integrated module both can adopt the form of hardware to realize, it would however also be possible to employ the form of software function module realizes. If described integrated module is using the form realization of software function module and as independent production marketing or use, it is also possible to be stored in a computer read/write memory medium.

Storage medium mentioned above can be read only memory, disk or CD etc.

In the description of this specification, specific features, structure, material or feature that the description of reference term " embodiment ", " some embodiments ", " example ", " concrete example " or " some examples " etc. means in conjunction with this embodiment or example describe are contained at least one embodiment or the example of the present invention. In this manual, the schematic representation of above-mentioned term is not necessarily referring to identical embodiment or example. And, the specific features of description, structure, material or feature can combine in an appropriate manner in any one or more embodiments or example.

Although above it has been shown and described that embodiments of the invention, it is understandable that, above-described embodiment is illustrative of, it is impossible to be interpreted as limitation of the present invention, and above-described embodiment can be changed, revises, replace and modification by those of ordinary skill in the art within the scope of the invention.

Claims

1. the message forwarding method based on multiple nucleus system, it is characterised in that including:

Receive message, and from described message, extract five-tuple;

In global session Hash table, without whether lock search exists the session corresponding with described five-tuple;

If it does not exist, then newly-built session in the CPU receiving message, forward described message according to newly-built session;

If it does, determine the CPU that described session belongs to;

If the CPU that described session belongs to is the CPU receiving message, forward described message according to the described session in the CPU receiving message;

If the CPU that described session belongs to is not the CPU receiving message, then described message is transmitted to the CPU that described session belongs to, and the CPU belonged to by described session forwards described message according to described session.

2. method according to claim 1, it is characterised in that described reception message, including:

3. method according to claim 1, it is characterised in that also include:

In global session Hash table, locking mode is adopted to add hash table, or, adopt locking mode to delete hash table.

4. method according to claim 3, it is characterised in that described employing locks mode and adds hash table, including:

Obtain hash table to be added;

5. method according to claim 4, it is characterised in that the described gauge outfit that described hash table to be added is added to described barrel chain table, including:

Obtain the original first catena of barrel chain table;

6. method according to claim 3, it is characterised in that described employing locks mode and deletes hash table, including:

Determine hash table to be deleted;

7. method according to claim 6, it is characterised in that described described hash table to be deleted is deleted from described barrel chain table, including:

Determine the previous hash table of hash table to be deleted;

Discharge the pointer of hash table to be deleted;

Delete hash table to be deleted.

8. the method according to any one of claim 1-7, it is characterised in that described global session Hash table includes multiple hash table, record in described hash table: session address and conversation direction.

9. the method according to any one of claim 1-7, it is characterised in that also include:

In advance for session application internal memory, and described session is stored in the internal memory of correspondence, wherein, adopts application and the release of fifo queue and cache maintenance internal memory.

10. method according to claim 9, it is characterised in that in the described internal memory that described session is stored in correspondence, including:

11. method according to claim 1, it is characterised in that described session is only capable of being established the CPU of described session and accesses, and adopt without lock access mode when the CPU setting up described session accesses.

12. the apparatus for forwarding message based on multiple nucleus system, it is characterised in that including:

Extraction module, is used for receiving message, and extracts five-tuple from described message;

Search module, in global session Hash table, without whether lock search exists the session corresponding with described five-tuple;

Newly-built module, when being used for being absent from described session, then newly-built session in the CPU receiving message, forward described message according to newly-built session;

Determine module, when being used for existing described session, it is determined that the CPU that described session belongs to;

First forwarding module, for when the CPU that described session belongs to is the CPU receiving message, forwarding described message according to the described session in the CPU receiving message;

Second forwarding module, for when the CPU that described session belongs to not is the CPU receiving message, then described message being transmitted to the CPU that described session belongs to, and the CPU belonged to by described session forwards described message according to described session.