Disclosure of Invention
The application provides a method and a device for processing a congestion flow, which can prevent the queue depth of a queue in network equipment from rapidly reaching a PFC threshold, and avoid the situation that a packet corresponding to the congestion flow is lost.
In a first aspect, an embodiment of the present application provides a method for processing a congestion flow, where the method is applied to a first network device, and the method includes: acquiring a first message, where the first message includes first indication information, where the first indication information is used to indicate and adjust a sending rate of a first congestion flow sent by a third network device, the third network device is a source device of the first congestion flow, and the first congestion flow is a congestion flow identified by a second network device when the second network device is congested; and processing the first message.
In this embodiment of the application, the first indication information in the first message may be used to indicate to adjust a sending rate of the first congestion flow sent by the third network device, and through the first message, the source device of the first congestion flow (that is, the third network device) may adjust the sending rate of the first congestion flow, so that the sending rate of the first congestion flow can be matched with a congestion degree of a network, and a priority queue in a network device (including the first network device and/or the second network device, etc.) in the network is prevented from quickly reaching the PFC threshold. The first network device acquires the first message and then processes the first message. For example, the first network device may process the first packet according to a relationship between the first network device and the third network device, that is, the relationship between the first network device and the third network device (e.g., whether the first network device is connected to the third network device, etc.) is different, and the first network device processes the first packet in different manners. Although the first network device processes the first packet differently, the end result may cause the third network device to adjust the sending rate of the first congested flow.
In a possible implementation manner, the first packet further includes second indication information, where the second indication information is used to indicate that the first congestion flow sent by the third network device is isolated to a congestion queue in the third network device.
In this embodiment, the first message includes the second indication information, so that the third network device can isolate the first congestion flow, and it is avoided that when the first congestion flow and the non-congestion flow are in the same queue, the first congestion flow needs to be back-pressed, which may affect the sending of the non-congestion flow. The second indication information is used to indicate that the first congestion flow sent by the third network device is isolated to the congestion queue in the third network device, and may also be understood as: the second indication information is used to indicate that the first congestion flow in the third network device is isolated (switched to) a congestion queue in the third network device, or, it can be further understood that: the second indication information is used to indicate that the first congestion flow corresponding to the third network device is isolated (switched to) a congestion queue in the third network device.
In a possible implementation manner, the first indication information is used to indicate that adjusting the sending rate of the first congestion flow sent by the third network device includes: the first indication information is used for indicating to adjust the sending rate of the congestion queue in the third network device.
In a possible implementation manner, before the obtaining the first packet, the method further includes: identifying the first congested flow, isolating the first congested flow to a congestion queue in the first network device; the acquiring the first message includes: and generating the first message under the condition that the congestion queue in the first network equipment is congested.
In this embodiment, the first network device may be a network device where a congestion point in a network is located, or the first network device may also be an upstream node of the network device where the congestion point is located. And under the condition that the congestion queue in the first network equipment is congested, generating a first message so as to instruct the third network equipment to adjust the sending rate of the first congestion flow sent by the third network equipment. Therefore, by judging whether the congestion queue is congested or not, the third network equipment can adjust the sending rate of the first congestion flow sent by the third network equipment more accurately according to the congestion degree in the network. Meanwhile, whether congestion occurs in the congestion queue or not can be judged, and the first congestion flow is prevented from being influenced by adjusting the sending rate of the first congestion flow due to the fact that the congestion does not occur in the congestion queue.
In a possible implementation manner, after the processing the first packet, the method further includes: and sending a third PFC message to the upstream node of the first network device under the condition that the queue depth of the congestion queue in the first network device is greater than or equal to the flow control PFC threshold based on the priority level, wherein the third PFC message is used for indicating the congestion queue in the upstream node of the first network device to stop sending the first congestion flow.
In this embodiment, the third network device may prevent, by adjusting the sending rate of the first congestion flow, that the queue depth of the congestion queue in the first network device reaches the PFC threshold quickly, or even prevent the PFC threshold from occurring as much as possible.
In a possible implementation manner, the first network device and the second network device are the same network device.
In this embodiment of the application, when the first network device is a network device at a congestion point where congestion occurs in a network (that is, a network where the first network device is located), the first network device may further generate the first packet when the first congestion flow is identified and isolated to a congestion queue in the first network device. The first network device identifies and isolates the first congestion flow, so that the situations that other data flows cannot be cached in a queue where the first congestion flow is located, even packets corresponding to other data flows are lost and the like due to congestion of the queue where the first congestion flow is located can be avoided. Meanwhile, by identifying the first congested flow, the first network device may explicitly instruct the third network device to adjust the sending rate of the first congested flow.
In one possible implementation, before the identifying the first congested flow, the method further includes: receiving a third packet, where the third packet is used to indicate that the first congestion flow sent by the first network device is isolated to a congestion queue in the first network device, and the first network device and the second network device are different network devices; the identifying the first congested flow, the isolating the first congested flow to a congestion queue in the first network device comprising: and identifying the first congestion flow according to the third message, and isolating the first congestion flow to a congestion queue in the first network equipment.
In this embodiment, the first network device may also be an upstream node of the second network device, and in this case, after the first network device receives the third packet and recognizes the first congestion flow, it may further determine whether to generate the first packet according to a congestion degree of a congestion queue of the first network device.
In this embodiment, the first network device may generate the first packet when the congestion queue of the first network device is congested. Thereby, the efficiency of the third network device adjusting the transmission rate of the first congested flow it transmits may be improved. The condition that the congestion queue in the first network device is congested comprises the following steps: a queue depth of a congestion queue in the first network device is greater than or equal to a first depth threshold. In an embodiment of the present application, the identifying the first congested flow includes: identifying the first congested flow from a congestion queue in the first network device.
In one possible implementation, the identifying the first congested flow includes: identifying the first congested flow from a management queue in the first network device.
In this embodiment, the first congestion flow may be a congestion flow that the first network device identifies from its management queue; or, a congestion flow identified for the first network device from its congestion queue.
In one possible implementation, the identifying the first congested flow from the management queue in the first network device includes: identifying the first congested flow from a management queue in the first network device in the event the management queue in the first network device is congested.
In this embodiment of the present application, a case that a management queue in a first network device is congested includes: a queue depth of a management queue in the first network device is greater than or equal to a second depth threshold.
In one possible implementation, the method further includes: generating a second packet, where the second packet is used to instruct an upstream node of the first network device to isolate the first congestion flow to a congestion queue in the upstream node of the first network device; and sending the second message to an upstream node of the first network equipment.
In this embodiment, when the first network device identifies the first congestion flow and isolates the first congestion flow, in order to avoid congestion of other network devices caused by the first congestion flow, the first network device may further send a second packet to an upstream node thereof.
In a possible implementation manner, the second packet is generated when a congestion queue in the first network device is congested.
In a possible implementation manner, before the obtaining the first packet, the method further includes: receiving a third packet, where the third packet is used to indicate that the first congestion flow sent by the first network device is isolated to a congestion queue in the first network device, and the first network device and the second network device are different network devices; and identifying the first congestion flow according to the third message, and isolating the first congestion flow to a congestion queue in the first network equipment.
In this embodiment of the application, when the first network device is a network device where a non-congestion point in a network is located, the first network device may further identify the first congestion flow according to a third packet sent by another network device, and then isolate a congestion queue in the first network device.
In a possible implementation manner, the obtaining the first packet includes: receiving the first packet from a downstream node of the first network device.
In this embodiment, the first network device may generate the first packet and may also receive the first packet sent by other network devices. Optionally, the first network device may receive the first packet after receiving the third packet, identifying the first congestion flow, and isolating the first congestion flow to a congestion queue in the first network device; alternatively, the first network device may also directly receive the first packet, and then instruct the third network device to adjust the sending rate of the first congestion flow sent by the third network device.
In a possible implementation manner, the processing the first packet includes: generating a first PFC message according to the first message, wherein the first PFC message comprises duration information of a stop of a queue where the first congestion flow is sent by the third network device; and sending the first PFC message to the third network equipment.
In this embodiment of the application, when the first network device acquires the first packet from the downstream node thereof, the first network device may further send a first PFC packet to a third network device, so that the third network device adjusts a sending rate of a first congestion flow sent by the third network device according to the first PFC packet. Optionally, the first PFC message may also be understood as: the method comprises the time length information that the queue where the first congestion flow sent by the third network equipment is located is back-pressed. In this embodiment of the application, after receiving the first packet, the first network device may send a first PFC packet to a third network device, and for the third network device, the first PFC packet indicates to adjust a sending rate of a first congestion flow sent by the third network device, which is not only simple and reliable, but also easy to implement. Even if the third network device does not support congestion isolation, the sending rate of the first congestion flow sent by the third network device can still be adjusted through the first PFC message.
In a possible implementation manner, the queue in which the first congestion flow is located includes a congestion queue in which the first congestion flow is located and/or an original queue in which the first congestion flow is located.
In a possible implementation manner, the processing the first packet includes: sending the first message to the third network equipment; or generating a fourth message according to the first message, where the fourth message includes fourth indication information, and the fourth indication information is used to indicate that the first congestion flow sent by the third network device is isolated to a congestion queue in the third network device; and sending the fourth message to the third network equipment.
In the embodiment of the present application, for the specific description of the fourth indication information, reference may also be made to the above-mentioned second indication information.
In a possible implementation manner, the first indication information is used to indicate that the sending rate of the first congestion flow sent by the third network device is reduced; or, the first indication information is used to indicate to stop the third network device from sending the first congestion flow.
In this embodiment of the application, the first indication information is used to indicate to stop the third network device from sending the first congestion flow, and may be understood as: the first indication information is used for indicating the third network equipment to stop sending the first congestion flow; alternatively, the first indication information is used to indicate that the first congestion flow transmitted by the third network device is stopped from being transmitted, and the like.
In one possible implementation, the method further includes: acquiring a fifth message, where the fifth message includes fifth indication information, and the fifth indication information is used to indicate that the sending rate of the first congestion flow sent by the third network device is increased; or, the fifth indication information is used to indicate to resume the third network device to send the first congestion flow; and processing the fifth message.
In this embodiment of the application, the fifth indication information is used to indicate to resume the third network device to send the first congestion flow, and may be understood as: the first indication information is used for indicating the third network equipment to resume sending the first congestion flow; alternatively, the first indication information is used to indicate that the first congested flow transmitted by the third network device is resumed to be transmitted, and the like.
In a possible implementation manner, the obtaining the fifth packet includes: generating the fifth packet if the first congested flow is identified as a non-congested flow; or, generating the fifth packet when the congestion queue in the first network device is not congested.
In a possible implementation manner, the obtaining the fifth packet includes: receiving the fifth packet from a downstream node of the first network device.
In a possible implementation manner, the processing the fifth packet includes: generating a second PFC message according to the fifth message, where the second PFC message is used to indicate that the queue where the first congestion flow sent by the third network device is located is recovered; and sending the second PFC message to the third network equipment.
In a possible implementation manner, the processing the fifth packet includes: sending the fifth message to the third network device; or generating a sixth message according to the fifth message, where the sixth message includes eighth indication information, and the eighth indication information is used to indicate that the first congestion flow sent by the third network device is switched to a non-congestion queue in the third network device; and sending the sixth message to the third network equipment.
In a second aspect, an embodiment of the present application further provides a method for processing a congestion flow, where the method is applied to a third network device, and the method includes: and acquiring a message, and adjusting the sending rate of the first congestion flow sent by the third network equipment according to the message.
In this embodiment, the message may include at least one of the first PFC message, the first message, and the fourth message.
In a possible implementation manner, a first PFC message is obtained, where the first PFC message includes duration information that a queue where the first congestion flow sent by the third network device is located is stopped; and adjusting the sending rate of the first congestion flow sent by the third network equipment according to the first PFC message.
In a possible implementation manner, the queue in which the first congestion flow is located includes a congestion queue in which the first congestion flow is located and/or an original queue in which the first congestion flow is located.
In a possible implementation manner, before adjusting the sending rate of the first congestion flow sent by the third network device according to the first PFC message, the method further includes: acquiring a fourth message, where the fourth message includes fourth indication information, and the fourth indication information is used to indicate that the first congestion flow sent by the third network device is isolated to a congestion queue in the third network device; and isolating the first congestion flow sent by the third network device to a congestion queue in the third network device according to the fourth message (or fourth indication information in the fourth message).
In a possible implementation manner, the fourth packet further includes third indication information, where the third indication information is used to indicate and adjust a sending rate of a first congestion flow sent by the third network device, the third network device is a source device of the first congestion flow, and the first congestion flow is a congestion flow identified by the second network device when the second network device is congested.
In a possible implementation manner, the adjusting, according to the first PFC packet, a sending rate of a first congestion flow sent by the third network device includes: and adjusting the sending rate of the first congestion flow sent by the third network device according to the first PFC message and the fourth PFC message (or third indication information in the fourth message).
In a possible implementation manner, before adjusting the sending rate of the first congestion flow sent by the third network device according to the first PFC message, the method further includes: the method includes the steps of obtaining a first message, and adjusting the sending rate of a first congestion flow sent by a third network device according to the first message and a first PFC message, wherein the first message includes first indication information, the first indication information is used for indicating and adjusting the sending rate of the first congestion flow sent by the third network device, the third network device is a source device of the first congestion flow, and the first congestion flow is a congestion flow identified by a second network device under the condition that the second network device is congested.
In a possible implementation manner, the first packet further includes second indication information, where the second indication information is used to indicate that the first congestion flow sent by the third network device is isolated to a congestion queue in the third network device.
In a possible implementation manner, the adjusting, according to the first PFC packet, a sending rate of a first congestion flow sent by the third network device includes: isolating the first congestion flow sent by the third network device to a congestion queue in the third network device according to the first message, and adjusting the sending rate of the first congestion flow sent by the third network device according to the first message and the first PFC message.
In a possible implementation manner, the first indication information is used to indicate that the sending rate of the first congestion flow sent by the third network device is reduced; or, the first indication information is used to indicate to stop the first network device from sending the first congestion flow.
In this embodiment, the third network device may further adjust the sending rate of the first congestion flow sent by the third network device only through the first packet or the fourth packet.
In one possible implementation, the method further includes: receiving a second PFC message, where the second PFC message is used to indicate that a queue where the first congestion flow sent by the third network device is located is recovered; and recovering the transmission of the first congestion flow according to the second PFC message.
In one possible implementation, the method further includes: receiving a fifth message, where the fifth message includes fifth indication information, and the fifth indication information is used to indicate that a sending rate of a first congestion flow sent by the third network device is increased; or, the fifth indication information is used to indicate to resume the first network device to send the first congestion flow; increasing the sending rate of the first congestion flow sent by the third network equipment according to the fifth message; or, restoring the first network device to send the first congestion flow according to the fifth packet.
In this embodiment of the application, the fifth packet may further include sixth indication information.
In a possible implementation manner, a sixth packet is received, where the sixth packet includes eighth indication information, and the eighth indication information is used to indicate that the first congestion flow sent by the third network device is switched to a non-congestion queue in the third network device; and sending the sixth message to the third network equipment.
In this embodiment, the third network device may isolate the first congested flow to the non-congested queue according to the sixth packet, and then increase the sending rate of the first congested flow or resume the sending of the first congested flow. Or, the sixth packet may further include seventh indication information, so that the third network device increases the sending rate of the first congestion flow or resumes sending the first congestion flow according to the sixth packet.
The beneficial effects of the second aspect can be seen in the beneficial effects of the first aspect, which are not described herein in detail.
In a third aspect, the present application provides a first network device, configured to perform the method of the first aspect or any possible implementation manner of the first aspect. The first network device comprises corresponding means with instructions to perform the method of the first aspect or any possible implementation of the first aspect.
For example, the first network device may include a transceiving unit and a processing unit.
In a fourth aspect, the present application provides a third network device configured to perform the method of the second aspect or any possible implementation manner of the second aspect. The third network device comprises corresponding means for performing the method of the second aspect or any possible implementation of the second aspect.
For example, the third network device may comprise a transceiving unit and a processing unit.
In a fifth aspect, the present application provides a first network device, which includes a processor, and the processor may be configured to execute the method shown in the first aspect or any possible implementation manner of the first aspect.
In the embodiment of the present application, in the process of executing the method, a process of sending a message or receiving a message (hereinafter, referred to as information) in the method may be understood as a process of outputting information by a processor and a process of receiving input information by the processor. When outputting information, the processor outputs the information to the transceiver for transmission by the transceiver. This information, after being output by the processor, may also need to be further processed before reaching the transceiver. Similarly, when the processor receives incoming information, the transceiver receives the information and inputs it to the processor. Further, after the transceiver receives the information, the information may need to be further processed before being input to the processor.
Based on the above principle, for example, sending the first packet may be understood as the processor outputting the at least one first bit block. Also for example, receiving a first message may be understood as a processor receiving an incoming first message, and so on.
The operations involving a processor, such as transmitting and/or receiving, may be understood generally as operations involving processor outputs and/or inputs, unless specifically indicated otherwise, or if not contradicted by actual role or inherent logic in the associated description.
In implementation, the processor may be a processor dedicated to performing the methods, or may be a processor executing computer instructions in a memory to perform the methods, such as a general-purpose processor. For example, the processor may also be adapted to execute a program stored in the memory, which when executed, causes the first network device to perform the method as illustrated in the first aspect above or any possible implementation of the first aspect.
In one possible implementation, the memory is located outside the first network device.
In one possible implementation, the memory is located within the first network device.
In the embodiments of the present application, the processor and the memory may also be integrated into one device, that is, the processor and the memory may also be integrated together.
In a possible implementation manner, the first network device further includes a transceiver, and the transceiver is configured to receive a message or send a message, and the like.
In a sixth aspect, the present application provides a third network device comprising a processor configured to perform a method as set forth in the second aspect or any possible implementation manner of the second aspect. In the embodiments of the present application, specific descriptions about the processor may refer to the description of the fifth aspect, and are not described in detail here.
In one possible implementation, the memory is located outside the third network device.
In one possible implementation, the memory is located within the third network device.
In a possible implementation manner, the third network device further includes a transceiver, and the transceiver is configured to receive a message or send a message, and the like.
In a seventh aspect, the present application provides a first network device comprising a logic circuit and an interface, the logic circuit being coupled to the interface, the logic circuit being configured to perform the method according to the first aspect or any possible implementation manner of the first aspect.
Illustratively, the logic is configured to obtain a first packet; alternatively, the logic circuit may further obtain an input first packet and the like through an interface, and the logic circuit may further be configured to process the first packet, or the logic circuit may further output the first packet and the like through the interface.
In an eighth aspect, the present application provides a third network device comprising a logic circuit and an interface, the logic circuit being coupled to the interface, the logic circuit being operable to perform the method as set forth in the second aspect or any possible implementation manner of the second aspect.
In a ninth aspect, the present application provides a computer readable storage medium for storing a computer program which, when run on a computer, causes the method illustrated in the first aspect or any possible implementation of the first aspect described above to be performed.
In a tenth aspect, the present application provides a computer-readable storage medium for storing a computer program which, when run on a computer, causes the method shown in the second aspect or any possible implementation of the second aspect described above to be performed.
In an eleventh aspect, the present application provides a computer program product comprising a computer program or computer code which, when run on a computer, causes the method illustrated in the first aspect or any possible implementation of the first aspect described above to be performed.
In a twelfth aspect, the present application provides a computer program product comprising a computer program or computer code which, when run on a computer, causes the method shown in the second aspect or any possible implementation of the second aspect described above to be performed.
In a thirteenth aspect, the present application provides a computer program which, when run on a computer, performs the method of the first aspect described above or shown in any possible implementation form of the first aspect.
In a fourteenth aspect, the present application provides a computer program which, when run on a computer, performs the method of the second aspect described above or any possible implementation of the second aspect.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more clear, the present application will be further described with reference to the accompanying drawings.
The terms "first" and "second," and the like in the description, claims, and drawings of the present application are used solely to distinguish between different objects and not to describe a particular order. Furthermore, the terms "comprising" and "having," as well as any variations thereof, are intended to cover non-exclusive inclusions. Such as a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those skilled in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.
In this application, "at least one" means one or more, "a plurality" means two or more, "at least two" means two or three and three or more, "and/or" for describing an association relationship of associated objects, which means that there may be three relationships, for example, "a and/or B" may mean: only A, only B and both A and B are present, wherein A and B may be singular or plural. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. "at least one item(s) below" or similar expressions refer to any combination of these items. For example, at least one (one) of a, b, or c, may represent: a, b, c, "a and b," a and c, "" b and c, "or" a and b and c.
The terms referred to in the present application are described in detail below.
1、PFC
Through the PFC technique, a network device or a server, etc. can create 8 virtual channels on one ethernet link, and assign a priority to each virtual channel, allowing any one of the virtual channels to be individually suspended and resumed. Here, an ethernet link may be understood as an ethernet link between two devices corresponding to a corresponding port. Illustratively, as shown in fig. 1, fig. 1 shows that each of device a and device B includes 8 priority queues, for example, an ethernet link between device a and device B may be understood as an ethernet link in which device a is connected through port a, and device B is connected through port B. If the device a can also communicate with the device B through the port C, another ethernet link may still exist between the port C and a port, such as D, in the device B, a sending queue of the another ethernet link is 8 priority queues corresponding to the port C through which the device a passes, and a receiving queue of the ethernet link is 8 priority queues corresponding to the port D through which the device B passes.
The description of the relationship between the priority queue and the port is also applicable to the congestion queue and the management queue shown in the present application, and the description thereof is omitted here.
2. Classification of data streams
In one possible implementation, the data stream may be divided into a large stream and a small stream. The data flow is considered as a mouse flow (namely a small flow) firstly and enters a mouse flow queue; then, when the number of bytes transmitted in the data stream exceeds a certain threshold, the data stream is identified as a elephant stream (i.e. a big stream) and enters a elephant stream queue.
In another possible implementation, the data flows may also be divided into congested flows and uncongested flows. The data flow is considered as non-congestion flow firstly and enters a non-congestion flow queue; when network congestion occurs, once a certain data flow is identified as a congestion flow, the data flow enters a congestion queue to realize congestion isolation.
The method for identifying the congestion flow comprises the following steps: A. the network device may identify whether the data flow is a congested flow according to a packet rate corresponding to the data flow and/or a packet length corresponding to the data flow. Illustratively, a data flow may be identified as a congested flow if the packet rate corresponding to the data flow is greater than a rate threshold and/or the packet length corresponding to the data flow is greater than a length threshold. The specific values of the rate threshold and the length threshold are not limited in the present application. It will be appreciated that the method of identifying congested flows shown here is merely an example. B. And identifying whether the data flow is a congestion flow or not according to the number of ECN messages corresponding to the data flow. For example, if the number of ECN messages corresponding to a data flow is greater than a preset number, the data flow may be identified as a congested flow. For example, if the number of marked ECN (explicit congestion notification) messages is greater than a preset number when a data flow passes through a network device within a period of time, the data flow may be a congested flow. Optionally, in some implementations, the network device may also identify the elephant flow as a congested flow.
An important step in the CI procedure is the identification of congested flows by ECN-enabled Active Queue Management (AQM) schemes specified in IETF RFC 3168. The Congestion Isolation Protocol (CIP) congestion detection function (98.3.1) of the congestion isolation aware forwarding process (98.3) is responsible for implementing AQM. A Congestion Point (CP) algorithm (30.2.1) for detecting a Congestion Control Flow (CCF) of a congestion aware bridge is also defined in some implementations. The method can be used to detect congested flows in CI aware systems. Many other possible methods are also discussed in IETF RFC 7567, such as those that include support for end-to-end ECN congestion control. That is, the ECN congestion control method may be used separately from the CI method, but the ECN congestion control method may also be combined with the CI method.
A congested flow may also be understood as: in a higher layer protocol for end-to-end congestion control, a set of frame sequences is considered to belong to a single flow that experiences congestion in a congestion isolation aware system. I.e. the CI method can also be applied in end-to-end congestion control.
It is understood that the number of congested flows in the network is not limited in this application. Alternatively, for the same network device, the congested flows in the network may be segregated into one congestion queue or at least two congestion queues in the network device, and so on.
In this application, a data flow may be referred to as an uncongested flow if it is not identified as a congested flow.
3. Congestion queue and management queue
The queue for buffering congested flows may be referred to as a congestion queue, the queue for buffering uncongested flows may be referred to as a management queue, or may also be referred to as an uncongested queue, and so on. In general, the priority of a congestion queue may be less than the priority of a management queue.
In this application, the priority of the congestion queue may be determined according to a length value (TLV) of a congestion isolation indicator. Alternatively, the priority of the congestion queue may be set by the network device, and the application does not limit how to set the priority of the congestion queue. The priority information of the congestion queue may be obtained by each network device according to TLV, or the priority information of the congestion queue may be stored in a congestion flow table, or the priority information of the congestion queue may also be included in a packet, such as the first packet or the fourth packet.
4. Isolation of
Isolation in this application is also understood to be: adjustment or switching, etc. Optionally, the network device isolates the congestion flow thereof to a congestion queue in the network device, which may also be understood as: the network device lowers the priority queue for the congested flow. If the priority queue of the data flow is a management queue, after being isolated, the data flow can be buffered to a congestion queue, that is, the data flow is sent through the congestion queue. If the congestion flow needs to be isolated to the congestion queue in the network device, after receiving the congestion flow, the network device may schedule the congestion flow to the congestion queue according to the congestion flow table, and send the congestion flow through the congestion queue.
5. Congestion Isolation Message (CIM) messages
As shown in table 1, table 1 shows an encapsulation format of the CIM packet, that is, the CIM packet may include an ethertype (e.g., PDU type), a network layer header (e.g., IPv4 header), a transport layer header (e.g., UDP header), and a Protocol Data Unit (PDU). It is understood that the IP addresses in the network layer header shown in table 1 are illustrated as IPv4, and the application is equally applicable to IPv 6.
TABLE 1
Name (R)
|
Byte (Octet)
|
Length (length)/byte
|
PDU Ether type (PDU ether type)
|
1
|
2
|
Network layer head (IPv4 head)
|
3
|
20
|
Transport layer header (UDP header)
|
23
|
8
|
CIM PDU
|
31
|
65-529 |
The network layer header may include a source Internet Protocol (IP) address (source address) of the CIM packet and a destination IP address (destination address) of the CIM packet. The transport layer header may include source port information (source port) of the CIM packet and destination port information (destination port) of the CIM packet. The CIM PDU may include a destination Media Access Control (MAC) address of the CIM packet and a source MAC address of the CIM packet. The source IP address, the destination IP address, the source MAC address, the destination MAC address, the source port information, and the target port information may be included in a congestion isolation peer table (CI peer table), and the CI peer table may be obtained according to information in a Link Layer Discovery Protocol (LLDP) and a CI length value (TLV). Among them, the destination IP address of the congested flow, the source IP address of the congested flow, the IP protocol type (included in the network layer header), the source port information of the congested flow, and the destination port information of the congested flow may be referred to as quintuple information of the congested flow.
It is understood that the bytes shown in table 1 may sequentially represent the 1 st byte, the 3 rd byte, the 23 rd byte and the 31 th byte of the CIM message. Illustratively, the PDU ethertype is 2 bytes long and starts from byte 1, then the network layer header starts from byte 3 and the network layer header is 20 bytes long.
And the network equipment generates a CIM message according to the MSDU information of the original message in the congestion flow table and the address information of the node adjacent to the network equipment in the CI peer table. Alternatively, the network device may also generate a CIM packet according to the source MAC address information of the congestion flow in the congestion flow table (i.e., MAC address information of a third network device shown below) and MSDU information of the original packet. Or, the network device may also generate another CIM packet according to the received CIM packet, and the like. In the present application, the content of the CIM PDU in the CIM packet may be as shown in table 2.
TABLE 2
In the present application, the msdu (encapsulated msdu) field encapsulated in the CIM message may include information related to the congested flow. The MSDU field as encapsulated may include, for example, a priority queue (including a congestion queue) of the first congested flow in the third network device, address information of at least one first network device or at least one second network device that the first congested flow is routed from the third network device (i.e., the source device) to another third network device (e.g., the destination device), a priority queue (including a congestion queue and a management queue, etc.) of the first congested flow in each first network device, or a priority queue of the first congested flow in each second network device. Alternatively, the encapsulated MSDU field includes quintuple information related to the congested flow.
The common features of the CIM messages are emphasized here, and other differences of each CIM message will be described below with reference to specific embodiments, which will not be described in detail here.
6. Congestion flow table
Generally, when a network is congested, a network device needs to identify a congested flow and create a congested flow table. As shown in table 3, table 3 exemplarily shows the contents of the congestion flow table. It is understood that table 3 is only an example, and in a specific implementation, when a network is congested, a network device may further obtain information related to a congested flow, such as a destination MAC address of the congested flow, a source MAC address of the congested flow (i.e., a MAC address of a source device of the congested flow), a VLAN of the congested flow, and the like, and store the information related to the congested flow. In other words, the contents shown in table 3 may be stored in other forms in the network device, and the present application is not limited thereto.
TABLE 3
For convenience of description, the following will describe a method for processing a congestion flow provided in the present application by taking a congestion flow table as an example.
7. Upstream node of network equipment or downstream node of network equipment
In this application, when a data flow is sent from a source node (a source MAC address of a congested flow in a congested flow table is a MAC address of the source node) to a destination node (a destination MAC address of a congested flow in a congested flow table is a MAC address of the destination node), the data flow may be forwarded by a plurality of network devices. Illustratively, a data stream needs to be sent from the network device 1 to the network device 2, and the data stream passes through the first network device and the second network device in sequence in the transmission process of the data stream. The second network device may be understood as a downstream node of the first network device and the first network device may be understood as an upstream node of the second network device.
In the embodiments shown below, for example, a data stream is sent from a third network device and needs to be sent to another third network device, where the first network device and the second network device may be included between the third network device and the another third network device. For example, a data flow needs to be sent from server 1 to server 2, and through the source switch, the intermediate switch, and the destination switch in sequence. The source switch may be understood as the intermediate switch and the upstream node of the destination switch and, correspondingly, the destination switch may be understood as the intermediate switch and the downstream node of the source switch. Alternatively, the intermediate switch is a downstream node of the source switch and an upstream node of the destination switch.
Generally, when a queue depth of a receive queue in a network device in a network is greater than or equal to a priority-based flow control (PFC) threshold, such as an Xoff threshold, the network device sends a PFC message (which may also be referred to as a PFC frame, or the like) to an upstream node of the network device, where the PFC message includes duration information that a send queue corresponding to the receive queue in the upstream node is stopped. In this case, the sending queue in the upstream node may be stopped sending packets to the receiving queue in the network device, but the sending queue in the upstream node may still receive packets, thereby causing the sending queue in the upstream node to reach the Xoff threshold soon. By analogy, congestion is propagated in a backpressure manner (also referred to as congestion root) to form a "congestion tree" that may eventually lead to traffic disruption. Or, since the data flow forming the congestion tree also generates the head-blocking effect, other data flows cannot be buffered in the blocked queue, and the like.
Therefore, by means of a Congestion Isolation (CI) technology, a data stream which may cause congestion is isolated into a queue with a lower priority (i.e. a congestion queue), and the problem of head blocking caused by PFC in a network can be effectively alleviated. In the isolation process, when the congestion queue is congested, the network device corresponding to the congestion queue sends a PFC message to an upstream node thereof, so that the congestion queue in the upstream node stops sending messages to the congestion queue in the network device. However, the packets sent to the congestion queue in the network device may still be buffered in the non-congestion queue in the upstream node, and in this case, since the congestion queue in the network device is already congested, problems such as packet loss or packet overflow may be likely to be caused.
The application provides a method and equipment for processing a congestion flow, which can avoid the problems of message loss or overflow and the like. Further, on one hand, the method provided by the application can effectively solve the problem of packet loss of the messages corresponding to the data stream in the transmission process, and avoid the problem of congestion tree in the network caused by diffusion of the PFC messages in the network. On the other hand, the method provided by the application can also avoid the situation that the congestion queue and the management queue in the network equipment both stop sending messages to cause the damage of the non-congestion flow, and ensure the transmission of the non-congestion flow.
The method provided by the application can be applied to data center networks, campus networking and the like. Optionally, the method provided by the application can also be applied to high-performance computing, high-performance distributed storage, big data, artificial intelligence and the like. Specifically, the method provided by the present application may be applied to a network device, and the network device may be a computer, a server, a switch (or referred to as a switch device, a switch chip, etc.), a router, a network card, etc. in any form, and the present application does not limit the specific form of the network device. Optionally, the method provided by the present application may also be applied to a network architecture composed of at least two network devices. Such as the at least two network devices may include a first network device and a third network device; alternatively, the at least two network devices may include a first network device, a second network device, at least one third network device, and the like, and the network architecture is not limited in this embodiment of the application. The third network device may be understood as a server, and the first network device and the second network device may be understood as a switch or a router between two servers.
For example, fig. 2b and fig. 2b are schematic diagrams of a network architecture provided in an embodiment of the present application, respectively. In fig. 2a and 2b, the TOR may be understood as a top of rack (TOR) switch, and may also be understood as a source switch, i.e., a switch connected to a server. Agg can be understood as an aggregation node (Agg), i.e. Agg can be understood as an aggregation switch. Fig. 2a and 2b mainly differ in the number of switches through which the data stream originating from the server side passes. It is understood that the network architectures shown in fig. 2 a-2 b are merely illustrative and should not be construed as limiting the present application. Illustratively, the third network device may be a server shown in fig. 2a or fig. 2b, and the first network device may be any type of switch shown in fig. 2a or fig. 2b, such as a source switch TOR connected to the server, or a Spine shown in fig. 2a, or an Agg switch shown in fig. 2 b. It is understood that the network architectures shown in fig. 2a and fig. 2b are only examples, and in a specific implementation, the method provided by the embodiment of the present application may also be applied to a scenario in which at least two third network devices include one first network device therebetween.
The method for processing the congested flow provided by the embodiment of the present application will be described below with reference to a specific location of the first network device.
Fig. 3a is a flowchart illustrating a method for processing a congestion flow according to an embodiment of the present application, where the method can be applied to the network architectures shown in fig. 2a to 2 b. The method can be applied to a first network device and a third network device, and the first network device can be understood as a network device where a congestion point is located when a network where the first network device is located is congested. In other words, in fig. 3a the first network device and the second network device (i.e. the network device where the congestion point is located) may be understood as one and the same network device. As shown in fig. 3a, the method comprises:
in one possible implementation, the method shown in fig. 3a may include step 301.
301. The first network device identifies a first congested flow, and isolates the first congested flow to a congestion queue in the first network device.
After the first network device identifies the first congestion flow, a congestion flow table may be created, and for the specific description of the congestion flow table, reference may be made to table 3 above, and details thereof are not described here. The first congested flow is a congested flow identified by the first network device. The first congestion flow may be a congestion flow identified by the first network device from its management queue, or may be a congestion flow identified by the first network device from its congestion queue, as described in detail below.
In one possible implementation, the identifying, by the first network device, the first congested flow includes: a first congested flow is identified from a management queue in a first network device. I.e., the first congested flow is a congested flow that the first network device identifies from its administrative queue.
Optionally, the first network device may identify whether the data flow is a congestion flow according to a packet rate and/or a packet length corresponding to the data flow. For this description, reference is made to the above description, which is not detailed here.
Optionally, the first network device may further identify the first congested flow when a management queue in the first network device is congested. As an example, the metric for congestion of the management queue of the first network device may be as follows:
A. and whether the management queue of the first network equipment is congested or not is balanced according to the queue depth of the management queue in the first network equipment. If the queue depth of the management queue of the first network device is greater than or equal to the second threshold, the management queue in the first network device has been congested. The queue depth is used to measure the amount of buffer of the management queue, and therefore, the embodiment of the present application does not limit the specific value of the second threshold.
B. And judging whether the management queue of the first network equipment is congested or not according to the times that the queue depth of the management queue is greater than or equal to a second threshold value within the preset time length. If the number of times that the queue depth of the management queue in the first network device is greater than or equal to the second threshold is greater than or equal to the preset number within the preset time length, the management queue in the first network device is already congested. The embodiment of the present application is not limited to specific values of the preset duration or the preset times.
C. And identifying whether the management queue in the first network equipment is congested or not according to the time length of the data flow buffered in the management queue in the first network equipment. The longer the duration of the data stream buffered in the management queue in the first network device is, the higher the probability that the management queue in the network device is congested is.
D. And identifying whether the management queue in the first network equipment is congested or not according to the number of the corresponding ECN messages in the management queue in the first network equipment.
It can be understood that the above-described method for determining whether congestion occurs in the management queue in the first network device is only an example, and the embodiment of the present application does not limit this.
In the case that the management queue in the first network device is congested, after the first network device identifies the first congestion flow, the first network device directly performs step 304; alternatively, the first network device may also perform step 304 after performing step 302. For a specific description of step 302 or step 304, reference may be made to the following description, which is not detailed here.
In one possible implementation, the identifying, by the first network device, the first congested flow includes: a first congestion flow is identified from a congestion queue in a first network device. I.e., the first congested flow is a congested flow that the first network device identifies from its congestion queue. In other words, the first network device may first identify at least one congested flow and then segregate the at least one congested flow to a congestion queue in the first network device. A first congested flow is then identified from a congestion queue in the first network device. In other words, the embodiment of the present application does not limit the order of the two steps shown in step 301.
Thus, the method for the first network device to identify the first congested flow is as follows:
A. and identifying the first congestion flow from the at least one congestion flow according to the message rate or the message length corresponding to the at least one congestion flow.
B. In the case that the congestion queue is congested, identifying a first congestion flow from at least one congestion flow corresponding to the congestion queue. The above-mentioned reference for managing whether congestion occurs in the queue can be referred to as the above-mentioned reference for whether congestion occurs in the congestion queue, and is not described in detail here. For the same measurement standard, specific values of a certain threshold corresponding to the congestion queue and the management queue, such as a threshold for a preset duration, a preset number of times, or a threshold for measuring queue depth, may be different, and are not shown one by one here. It can be understood that, in the embodiment of the present application, as to how to identify the first congestion flow from the congestion queue in the first network device, reference may be made to the above-described method for identifying the first congestion flow from the management queue in the first network device, and details of the method are not described here. In the case that the congestion queue in the first network device is congested, after the first network device identifies the first congestion flow, the first network device directly performs step 304; alternatively, the first network device may also perform step 304 after performing step 302.
In this embodiment of the present application, isolating the first congestion flow to the congestion queue in the first network device may also be understood as: switching the first congestion flow to a congestion queue in the first network device; or, adjusting (or switching) a transmission queue of the first congestion flow to a congestion queue in the first network device; alternatively, the transmit queue for the first congested flow is adjusted (or switched) from a managed queue in the first network device to a congestion queue in the first network device. It is understood that, unless otherwise specified, the different expressions shown herein, and statements shown below similar to the "isolating the first congestion flow to the congestion queue in the first network device" may also be similarly understood as the several expressions shown herein, and are not further described below.
In one possible implementation, the method shown in fig. 3a may further include step 302 and step 303.
302. The first network equipment generates a second message, wherein the second message is used for indicating an upstream node of the first network equipment to isolate the first congestion flow to a congestion queue in the upstream node of the first network equipment; and sending the second message to an upstream node of the first network device. Correspondingly, the upstream node of the first network device receives the second packet.
In this embodiment, the second packet is mainly used to indicate the upstream node of the first network device to isolate the first congestion flow. Therefore, the embodiment of the present application does not limit the specific format of the second packet. For example, the second message may include a second CIM message, and the format of the second CIM message may be as shown in table 1 or table 2 above. That is, the first network device may generate the second CIM packet according to the flow information in the congestion flow table and the address information of the network device (such as the upstream node) adjacent to the first network device in the CI peer table.
In conjunction with the above description regarding step 301, the generating, by the first network device, the second packet may include: generating a second message under the condition that a management queue in the first network equipment is congested; or, generating the second message when the congestion queue in the first network device is congested. The two ways for generating the second message shown here may be referred to the description of step 301, and are not described in detail here.
The second message described above is illustrated in table 2, but the format of the CIM PDU in the second message may also be as shown in table 5, in which case, the isolation type field may be used to carry tenth indication information, and the tenth indication information may be used to instruct the upstream node of the first network device to isolate the first congestion flow to the congestion queue in the upstream node of the first network device. It is understood that the tenth indication information may be carried in the vlan identifier field. Illustratively, the tenth indication information may be 000 in the quarantine type field, etc.
303. And the upstream node of the first network equipment isolates the first congestion flow to a congestion queue in the upstream node of the first network equipment according to the second message.
After the upstream node of the first network device receives the second packet, it may also create a congestion flow table according to the congestion flow information in the second packet, and isolate the first congestion flow to its congestion queue. If the second message includes the second CIM message, the information of the first congestion flow may be obtained according to the MSDU field encapsulated in the CIM PDU in the second CIM message. For the description of the information about the first congested flow carried in the encapsulated MSDU field, reference may be made to the description of the CIM packet shown above, and details thereof are not described here.
In this embodiment of the application, after the first network device performs step 301, it may also directly perform step 304; or after step 301 is executed, step 302 is executed, and the like, and the steps specifically executed by the first network device are not limited in this embodiment of the application.
304. The first network device generates a first message, where the first message includes first indication information, where the first indication information is used to indicate and adjust a sending rate of a first congestion flow sent by a third network device, the third network device is a source device of the first congestion flow, and the first congestion flow is a congestion flow identified by the first network device when the first network device is congested.
The first message may include a first CIM message, and then the encapsulated MSDU field in the CIM PDU in the first message may include at least one of a priority queue (including a congestion queue) of the first congested flow in the third network device, address information of at least one first network device or at least one second network device that the first congested flow is routed from the third network device (i.e., a source device) to another third network device (e.g., a destination device), a priority queue (including a congestion queue and a management queue, etc.) of the first congested flow in each first network device, or a priority queue of the first congested flow in each second network device.
Optionally, the first packet further includes second indication information, where the second indication information is used to indicate that the first congestion flow sent by the third network device is isolated to a congestion queue in the third network device. The first packet may include a first CIM packet, and in combination with the formats of the CIM packets shown in tables 1 and 2, the format of the CIM PDU of the first CIM packet may be as shown in tables 4 and 5.
TABLE 4
TABLE 5
The first CIM packet shown in tables 4 and 5 is exemplified by that the first indication information and the second indication information are specifically carried in a connectivity isolation type (CI type) field. Optionally, as can be seen from table 2, 4 bits (bit) remain in the vlan identifier field, so that the first indication information and the second indication information may also be carried in the vlan identifier field in the first CIM message. Or, the first indication information and the second indication information are respectively carried in the isolation type field and the virtual local area network identifier field, and the embodiment of the present application does not limit where (or field, area) the first indication information and the second indication information are specifically carried. As an example, the second indication information may be 001, and the first indication information may be different from the second indication information, such as 100.
The first indication information and the second indication information shown above are different information, and in a specific implementation, the first indication information and the second indication information may also be the same information, such as 001. In other words, the first packet may include an indication information instructing the third network device to adjust the sending rate of the first congestion flow sent by the third network device and to isolate the first congestion flow sent by the third network device to the congestion queue in the third network device.
In a possible implementation manner, the step of using the first indication information to indicate to adjust the sending rate of the first congestion flow sent by the third network device includes: the first indication information is used for indicating to adjust a sending rate of a congestion queue in the third network device.
In this embodiment, the sending rate of the first congested flow sent by the third network device may be adjusted more accurately, for example, the purpose of accurately adjusting the sending rate of the first congested flow may be achieved by a Data Center Quantized Congestion Notification (DCQCN) algorithm. When the purpose of adjusting the transmission rate of the first congestion flow is achieved by adjusting the transmission rate of the congestion queue in the third network device, the congestion flow transmitted by the third network device may also include a second congestion flow, which may also be isolated into the congestion queue in the third network device. In this case, the sending of the second congestion flow may be affected, but by adjusting the sending rate of the congestion queue, a more serious situation such as service interruption in the network is avoided.
In this embodiment of the application, optionally, the step of using the first indication information to indicate that the adjustment of the sending rate of the first congestion flow sent by the third network device includes: the first indication information is used for indicating that the sending rate of the first congestion flow sent by the third network equipment is reduced; or, the first indication information is used to indicate that the third network device stops sending the first congestion flow. As for the specific role of the first indication information, it can be distinguished by different bit sequences; or, the third network device determines the specific role of the first indication information according to the information of the first congestion flow, and the like, which is not limited in this embodiment of the application.
In one possible implementation, the first packet is generated when a congestion queue in the first network device is congested. Here, as to the measure of whether congestion occurs in the congestion queue in the first network device, reference may be made to the measure of whether congestion occurs in the management queue described in step 301 above, and details thereof are not described here. As to whether the congestion degree of the congestion queue in the first network device shown here is consistent with the congestion degree of the congestion queue in the first network device in the related description of step 301, the embodiment of the present application is not limited. If the congestion flow is consistent with the first congestion flow, the first network equipment can generate a first message after identifying the first congestion flow; if not, the congestion level at which the congestion queue in the first network device is congested as shown here may be greater than the congestion level at which the congestion queue in the first network device is congested as described in relation to step 301. Thereby, it is facilitated for the first network device to identify the first congested flow and then indicate which congested flow the third network device needs to adjust. Alternatively, if not, the congestion level at which the congestion queue in the first network device is congested as shown here may be less than the congestion level at which the congestion queue in the first network device is congested as described in relation to step 301. In this case, the first network device may directly generate the first packet when the congestion queue of the first network device is congested.
305. The first network equipment sends a first message to an upstream node thereof; correspondingly, the upstream node of the first network device receives the first packet.
The steps performed by the upstream node of the first network device may refer to the method shown in fig. 3b, as described in connection with the first network device in step 316, and will not be described in detail here.
It is understood that the method shown in step 304 and step 305 is exemplified by the first network device generating the first message, but in a specific implementation, after step 301 to step 303, the upstream node of the first network device may also generate the first message when the congestion queue of the upstream node is congested. The upstream node of the first network device then sends a first packet to its upstream node, where the steps performed with respect to the upstream node of the first network device may be as described in relation to the first network device in step 316 of fig. 3b, and are not described in detail here.
Optionally, the upstream node of the first network device may also be a third network device. For example, the congestion queue (or management queue) in the source switch or the source router connected to the third network device is congested, in this case, the third network device may directly send the first packet to the third network device. For the specific description of the first network device, it is not described in detail here.
306. The third network device adjusts a transmission rate of the first congested flow transmitted by the third network device.
Illustratively, the third network device may reduce a transmission rate of the first congestion flow transmitted by the third network device; alternatively, the third network device may stop the third network device from sending the first congested flow.
Alternatively, step 306 may further include: the third network device isolates the first congestion flow sent by the third network device to a congestion queue in the third network device.
For the detailed description of step 306, reference may also be made to step 317 and step 318 in fig. 3b, which are not detailed here.
It is understood that reference may be made to the method shown in fig. 3b for specific implementation of step 306, as described in connection with step 317 and step 318, which will not be described in detail herein.
In this embodiment of the application, after step 306, the method shown in fig. 3a may further include:
and under the condition that the queue depth of the congestion queue in the first network equipment is greater than or equal to the flow control PFC threshold based on the priority, sending a third PFC message to the upstream node of the first network equipment, wherein the third PFC message is used for indicating the congestion queue in the upstream node of the first network equipment to stop sending the first congestion flow.
In this embodiment of the present application, although it may be avoided that the queue depth of the congestion queue in the network reaches the Xoff threshold to the greatest extent, if the queue depth of the congestion queue in the first network device reaches the Xoff threshold, the first network device may send the third PFC packet to the upstream node thereof, so that the congestion queue in the upstream node of the first network device may suspend sending the first congestion flow, and situations such as packet loss or overflow of the congestion queue in the first network device are avoided.
It can be understood that the method for sending the third PFC message by the first network device shown here may also be applied to other network devices in the network, and the embodiment of the present application does not limit this. It is understood that the detailed description of the third PFC message applies to the method shown in fig. 3b, and the detailed description is omitted below.
In this embodiment of the application, the first indication information in the first message may be used to indicate to adjust a sending rate of the first congestion flow sent by the third network device, and through the first message, the source device of the first congestion flow (that is, the third network device) may adjust the sending rate of the first congestion flow, so that the sending rate of the first congestion flow can be matched with a congestion degree of a network, and a priority queue in a network device (including the first network device and/or the second network device, etc.) in the network is prevented from quickly reaching the PFC threshold.
Fig. 3b is a flowchart illustrating a method for processing a congestion flow according to an embodiment of the present application, where the method can be applied to the network architectures shown in fig. 2a to 2 b. The method can be applied to a first network device and a third network device, and the first network device can be understood as a network device where a non-congestion point is located when a network where the first network device is located is congested. In other words, in fig. 3b, the first network device and the second network device (i.e. the network device where the congestion point is located) are not only different network devices, but the first network device is an upstream node of the second network device. As shown in fig. 3b, the method comprises:
in one possible implementation, the method shown in fig. 3b includes steps 311 to 313.
311. The second network device identifies a first congested flow, and isolates the first congested flow to a congestion queue in the second network device.
It is understood that the detailed description about step 311 can also refer to the description about step 301 in fig. 3a, and the detailed description is omitted here.
312. The second network equipment generates a third message, wherein the third message is used for indicating the upstream node of the second network equipment to isolate the first congestion flow to a congestion queue in the upstream node of the second network equipment; and sending the second message to an upstream node of the second network device. Correspondingly, the upstream node of the second network device receives the third packet.
In the embodiment of the present application, for specific descriptions of step 311 and step 312, refer to step 301 and step 302 in fig. 3a, for example, the third packet may refer to the second packet, and details thereof are not described here.
313. And the first network equipment identifies the first congestion flow according to the third message.
The information of the first congested flow may be included in an encapsulated MSDU field in a CIM PDU of the third packet. For a specific introduction of the information on the first congestion flow included in the encapsulated MSDU field, reference is made to the above, and details thereof are not described here.
In this embodiment of the application, when the first network device needs to send the third packet to the upstream node of the first network device, the method shown in fig. 3b may further include: and the first network equipment generates a new third message according to the third message, wherein the destination address in the new third message is the address of the upstream node of the first network equipment, and the destination address in the third message is the address of the first network equipment.
314. And the first network equipment isolates the first congestion flow to a congestion queue in the first network equipment according to the third message.
In one possible implementation, the method shown in fig. 3b may further include step 315 and step 316.
In this embodiment of the application, the second network device may further directly perform step 315 after performing step 311, and the like, which is not limited in this embodiment of the application.
315. The second network device generates a first message, where the first message includes first indication information, where the first indication information is used to indicate and adjust a sending rate of a first congestion flow sent by a third network device, the third network device is a source device of the first congestion flow, and the first congestion flow is a congestion flow identified by the second network device when the second network device is congested.
316. The second network equipment sends a first message to an upstream node, namely the first network equipment; correspondingly, the upstream node of the first network device receives the first packet.
In this embodiment of the application, after receiving the first packet, the first network device may further perform different processing according to a relationship between the first network device and the third network device. If the first network device is a source switch or a source router connected to the third network device, or the first network device is a network segment directly connected to the third network device, the first network device may perform step 317. Generally, a first network device or a second network device in a network can know whether the first network device or the second network device is a network segment directly connected to a third network device. Or, the first network device or the second network device in the network can both know whether the first network device or the second network device is the downstream node directly connected to the third network device, that is, the first network device can know whether the upstream node is the third network device. For example, the first network device may obtain whether it is on the same network segment as the third network device according to the MAC address of the third network device. Illustratively, with respect to the definition of the same network segment: and respectively performing AND operation on the two IP addresses and the subnet mask if the two IP addresses are not in the same network segment to obtain a network number, wherein if the network numbers are the same, the two IP addresses are in the same network segment, and otherwise, the two IP addresses are not in the same network segment. It is understood that the description of the network segment shown here is merely an example, and in a specific implementation, the first network device may also know whether its upstream node is a third network device through other methods.
If the first network device is not a source switch or a source router connected to the third network device, the first network device may forward the first packet, or generate a new first packet. For example, if the destination address in the network layer header of the first packet is the address of the source switch or the source router, the first network device may forward the first packet directly. For another example, if the destination address in the network layer header of the first packet is the address of the third network device, the first network device may also directly forward the first packet. For example, the destination address in the network layer header of the first packet is only the address of the first network device, and the destination address in the new first packet generated by the first network device is the upstream node of the first network device. In other words, the first network device needs to generate a new first packet according to the first packet, where a source address in a network layer header of the new first packet may be an IP address of the first network device, and a destination address in the network layer header of the new first packet may be an IP address of an upstream node of the first network device; the source MAC address in the PDU in the new first message may be a MAC address of the first network device, and the destination MAC address in the CIM PDU in the new first message may be a MAC address of an upstream node of the first network device. Optionally, after the first network device receives the first packet, and when the destination address in the first packet is only the address of the first network device, the first network device may further obtain a new first packet by modifying the address information in the first packet. The specific manner for modification may be as described above, such as modifying the source address information in the network layer header in the first message to the IP address of the first network device. It is understood that the modification of the address shown here is merely an example, and in a specific implementation, the first network device may also modify the source port information in the first message, and the like, which is not limited in this embodiment of the application.
The above steps 315 and 316 are illustrated by taking the second network device generating the first message as an example, however, in a specific implementation, although the second network device is a network device where a congestion point is located, in a specific implementation, congestion may also occur in a congestion queue of the first network device, and in this case, the first network device may generate the first message. That is, after steps 311 to 314, the method shown in fig. 3b may further include: the first network equipment generates a first message under the condition that the congestion queue of the first network equipment is congested. The first network device then sends the first packet to its upstream node, and the specific description of the first network device generating the first packet may refer to step 316 or the related description of steps 304 and 305, and will not be described in detail here.
317. The first network device generates a first PFC message according to the first message, wherein the first PFC message includes duration information that a queue where a first congestion flow sent by the third network device is located is stopped. And sending a first PFC message to the third network device, wherein correspondingly, the third network device receives the first PFC message.
In this embodiment, the time unit of the duration information may be the time required for the physical layer chip to transmit 512 bits of data. In other words, one time unit of the duration information indicates that the time when the congestion queue of the third network device suspends sending the packet corresponding to the first congestion flow is the time required by the physical layer chip of the third network device to send 512-bit data. Alternatively, the time unit of the duration information may also be milliseconds, microseconds, or the like, which is not limited in this embodiment of the application. For example, the maximum time of the duration information may be 0 xFFFF. The PFC message may include, in addition to the duration information, an identifier of a priority queue corresponding to the duration information. In other words, the PFC message may further include an identifier of a queue in which the first congestion flow is located. Optionally, the first network device may obtain the information of the first congestion flow according to an encapsulated MSDU field in the CIM PDU in the first message received by the first network device. Therefore, the first network device knows the queue of the first congestion flow according to the information of the first congestion flow. Alternatively, the first network device may also obtain, through other manners, a queue in which the first congestion flow is located, which is not limited in this embodiment of the application.
Optionally, the queue in which the first congestion flow is located includes a congestion queue in which the first congestion flow is located and/or an original queue in which the first congestion flow is located. That is, for the third network device, the first congestion flow may not be isolated to the congestion queue in the third network device, or the first congestion flow may not be completely isolated (e.g., a part of the packet corresponding to the first congestion flow) to the congestion queue in the third network device, or the first congestion flow may be isolated to the congestion queue in the third network device.
In the embodiment of the application, when receiving the first message, the first network device may adjust the sending rate of the congestion flow sent by the third network device by sending the PFC message instruction to the third network device.
Alternatively, step 317 may also be replaced with: the first network equipment sends a first message to the third network equipment, and the third network equipment receives the first message.
In this embodiment, the destination address in the first message may be an address of the third network device. For example, the source address in the network layer header of the first packet may be an IP address of the third network device, and the destination address in the network layer header of the first packet may be an IP address of the third network device; the source MAC address in the PDU in the first message may be a MAC address of the third network device, and the destination MAC address in the CIM PDU in the first message may be a MAC address of the third network device.
Alternatively, in conjunction with step 317, step 317 may further include: the first network device generates a fourth message according to the first message, wherein the fourth message includes fourth indication information, and the fourth indication information is used for indicating that the first congestion flow sent by the third network device is isolated to a congestion queue in the third network device. Sending a fourth message to the third network equipment; correspondingly, the third device receives the fourth message. In this case, the third network device may perform step 318 according to the fourth packet and the first PFC packet.
Optionally, the fourth message may further include third indication information, the first indication information may be referred to for a specific description of the third indication information, the second indication information may be referred to for a specific description of the fourth indication information, and the description of the first message may be referred to for the fourth message, which is not described in detail here. The difference between the first message and the fourth message is that the destination address in the first message is different from the destination address of the fourth message, or the source address of the first message is different from the source address of the fourth message, and so on, which is not described herein again. In this case, the first network device may send the fourth packet to the third network device alone, and may cause the third network device to perform step 318 without passing through the first PFC packet.
318. The third network device adjusts a transmission rate of the first congestion flow transmitted by the third network device.
Optionally, before adjusting the sending rate of the first congestion flow sent by the third network device, the third network device may further isolate the first congestion flow sent by the third network device to a congestion queue in the third network device.
In this embodiment of the application, the first indication information in the first message may be used to indicate to adjust a sending rate of the first congestion flow sent by the third network device, and through the first message, the source device of the first congestion flow (that is, the third network device) may adjust the sending rate of the first congestion flow, so that the sending rate of the first congestion flow can be matched with a congestion degree of a network, and a priority queue in a network device (including the first network device and/or the second network device, etc.) in the network is prevented from quickly reaching the PFC threshold.
The above-described embodiments illustrate examples in which the third network device reduces the transmission rate of the first congested flow that it transmits, or the third network device stops transmitting the first congested flow, and a description will be given below of an example in which the third network device increases the transmission rate of the first congested flow that it transmits, or the third network device stops transmitting the first congested flow, to describe another method for processing a congested flow that is provided in the embodiments of the present application. Similarly, the embodiment of the present application will describe the method by taking the first network device and the third network device as examples.
Fig. 4a is a flowchart illustrating a method for processing a congestion flow according to an embodiment of the present application, where the method may be applied to a first network device and a third network device. The first network device in the method may correspond to the first network device shown in fig. 3 a. Such as the first network device may be a network device that identifies the first congested flow as a congested flow. As shown in fig. 4a, the method comprises:
401. the first network equipment generates a fifth message, wherein the fifth message comprises fifth indication information, and the fifth indication information is used for indicating that the sending rate of the first congestion flow sent by the third network equipment is increased; or, the fifth indication information is used to indicate to resume the third network device to send the first congestion flow.
If the first indication information is used to indicate the third network device to stop sending the first congestion flow, the fifth indication information may be used to indicate to resume the third network device to send the first congestion flow. If the first indication information is used to indicate that the transmission rate of the first congested flow transmitted by the third network device is decreased, the fifth indication information may be used to indicate that the transmission rate of the first congested flow transmitted by the third network device is increased.
When the fifth packet includes the fifth CIM packet, for a specific description of the fifth indication information, the first indication information shown in table 4 and table 5 may be referred to. For example, the fifth indication information may be carried in the isolation type field shown in table 4, or in the virtual local area network identifier field, and the like, which is not limited in this embodiment of the application.
Optionally, the first network device may further generate the fifth packet when the first congested flow is identified as a non-congested flow; or, the first network device may further generate the fifth packet when the congestion queue of the first network device is not congested.
In this embodiment of the application, when the first network device identifies that the first congested flow is an uncongested flow, the first network device may generate a fifth packet, so that the third network device increases a sending rate of the first congested flow sent by the third network device or resumes sending the first congested flow in units of flows. And/or, generating a fifth message when the congestion queue in the first network device is not congested, so that the third network device increases the sending rate of the first congestion flow sent by the third network device or resumes sending the first congestion flow. For example, the first network device may also determine whether its congestion queue is already uncongested based on whether the queue depth of its congestion queue is less than or equal to a third depth threshold. The third depth threshold may be less than the first depth threshold shown above. The third depth threshold shown here is merely an example, and in particular implementations, it may also be determined whether the congestion queue of the first network device is already uncongested according to other methods.
In a possible implementation manner, the fifth packet may further include sixth indication information, where the sixth indication information is used to indicate that the first congested flow sent by the third network device is isolated to the non-congested queue. Alternatively, it can also be understood that: the sixth indication information is used to indicate that the first congested flow sent by the third network device is switched to a non-congested queue, and the like, and is not described in detail here.
It is understood that, regarding the specific description of the fifth indication information and the sixth indication information, reference may be made to the above description regarding the first indication information and the second indication information similarly.
402. The first network equipment sends a fifth message to an upstream node of the first network equipment; correspondingly, the upstream node of the first network device receives the fifth packet.
403. The third network equipment increases the sending rate of the first congestion flow sent by the third network equipment; or, resuming the first network device to transmit the first congested flow.
In this embodiment of the present application, the fifth message generated by the first network device may be directly sent to the third network device, where in this case, the destination address in the fifth message is an address related to the third network device. Or, the fifth message generated by the first network device may also be sent to the source switch or the source router, and in this case, the third network device may receive the second PFC, and perform step 403 according to the second PFC message. Alternatively, the third network device may further execute step 403 through the received sixth packet. Or, the third network device may further execute step 403 on the received second PFC message and sixth message, and the like. For the specific description of step 403, reference may also be made to the description of step 413 in fig. 4b shown below, which is not described in detail here. Unlike step 413 in fig. 4b, the message received by the third network device is sent from the first network device in one way, as in step 413; another way is from another network device (e.g., a source switch or source router, etc.) that is on the same network segment as the third network device. Generally, network devices in a network can know whether the first network device is connected to a third network device, and therefore, the first network device may perform the different methods according to whether the first network device is connected to the third network device.
In this embodiment, the third network device may further isolate the first congested flow sent by the third network device to a non-congested queue and the like.
It is understood that, with regard to the specific description of fig. 4a, reference may also be made to the methods illustrated in fig. 3a and 3b, which are not described in detail here.
Fig. 4b is a flowchart illustrating a method for processing a congestion flow according to an embodiment of the present application, where the method may be applied to a first network device and a third network device. In this case, the second network device may identify that the first congested flow is a non-congested flow, or that its congestion queue is not congested, and that the first network device is an upstream node of the second network device. As shown in fig. 4b, the method comprises:
411. the second network device identifies the first congested flow as an uncongested flow.
412. The second network equipment generates a fifth message and sends the fifth message to an upstream node of the second network equipment, namely the first network equipment; correspondingly, the first network device receives the fifth message.
The fifth packet includes fifth indication information, where the fifth indication information is used to indicate to increase a sending rate of the first congestion flow sent by the third network device, or is used to indicate to resume the third network device from sending the first congestion flow. Optionally, the fifth packet may further include sixth indication information, where the sixth indication information is used to indicate that the first congestion flow sent by the third network device is isolated to the non-congestion queue in the third network device.
In this embodiment, when the first network device receives the fifth packet, different processing may be performed according to a relationship between the first network device and the third network device. The different processes shown here may correspond to the description of the processing of the first packet by the first network device in step 316 in fig. 3b, and are not described in detail here.
413. And the first network equipment generates a second PFC message according to the fifth message, wherein the second PFC message is used for indicating that the queue where the first congestion flow sent by the third network equipment is located is recovered. And sending a second PFC message to the third network device, wherein correspondingly, the third network device receives the second PFC message. In this case, the third network device may perform step 414 according to the second PFC message.
In this embodiment of the application, the duration information in the second PFC message may be 0, so that the third network device may resume sending the first congestion flow after receiving the second PFC message.
In this embodiment of the application, the queue in which the first congestion flow is located includes a congestion queue in which the first congestion flow is located and/or an original queue in which the first congestion flow is located. That is, for a third network device, the first congested flow may not have been isolated to a congestion queue in the third network device, or the first congested flow may not have been completely isolated to a congestion queue in the third network device, or the first congested flow may have been isolated to a congestion queue in the third network device.
Alternatively, step 413 may further include: and the first network equipment sends a fifth message to the third network equipment, and the third network equipment receives the fifth message. In this embodiment, the destination address of the fifth packet may be an address of the third network device, and for the description of the address, reference may be made to the destination address in the first packet described in step 317, and details are not described here. In this case, the third network device may perform step 414 according to the fifth message and the second PFC message. Alternatively, the first network device may also directly send the fifth packet to the third network device, in this case, the third network device may execute step 414 according to the fifth packet.
Alternatively, step 413 may also be replaced with: and the first network equipment generates a sixth message according to the fifth message, wherein the sixth message comprises seventh indication information, and the seventh indication information is used for indicating and increasing the sending rate of the first congestion flow sent by the third network equipment. Or, the fifth packet may further include eighth indication information, where the eighth indication information is used to indicate that the first congestion flow sent by the third network device is switched to a non-congestion queue in the third network device. In this case, the third network device may perform step 414 according to the sixth packet.
414. The third network device increases the transmission rate of the first congested flow transmitted by the third network device or resumes the transmission of the first congested flow by the third network device.
Optionally, the third network device may further switch the first congested flow sent by the third network device to a non-congested queue in the third network device.
It is understood that the specific description of the fifth message or the sixth message referred to in fig. 4a and 4b may refer to the descriptions of the first message and the fourth message shown above, and will not be described in detail here.
It can be understood that the above-described illustrated fig. 3a, 3b, 4a and 4b are each emphasized, wherein implementations that are not described in detail in one embodiment may refer to other embodiments, and are not described in detail here. Alternatively, the four embodiments shown above may be combined with each other. Illustratively, FIG. 3a may be combined with FIGS. 4a and 4b, respectively; alternatively, fig. 3b may also be combined with fig. 4a and 4b, respectively. When fig. 3a is combined with fig. 4a, then identifying the first congested flow and identifying the first congested flow as an uncongested flow may both be performed by the first network device. When fig. 3a is combined with fig. 4b, then identifying the first congested flow may be performed by a first network device, and identifying the first congested flow as a non-congested flow may be performed by a second network device. Alternatively, fig. 3b may be combined with fig. 4a and 4b, respectively, etc.
Referring to fig. 5a, fig. 5a is a schematic diagram of a network architecture according to an embodiment of the present application. Among them, S1 to S5 may be understood as a server (server), L1 to L4 may be understood as switches connected to the server, such as called source switches or TORs for short, and S11 and S12 may be understood as intermediate switches (or as Spine as shown in fig. 2 a). Without loss of generality, the embodiments shown below will be exemplified by fig. 5a, and therefore the detailed description of fig. 5a will not be described in detail below.
The server in fig. 5a may be used to execute the method of the third network device illustrated in the above embodiments, and L1 to L4, s11 and s12 may be understood as the above first network device or the above second network device.
In the following embodiments, the method provided by the embodiment of the present application will be described by taking the network shown in fig. 5a as a congested state, where the congested point is L4, and taking the link formed by L4, S12, L3, and S3 as an example. However, the method provided by the embodiment of the present application is also applicable to the network devices on other links in the network described in fig. 5a, and will not be described in detail below.
Fig. 5b is a schematic flowchart of a congestion isolation method provided in an embodiment of the present application, and as shown in fig. 5b, the method includes:
501. in the event that the queue depth of the management queue in L4 is greater than or equal to the second depth threshold, L4 identifies a first congestion flow from the management queue, segregating the first congestion flow to a congestion queue in L4 (e.g., congestion queue 3).
L4 segregates the first congestion flow it sends to the congestion queue in L4, where the congestion queue shown here is: and the L4 sends at least one congestion queue in the 8 priority queues corresponding to the sending port when the third CIM message is sent to the s 12. It is understood that, no matter the data stream or the third CIM packet is transmitted between L4 and s12, the congestion queue may be at least one congestion queue of the 8 priority queues corresponding to the ports between L4 and s12 shown above.
As shown in fig. 5c, the relationship between the second depth threshold and the first depth threshold or the third depth threshold shown in fig. 5c is merely an example, and the relationship between the second depth threshold and the first depth threshold or the third depth threshold is not limited in the embodiment of the present application. For the description of the first depth threshold, the second depth threshold and the third depth threshold, reference may be made to the foregoing embodiments, which are not described in detail here. As shown in fig. 5c (one), the queue depth of management queue 4 in L4 is greater than the second depth threshold, then L4 may isolate the first congestion flow to congestion queue 3.
In this embodiment, L4 may also create a congestion flow table. For the detailed description of the congestion flow table, reference may be made to the description of table 2 above, which is not described in detail here.
502. And the L4 generates a second CIM packet, where the second CIM packet is used to instruct s12 to isolate the first congestion flow sent by s12 to the congestion queue, and a source IP address of the second CIM packet is an address of L4, and a destination IP address of the second CIM packet is an address of s 12.
It can be understood that, with respect to the method for generating the second CIM packet by L4, reference may be made to the method for generating the second packet by the first network device in the foregoing embodiments, and details are not described here. It can be understood that the second CIM packet shown in the embodiment of the present application is illustrated in a point-to-point manner, but the embodiment of the present application is also applicable to a three-hop or four-hop scenario to which the second CIM packet is applicable.
503. L4 sends the second CIM message to s12, and correspondingly s12 receives the second CIM message.
504. s12 segregates the first congestion flow it sends to the congestion queue.
s12 segregates the first congestion flow it sends to congestion queues, here shown as: s12 is at least one congestion queue of 8 priority queues corresponding to the receiving port when receiving the second CIM packet sent by L4.
In the embodiment of the present application, for example, the second depth threshold in L4 is shown as (one) in fig. 5c, and the queue depth for the management queue in s12 may be shown as (two) in fig. 5c or (four) in fig. 5 c. In other words, when the first congestion flow is isolated to the congestion queue at s12, whether the queue depth of the management queue of s12 is greater than or equal to the second depth threshold is not limited in the embodiment of the present application. For this description, the same applies to L3 and the like, which will not be described in detail below. It will be appreciated that in fig. 5c, each rectangle may represent a data stream, e.g. for the management queue 4, different depth rectangles may represent a data stream. For example, in fig. 5c (two), two data streams are included in the management queue 4. As in (four) of fig. 5c, two congested flows are included in the congestion queue 3. Or, it may also be called that two congested flows are buffered in a congestion queue, etc.
As an example, the first depth threshold in fig. 5c may satisfy the following formula:
headroom=2×MTU+2×Link BW×link Propagation time×n
wherein the headroom is used to represent a difference between the first depth threshold and the Xoff threshold; the MTU is used to indicate the maximum transmission unit, such as the maximum transmission unit of the data stream received by the server, and the maximum size of the data service unit received by the server; link BW is used to represent Link bandwidth; the link Propagation time is used to indicate the transmission time of a data stream (including a congestion stream) on a one-hop link (i.e., a point-to-point transmission mode), and n is used to indicate the number of hops that the data stream (or a packet as well) is transmitted in the network.
The value of the first depth threshold needs to ensure that the server can have enough time to adjust the sending rate of the first congestion flow sent by the server when the first network device or the second network device does not trigger the Xoff threshold. In other words, it is necessary to ensure that the first network device or the second network device has enough buffer space to absorb the flight message before the server stops sending the first congestion flow. For example,
1500 is used for representing the MTU, the unit of 1500 is bytes, and 1500 × 8 is used for representing the number of bits corresponding to 1500 bytes; 100 is used to represent the link bandwidth, and the unit of 100 is Gbit/s, 100 × 1024 × 1024 × 0.000001 is used to represent the bit number of the message transmitted with the duration of 1us and 4 is the hop number when the link bandwidth is 100 Gbit/s.
In this embodiment of the application, after the s12 obtains the second CIM packet, a congestion flow entry may be created according to the second CIM packet, and when a certain data flow hits the congestion flow entry, the s12 isolates the data flow sent by the data flow to the congestion queue. Wherein the certain data flow may also be referred to as a congestion flow.
505. s12 generates a second CIM packet, where the second CIM packet is used to instruct L3 to isolate a congestion flow sent by the second CIM packet to the congestion queue, and the source IP address of the second CIM packet is the address of s12, and the destination IP address of the second CIM packet is the address of L3.
506. s12 sends a second CIM message to L3, and correspondingly, L3 receives the second CIM message.
507. L3 segregates the first congestion flow it sends to the congestion queue.
L3 segregates a congestion flow it sends to congestion queues, here shown as: at least one congestion queue of 8 priority queues corresponding to the receiving port when the L3 receives the third CIM packet sent by s 12.
508. When the queue depth of the congestion queue in L4 is greater than or equal to the first depth threshold, L4 generates a first CIM packet, where the first CIM packet includes first indication information, and a destination address of the first CIM packet is an address of S3.
It is understood that L4 shown in step 508 is only an example, and in a specific implementation, the first CIM packet and the like may also be generated according to the queue depth of the congestion queue in s12 or L3, which is not described in detail in this embodiment of the present application. It can be understood that, for other descriptions of the first CIM packet shown in the embodiment of the present application, reference may be made to the first packet shown in the foregoing embodiment. The destination address of the first CIM message shown here is an address of S3, which can also be understood as: the destination IP address in the network layer header of the first CIM packet is an IP address of S3, or the destination MAC address in the PDU of the first CIM packet is a MAC address of S3, etc., which will not be described in detail herein.
509. L4 sends the first CIM message to s12, and s12 receives the first CIM message accordingly.
In this embodiment of the application, when s12 receives the first CIM packet, the relationship between the queue depth of the congestion queue in s12 and the first depth threshold may be as shown in (two) in fig. 5c or as shown in (three) in fig. 5 c.
510. s12 sending the first CIM message to L3 according to the destination address in the first CIM message; accordingly, L3 receives the first CIM message.
511. L3 sends the first CIM message to S3 according to the destination address in the first CIM message; accordingly, S3 receives the first CIM message.
513. S3 reduces the transmission rate of the first congested flow transmitted at S3 or stops transmitting the first congested flow.
In a possible implementation manner, when receiving the first CIM packet, the S3 may implicitly determine, according to the first indication information in the first CIM packet, that the sending of the first congestion flow needs to be stopped. That is, after receiving the first CIM packet, S3 may isolate the first congestion flow directly according to the first CIM packet, and stop sending the first congestion flow.
In yet another possible implementation, the method shown in fig. 5b may further include step 512.
512. The L3 sends a first PFC message to S3, where the first PFC message includes duration information that a queue where the first congestion flow sent in S3 is located is stopped; accordingly, S3 receives the first PFC message.
For example, if the destination address of the first CIM packet generated by L4 is the address of L3, L3 may also directly perform step 512 after receiving the first CIM packet. The L3 may not need to send the first CIM message to the S3.
In this embodiment of the application, since L4 and s12 are not directly connected to the first congestion flow, when acquiring the first CIM packet, L4 and s12 can only send the first CIM packet to its upstream device, but cannot directly send the first PFC packet to the upstream device. For example, if L4 sends the first PFC message to s12, the congestion queue in s12 may be stopped sending messages. In this case, the queue depth of the congestion queue in s12 will quickly reach the Xoff threshold, resulting in s12 sending the first PFC packet to L3, and so on, the queue depth of the congestion queue in L3 will also quickly reach the Xoff threshold. But the server is still sending the first congested flow, in this case, it may cause the first congested flow buffered in the congestion queue in L3 to be dropped or overflow.
Fig. 6 is a schematic flowchart of a congestion isolation method provided in an embodiment of the present application, and as shown in fig. 6, the method includes:
for the detailed description of step 601 to step 607, reference may be made to the method shown in fig. 5b, which is not described in detail here.
608. And under the condition that the queue depth of the congestion queue in the L4 is greater than or equal to a first depth threshold value, generating a first CIM message, wherein the first CIM message comprises first indication information, and the destination address of the first CIM message is an address of s 12.
It is understood that L4 shown in step 608 is only an example, and in a specific implementation, the first CIM packet may also be generated according to the queue depth of the congestion queue in s12 or L3. In this case, the destination address of the first CIM packet generated at s12 is the address of L3. The destination address of the first CIM packet generated by L3 is the address of S3.
609. L4 sends the first CIM message to s12, and s12 receives the first CIM message accordingly.
In this embodiment of the application, when s12 receives the first CIM packet, the relationship between the queue depth of the congestion queue in s12 and the first depth threshold may be as shown in (two) in fig. 5c or as shown in (three) in fig. 5 c.
610. s12 generates a new first CIM packet according to the first CIM packet.
611. s12 sends a new first CIM message to L3; accordingly, L3 receives the new first CIM message.
612. The L3 sends a first PFC message to S3; accordingly, S3 receives the first PFC message.
613. S3 reduces the transmission rate of the first congested flow transmitted at S3 or stops transmitting the first congested flow.
It can be understood that the method flows shown in fig. 5b and fig. 6 are only examples, and for other examples, reference may be made to fig. 3a, fig. 3b, fig. 4a, fig. 4b, and the like, and the embodiments of the present application are not described again.
According to the method provided by the embodiment of the application, the congestion flow information is sent to the TOR switch through the first CIM message through the isolation type field, the TOR switch identifies the first congestion flow, and the PFC message is sent to the server, so that the diffusion of the PFC message in the network is reduced, the head blockage problem is avoided, the deadlock generated in the network is avoided, and the network performance is improved.
The various embodiments shown above are each directed to a method and the like not described in detail in one embodiment, which may be referred to in other embodiments.
The following will describe the apparatus provided by the embodiments of the present application.
Fig. 7 is a schematic structural diagram of a network device according to an embodiment of the present application, where the network device includes a processing unit 701 and a transceiver unit 702.
In some embodiments of the present application, the network device may be configured to perform the steps performed by the first network device in the above-described embodiments.
Exemplarily, the processing unit 701 is configured to obtain a first packet, where the first packet includes first indication information, where the first indication information is used to indicate and adjust a sending rate of a first congestion flow sent by a third network device, the third network device is a source device of the first congestion flow, and the first congestion flow is a congestion flow identified by a second network device when the second network device is congested; the processing unit 701 is further configured to process the first packet.
In a possible implementation manner, the processing unit 701 is further configured to identify a first congestion flow, and isolate the first congestion flow to a congestion queue in the first network device; and generating a first message under the condition that the congestion queue in the first network equipment is congested.
In a possible implementation manner, the transceiving unit 702 is configured to send, to an upstream node of the first network device, a third PFC message when a queue depth of a congestion queue in the first network device is greater than or equal to a flow control PFC threshold based on the priority, where the third PFC message is used to instruct the congestion queue in the upstream node of the first network device to stop sending the first congestion flow.
In a possible implementation manner, the transceiver 702 is configured to receive a third message, where the third message is used to indicate that a first congestion flow sent by a first network device is isolated to a congestion queue in the first network device, and the first network device and a second network device are different network devices; the processing unit 701 is specifically configured to identify the first congestion flow according to the third packet, and isolate the first congestion flow to a congestion queue in the first network device.
In one possible implementation, the processing unit 701 is specifically configured to identify a first congestion flow from a management queue in a first network device.
In a possible implementation manner, the processing unit 701 is specifically configured to, in a case that a management queue in a first network device is congested, identify a first congested flow from the management queue in the first network device.
In a possible implementation manner, the processing unit 701 is further configured to generate a second packet, where the second packet is used to instruct an upstream node of the first network device to isolate the first congestion flow to a congestion queue in the upstream node of the first network device; the transceiving unit 702 is further configured to send a second packet to an upstream node of the first network device.
In a possible implementation manner, the processing unit 701 is specifically configured to generate the second message when a congestion queue in the first network device is congested.
In a possible implementation manner, the transceiver 702 is configured to receive a third message, where the third message is used to indicate that a first congestion flow sent by a first network device is isolated to a congestion queue in the first network device, and the first network device and a second network device are different network devices; the processing unit 701 is further configured to identify the first congestion flow according to the third packet, and isolate the first congestion flow to a congestion queue in the first network device.
In a possible implementation manner, the processing unit 701 is specifically configured to receive, through the transceiving unit 702, a first packet from a downstream node of the first network device.
In a possible implementation manner, the processing unit 701 is specifically configured to generate a first PFC message according to the first message, where the first PFC message includes duration information that a queue where a first congestion flow sent by a third network device is located is stopped; and sending the PFC message to the third network equipment through the transceiving unit.
In a possible implementation manner, the processing unit 701 is specifically configured to send a first packet to a third network device through the transceiver unit; or, the processing unit 701 is specifically configured to generate a fourth message according to the first message, where the fourth message includes fourth indication information, and the fourth indication information is used to indicate that the first congestion flow sent by the third network device is isolated to a congestion queue in the third network device; and sending the fourth message to the third network device through the transceiving unit.
In a possible implementation manner, the processing unit 701 is further configured to obtain a fifth message, where the fifth message includes fifth indication information, and the fifth indication information is used to indicate that the sending rate of the first congestion flow sent by the third network device is increased; or, the fifth indication information is used to indicate to resume the third network device to send the first congestion flow; and processing the fifth message.
In a possible implementation manner, the processing unit 701 is specifically configured to generate a fifth packet if the first congested flow is identified as a non-congested flow; or, generating the fifth message when the congestion queue in the first network device is not congested.
In a possible implementation manner, the processing unit 701 is specifically configured to receive, by the transceiver unit 702, a fifth packet from a downstream node of the first network device.
In a possible implementation manner, the processing unit 701 is specifically configured to generate a second PFC message according to the fifth message, where the second PFC message is used to indicate that a queue where the first congestion flow sent by the third network device is located is recovered; and transmitting the second PFC message to the third network device through the transceiving unit 702.
In a possible implementation manner, the processing unit 701 is specifically configured to send a fifth packet to the third network device by the transceiver unit; or, the processing unit 701 is specifically configured to generate a sixth message according to the fifth message, where the sixth message includes eighth indication information, and the eighth indication information is used to indicate that the first congestion flow sent by the third network device is switched to a non-congestion queue in the third network device; and transmits the sixth message to the third network device through the transceiving unit 702.
In the embodiment of the present application, for specific descriptions of the first packet, the second packet, the third packet, the fourth packet, the first PFC packet, or the second PFC packet, reference may be made to the description in the above illustrated method embodiment, and details are not repeated here.
The descriptions of the transceiver and the processing unit shown in the embodiments of the present application are only examples, and for specific implementations of the transceiver and the processing unit, reference may also be made to the embodiments of the methods shown in the present application, and a detailed description thereof is omitted here. For example, the transceiver unit and the processing unit may be configured to perform the steps performed by the first network device in the above embodiments. Illustratively, the processing unit may be further configured to perform the step of generating the second packet in step 301 and step 302, step 304, and the like shown in fig. 3a, and the transceiving unit may be further configured to perform the step of sending the second packet in step 302, step 305, and the like shown in fig. 3 a. Illustratively, the processing unit may be further configured to execute the steps of generating the PFC message in step 313, step 314, step 317, and the like shown in fig. 3b, which are not listed here.
In other embodiments of the present application, the network device may be configured to perform the steps performed by the third network device in the above-described embodiments.
In a possible implementation manner, the processing unit 701 is configured to obtain a first PFC message, where the first PFC message includes duration information that a queue where a first congestion flow sent by a third network device is located is stopped; and adjusting the sending rate of the first congestion flow sent by the third network equipment according to the first PFC message.
In a possible implementation manner, the processing unit 701 is configured to obtain a fourth packet, and isolate a first congestion flow sent by a third network device to a congestion queue in the third network device according to the fourth packet (or fourth indication information in the fourth packet).
In a possible implementation manner, the processing unit 701 is specifically configured to adjust a sending rate of a first congestion flow sent by a third network device according to the first PFC message and the fourth PFC message (or third indication information in the fourth PFC message).
In a possible implementation manner, the processing unit 701 is configured to obtain the first packet, and adjust a sending rate of the first congestion flow sent by the third network device according to the first packet and the first PFC packet.
For example, the processing unit may obtain the first message and the like through the transceiver unit, which is not limited in this embodiment of the application. The specific implementation manner of the processing unit may correspond to the related description in the method embodiment.
In a possible implementation manner, the processing unit 701 is configured to isolate, according to the first packet, a first congestion flow sent by the third network device to a congestion queue in the third network device, and adjust a sending rate of the first congestion flow sent by the third network device according to the first packet and the first PFC packet.
In a possible implementation manner, the transceiving unit 702 is configured to receive a second PFC message, where the second PFC message is used to indicate that a queue where a first congestion flow sent by a third network device is located is recovered; and recovering the transmission of the first congestion flow according to the second PFC message.
In a possible implementation manner, the transceiving unit 702 is configured to receive a fifth message; increasing the sending rate of the first congestion flow sent by the third network equipment according to the fifth message; or, according to the fifth message, the first network device is recovered to send the first congestion flow.
In a possible implementation manner, the transceiving unit 702 is configured to receive the sixth packet and send the sixth packet to the third network device.
In the embodiment of the present application, for specific descriptions of the first packet, the second packet, the third packet, the fourth packet, the first PFC packet, or the second PFC packet, reference may be made to the description in the above illustrated method embodiment, and details are not repeated here.
The descriptions of the transceiver and the processing unit shown in the embodiments of the present application are only examples, and for specific implementations of the transceiver and the processing unit, reference may also be made to the embodiments of the methods shown in the present application, and a detailed description thereof is omitted here. For example, the transceiver unit and the processing unit may be configured to perform the steps performed by the third network device in the above embodiments. Illustratively, the processing unit may be further configured to execute step 306 shown in fig. 3a, and the transceiver unit may be further configured to execute the receiving step in step 305 shown in fig. 3a, and the like. Illustratively, the processing unit may also be configured to perform step 318 shown in fig. 3b, etc., which are not listed here.
The division of the modules in the embodiments of the present application is schematic, and only one logical function division is provided, and in actual implementation, there may be another division manner, and in addition, each functional module or unit in each embodiment of the present application may be integrated in one processor, may also exist alone physically, and may also be integrated in one module or unit by two or more modules or units. The integrated modules or units may be implemented in the form of hardware, or may be implemented in the form of software functional modules.
In a possible implementation manner, when the network device shown in fig. 7 is used to execute the steps executed by the first network device, the network device shown in fig. 7 may be a computer, a switch, a router, a network card, or the like in any form; or any form of device in a computer, a switch, a router, a network card, or the like, or any form of device used in combination with any form of computer, a switch, a router, a network card, or the like, the processing unit 701 may be one or more processors, the transceiver unit 702 may be a transceiver, or the transceiver unit 702 may also be a transmitting unit and a receiving unit, the transmitting unit may be a transmitter, the receiving unit may be a receiver, and the transmitting unit and the receiving unit are integrated into one device, such as a transceiver. In the embodiment of the present application, the processor and the transceiver may be coupled, and the connection manner between the processor and the transceiver is not limited in the embodiment of the present application.
In a possible implementation manner, when the network device shown in fig. 7 is used to execute the steps executed by the third network device in the foregoing, the network device shown in fig. 7 may be any form of computer, server, network card, or the like; or any form of device in a computer, a server, or a network card, or any form of device used in combination with a computer, a server, or a network card, the processing unit 701 may be one or more processors, the transceiver unit 702 may be a transceiver, or the transceiver unit 702 may also be a transmitting unit and a receiving unit, the transmitting unit may be a transmitter, the receiving unit may be a receiver, and the transmitting unit and the receiving unit are integrated into one device, such as a transceiver. In the embodiment of the present application, the processor and the transceiver may be coupled, and the connection manner between the processor and the transceiver is not limited in the embodiment of the present application.
As shown in fig. 8, the network device 80 includes one or more processors 820 and a transceiver 810.
Optionally, the processor and the transceiver may be configured to perform functions or operations performed by the network device as the first network device. Illustratively, the processor is configured to generate a first message, a second message, a third message, a fourth message, a first PFC message or a second PFC message, etc.; the transceiver is used for transmitting the first message, the second message, the third message, the fourth message, the first PFC message or the second PFC message and the like.
Alternatively, the processor and the transceiver may be configured to perform functions or operations, etc., performed when the network device is used as a third network device. For example, the processor may be configured to adjust a transmission rate of the first congestion flow it transmits, or to isolate the first congestion flow it transmits to the congestion queue. The transceiver may be configured to receive a first PFC message, a second PFC message, a first message, a fourth message, or the like.
It is understood that for the functions or operations performed by the transceiver and/or the processor, etc., reference may be made to the various embodiments shown in fig. 7, or, alternatively, reference may also be made to the method embodiments shown in fig. 3a to 4b, fig. 5b, or fig. 6, etc., which are not described in detail herein.
In various implementations of the network device shown in fig. 8, the transceiver may include a receiver to perform the function (or operation) of receiving and a transmitter to perform the function (or operation) of transmitting. And transceivers for communicating with other devices/apparatuses over a transmission medium.
Optionally, network device 80 may also include one or more memories 830 for storing program instructions and/or data. The memory 830 is coupled with the processor 820. The coupling in the embodiments of the present application is an indirect coupling or a communication connection between devices, units or modules, and may be an electrical, mechanical or other form for information interaction between the devices, units or modules. The processor 820 may operate in conjunction with the memory 830. Processor 820 may execute program instructions stored in memory 830. Optionally, at least one of the one or more memories may be included in the processor.
The specific connection medium among the transceiver 810, the processor 820 and the memory 830 is not limited in the embodiments of the present application. In fig. 8, the memory 830, the processor 820 and the transceiver 810 are connected by a bus 840, the bus is represented by a thick line in fig. 8, and the connection manner among other components is only schematically illustrated and is not limited. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in FIG. 8, but this is not intended to represent only one bus or type of bus.
In the embodiments of the present application, the processor may be a general-purpose processor, a digital signal processor, an application specific integrated circuit, a field programmable gate array or other programmable logic device, a discrete gate or transistor logic device, a discrete hardware component, or the like, which may implement or execute the methods, steps, and logic blocks disclosed in the embodiments of the present application. A general purpose processor may be a microprocessor or any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware processor, or implemented by a combination of hardware and software modules in a processor.
In the embodiment of the present application, the Memory may include, but is not limited to, a nonvolatile Memory such as a hard disk (HDD) or a solid-state drive (SSD), a Random Access Memory (RAM), an Erasable Programmable Read Only Memory (EPROM), a Read-Only Memory (ROM), or a portable Read-Only Memory (CD-ROM). The memory is any storage medium that can be used to carry or store program code in the form of instructions or data structures and that can be read and/or written by a computer (such as the network devices shown and described herein), but is not limited to such. The memory in the embodiments of the present application may also be circuitry or any other device capable of performing a storage function for storing program instructions and/or data.
It can be understood that the network device shown in the embodiment of the present application may also have more components than those shown in fig. 8, and the embodiment of the present application does not limit this.
It will be appreciated that the methods performed by the processor and transceiver shown above are merely examples, and reference may be made to the methods described above for the steps specifically performed by the processor and transceiver.
In another possible implementation manner, when the network device is a system on chip, such as a system on chip in a switch, a router, a network card, or the like, the processing unit 701 may be one or more logic circuits, and the transceiving unit 702 may be an input/output interface, which is also referred to as a communication interface, or an interface circuit, or an interface, or the like. Or the transceiving unit 702 may also be a transmitting unit and a receiving unit, the transmitting unit may be an output interface, the receiving unit may be an input interface, and the transmitting unit and the receiving unit are integrated into one unit, such as an input-output interface.
The logic circuit 901 may be a chip, a processing circuit, an integrated circuit or a system on chip (SoC) chip, and the interface 902 may be a communication interface, an input/output interface, and the like. In the embodiments of the present application, the logic circuit and the interface may also be coupled to each other. The embodiments of the present application are not limited to the specific connection manner of the logic circuit and the interface.
As shown in fig. 9, the network device shown in fig. 9 includes a logic circuit 901 and an interface 902. That is, the processing unit 701 may be implemented by a logic circuit 901, and the transceiver unit 702 may be implemented by an interface 902.
Alternatively, the logic circuit and the interface may be configured to perform a function or an operation performed by the network device as the first network device. Illustratively, the logic circuit is configured to generate a first packet, a second packet, a third packet, a fourth packet, a first PFC packet, or a second PFC packet; the interface is used for outputting the first message, the second message, the third message, the fourth message, the first PFC message or the second PFC message. For another example, the logic circuit may further obtain the first message, the second message, the third message, and the like through the interface.
Alternatively, the logic circuit and the interface may be configured to perform a function or an operation performed by the network device as a third network device. Illustratively, the logic circuit is configured to adjust a sending rate of the first congestion flow output by the third network device, or to isolate the first congestion flow output by the third network device to the congestion queue. The interface may be used to input a first PFC message, a second PFC message, a first message, or a fourth message, etc.
It is understood that for the functions or operations performed by the interface and/or logic circuit, etc., reference may be made to the various embodiments shown in fig. 7, or, alternatively, reference may also be made to the method embodiments shown in fig. 3 a-4 b, 5b, or 6, etc., and will not be described in detail here.
In the embodiment of the present application, for specific descriptions of the first packet, the second packet, the third packet, the fourth packet, the first PFC packet, or the second PFC packet, reference may be made to the description in the above illustrated method embodiment, and details are not repeated here.
Furthermore, the present application also provides a computer program for implementing the operations and/or processes performed by the first network device in the methods provided by the present application.
The present application also provides a computer program for implementing the operations and/or processes performed by the third network device in the methods provided herein.
The present application also provides a computer-readable storage medium having stored therein computer code, which, when run on a computer, causes the computer to perform the operations and/or processes of the methods provided herein performed by the first network device.
The present application also provides a computer-readable storage medium having stored therein computer code, which, when run on a computer, causes the computer to perform the operations and/or processes of the methods provided herein performed by a third network device.
The present application also provides a computer program product comprising computer code or a computer program which, when run on a computer, causes the operations and/or processes performed by the first network device in the methods provided herein to be performed.
The present application also provides a computer program product comprising computer code or a computer program which, when run on a computer, causes the operations and/or processes performed by the third network device in the methods provided herein to be performed.
The embodiment of the present application further provides a communication system, where the communication system includes a first network device and a second network device, and optionally, the communication system may further include a third network device. For specific steps performed by the first network device, the second network device, and the third network device, reference may be made to the foregoing embodiments, and details are not described here.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implementing, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may also be an electric, mechanical or other form of connection.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the technical effects of the solutions provided by the embodiments of the present application.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a readable storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned readable storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
Furthermore, the present application also provides a computer program for implementing the operations and/or processes performed by the first node in the method for configuring a routing domain identifier provided by the present application.
The present application also provides a computer program for implementing the operations and/or processes performed by the second node in the method of configuring a routing domain identity provided herein.
The present application also provides a computer-readable storage medium having stored therein computer code, which, when run on a computer, causes the computer to perform the operations and/or processes performed by the first node in the method of configuring routing domain identities provided herein.
The present application also provides a computer-readable storage medium having stored therein computer code, which, when run on a computer, causes the computer to perform the operations and/or processes performed by the second node in the method of configuring routing domain identities provided herein.
The present application also provides a computer program product comprising computer code or a computer program which, when run on a computer, causes the operations and/or processes performed by the first node in the method of configuring routing domain identities provided herein to be performed.
The present application also provides a computer program product comprising computer code or a computer program which, when run on a computer, causes the operations and/or processes performed by the second node in the method of configuring routing domain identities provided herein to be performed.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.