Disclosure of Invention
In view of this, the present application provides both a control plane switching method and a forwarding plane switching method for an intelligent network card, so as to solve the technical problems of poor stability and low energy efficiency ratio of a network in a network device. The application provides the following technical scheme:
in a first aspect, the present application provides an intelligent network card system, which is applied to a network device, and includes a network card module, a programmable integrated circuit module, and a main processor in the network device, wherein a processor unit is integrated in the programmable integrated circuit module;
the network card module is connected with the programmable integrated circuit module through a communication bus;
the network card module and the programmable integrated circuit module are both connected with the main processor;
an information synchronization channel is established between the processor unit and the main processor;
the main processor is a first control plane, and the processor unit is a second control plane; the network card module is a first forwarding plane, the programmable integrated circuit module is a second forwarding plane, and the main processor is a third forwarding plane.
Optionally, the priority of the first control plane is higher than the priority of the second control plane;
the first forwarding plane has a higher priority than the second forwarding plane, and the second forwarding plane has a higher priority than the third forwarding plane.
Optionally, the current main control plane is the first control plane, and the current standby control plane is the second control plane, or the current main control plane is the second control plane, and the current standby control plane is the first control plane; the current master control plane and the current standby control plane are both used for:
detecting whether the current main control plane and the current standby control plane are abnormal or not;
when the current standby control plane is detected to be abnormal, controlling the current standby control plane to restart;
and when detecting that the current main control plane is abnormal, controlling the current standby control plane to be switched to a new main control plane, controlling the current main control plane to be switched to a new standby control plane, and controlling the current main control plane to be restarted.
Optionally, the current master control plane and the current standby control plane are both further configured to:
when the current standby control plane is detected to be eliminated abnormally, synchronizing the control information of the current main control plane to the current standby control plane;
and synchronizing the control information of the new main control plane to the current main control plane when the current main control plane is detected to be eliminated abnormally.
Optionally, the current master control plane is configured to, when it is detected that the current forwarding plane is abnormal, migrate the forwarding rule from the current forwarding plane to a forwarding plane with a next priority of the current forwarding plane according to a sequence of priorities from high to low;
wherein the current primary control plane is the first control plane or the second control plane.
Optionally, the current master control plane is configured to, when it is detected that a software failure occurs in the first forwarding plane, migrate a forwarding rule from the first forwarding plane to the second forwarding plane, and control the first forwarding plane to restart the system; migrating the forwarding rule from the second forwarding plane to the first forwarding plane after detecting that the software failure of the first forwarding plane is eliminated;
the current master control plane is configured to, when detecting that a software failure occurs in the second forwarding plane, migrate the forwarding rule from the second forwarding plane to the third forwarding plane, and notify an upper application to perform processing;
and the current main control plane is used for informing an upper layer application to process when the third forwarding plane is monitored to have a software fault.
Optionally, the current master control plane is configured to migrate the forwarding rule from the first forwarding plane to the second forwarding plane when detecting that the resource of the first forwarding plane is insufficient, and migrate the forwarding rule from the second forwarding plane back to the first forwarding plane when detecting that the resource of the first forwarding plane becomes sufficient;
the current primary control plane is configured to migrate the forwarding rule from the second forwarding plane to the third forwarding plane when detecting that the resource of the second forwarding plane is insufficient, and migrate the forwarding rule from the third forwarding plane back to the second forwarding plane when detecting that the resource of the second forwarding plane becomes sufficient.
In a second aspect, the present application further provides an intelligent network card, which is applied to a network device, and includes a network card module and a programmable integrated circuit module, wherein a processor unit is integrated in the programmable integrated circuit module;
the network card module is connected with the programmable integrated circuit module through a communication bus;
the processor units of the network card module and the programmable integrated circuit module are connected with a main processor in the network equipment;
an information synchronization channel is established between the processor unit and the main processor;
the main processor is a first control plane, and the processor unit is a second control plane; the network card module is a first forwarding plane, the programmable integrated circuit module is a second forwarding plane, and the main processor is a third forwarding plane.
In a third aspect, the present application further provides a network device, including the intelligent network card system described in any one of the possible implementation manners of the first aspect.
In a fourth aspect, the present application further provides a method for switching a control plane of an intelligent network card, where the method is applied to a main processor or a processor unit of an intelligent network card system according to any one of possible implementation manners of the first aspect, and the method includes:
detecting whether the current main control plane and the current standby control plane are abnormal or not;
when the current standby control plane is detected to be abnormal, controlling the current standby control plane to restart;
and when detecting that the current main control plane is abnormal, controlling the current standby control plane to be switched to a new main control plane, controlling the current main control plane to be switched to a new standby control plane, and controlling the current main control plane to be restarted.
Optionally, the method further comprises: when the current standby control plane is detected to be eliminated abnormally, synchronizing the control information of the current control plane to the current standby control plane;
and synchronizing the control information of the new main control plane to the current main control plane when the current main control plane is detected to be eliminated abnormally.
In a fifth aspect, the present application further provides a forwarding plane switching method for an intelligent network card, which is applied to the intelligent network card system described in any one of the possible implementation manners of the first aspect, and the method includes:
when the main control plane detects that the current forwarding plane is abnormal, the forwarding rules are transferred from the current forwarding plane to a target forwarding plane with the next priority of the current forwarding plane according to the sequence of the priorities from high to low;
and when the abnormal forwarding plane is detected to be recovered to be normal, migrating the forwarding rule from the target forwarding plane to the current forwarding plane.
The intelligent network card system comprises a network card module, a programmable integrated circuit module and a main processor, wherein the network card module and the programmable integrated circuit module form an intelligent network card, and the main processor is a processor in network equipment applying the intelligent network card; the network card module is connected with the programmable integrated circuit module, and the network card module and the programmable integrated circuit are both connected with the main processor; thus, the main processor and the processor unit in the programmable integrated circuit module can be used as the control plane of the intelligent network card; the network card module, the programmable integrated circuit and the main processor can be used as a forwarding plane of the intelligent network card. Compared with the traditional scheme only depending on a host CPU, the intelligent network card system is provided with two control planes and three forwarding planes, and when one control plane or one forwarding plane is abnormal, other control planes or other forwarding planes can be started; the reliability of the intelligent network card is improved; moreover, a forwarding plane with high network energy efficiency ratio can be selected according to actual requirements, so that the network energy efficiency ratio of the intelligent network card is improved.
Detailed Description
Before describing the embodiments provided in the present application in detail, the following control plane and forwarding plane are introduced:
the control plane refers to the part for transmitting instructions and calculating table items in the system, and provides various network information and forwarding query table items which are necessary before data processing and forwarding;
the forwarding plane refers to a part of the system used for encapsulating and forwarding data packets. Such as receiving, decapsulating, encapsulating, forwarding, etc. of data packets, fall within the scope of the forwarding plane.
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Referring to fig. 1, a schematic block diagram of an intelligent network card system provided in an embodiment of the present application is shown, where the system includes a network card module 1, a programmable integrated circuit module 2, and a main processor 3;
the network card module 1 and the programmable integrated circuit module 2 form an intelligent network card, and the intelligent network card is generally applied to network devices such as servers. The main processor 3 is a processor within the network device.
In an embodiment of the present Application, the Network Card module 1 may adopt a Network Card (NIC) chip, where the NIC chip belongs to an Application Specific Integrated Circuit (ASIC) chip, and is an ASIC designed specifically for implementing a Network Card function.
The Programmable integrated circuit module 2 may be a Field Programmable Gate Array (FPGA) chip, and a processor unit is embedded therein.
The main processor 3 is a Host CPU in a network device, wherein the network device may be a server, and the server may be one server device or a server cluster composed of a plurality of server devices.
The network card module 1 and the programmable integrated circuit module 2 are connected through a communication bus, and the programmable integrated circuit module 2 is provided with a network interface for connecting an external network.
The network card module 1 and the programmable integrated circuit module 2 are both connected with the main processor 3 through a communication bus, and an information synchronization channel is established between the programmable integrated circuit module 2 and the main processor 3 and is used for controlling information synchronization.
For example, the NIC chip and the FPGA chip may be connected by an ethernet bus; and the NIC chip and the FPGA chip are connected with the Host CPU of the server through the PCIE bus.
The initialization process of the intelligent network card is described below by taking an NIC chip, an FPGA chip and a Host CPU as examples:
during initialization, configuring both a Host CPU and a CPU embedded in an FPGA chip as a control plane of an intelligent network card; moreover, a user can configure any one of the control planes as a main control plane and the other control plane as a standby control plane according to the requirement of the user; for example, a Host CPU can be configured as a main control plane, and a CPU embedded in an FPGA is configured as a standby control plane; or, configuring a CPU embedded in the FPGA as a main control plane, and configuring a Host CPU as a standby control plane.
And establishing an information synchronization channel between the Host CPU and the CPU embedded in the FPGA for synchronizing control information.
Configuring an NIC chip, an FPGA and a Host CPU as forwarding planes, wherein the NIC chip is a first forwarding plane, the FPGA is a second forwarding plane and the Host CPU is a third forwarding plane;
in addition, the NIC chip has the advantages of large bandwidth and low time delay of the processing message of the ASIC chip, but has the disadvantages of low complexity and limited resources of the processing message; compared with an NIC chip, the FPGA has slightly lower message processing bandwidth and slightly higher time delay, can process more complex messages and has more sufficient resources than the NIC; compared with NIC and FPGA, the Host CPU has the advantages of lowest message processing bandwidth and largest time delay, but can process complex messages and has the most repeated resources.
The user may configure the priorities of the three forwarding planes according to actual requirements, for example, if the user has a relatively high requirement on the delay, the priorities may be, in order from high to low: a first forwarding plane, a second forwarding plane, and a third forwarding plane.
After the initialization configuration process is completed, the two control planes can monitor whether the two control planes and the other control plane are abnormal or not, and if any one control plane is abnormal, the control plane without the abnormality can be started to continue to complete the control flow; and the main control plane monitors whether the currently used forwarding plane is abnormal or not, and if the currently used forwarding plane is abnormal, the forwarding rules are transferred from the current forwarding plane to the forwarding plane with the priority level of the current forwarding plane according to the priority level sequence of the forwarding plane, so that the data can be normally forwarded and processed.
Compared with the traditional scheme only depending on a host CPU, the intelligent network card system provided by the embodiment has two control planes and three forwarding planes, and when any one control plane or forwarding plane is abnormal, other control planes or forwarding planes can be started; the reliability of the intelligent network card is improved; moreover, a forwarding plane with high network energy efficiency ratio can be selected according to actual requirements, so that the network energy efficiency ratio of the intelligent network card is improved.
Referring to fig. 2, a flowchart of a control plane switching method according to an embodiment of the present application is shown, where the method is applied to a first control plane and a second control plane, in this embodiment, both the control planes can monitor whether the control plane and another control plane are abnormal, and when it is detected that any one of the control planes is abnormal, the other control plane is switched to implement a control function.
As shown in fig. 2, the method may include the steps of:
s110, detecting whether the current main control plane and the current standby control plane are abnormal or not; if the current standby control plane is detected to be abnormal, executing S120; if it is detected that the current primary control plane is abnormal, S140 is executed.
The main control plane and the standby control plane are configured when the system is initialized, and a user can configure the main control plane and the standby control plane according to the actual requirements of the user. The current main control plane is a main control plane when the current detection action is executed, and the current standby control plane is a standby control plane when the current detection action is executed.
In an embodiment of the application, the main control plane is a Host CPU, and the standby control plane is an embedded CPU of an FPGA, so that both the Host CPU and the embedded CPU of the FPGA can monitor whether the Host CPU and the embedded CPU of the FPGA are abnormal or not.
The abnormal conditions of the control plane include software faults and insufficient resources which may occur to the CPU, for example, the software faults include dead halt, process exception, program deadlock, insufficient resources of the CPU or the memory, and the like.
And S120, controlling the current standby control plane to restart.
And when the current standby control plane is detected to be abnormal, controlling the current standby control plane to restart.
If the main control plane detects that the standby control plane has abnormity, the standby control plane is informed to restart so as to eliminate the abnormity.
If the backup control plane detects that the backup control plane has an abnormality, the backup control plane is automatically restarted to eliminate the abnormality.
S130, if the current standby control plane is eliminated, the current main control plane synchronizes the control information to the current standby control plane.
If the current standby control plane is eliminated by restarting the abnormity, the control information of the current main control plane is synchronized to the current standby control plane through an information synchronization channel, so that the current standby control plane realizes the control process when the current main control plane is abnormal.
And S140, controlling the current main control plane to restart, and controlling the current standby control plane to be changed into a new main control plane.
And if the current main control plane is detected to be abnormal, controlling the current standby control plane to be switched to a new main control plane, switching the current main control plane to a new standby control plane, and controlling the current main control plane to be restarted.
In the embodiment of the application, the two actions of controlling the current main control plane to restart and controlling the current standby control plane to be changed into a new main control plane can be executed in parallel or in sequence, for example, the restart can be controlled first, and then the main control plane is switched; or, the main control plane may be switched first, and then the restart is controlled.
In one embodiment of the application, the current main control plane is a Host CPU, and the current standby control plane is an embedded CPU of an FPGA;
if the Host CPU detects that the Host CPU is abnormal, the embedded CPU of the FPGA becomes a main control plane, and meanwhile, the Host CPU is restarted to eliminate the abnormality.
And if the embedded CPU of the FPGA detects that the Host CPU is abnormal, the embedded CPU of the FPGA is controlled to become a main control plane, and the Host CPU is informed to be restarted to eliminate the abnormality.
S150, if the current main control plane eliminates the abnormity, the new main control plane synchronizes the control information to the current control plane.
Still take the case that the current main control plane is the Host CPU and the current standby control plane is the embedded CPU of the FPGA as an example for explanation, and after the Host CPU eliminates the abnormality, the control information in the embedded CPU of the FPGA is synchronized to the Host CPU.
In the control plane switching method provided in this embodiment, both the two control planes can monitor the abnormality of the control plane itself and the other control plane, and if it is monitored that the main control plane is abnormal, the standby control plane is changed into the main control plane, and the control plane with the abnormality is controlled to restart. And if the backup control plane is detected to be abnormal, restarting the backup control plane and synchronizing the control information of the main control plane to the backup control plane so that the backup control plane can enter a main control state at any time. The method realizes redundancy control through two control planes, and improves the reliability of the intelligent network card.
Referring to fig. 3, a flowchart of a forwarding plane switching method provided in an embodiment of the present application is shown, where the method is applied to the main control plane shown in fig. 1. And the main control plane monitors whether the three forwarding planes are abnormal or not and migrates the forwarding rules to the forwarding planes without the abnormality.
In this embodiment, an NIC chip is taken as a first forwarding plane, an FPGA is taken as a second forwarding plane, and a Host CPU is taken as a third forwarding plane, and the priority of the NIC chip is higher than that of the FPGA, and the priority of the FPGA is higher than that of the Host CPU.
As shown in fig. 3, the method may include the steps of:
s210, when the main control plane detects that the current forwarding plane is abnormal, the forwarding rules are transferred from the current forwarding plane to a target forwarding plane with the next priority of the current forwarding plane according to the sequence of the priorities from high to low.
The current forwarding plane refers to a forwarding plane for executing forwarding data according to a forwarding rule when the main control plane executes a step of detecting whether the forwarding plane has an abnormality.
And S220, when the current forwarding plane is detected to be recovered to normal, migrating the forwarding rule from the target forwarding plane to the current forwarding plane.
And when the main control plane detects that the current forwarding plane is recovered to be normal, the forwarding rule is migrated from the target forwarding plane to the current forwarding plane.
Referring to fig. 4, a flowchart of a forwarding plane switching method provided in an embodiment of the present application is shown, where the embodiment will describe in detail a switching control process between forwarding planes; the method may be applied within a main processor or processor unit as shown in fig. 1.
As shown in fig. 4, the method may include the steps of:
s310, if the main control plane detects that the first forwarding plane is abnormal, judging the abnormal type of the first forwarding plane;
if the abnormal type is software failure, executing S320; if the exception type is resource shortage, executing S340;
in one embodiment of the present application, the software failures that may exist in the forwarding plane may include a dead halt, a process exception, a program deadlock, and the like; the resource shortage may include the shortage of the memory and the CPU.
The primary control plane in this embodiment may be the first control plane or the second control plane.
S320, transferring the forwarding rule from the first forwarding plane to the second forwarding plane, and controlling the first forwarding plane to restart;
s330, when the first forwarding plane is detected to be eliminated, migrating the forwarding rule from the second forwarding plane to the first forwarding plane.
S340, transferring the forwarding rule from the first forwarding plane to the second forwarding plane;
and S350, when the fact that the resources of the first forwarding plane are sufficient is detected, migrating the forwarding rules from the second forwarding plane to the first forwarding plane.
S360, if the main control plane detects that the second forwarding plane is abnormal, judging the abnormal type of the second forwarding plane;
if the software fails, executing S370; if the resource is insufficient, executing S380;
s370, the forwarding rule is migrated from the second forwarding plane to the third forwarding plane, and the upper layer application is notified to process;
s380, the forwarding rules are migrated from the second forwarding plane to the third forwarding plane, and when the resources of the second forwarding plane are detected to be sufficient, the forwarding rules are migrated from the third forwarding plane back to the second forwarding plane.
And S390, if the main control plane detects that the third forwarding plane has software failure, the main control plane informs the upper layer application to process.
In the forwarding plane switching method provided by this embodiment, the main control plane monitors the abnormal conditions of the three forwarding planes, and when it is monitored that the currently used forwarding plane is abnormal, the forwarding rule is migrated from the current forwarding plane to the forwarding plane of the next priority of the current forwarding plane, so as to ensure that network forwarding is performed normally, thereby improving the reliability of the intelligent network card system.
On the other hand, the application also provides an embodiment of the control plane switching device.
Referring to fig. 5, a block diagram of a control plane switching device according to an embodiment of the present disclosure is shown, where the control plane switching device is applied to a host processor or an embedded processor unit of a programmable integrated circuit module 2.
As shown in fig. 5, the apparatus may include: a first detecting unit 110, a first restarting unit 120, a first synchronizing unit 130, a second restarting unit 140, a first switching unit 150, a second synchronizing unit 160;
the first detecting unit 110 is configured to detect whether there is an abnormality in the primary control plane and the backup control plane.
If the backup control plane is detected to be abnormal, executing S120; if the main control plane is detected to be abnormal, S140 is executed.
The first restarting unit 120 is configured to control the standby control plane to restart when it is detected that there is an abnormality in the standby control plane.
A first synchronization unit 130 for synchronizing control information to the standby control plane by the main control plane when the standby control plane is eliminated abnormally.
And a second reboot unit 140 for controlling the restart of the main control plane.
A first switching unit 150, configured to switch the current standby control plane to a new main control plane.
A second synchronizing unit 160 for synchronizing the control information to the abnormality-removed control plane by the control plane where there is no abnormality when the abnormality-removed control plane has an abnormality.
In the control plane switching apparatus provided in this embodiment, both the two control planes can monitor the abnormality of the control plane itself and the other control plane, and if it is monitored that the main control plane is abnormal, the standby control plane is changed into the main control plane, and the control plane with the abnormality is controlled to restart. And if the backup control plane is detected to be abnormal, restarting the backup control plane and synchronizing the control information of the main control plane to the backup control plane so that the backup control plane can enter a main control state at any time. The device realizes redundancy control through two control planes, and improves the reliability of the intelligent network card.
Referring to fig. 6, a block diagram of a forwarding plane switching apparatus according to an embodiment of the present invention is shown, as shown in fig. 6, the apparatus is applied to an embedded processor unit of the host processor or the programmable integrated circuit module 2 shown in fig. 1.
As shown in fig. 6, the apparatus may include:
a second detecting unit 210, configured to detect an abnormality type of the first forwarding plane when detecting that the first forwarding plane has an abnormality;
the exception types include software failures and resource shortages.
A first migration unit 220, configured to migrate the forwarding rule from the first forwarding plane to the second forwarding plane when the first forwarding plane has a software failure;
and a third restart unit 230 for controlling the first forwarding plane to restart.
A second migration unit 240, configured to migrate the forwarding rule from the second forwarding plane back to the first forwarding plane after detecting that the failure of the first forwarding plane is eliminated.
A third migration unit 250, configured to migrate the forwarding rule from the first forwarding plane to the second forwarding plane when the first forwarding plane has insufficient resources.
A fourth migration unit 260, configured to migrate the forwarding rule from the second forwarding plane back to the first forwarding plane when the resource of the first forwarding plane is detected to be sufficient.
A third detecting unit 270, configured to determine an exception type of the second forwarding plane when it is detected that the second forwarding plane has an exception.
A fifth migration unit 280, configured to migrate the forwarding rule from the second forwarding plane to the third forwarding plane when the second forwarding plane has a software failure, and notify the upper layer application to perform processing.
A sixth migrating unit 290, configured to migrate the forwarding rule from the second forwarding plane to the third forwarding plane when the second forwarding plane has insufficient resources.
A seventh migrating unit 2100, configured to migrate the forwarding rules from the third forwarding plane back to the second forwarding plane when it is detected that the resources of the second forwarding plane become sufficient.
A third detecting unit 2110, configured to notify an upper-layer application to perform processing when detecting that a software failure occurs in the third forwarding plane.
In the forwarding plane switching device provided in this embodiment, the main control plane monitors the abnormal conditions of the three forwarding planes, and when it is monitored that the currently used forwarding plane is abnormal, the forwarding rule is migrated from the current forwarding plane to the forwarding plane of the next priority of the current forwarding plane, so as to ensure that network forwarding is performed normally, thereby improving the reliability of the intelligent network card system.
In another aspect, the present application further provides an intelligent network card, where the intelligent network card includes a network card module and a programmable integrated circuit module in the intelligent network card system shown in fig. 1, and please refer to the related description in the intelligent network card system for related contents.
In another aspect, the present application further provides a network device (e.g., a server), where the network device includes the intelligent network card system shown in fig. 1.
While, for purposes of simplicity of explanation, the foregoing method embodiments have been described as a series of acts or combination of acts, it will be appreciated by those skilled in the art that the present application is not limited by the order of acts or acts described, as some steps may occur in other orders or concurrently with other steps in accordance with the application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required in this application.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functionality of the units may be implemented in one or more software and/or hardware when implementing the present application.
From the above description of the embodiments, it is clear to those skilled in the art that the present application can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the present application may be essentially or partially implemented in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the embodiments or some parts of the embodiments of the present application.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.