[go: up one dir, main page]

CN109245926B - Intelligent network card, intelligent network card system and control method - Google Patents

Intelligent network card, intelligent network card system and control method Download PDF

Info

Publication number
CN109245926B
CN109245926B CN201810988015.1A CN201810988015A CN109245926B CN 109245926 B CN109245926 B CN 109245926B CN 201810988015 A CN201810988015 A CN 201810988015A CN 109245926 B CN109245926 B CN 109245926B
Authority
CN
China
Prior art keywords
plane
control plane
forwarding
current
network card
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810988015.1A
Other languages
Chinese (zh)
Other versions
CN109245926A (en
Inventor
林楷智
贡维
石江涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhengzhou Yunhai Information Technology Co Ltd
Original Assignee
Zhengzhou Yunhai Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhengzhou Yunhai Information Technology Co Ltd filed Critical Zhengzhou Yunhai Information Technology Co Ltd
Priority to CN201810988015.1A priority Critical patent/CN109245926B/en
Publication of CN109245926A publication Critical patent/CN109245926A/en
Application granted granted Critical
Publication of CN109245926B publication Critical patent/CN109245926B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • H04L41/0668Management of faults, events, alarms or notifications using network fault recovery by dynamic selection of recovery network elements, e.g. replacement by the most appropriate element after failure
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/02Details
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • H04L41/0659Management of faults, events, alarms or notifications using network fault recovery by isolating or reconfiguring faulty entities
    • H04L41/0661Management of faults, events, alarms or notifications using network fault recovery by isolating or reconfiguring faulty entities by reconfiguring faulty entities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0805Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability
    • H04L43/0817Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability by checking functioning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/14Arrangements for monitoring or testing data switching networks using software, i.e. software packages

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Environmental & Geological Engineering (AREA)
  • Hardware Redundancy (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

本申请提供了智能网卡、智能网卡系统及控制方法,该系统包括网卡模块、可编程集成电路模块和主处理器,主处理器是应用该智能网卡的网络设备内的处理器;网卡模块与可编程集成电路模块相连接,而且,网卡模块、可编程集成电路均与主处理器相连接;这样,主处理器和可编程集成电路模块内的处理器单元均可以作为智能网卡的控制平面;网卡模块、可编程集成电路和主处理器均可以作为智能网卡的转发平面。与传统的仅依靠主机CPU的方案相比,该智能网卡系统具有两个控制平面和三个转发平面,当一个控制平面或转发平面出现异常时,均可以启用其他的控制平面或转发平面;提高了智能网卡的可靠性。

Figure 201810988015

The present application provides an intelligent network card, an intelligent network card system and a control method, the system includes a network card module, a programmable integrated circuit module and a main processor, and the main processor is a processor in a network device applying the intelligent network card; The programming integrated circuit module is connected, and the network card module and the programmable integrated circuit are connected with the main processor; in this way, the main processor and the processor unit in the programmable integrated circuit module can be used as the control plane of the intelligent network card; the network card Modules, programmable integrated circuits and main processors can all act as forwarding planes for smart network cards. Compared with the traditional scheme that only relies on the host CPU, the smart NIC system has two control planes and three forwarding planes. When one control plane or forwarding plane is abnormal, other control planes or forwarding planes can be enabled; The reliability of the smart network card.

Figure 201810988015

Description

Intelligent network card, intelligent network card system and control method
Technical Field
The application relates to the technical field of computer networks, in particular to an intelligent network card, an intelligent network card system and a control method.
Background
An Intelligent Network Interface Card (iinic) is a high-performance Network access Card with a Network processor as a core. The network processor architecture adopts a multi-core and multi-thread network processor architecture, is mainly used for realizing the characteristics of virtual exchange, security isolation, Quality of Service (QoS) and the like, and is applied to a cloud computing network virtualization solution.
With the continuous increase of the service types and the data volume of the cloud data centers, the traditional internal network scheme of realizing network devices (for example, servers) by means of Host CPUs has poor network stability and low network energy efficiency ratio.
Disclosure of Invention
In view of this, the present application provides both a control plane switching method and a forwarding plane switching method for an intelligent network card, so as to solve the technical problems of poor stability and low energy efficiency ratio of a network in a network device. The application provides the following technical scheme:
in a first aspect, the present application provides an intelligent network card system, which is applied to a network device, and includes a network card module, a programmable integrated circuit module, and a main processor in the network device, wherein a processor unit is integrated in the programmable integrated circuit module;
the network card module is connected with the programmable integrated circuit module through a communication bus;
the network card module and the programmable integrated circuit module are both connected with the main processor;
an information synchronization channel is established between the processor unit and the main processor;
the main processor is a first control plane, and the processor unit is a second control plane; the network card module is a first forwarding plane, the programmable integrated circuit module is a second forwarding plane, and the main processor is a third forwarding plane.
Optionally, the priority of the first control plane is higher than the priority of the second control plane;
the first forwarding plane has a higher priority than the second forwarding plane, and the second forwarding plane has a higher priority than the third forwarding plane.
Optionally, the current main control plane is the first control plane, and the current standby control plane is the second control plane, or the current main control plane is the second control plane, and the current standby control plane is the first control plane; the current master control plane and the current standby control plane are both used for:
detecting whether the current main control plane and the current standby control plane are abnormal or not;
when the current standby control plane is detected to be abnormal, controlling the current standby control plane to restart;
and when detecting that the current main control plane is abnormal, controlling the current standby control plane to be switched to a new main control plane, controlling the current main control plane to be switched to a new standby control plane, and controlling the current main control plane to be restarted.
Optionally, the current master control plane and the current standby control plane are both further configured to:
when the current standby control plane is detected to be eliminated abnormally, synchronizing the control information of the current main control plane to the current standby control plane;
and synchronizing the control information of the new main control plane to the current main control plane when the current main control plane is detected to be eliminated abnormally.
Optionally, the current master control plane is configured to, when it is detected that the current forwarding plane is abnormal, migrate the forwarding rule from the current forwarding plane to a forwarding plane with a next priority of the current forwarding plane according to a sequence of priorities from high to low;
wherein the current primary control plane is the first control plane or the second control plane.
Optionally, the current master control plane is configured to, when it is detected that a software failure occurs in the first forwarding plane, migrate a forwarding rule from the first forwarding plane to the second forwarding plane, and control the first forwarding plane to restart the system; migrating the forwarding rule from the second forwarding plane to the first forwarding plane after detecting that the software failure of the first forwarding plane is eliminated;
the current master control plane is configured to, when detecting that a software failure occurs in the second forwarding plane, migrate the forwarding rule from the second forwarding plane to the third forwarding plane, and notify an upper application to perform processing;
and the current main control plane is used for informing an upper layer application to process when the third forwarding plane is monitored to have a software fault.
Optionally, the current master control plane is configured to migrate the forwarding rule from the first forwarding plane to the second forwarding plane when detecting that the resource of the first forwarding plane is insufficient, and migrate the forwarding rule from the second forwarding plane back to the first forwarding plane when detecting that the resource of the first forwarding plane becomes sufficient;
the current primary control plane is configured to migrate the forwarding rule from the second forwarding plane to the third forwarding plane when detecting that the resource of the second forwarding plane is insufficient, and migrate the forwarding rule from the third forwarding plane back to the second forwarding plane when detecting that the resource of the second forwarding plane becomes sufficient.
In a second aspect, the present application further provides an intelligent network card, which is applied to a network device, and includes a network card module and a programmable integrated circuit module, wherein a processor unit is integrated in the programmable integrated circuit module;
the network card module is connected with the programmable integrated circuit module through a communication bus;
the processor units of the network card module and the programmable integrated circuit module are connected with a main processor in the network equipment;
an information synchronization channel is established between the processor unit and the main processor;
the main processor is a first control plane, and the processor unit is a second control plane; the network card module is a first forwarding plane, the programmable integrated circuit module is a second forwarding plane, and the main processor is a third forwarding plane.
In a third aspect, the present application further provides a network device, including the intelligent network card system described in any one of the possible implementation manners of the first aspect.
In a fourth aspect, the present application further provides a method for switching a control plane of an intelligent network card, where the method is applied to a main processor or a processor unit of an intelligent network card system according to any one of possible implementation manners of the first aspect, and the method includes:
detecting whether the current main control plane and the current standby control plane are abnormal or not;
when the current standby control plane is detected to be abnormal, controlling the current standby control plane to restart;
and when detecting that the current main control plane is abnormal, controlling the current standby control plane to be switched to a new main control plane, controlling the current main control plane to be switched to a new standby control plane, and controlling the current main control plane to be restarted.
Optionally, the method further comprises: when the current standby control plane is detected to be eliminated abnormally, synchronizing the control information of the current control plane to the current standby control plane;
and synchronizing the control information of the new main control plane to the current main control plane when the current main control plane is detected to be eliminated abnormally.
In a fifth aspect, the present application further provides a forwarding plane switching method for an intelligent network card, which is applied to the intelligent network card system described in any one of the possible implementation manners of the first aspect, and the method includes:
when the main control plane detects that the current forwarding plane is abnormal, the forwarding rules are transferred from the current forwarding plane to a target forwarding plane with the next priority of the current forwarding plane according to the sequence of the priorities from high to low;
and when the abnormal forwarding plane is detected to be recovered to be normal, migrating the forwarding rule from the target forwarding plane to the current forwarding plane.
The intelligent network card system comprises a network card module, a programmable integrated circuit module and a main processor, wherein the network card module and the programmable integrated circuit module form an intelligent network card, and the main processor is a processor in network equipment applying the intelligent network card; the network card module is connected with the programmable integrated circuit module, and the network card module and the programmable integrated circuit are both connected with the main processor; thus, the main processor and the processor unit in the programmable integrated circuit module can be used as the control plane of the intelligent network card; the network card module, the programmable integrated circuit and the main processor can be used as a forwarding plane of the intelligent network card. Compared with the traditional scheme only depending on a host CPU, the intelligent network card system is provided with two control planes and three forwarding planes, and when one control plane or one forwarding plane is abnormal, other control planes or other forwarding planes can be started; the reliability of the intelligent network card is improved; moreover, a forwarding plane with high network energy efficiency ratio can be selected according to actual requirements, so that the network energy efficiency ratio of the intelligent network card is improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 shows a schematic block diagram of an intelligent network card system provided in an embodiment of the present application;
fig. 2 is a flowchart illustrating a control plane switching method provided in an embodiment of the present application;
fig. 3 shows a flowchart of a forwarding plane switching method provided in an embodiment of the present application;
fig. 4 is a flowchart illustrating another forwarding plane switching method provided in an embodiment of the present application;
fig. 5 is a block diagram illustrating a control plane switching apparatus provided in an embodiment of the present application;
fig. 6 shows a block diagram of a forwarding plane switching apparatus according to an embodiment of the present application.
Detailed Description
Before describing the embodiments provided in the present application in detail, the following control plane and forwarding plane are introduced:
the control plane refers to the part for transmitting instructions and calculating table items in the system, and provides various network information and forwarding query table items which are necessary before data processing and forwarding;
the forwarding plane refers to a part of the system used for encapsulating and forwarding data packets. Such as receiving, decapsulating, encapsulating, forwarding, etc. of data packets, fall within the scope of the forwarding plane.
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Referring to fig. 1, a schematic block diagram of an intelligent network card system provided in an embodiment of the present application is shown, where the system includes a network card module 1, a programmable integrated circuit module 2, and a main processor 3;
the network card module 1 and the programmable integrated circuit module 2 form an intelligent network card, and the intelligent network card is generally applied to network devices such as servers. The main processor 3 is a processor within the network device.
In an embodiment of the present Application, the Network Card module 1 may adopt a Network Card (NIC) chip, where the NIC chip belongs to an Application Specific Integrated Circuit (ASIC) chip, and is an ASIC designed specifically for implementing a Network Card function.
The Programmable integrated circuit module 2 may be a Field Programmable Gate Array (FPGA) chip, and a processor unit is embedded therein.
The main processor 3 is a Host CPU in a network device, wherein the network device may be a server, and the server may be one server device or a server cluster composed of a plurality of server devices.
The network card module 1 and the programmable integrated circuit module 2 are connected through a communication bus, and the programmable integrated circuit module 2 is provided with a network interface for connecting an external network.
The network card module 1 and the programmable integrated circuit module 2 are both connected with the main processor 3 through a communication bus, and an information synchronization channel is established between the programmable integrated circuit module 2 and the main processor 3 and is used for controlling information synchronization.
For example, the NIC chip and the FPGA chip may be connected by an ethernet bus; and the NIC chip and the FPGA chip are connected with the Host CPU of the server through the PCIE bus.
The initialization process of the intelligent network card is described below by taking an NIC chip, an FPGA chip and a Host CPU as examples:
during initialization, configuring both a Host CPU and a CPU embedded in an FPGA chip as a control plane of an intelligent network card; moreover, a user can configure any one of the control planes as a main control plane and the other control plane as a standby control plane according to the requirement of the user; for example, a Host CPU can be configured as a main control plane, and a CPU embedded in an FPGA is configured as a standby control plane; or, configuring a CPU embedded in the FPGA as a main control plane, and configuring a Host CPU as a standby control plane.
And establishing an information synchronization channel between the Host CPU and the CPU embedded in the FPGA for synchronizing control information.
Configuring an NIC chip, an FPGA and a Host CPU as forwarding planes, wherein the NIC chip is a first forwarding plane, the FPGA is a second forwarding plane and the Host CPU is a third forwarding plane;
in addition, the NIC chip has the advantages of large bandwidth and low time delay of the processing message of the ASIC chip, but has the disadvantages of low complexity and limited resources of the processing message; compared with an NIC chip, the FPGA has slightly lower message processing bandwidth and slightly higher time delay, can process more complex messages and has more sufficient resources than the NIC; compared with NIC and FPGA, the Host CPU has the advantages of lowest message processing bandwidth and largest time delay, but can process complex messages and has the most repeated resources.
The user may configure the priorities of the three forwarding planes according to actual requirements, for example, if the user has a relatively high requirement on the delay, the priorities may be, in order from high to low: a first forwarding plane, a second forwarding plane, and a third forwarding plane.
After the initialization configuration process is completed, the two control planes can monitor whether the two control planes and the other control plane are abnormal or not, and if any one control plane is abnormal, the control plane without the abnormality can be started to continue to complete the control flow; and the main control plane monitors whether the currently used forwarding plane is abnormal or not, and if the currently used forwarding plane is abnormal, the forwarding rules are transferred from the current forwarding plane to the forwarding plane with the priority level of the current forwarding plane according to the priority level sequence of the forwarding plane, so that the data can be normally forwarded and processed.
Compared with the traditional scheme only depending on a host CPU, the intelligent network card system provided by the embodiment has two control planes and three forwarding planes, and when any one control plane or forwarding plane is abnormal, other control planes or forwarding planes can be started; the reliability of the intelligent network card is improved; moreover, a forwarding plane with high network energy efficiency ratio can be selected according to actual requirements, so that the network energy efficiency ratio of the intelligent network card is improved.
Referring to fig. 2, a flowchart of a control plane switching method according to an embodiment of the present application is shown, where the method is applied to a first control plane and a second control plane, in this embodiment, both the control planes can monitor whether the control plane and another control plane are abnormal, and when it is detected that any one of the control planes is abnormal, the other control plane is switched to implement a control function.
As shown in fig. 2, the method may include the steps of:
s110, detecting whether the current main control plane and the current standby control plane are abnormal or not; if the current standby control plane is detected to be abnormal, executing S120; if it is detected that the current primary control plane is abnormal, S140 is executed.
The main control plane and the standby control plane are configured when the system is initialized, and a user can configure the main control plane and the standby control plane according to the actual requirements of the user. The current main control plane is a main control plane when the current detection action is executed, and the current standby control plane is a standby control plane when the current detection action is executed.
In an embodiment of the application, the main control plane is a Host CPU, and the standby control plane is an embedded CPU of an FPGA, so that both the Host CPU and the embedded CPU of the FPGA can monitor whether the Host CPU and the embedded CPU of the FPGA are abnormal or not.
The abnormal conditions of the control plane include software faults and insufficient resources which may occur to the CPU, for example, the software faults include dead halt, process exception, program deadlock, insufficient resources of the CPU or the memory, and the like.
And S120, controlling the current standby control plane to restart.
And when the current standby control plane is detected to be abnormal, controlling the current standby control plane to restart.
If the main control plane detects that the standby control plane has abnormity, the standby control plane is informed to restart so as to eliminate the abnormity.
If the backup control plane detects that the backup control plane has an abnormality, the backup control plane is automatically restarted to eliminate the abnormality.
S130, if the current standby control plane is eliminated, the current main control plane synchronizes the control information to the current standby control plane.
If the current standby control plane is eliminated by restarting the abnormity, the control information of the current main control plane is synchronized to the current standby control plane through an information synchronization channel, so that the current standby control plane realizes the control process when the current main control plane is abnormal.
And S140, controlling the current main control plane to restart, and controlling the current standby control plane to be changed into a new main control plane.
And if the current main control plane is detected to be abnormal, controlling the current standby control plane to be switched to a new main control plane, switching the current main control plane to a new standby control plane, and controlling the current main control plane to be restarted.
In the embodiment of the application, the two actions of controlling the current main control plane to restart and controlling the current standby control plane to be changed into a new main control plane can be executed in parallel or in sequence, for example, the restart can be controlled first, and then the main control plane is switched; or, the main control plane may be switched first, and then the restart is controlled.
In one embodiment of the application, the current main control plane is a Host CPU, and the current standby control plane is an embedded CPU of an FPGA;
if the Host CPU detects that the Host CPU is abnormal, the embedded CPU of the FPGA becomes a main control plane, and meanwhile, the Host CPU is restarted to eliminate the abnormality.
And if the embedded CPU of the FPGA detects that the Host CPU is abnormal, the embedded CPU of the FPGA is controlled to become a main control plane, and the Host CPU is informed to be restarted to eliminate the abnormality.
S150, if the current main control plane eliminates the abnormity, the new main control plane synchronizes the control information to the current control plane.
Still take the case that the current main control plane is the Host CPU and the current standby control plane is the embedded CPU of the FPGA as an example for explanation, and after the Host CPU eliminates the abnormality, the control information in the embedded CPU of the FPGA is synchronized to the Host CPU.
In the control plane switching method provided in this embodiment, both the two control planes can monitor the abnormality of the control plane itself and the other control plane, and if it is monitored that the main control plane is abnormal, the standby control plane is changed into the main control plane, and the control plane with the abnormality is controlled to restart. And if the backup control plane is detected to be abnormal, restarting the backup control plane and synchronizing the control information of the main control plane to the backup control plane so that the backup control plane can enter a main control state at any time. The method realizes redundancy control through two control planes, and improves the reliability of the intelligent network card.
Referring to fig. 3, a flowchart of a forwarding plane switching method provided in an embodiment of the present application is shown, where the method is applied to the main control plane shown in fig. 1. And the main control plane monitors whether the three forwarding planes are abnormal or not and migrates the forwarding rules to the forwarding planes without the abnormality.
In this embodiment, an NIC chip is taken as a first forwarding plane, an FPGA is taken as a second forwarding plane, and a Host CPU is taken as a third forwarding plane, and the priority of the NIC chip is higher than that of the FPGA, and the priority of the FPGA is higher than that of the Host CPU.
As shown in fig. 3, the method may include the steps of:
s210, when the main control plane detects that the current forwarding plane is abnormal, the forwarding rules are transferred from the current forwarding plane to a target forwarding plane with the next priority of the current forwarding plane according to the sequence of the priorities from high to low.
The current forwarding plane refers to a forwarding plane for executing forwarding data according to a forwarding rule when the main control plane executes a step of detecting whether the forwarding plane has an abnormality.
And S220, when the current forwarding plane is detected to be recovered to normal, migrating the forwarding rule from the target forwarding plane to the current forwarding plane.
And when the main control plane detects that the current forwarding plane is recovered to be normal, the forwarding rule is migrated from the target forwarding plane to the current forwarding plane.
Referring to fig. 4, a flowchart of a forwarding plane switching method provided in an embodiment of the present application is shown, where the embodiment will describe in detail a switching control process between forwarding planes; the method may be applied within a main processor or processor unit as shown in fig. 1.
As shown in fig. 4, the method may include the steps of:
s310, if the main control plane detects that the first forwarding plane is abnormal, judging the abnormal type of the first forwarding plane;
if the abnormal type is software failure, executing S320; if the exception type is resource shortage, executing S340;
in one embodiment of the present application, the software failures that may exist in the forwarding plane may include a dead halt, a process exception, a program deadlock, and the like; the resource shortage may include the shortage of the memory and the CPU.
The primary control plane in this embodiment may be the first control plane or the second control plane.
S320, transferring the forwarding rule from the first forwarding plane to the second forwarding plane, and controlling the first forwarding plane to restart;
s330, when the first forwarding plane is detected to be eliminated, migrating the forwarding rule from the second forwarding plane to the first forwarding plane.
S340, transferring the forwarding rule from the first forwarding plane to the second forwarding plane;
and S350, when the fact that the resources of the first forwarding plane are sufficient is detected, migrating the forwarding rules from the second forwarding plane to the first forwarding plane.
S360, if the main control plane detects that the second forwarding plane is abnormal, judging the abnormal type of the second forwarding plane;
if the software fails, executing S370; if the resource is insufficient, executing S380;
s370, the forwarding rule is migrated from the second forwarding plane to the third forwarding plane, and the upper layer application is notified to process;
s380, the forwarding rules are migrated from the second forwarding plane to the third forwarding plane, and when the resources of the second forwarding plane are detected to be sufficient, the forwarding rules are migrated from the third forwarding plane back to the second forwarding plane.
And S390, if the main control plane detects that the third forwarding plane has software failure, the main control plane informs the upper layer application to process.
In the forwarding plane switching method provided by this embodiment, the main control plane monitors the abnormal conditions of the three forwarding planes, and when it is monitored that the currently used forwarding plane is abnormal, the forwarding rule is migrated from the current forwarding plane to the forwarding plane of the next priority of the current forwarding plane, so as to ensure that network forwarding is performed normally, thereby improving the reliability of the intelligent network card system.
On the other hand, the application also provides an embodiment of the control plane switching device.
Referring to fig. 5, a block diagram of a control plane switching device according to an embodiment of the present disclosure is shown, where the control plane switching device is applied to a host processor or an embedded processor unit of a programmable integrated circuit module 2.
As shown in fig. 5, the apparatus may include: a first detecting unit 110, a first restarting unit 120, a first synchronizing unit 130, a second restarting unit 140, a first switching unit 150, a second synchronizing unit 160;
the first detecting unit 110 is configured to detect whether there is an abnormality in the primary control plane and the backup control plane.
If the backup control plane is detected to be abnormal, executing S120; if the main control plane is detected to be abnormal, S140 is executed.
The first restarting unit 120 is configured to control the standby control plane to restart when it is detected that there is an abnormality in the standby control plane.
A first synchronization unit 130 for synchronizing control information to the standby control plane by the main control plane when the standby control plane is eliminated abnormally.
And a second reboot unit 140 for controlling the restart of the main control plane.
A first switching unit 150, configured to switch the current standby control plane to a new main control plane.
A second synchronizing unit 160 for synchronizing the control information to the abnormality-removed control plane by the control plane where there is no abnormality when the abnormality-removed control plane has an abnormality.
In the control plane switching apparatus provided in this embodiment, both the two control planes can monitor the abnormality of the control plane itself and the other control plane, and if it is monitored that the main control plane is abnormal, the standby control plane is changed into the main control plane, and the control plane with the abnormality is controlled to restart. And if the backup control plane is detected to be abnormal, restarting the backup control plane and synchronizing the control information of the main control plane to the backup control plane so that the backup control plane can enter a main control state at any time. The device realizes redundancy control through two control planes, and improves the reliability of the intelligent network card.
Referring to fig. 6, a block diagram of a forwarding plane switching apparatus according to an embodiment of the present invention is shown, as shown in fig. 6, the apparatus is applied to an embedded processor unit of the host processor or the programmable integrated circuit module 2 shown in fig. 1.
As shown in fig. 6, the apparatus may include:
a second detecting unit 210, configured to detect an abnormality type of the first forwarding plane when detecting that the first forwarding plane has an abnormality;
the exception types include software failures and resource shortages.
A first migration unit 220, configured to migrate the forwarding rule from the first forwarding plane to the second forwarding plane when the first forwarding plane has a software failure;
and a third restart unit 230 for controlling the first forwarding plane to restart.
A second migration unit 240, configured to migrate the forwarding rule from the second forwarding plane back to the first forwarding plane after detecting that the failure of the first forwarding plane is eliminated.
A third migration unit 250, configured to migrate the forwarding rule from the first forwarding plane to the second forwarding plane when the first forwarding plane has insufficient resources.
A fourth migration unit 260, configured to migrate the forwarding rule from the second forwarding plane back to the first forwarding plane when the resource of the first forwarding plane is detected to be sufficient.
A third detecting unit 270, configured to determine an exception type of the second forwarding plane when it is detected that the second forwarding plane has an exception.
A fifth migration unit 280, configured to migrate the forwarding rule from the second forwarding plane to the third forwarding plane when the second forwarding plane has a software failure, and notify the upper layer application to perform processing.
A sixth migrating unit 290, configured to migrate the forwarding rule from the second forwarding plane to the third forwarding plane when the second forwarding plane has insufficient resources.
A seventh migrating unit 2100, configured to migrate the forwarding rules from the third forwarding plane back to the second forwarding plane when it is detected that the resources of the second forwarding plane become sufficient.
A third detecting unit 2110, configured to notify an upper-layer application to perform processing when detecting that a software failure occurs in the third forwarding plane.
In the forwarding plane switching device provided in this embodiment, the main control plane monitors the abnormal conditions of the three forwarding planes, and when it is monitored that the currently used forwarding plane is abnormal, the forwarding rule is migrated from the current forwarding plane to the forwarding plane of the next priority of the current forwarding plane, so as to ensure that network forwarding is performed normally, thereby improving the reliability of the intelligent network card system.
In another aspect, the present application further provides an intelligent network card, where the intelligent network card includes a network card module and a programmable integrated circuit module in the intelligent network card system shown in fig. 1, and please refer to the related description in the intelligent network card system for related contents.
In another aspect, the present application further provides a network device (e.g., a server), where the network device includes the intelligent network card system shown in fig. 1.
While, for purposes of simplicity of explanation, the foregoing method embodiments have been described as a series of acts or combination of acts, it will be appreciated by those skilled in the art that the present application is not limited by the order of acts or acts described, as some steps may occur in other orders or concurrently with other steps in accordance with the application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required in this application.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functionality of the units may be implemented in one or more software and/or hardware when implementing the present application.
From the above description of the embodiments, it is clear to those skilled in the art that the present application can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the present application may be essentially or partially implemented in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the embodiments or some parts of the embodiments of the present application.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (12)

1.一种智能网卡系统,其特征在于,应用于网络设备中,所述系统包括网卡模块、可编程集成电路模块和所述网络设备中的主处理器,其中,该可编程集成电路模块内部集成有处理器单元;1. an intelligent network card system, is characterized in that, is applied in network equipment, described system comprises network card module, programmable integrated circuit module and the main processor in described network equipment, wherein, this programmable integrated circuit module internal Integrated processor unit; 所述网卡模块与所述可编程集成电路模块通过通信总线连接;The network card module is connected with the programmable integrated circuit module through a communication bus; 所述网卡模块和所述可编程集成电路模块均与所述主处理器相连接;Both the network card module and the programmable integrated circuit module are connected to the main processor; 所述处理器单元与所述主处理器之间建立有信息同步通道,用于同步控制信息;An information synchronization channel is established between the processor unit and the main processor for synchronizing control information; 其中,所述主处理器为第一控制平面,所述处理器单元为的第二控制平面;所述网卡模块为第一转发平面、所述可编程集成电路模块为第二转发平面、所述主处理器为第三转发平面。The main processor is the first control plane, the processor unit is the second control plane; the network card module is the first forwarding plane, the programmable integrated circuit module is the second forwarding plane, the The main processor is the third forwarding plane. 2.根据权利要求1所述的系统,其特征在于,2. The system of claim 1, wherein: 所述第一控制平面的优先级高于所述第二控制平面的优先级;The priority of the first control plane is higher than the priority of the second control plane; 所述第一转发平面的优先级高于所述第二转发平面的优先级,且所述第二转发平面的优先级高于第三转发平面的优先级。The priority of the first forwarding plane is higher than the priority of the second forwarding plane, and the priority of the second forwarding plane is higher than the priority of the third forwarding plane. 3.根据权利要求1或2所述的系统,其特征在于,当前主控制平面为所述第一控制平面、当前备控制平面为所述第二控制平面,或者,当前主控制平面为所述第二控制平面、当前备控制平面为所述第一控制平面;所述当前主控制平面和所述当前备控制平面均用于:3. The system according to claim 1 or 2, wherein the current primary control plane is the first control plane, the current backup control plane is the second control plane, or the current primary control plane is the The second control plane and the current backup control plane are the first control plane; the current primary control plane and the current backup control plane are both used for: 检测所述当前主控制平面和所述当前备控制平面是否存在异常;Detecting whether the current primary control plane and the current standby control plane are abnormal; 当检测到所述当前备控制平面存在异常时,控制所述当前备控制平面重新启动;When detecting that the current standby control plane is abnormal, controlling the current standby control plane to restart; 当检测到所述当前主控制平面存在异常时,控制当前备控制平面切换为新的主控制平面,当前主控制平面切换为新的备控制平面,以及控制所述当前主控制平面重新启动。When it is detected that the current primary control plane is abnormal, the current backup control plane is controlled to be switched to the new primary control plane, the current primary control plane is switched to the new backup control plane, and the current primary control plane is controlled to be restarted. 4.根据权利要求3所述的系统,其特征在于,所述当前主控制平面和所述当前备控制平面均还用于:4. The system according to claim 3, wherein the current primary control plane and the current standby control plane are both used for: 当检测到所述当前备控制平面异常消除时,将所述当前主控制平面的控制信息同步给所述当前备控制平面;When detecting that the current standby control plane is abnormally eliminated, synchronizing the control information of the current primary control plane to the current standby control plane; 当检测到所述当前主控制平面异常消除时,将所述新的主控制平面的控制信息同步给所述当前主控制平面。When it is detected that the current main control plane is abnormally eliminated, the control information of the new main control plane is synchronized to the current main control plane. 5.根据权利要求1所述的系统,其特征在于,当前主控制平面,用于当检测到当前转发平面出现异常时,按照优先级由高到低的顺序,将转发规则由所述当前转发平面迁移至该当前转发平面的下一个优先级的转发平面;5. The system according to claim 1, wherein the current main control plane is used to, when detecting that the current forwarding plane is abnormal, according to the order of priority from high to low, forwarding rules from the current forwarding The plane is migrated to the forwarding plane of the next priority of the current forwarding plane; 其中所述当前主控制平面为所述第一控制平面或所述第二控制平面。The current main control plane is the first control plane or the second control plane. 6.根据权利要求5所述的系统,其特征在于,6. The system of claim 5, wherein: 所述当前主控制平面,用于当检测到所述第一转发平面出现软件故障时,将转发规则由所述第一转发平面迁移至所述第二转发平面,并控制所述第一转发平面重新启动系统;当检测到所述第一转发平面软件故障消除后,将所述转发规则由所述第二转发平面迁移回所述第一转发平面;the current main control plane, configured to migrate forwarding rules from the first forwarding plane to the second forwarding plane and control the first forwarding plane when a software failure is detected in the first forwarding plane Restarting the system; after detecting that the software failure of the first forwarding plane is eliminated, migrating the forwarding rule from the second forwarding plane back to the first forwarding plane; 所述当前主控制平面,用于当检测到所述第二转发平面出现软件故障时,将所述转发规则由所述第二转发平面迁移至所述第三转发平面,并通知上层应用进行处理;The current main control plane is configured to migrate the forwarding rule from the second forwarding plane to the third forwarding plane when a software failure is detected in the second forwarding plane, and notify the upper-layer application for processing ; 所述当前主控制平面,用于当监测到所述第三转发平面出现软件故障时,通知上层应用进行处理。The current main control plane is configured to notify an upper-layer application for processing when a software failure is detected in the third forwarding plane. 7.根据权利要求5所述的系统,其特征在于,7. The system of claim 5, wherein: 所述当前主控制平面,用于当检测到所述第一转发平面的资源不充足时,将所述转发规则由所述第一转发平面迁移至所述第二转发平面,并在检测到所述第一转发平面的资源变为充足时,将所述转发规则由所述第二转发平面迁移回所述第一转发平面;The current main control plane is configured to migrate the forwarding rule from the first forwarding plane to the second forwarding plane when it is detected that the resources of the first forwarding plane are insufficient, and when it is detected that the resources of the first forwarding plane are insufficient; When the resources of the first forwarding plane become sufficient, migrate the forwarding rule from the second forwarding plane back to the first forwarding plane; 所述当前主控制平面,用于当检测到所述第二转发平面的资源不充足时,将所述转发规则由所述第二转发平面迁移至所述第三转发平面,并在检测到所述第二转发平面的资源变为充足时,将所述转发规则由所述第三转发平面迁移回所述第二转发平面。The current main control plane is configured to migrate the forwarding rule from the second forwarding plane to the third forwarding plane when it is detected that the resources of the second forwarding plane are insufficient, and when it is detected that the When the resources of the second forwarding plane become sufficient, the forwarding rule is migrated from the third forwarding plane back to the second forwarding plane. 8.一种智能网卡,其特征在于,应用于网络设备中,所述智能网卡包括网卡模块、可编程集成电路模块,其中,该可编程集成电路模块内集成有处理器单元;8. An intelligent network card, characterized in that, when applied to a network device, the intelligent network card comprises a network card module and a programmable integrated circuit module, wherein a processor unit is integrated in the programmable integrated circuit module; 所述网卡模块与所述可编程集成电路模块通过通信总线连接;The network card module is connected with the programmable integrated circuit module through a communication bus; 所述网卡模块和所述可编程集成电路模块的处理器单元均与所述网络设备中的主处理器相连接;Both the network card module and the processor unit of the programmable integrated circuit module are connected to the main processor in the network device; 所述处理器单元与所述主处理器之间建立有信息同步通道,用于同步控制信息;An information synchronization channel is established between the processor unit and the main processor for synchronizing control information; 其中,所述主处理器为第一控制平面,所述处理器单元为第二控制平面;所述网卡模块为第一转发平面、所述可编程集成电路模块为第二转发平面、所述主处理器为第三转发平面。The main processor is a first control plane, the processor unit is a second control plane; the network card module is a first forwarding plane, the programmable integrated circuit module is a second forwarding plane, and the main The processor is the third forwarding plane. 9.一种网络设备,其特征在于,包括权利要求1-7任一项所述的智能网卡系统。9. A network device, comprising the smart network card system according to any one of claims 1-7. 10.一种智能网卡的控制平面切换方法,其特征在于,应用于权利要求1-7任一项所述的智能网卡系统的主处理器或处理器单元中,所述方法包括:10. A control plane switching method for an intelligent network card, characterized in that, applied to the main processor or processor unit of the intelligent network card system according to any one of claims 1-7, the method comprising: 检测当前主控制平面和当前备控制平面是否存在异常;Detect whether the current active control plane and the current standby control plane are abnormal; 当检测到所述当前备控制平面存在异常时,控制所述当前备控制平面重新启动;When detecting that the current standby control plane is abnormal, controlling the current standby control plane to restart; 当检测到当前主控制平面存在异常时,控制当前备控制平面切换为新的主控制平面,当前主控制平面切换为新的备控制平面,以及控制当前主控制平面重新启动。When it is detected that the current active control plane is abnormal, the current standby control plane is controlled to be switched to the new active control plane, the current active control plane is switched to the new standby control plane, and the current active control plane is controlled to be restarted. 11.根据权利要求10所述的方法,其特征在于,所述方法还包括:11. The method of claim 10, wherein the method further comprises: 当检测到所述当前备控制平面异常消除时,将所述当前控制平面的控制信息同步给所述当前备控制平面;When detecting that the current standby control plane is abnormally eliminated, synchronizing the control information of the current control plane to the current standby control plane; 当检测到所述当前主控制平面异常消除时,将所述新的主控制平面的控制信息同步给所述当前主控制平面。When it is detected that the current main control plane is abnormally eliminated, the control information of the new main control plane is synchronized to the current main control plane. 12.一种智能网卡的转发平面切换方法,其特征在于,应用于权利要求1-7任一项所述的智能网卡系统中,所述方法包括:12. A forwarding plane switching method for an intelligent network card, characterized in that, applied to the intelligent network card system according to any one of claims 1-7, the method comprising: 当主控制平面检测到当前转发平面出现异常时,按照优先级由高到低的顺序,将转发规则由所述当前转发平面迁移至该当前转发平面的下一个优先级的目标转发平面;When the main control plane detects that the current forwarding plane is abnormal, according to the order of priority from high to low, the forwarding rule is migrated from the current forwarding plane to the target forwarding plane of the next priority of the current forwarding plane; 当检测到异常的转发平面恢复正常后,将所述转发规则由所述目标转发平面迁移回所述当前转发平面。After the abnormal forwarding plane is detected to return to normal, the forwarding rule is migrated from the target forwarding plane back to the current forwarding plane.
CN201810988015.1A 2018-08-28 2018-08-28 Intelligent network card, intelligent network card system and control method Active CN109245926B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810988015.1A CN109245926B (en) 2018-08-28 2018-08-28 Intelligent network card, intelligent network card system and control method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810988015.1A CN109245926B (en) 2018-08-28 2018-08-28 Intelligent network card, intelligent network card system and control method

Publications (2)

Publication Number Publication Date
CN109245926A CN109245926A (en) 2019-01-18
CN109245926B true CN109245926B (en) 2021-10-15

Family

ID=65068808

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810988015.1A Active CN109245926B (en) 2018-08-28 2018-08-28 Intelligent network card, intelligent network card system and control method

Country Status (1)

Country Link
CN (1) CN109245926B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113726875B (en) 2020-07-08 2024-06-21 支付宝(杭州)信息技术有限公司 Transaction processing method and device based on blockchain all-in-one machine
CN111539829B (en) 2020-07-08 2020-12-29 支付宝(杭州)信息技术有限公司 To-be-filtered transaction identification method and device based on block chain all-in-one machine
CN111541783B (en) 2020-07-08 2020-10-20 支付宝(杭州)信息技术有限公司 Transaction forwarding method and device based on block chain all-in-one machine
CN111541789A (en) * 2020-07-08 2020-08-14 支付宝(杭州)信息技术有限公司 Data synchronization method and device based on block chain all-in-one machine
CN111541726B (en) * 2020-07-08 2021-05-18 支付宝(杭州)信息技术有限公司 Replay transaction identification method and device based on block chain all-in-one machine

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7289434B2 (en) * 2002-12-05 2007-10-30 Cisco Technology, Inc. Method for verifying function of redundant standby packet forwarder
CN101753581B (en) * 2010-01-15 2013-04-24 华为技术有限公司 Method and device for forwarding uninterrupted data
WO2015066894A1 (en) * 2013-11-08 2015-05-14 华为技术有限公司 Forwarding plane migration method, controller, and gateway
CN103843285B (en) * 2013-11-14 2017-02-08 华为技术有限公司 Method of version upgrade of network device and network device
CN204392269U (en) * 2015-03-05 2015-06-10 南京叠锶信息技术有限公司 A kind of full SDN High_speed NIC able to programme
CN105099754B (en) * 2015-05-29 2018-05-11 新华三技术有限公司 The network equipment and the method for network equipment interaction
CN106100940A (en) * 2016-08-25 2016-11-09 上海斐讯数据通信技术有限公司 A kind of network message supervising device and monitoring method thereof
CN107968747A (en) * 2016-10-19 2018-04-27 中兴通讯股份有限公司 A kind of path adjustment management method and device, communication system
CN108270690B (en) * 2016-12-30 2021-12-24 北京华为数字技术有限公司 Method and device for controlling message flow
CN107733728B (en) * 2017-11-30 2021-07-02 新华三技术有限公司 Multi-machine backup method and device

Also Published As

Publication number Publication date
CN109245926A (en) 2019-01-18

Similar Documents

Publication Publication Date Title
CN109245926B (en) Intelligent network card, intelligent network card system and control method
US10983880B2 (en) Role designation in a high availability node
CN108173911B (en) Microservice fault detection and processing method and device
US9176834B2 (en) Tolerating failures using concurrency in a cluster
CN106330475B (en) A method and device for managing active and standby nodes in a communication system and a high-availability cluster
US8910160B1 (en) Handling of virtual machine migration while performing clustering operations
US20150339200A1 (en) Intelligent disaster recovery
US20150149813A1 (en) Failure recovery system and method of creating the failure recovery system
CN107480014A (en) A kind of High Availabitity equipment switching method and device
CN105095001A (en) Virtual machine exception recovery method under distributed environment
CN103905247B (en) Two-unit standby method and system based on multi-client judgment
WO2015058711A1 (en) Rapid fault detection method and device
CN105471622A (en) High-availability method and system for main/standby control node switching based on Galera
US11461198B2 (en) Method to disable or reboot unresponsive device with active uplink in a ring network
CN112612653B (en) A business recovery method, device, arbitration server and storage system
CN105704187A (en) Processing method and apparatus of cluster split brain
WO2015139327A1 (en) Failover method, apparatus and system
CN104657240B (en) The Failure Control method and device of more kernel operating systems
CN107526652A (en) A kind of method of data synchronization and storage device
CN107528724B (en) Optimization processing method and device for node cluster
CN106528324A (en) Fault recovery method and apparatus
CN110620684A (en) Storage double-control split-brain-preventing method, system, terminal and storage medium
CN113746655B (en) A backup state determination method, device and system
US11947431B1 (en) Replication data facility failure detection and failover automation
Kitamura Configuration of a Power-saving High-availability Server System Incorporating a Hybrid Operation Method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant