Embodiment
For making the object of the invention, technical scheme and advantage clearer, will combine accompanying drawing that embodiment of the present invention is done to describe in detail further below.
The present invention has mainly solved in the system that multinode is formed, and when wherein one or more node cpus break down, needn't from system, single node be taken down again, carries out the independent debugging of single node, carries out group system again after debugging finishes; But directly through online method the CPU operation information real-time online of single node is collected and demonstration, thereby through the information of collecting cpu fault is carried out on-line debugging, and then practice thrift the system failure and get rid of time and raising system maintenance efficient.
Embodiment 1
Referring to Fig. 1, a kind of method of information processing is provided in the present embodiment, said method comprises:
101, when a plurality of nodes of DMS (DEVICE MANAGEMENT SYSTEM, equipment management system) management moved, said DMS obtained CPU (Central Processing Unit, the central processing unit) operation information of said each node;
102, when breaking down node in said a plurality of nodes, said DMS is known malfunctioning node according to the CPU operation information of said each node;
103, said DMS debugs said malfunctioning node.
Preferably; Each node of said DMS management includes FPGA (Field Programmable Gate Array; Field programmable gate array), XDP (Xtended Debug Port) and CPU, XDP is a new debugging interface of Intel definition, is used for CPU is debugged; Said DMS obtains before the CPU operation information of said each node, and each node of said DMS management is carried out following steps:
The FPGA of node receives the CPU operation information of the CPU of said node through the XDP transmission;
Said FPGA converts said CPU operation information to information that said DMS can show; And sending the information after the said conversion CPU of BMC to, the information of the CPU that makes said BMC after to said conversion encapsulates and sends the information after the said encapsulation to said DMS.
Preferably, said information after the said conversion is sent to the CPU of BMC, comprising:
Said FPGA sends interrupt request information to the CPU of BMC, makes the information the CPU of said BMC takes out said conversion from said FPGA after.
Preferably, said DMS debugs said malfunctioning node, comprising:
Said DMS obtains the fault recovery information of said malfunctioning node according to the CPU information of said malfunctioning node;
Said DMS sends the fault recovery information of the said malfunctioning node CPU through BMC to FPGA, and the instruction that the CPU that makes said FPGA be said malfunctioning node with said fault recovery information translation can carry out also sends said instruction to the CPU of said malfunctioning node through XDP.
The beneficial effect of present embodiment comprises: when a plurality of nodes of equipment management system DMS management moved, said DMS obtained the central processor CPU operation information of said each node; When breaking down node in said a plurality of nodes, said DMS is known malfunctioning node according to the CPU operation information of said each node; Said DMS debugs said malfunctioning node.Through online each node cpu operation information that obtains when the node in the system breaks down; Can be in time according to the CPU operation information of each node; Find malfunctioning node; And online malfunctioning node is debugged, need not lift down malfunctioning node separately and debug, solved in the prior art malfunctioning node debugging long problem consuming time.
Embodiment 2
A kind of method of information processing is provided in the present embodiment; Wherein, Node comprises FPGA, XDP and CPU, and the XDP signal on the mainboard is through the conversion of fpga logic realization XDP signal, and the signal with conversion passes the CPU to BMC through certain mode again; Through the CPU of BMC information is carried out encapsulation process by the protocol format of IPMI, realize cpu fault online treatment mechanism by the passage between BMC and DMS again.Wherein BMC can be the module that is integrated on the node, also can be an independently module, and this present embodiment is not done concrete qualification.Referring to Fig. 2, the information processing method in the present embodiment specifically comprises:
201, when a plurality of nodes of DMS management moved, DMS obtained the CPU operation information of each node.
In the present embodiment, each node in the DMS management system is suspended to the device of the FPGA in the node on the LOCAL BUS bus of BMC, and the XDP signal on the mainboard is realized the XDP signal resolution through fpga logic.The XDP debugging interface of node cpu and the intercommunication of FPGA; Wherein DMS obtains before the CPU operation information of each node; Said DMS obtains before the CPU operation information of said each node, and each node of said DMS management is carried out following steps: the FPGA of node receives the CPU information of the CPU of said node through the XDP transmission; Said FPGA converts said interface message to information that said DMS can show; And the information after the said conversion sent to the CPU of BMC; The information of the CPU that makes said BMC after to said conversion encapsulate and said envelope changeed after information send said DMS to through FE PHY (FAST ETHNET PHY, Fast Ethernet PHY).
In the present embodiment, in concrete real-time process, Data Receiving district OUT, data buffer 1, data buffer 2 and Data Receiving district IN are set on FPGA.Data Receiving district OUT is used to deposit the data that enter into FPGA from the XDP of node cpu; Data buffer 1 is used for depositing the data after FPGA resolves the data of Data Receiving district OUT; Data buffer 2 is used to deposit the order that DMS assigns to this node; Be fault recovery information, it is the data of CPU language with the data parsing of data buffer 2 that Data Receiving district IN is used for depositing FPGA.Based on the division of above-mentioned data area, the CPU operation information that obtains each node of system specifically comprises:
When 1) moving in system, the interface message of the XDP of the CPU of node debugging gets into the Data Receiving buffer zone OUT of FPGA;
Wherein, the interface message of the XDP of the CPU of node debugging is the CPU operation information.
2) FPGA carries out protocol analysis with the data of data buffer OUT, and the data after will resolving are put into data buffer 1;
Wherein, the data among the data buffer zone OUT being carried out protocol analysis is exactly the process that the CPU operation information is converted to the information that DMS can show.When system moved, the operation information of the CPU of node was binary code, if this code directly shows on DMS; Mess code may appear; The DMS system can not discern, and therefore needs FPGA that the operation information of CPU is changed, and converts the language that DMS can show into.
3) FPGA sends interrupt request information to the CPU of BMC; After the CPU of BMC receives this interrupt request; The data of data buffer 1 are taken out; And the protocol form of data with IPMI encapsulated, then through the channel transfer between CPU and FEPHY to DMS, packets of information handled and shown by DMS.
Wherein, FPGA sends interrupt request information to the CPU of BMC, after the CPU of BMC receives this interrupt request, the data of data buffer 1 is taken out; And the protocol form of data with IPMI encapsulated; Through the process of the channel transfer between CPU and FEPHY to DMS, the information after will resolving exactly sends the CPU of BMC to then, and the information of the CPU of BMC after to conversion the information after the commentaries on classics of encapsulating and will seal sends the process of the DMS of institute to through FEPHY.
202, when breaking down node in a plurality of nodes, DMS is known malfunctioning node according to the CPU operation information of each node.
In the present embodiment; When system moved, the real-time CPU operation information that obtains each node in the system was when certain or a plurality of node in the system break down; DMS analyzes the CPU operation information of each node, finds the one or more nodes that break down.Wherein concrete analytic process comprises: operation result in the operation information and preset operation result are compared, if exceed preset operation result, confirm that then fault has appearred in this node.Wherein the operation information of CPU is analyzed to know whether it breaks down and belonged to prior art, this present embodiment is repeated no more.
203, DMS debugs said malfunctioning node.
In the present embodiment; After finding malfunctioning node, DMS switches on the malfunctioning node, analyzes the failure cause of malfunctioning node; And assign instruction to malfunctioning node according to the failure cause of malfunctioning node; This instruction comprises the fault recovery information of malfunctioning node, with the CPU of this malfunctioning node of on-line debugging, makes malfunctioning node after operation troubles information, can recover operate as normal.Wherein malfunctioning node is debugged, specifically comprised: DMS obtains the fault recovery information of said malfunctioning node according to the CPU operation information of said malfunctioning node; Said DMS sends the fault recovery information of the said malfunctioning node CPU through BMC to FPGA, and the instruction that the CPU that makes said FPGA be said malfunctioning node with said fault recovery information translation can carry out also sends said instruction to the CPU of said malfunctioning node through XDP.In this step; The CPU of the instruction node that DMS assigns can not directly move, and needing this instruction transformation is the instruction that node cpu can be carried out, and uses FPGA to change in the concrete present embodiment; But also be not limited thereto kind of a conversion regime, this present embodiment is not done concrete qualification.
In the practical implementation process, based on the data field that FPGA divides, DMS find malfunctioning node and the trend of the trend of information flow that malfunctioning node is debugged and the information flow in the step 201 opposite, concrete information flow comprises:
1) DMS gives an order downwards and this order is encapsulated with the IPMI protocol form, the order after the encapsulation is passed to the CPU of BMC;
The fault recovery information that comprises node in the order that wherein DMS issues.
2) CPU of BMC resolves the information of the form of IPMI, and the CPU of BMC transmits the information after resolving through the passage between itself and FPGA to FPGA, and FPGA is put into the resolving information that receives and reaches data buffer 2;
3) FPGA becomes the data-switching in the data buffer 2 through the XDP agreement content of XDP protocol format; And the content after will changing is put into Data Receiving district IN; Data among the IN of data buffer get into the CPU of node again through XDP, make the CPU fill order, and the normal operation of recovery nodes.
204, after debugging is accomplished to malfunctioning node, system's operate as normal.
In this step, after debugging was accomplished to malfunctioning node, in the time of the node operate as normal, system is operate as normal also, and continues to obtain the CPU operation information of each node, and repeated execution of steps 201-204 repeats no more to this present embodiment.
The beneficial effect of present embodiment comprises: when a plurality of nodes of equipment management system DMS management moved, said DMS obtained the central processor CPU operation information of said each node; When breaking down node in said a plurality of nodes, said DMS is known malfunctioning node according to the CPU operation information of said each node; Said DMS debugs said malfunctioning node.Through online each node cpu operation information that obtains when the node in the system breaks down; Can be in time according to the CPU operation information of each node; Find malfunctioning node; And online malfunctioning node is debugged, need not lift down malfunctioning node separately and debug, solved in the prior art malfunctioning node debugging long problem consuming time.And communicating by letter between the XDP that realizes node through FPGA and BMC realizes simple, workable.
Embodiment 3
Referring to Fig. 3, a kind of equipment management system is provided in the present embodiment, it is characterized in that said equipment management system comprises: acquisition module 301, fault diagnosis module 302 and fault debugging module 303.
Acquisition module 301 is used for when a plurality of nodes of DMS management move, obtaining the CPU operation information of said each node;
Fault diagnosis module 302 is used for when said a plurality of nodes break down node, knowing malfunctioning node according to the CPU operation information of said each node;
Fault debugging module 303 is used for said malfunctioning node is debugged.
Preferably, said fault debugging module 303 comprises:
Acquiring unit is used for obtaining according to the CPU operation information of said malfunctioning node the fault recovery information of said malfunctioning node;
Order issues the unit; Be used for sending the fault recovery information of the said malfunctioning node CPU through BMC to FPGA, the instruction that the CPU that makes said FPGA be said malfunctioning node with said fault recovery information translation can carry out also sends said instruction to the CPU of said malfunctioning node through XDP.
Referring to Fig. 4; A kind of node also is provided in the present embodiment; Said node comprises: on-site programmable gate array FPGA 100, debugging interface XDP200 and central processor CPU 300, wherein referring to Fig. 5, said FPGA comprises: receiver module 401 and message processing module 402.
Receiver module 401 is used to receive the CPU operation information that CPU transmits through XDP;
Message processing module 402; Be used for converting said interface message to information that DMS can show; And sending the information after the said conversion CPU of BMC to, the information of the CPU that makes said BMC after to said conversion encapsulates and sends the information after the said encapsulation to said DMS.
Preferably, message processing module 402 comprises:
First transmitting element is used for sending interrupt request information to the CPU of BMC, makes the information the CPU of said BMC takes out said conversion from said FPGA after.
Preferably, said message processing module 402 also comprises:
Receiving element is used to receive the fault recovery information of said DMS through the said node of the CPU transmission of said BMC;
Converting unit is used for the instruction that the CPU that said fault recovery information translation is said node can be carried out;
Second transmitting element is used for sending said instruction to said CPU through said XDP.
Wherein FPGA comprises four data areas, Data Receiving district OUT, data buffer 1, data buffer 2 and Data Receiving district IN.Data Receiving district OUT is used to deposit the data that enter into FPGA from the XDP of CPU; Data buffer 1 is used for depositing the data after FPGA resolves the data of Data Receiving district OUT; Data buffer 2 is used to deposit the order that DMS assigns to this node; Be fault recovery information, it is the data of CPU language with the data parsing of data buffer 2 that Data Receiving district IN is used for depositing FPGA.
Referring to Fig. 6, a kind of system of information processing also is provided in the present embodiment, said system comprises: aforesaid equipment management system 400 and aforesaid node 500.
The beneficial effect of present embodiment comprises: when a plurality of nodes of equipment management system DMS management moved, said DMS obtained the central processor CPU operation information of said each node; When breaking down node in said a plurality of nodes, said DMS is known malfunctioning node according to the CPU operation information of said each node; Said DMS debugs said malfunctioning node.Through online each node cpu operation information that obtains when the node in the system breaks down; Can be in time according to the CPU operation information of each node; Find malfunctioning node; And online malfunctioning node is debugged, need not lift down malfunctioning node separately and debug, solved in the prior art malfunctioning node debugging long problem consuming time.
The equipment management system that present embodiment provides, node and system specifically can belong to same design with method embodiment, and its concrete implementation procedure sees method embodiment for details, repeats no more here.
The all or part of step that one of ordinary skill in the art will appreciate that realization the foregoing description can be accomplished through hardware; Also can instruct relevant hardware to accomplish through program; Described program can be stored in a kind of computer-readable recording medium; The above-mentioned storage medium of mentioning can be a ROM (read-only memory), disk or CD etc.
The above is merely preferred embodiment of the present invention, and is in order to restriction the present invention, not all within spirit of the present invention and principle, any modification of being done, is equal to replacement, improvement etc., all should be included within protection scope of the present invention.