CN117978612B

CN117978612B - Network fault detection method, storage medium and electronic equipment

Info

Publication number: CN117978612B
Application number: CN202410363187.5A
Authority: CN
Inventors: 杨斌; 戴勇
Original assignee: Chengdu Greatech Electrics Co ltd
Current assignee: Chengdu Greatech Electrics Co ltd
Priority date: 2024-03-28
Filing date: 2024-03-28
Publication date: 2024-06-04
Anticipated expiration: 2044-03-28
Also published as: CN117978612A

Abstract

The invention provides a network fault detection method, a storage medium and electronic equipment, belonging to the field of network detection, wherein the method comprises the following steps: firstly, based on network topology information of network topology, sending a network detection message to each network device in the network topology, wherein the network topology at least comprises one network device, and the network device comprises at least one of network service device and network terminal device; receiving a network response message fed back by the network equipment, wherein the network response message at least comprises a response content field, a timestamp of the network equipment for receiving the network detection message and a timestamp of the network equipment for sending the network response message; inputting network topology information and a network response message into a pre-trained network fault detection model to obtain a network fault detection result; and finally, determining the fault type of the network equipment with the fault in the network topology based on the network fault detection result. The accuracy and the efficiency of network fault detection can be effectively improved.

Description

Network fault detection method, storage medium and electronic equipment

Technical Field

The present invention relates to the field of network detection, and in particular, to a network failure detection method, a storage medium, and an electronic device.

Background

In the related art, when a device in a network has a fault, engineers are usually required to check the possible faults of the device one by one, which makes it difficult to accurately determine the type of the fault and greatly consumes manpower and material resources.

Disclosure of Invention

The embodiment of the invention provides a network fault detection method, a storage medium and electronic equipment.

According to a first aspect of an embodiment of the present invention, there is provided a network failure detection method applied to a first network device, where the first network device is a network device in a network topology, including:

Based on the network topology information of the network topology, sending a network detection message to each network device in the network topology, wherein the network topology at least comprises one network device, and the network device comprises at least one of a network service device and a network terminal device;

Receiving a network response message fed back by the network device, wherein the network response message at least comprises a response content field, a timestamp of the network device for receiving the network detection message and a timestamp of the network device for sending the network response message;

inputting the network topology information and the network response message into a pre-trained network fault detection model to obtain a network fault detection result;

And determining the fault type of the network equipment with the fault in the network topology based on the network fault detection result.

Optionally, inputting the network topology information and the network response message into a pre-trained network fault detection model to obtain a network fault detection result, including:

inputting the network topology information and the network response message into the network fault detection model, and utilizing the network fault detection model to infer fault types corresponding to the network equipment to obtain the network fault detection result, wherein the network fault detection result comprises target confidence degrees of each network equipment in the network topology, which are predicted by the network fault detection model to be in each preset fault type.

Optionally, the training of the network fault detection model includes:

Acquiring an example sample set and sample network topology information corresponding to sample network equipment in the example sample set; the sample set comprises a plurality of batches of first samples and a plurality of batches of second samples, wherein each batch of first samples comprises a network response message fed back by sample network equipment under a preset fault type, and each batch of second samples comprises a network response message fed back by the sample network equipment outside the preset fault type;

inputting the sample network topology information and the first sample into a network fault detection model to perform model parameter iteration, and determining a first iteration cost parameter of the model parameter iteration;

Inputting the sample network topology information and the second sample into the network fault detection model, and reasoning the fault type corresponding to the sample network equipment in the second sample by utilizing the network fault detection model to obtain a first fault detection result; the first fault detection result comprises a first confidence coefficient of predicting that the sample network equipment is in each preset fault type by the network fault detection model;

Determining a second iteration cost parameter of the model parameter iteration based on the first confidence and a predefined detection confidence index;

And iterating the model parameters of the network fault detection model based on the first iteration cost parameter and the second iteration cost parameter to obtain the trained network fault detection model.

Optionally, the sample set further includes pre-labeling information corresponding to the first sample, where the pre-labeling information is used to indicate a real fault type corresponding to the sample network device in the first sample;

inputting the sample network topology information and the first sample into a network fault detection model for model parameter iteration, and determining a first iteration cost parameter of model parameter iteration, wherein the method comprises the following steps:

Inputting the sample network topology information and the first sample into the network fault detection model, and reasoning the fault type corresponding to the sample network equipment in the first sample by utilizing the network fault detection model to obtain a second fault detection result; the second fault detection result comprises a second confidence coefficient of predicting that the sample network equipment is in each preset fault type by the network fault detection model;

determining a first iteration cost parameter of model parameter iteration based on the pre-labeling information and the second fault detection result;

The network fault detection model comprises a first processing unit and a second processing unit; inputting the sample network topology information and the first sample into the network fault detection model, and reasoning the fault type corresponding to the sample network equipment in the first sample by using the network fault detection model to obtain a second fault detection result, wherein the second fault detection result comprises the following steps:

Inputting the sample network topology information and the first sample to the first processing unit, and executing feature engineering on the first sample by using the first processing unit to obtain a first sample feature vector;

Inputting the first sample feature vector to the second processing unit, and weighting the first sample feature vector by using clustering attention information by using the second processing unit to obtain a first target feature vector; the clustering attention information comprises a plurality of clustering attention distribution vectors, each clustering attention distribution vector corresponds to one preset fault type, and the number of elements of the first target feature vector is equal to the number of the preset fault types;

Carrying out normalized data mapping on the first target feature vector by using a preset normalization function to obtain a second target feature vector, and taking the second target feature vector as a second fault detection result; wherein, the value corresponding to the element in the second target feature vector represents a second confidence that the network fault detection model predicts that the sample network device is in the corresponding preset fault type;

The determining a first iteration cost parameter of the model parameter iteration based on the pre-labeling information and the second fault detection result includes:

based on the real fault type corresponding to the pre-labeling information, determining a corresponding first attention index from the cluster attention distribution vectors, and taking the cluster attention distribution vectors except the first attention index in the cluster attention distribution vectors as a second attention index;

determining a first vector angle between the first sample feature vector and the first attention index, and determining a second vector angle between the first sample feature vector and each of the second attention indexes;

Determining the angle deviation between the first vector included angle and a predefined target included angle to obtain a first price-related parameter;

determining a first iteration cost parameter of model parameter iteration based on the first price related parameter and each second vector included angle;

Wherein, the correlation coefficient of the first iteration cost parameter and the first price correlation parameter is negative, and the correlation coefficient of the second vector included angle and the first iteration cost parameter is positive.

Optionally, the network fault detection model includes a first processing unit and a second processing unit; inputting the sample network topology information and the second sample into the network fault detection model, and reasoning the fault type corresponding to the sample network equipment in the second sample by using the network fault detection model to obtain a first fault detection result, wherein the first fault detection result comprises the following steps:

Inputting the sample network topology information and the second sample to the first processing unit, and executing feature engineering on the second sample by using the first processing unit to obtain a second sample feature vector;

Inputting the second sample feature vector to the second processing unit, and weighting the second sample feature vector by using clustering attention degree information by utilizing the second processing unit to obtain a third feature vector; the clustering attention information comprises a plurality of clustering attention distribution vectors, each clustering attention distribution vector corresponds to one preset fault type, and the number of elements of the third feature vector is equal to the number of the preset fault types;

carrying out normalized data mapping on the third feature vector by using a preset normalization function to obtain a fourth feature vector, and taking the fourth feature vector as a first fault detection result; and the value corresponding to the element in the fourth feature vector represents a first confidence that the network fault detection model predicts that the sample network device is in the corresponding preset fault type.

Optionally, the determining a second iteration cost parameter of the model parameter iteration includes:

Determining a third vector included angle between the second sample feature vector and each cluster attention distribution vector;

Determining a target included angle based on the detection confidence index;

determining the maximum angle deviation between the third vector included angle and the target included angle to obtain a second cost related parameter;

determining a second iteration cost parameter of the model parameter iteration based on the second cost related parameter;

wherein, the correlation coefficient of the second cost correlation parameter and the second iteration cost parameter is positive.

Optionally, the determining a second iteration cost parameter of the model parameter iteration based on the first confidence and a predefined detection confidence indicator includes:

determining the angle deviation between the first confidence coefficient corresponding to each preset fault type and the detection confidence coefficient index to obtain third-generation price related parameters corresponding to each preset fault type;

Determining a parameter size sequence of each third tariff related parameter in response to the third tariff related parameter being present above or equal to zero, determining the second iteration cost parameter based on the largest third tariff related parameter; or in response to the third iteration cost parameter being below zero, determining that the parameter value of the second iteration cost parameter is zero.

Optionally, iterating the model parameters of the network fault detection model based on the first iteration cost parameter and the second iteration cost parameter to obtain a trained network fault detection model, including:

Detecting a first lot number of the first sample and a second lot number of the second sample in the example sample set;

determining a first influence index corresponding to the first iteration cost parameter based on the first batch number, and determining a second influence index corresponding to the second iteration cost parameter based on the second batch number;

Performing influence adjustment on the first iteration cost parameter and the second iteration cost parameter based on the first influence index and the second influence index to obtain a target iteration cost parameter;

and iterating the model parameters of the network fault detection model based on the target iteration cost parameters to obtain the trained network fault detection model.

According to a second aspect of embodiments of the present invention, there is provided a computer readable storage medium having stored thereon a computer program which when executed by a processing device performs the steps of the method of the first aspect.

According to a third aspect of an embodiment of the present invention, there is provided an electronic apparatus including:

A storage device having a computer program stored thereon;

processing means for executing said computer program in said storage means to carry out the steps of the method of the first aspect.

The beneficial effects of the embodiment of the invention include, for example: the network fault detection model can be trained in advance, network detection messages are sent to all network devices in the network topology based on the network topology information, and after corresponding network response messages are received, the network topology information and the network response messages are input into the network fault detection model, so that fault types corresponding to all the network devices in the network topology are detected by using the network fault detection model, and the accuracy and the efficiency of network fault detection are effectively improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments will be briefly described below, it being understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and other related drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart illustrating a method of network failure detection according to an embodiment of the present invention;

Fig. 2 is a schematic diagram of an electronic device, according to an embodiment of the invention.

200-An electronic device; 201-a processing device; 202-read-only memory; 203-random access memory;

204-bus; 205-input/output interface; 206-input means; 207-output means;

208-storage means; 209-communication means.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. The components of the embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.

Thus, the following detailed description of the embodiments of the invention, as presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures.

Furthermore, the terms "first," "second," and the like, if any, are used merely for distinguishing between descriptions and not for indicating or implying a relative importance.

It should be noted that the features of the embodiments of the present invention may be combined with each other without conflict.

Fig. 1 is a flowchart of a network fault detection method according to an embodiment of the present invention, where the method may be applied to a first network device, where the first network device is a network device in a network topology, for example, may be any one of the network devices in the network topology, where the network topology may be a network topology of a certain area, such as an inter-city network, and the network device may be, for example, a computer device, a light cat device, a server device, etc. accessing a tetracos telecommunication network, as shown in fig. 1, where the method includes:

step S101, based on the network topology information of the network topology, sending a network detection message to each network device in the network topology.

The network topology at least comprises one network device, and the network device comprises at least one of network service equipment and network terminal equipment.

In this step, the network topology information may be stored in advance in the first network device. Or the first network device may sniff each device in the network according to a preset period, and generate corresponding network topology information. In some embodiments, the network topology information may include routing tables in the network topology, the number of network devices, the type of network device, the IP address or routing address of each network device, and so forth. Based on the network topology information, the first network device can accurately and reliably send corresponding network detection messages to each network device in the network.

The network service device may include a fiber transceiver, a network base station, and other devices, for example, a core network device or an access network device in a 5G network. The network terminal device may include a computer device, a mobile terminal, and the like. The first network device may be any one of them.

In some embodiments, the network detection message may be a message transmitted based on the TCP/IP protocol. Optionally, the network detection message may include detecting the identity of the content field and the first network device, such as a MAC address and/or an IP address. Optionally, the detection content field in the network detection message may be used to instruct the network device to feed back information such as the number and time of the corresponding network messages to the first network device.

Step S102, receiving a network response message fed back by the network equipment.

The network response message at least comprises a response content field, a timestamp of the network device receiving the network detection message and a timestamp of the network device sending the network response message.

It is understood that the first network device may receive a network response message fed back by one or more network devices. Each network device may feed back one or more network response messages.

In this step, the timestamp of the network device receiving the network detection message and the timestamp of the network device sending the network response message may be included in the response content field.

Optionally, the response content field may also include an index of the network response message. For example, the network device determines that N network response messages need to be fed back continuously according to the network detection messages, and the index may be used to indicate that a certain network response message is an ith network response message in the N network response messages.

Step S103, inputting the network topology information and the network response message into a pre-trained network fault detection model to obtain a network fault detection result.

In this step, the network fault detection model may be at least used to detect whether the network device is in a fault state of a preset fault type, where the preset fault type of the optical fiber transceiver may include, for example, a fault type of no signal transmission, abnormal optical power, etc., and the preset fault type of the computer device may include, for example, a fault type of an overload, a network configuration fault, etc., which is not limited in the embodiments of the present disclosure.

In this step, inputting the network topology information and the network response message into the network fault detection model trained in advance, and obtaining the network fault detection result may include: inputting network topology information and a network response message into a network fault detection model, and reasoning the fault types corresponding to the network equipment by using the network fault detection model to obtain a network fault detection result, wherein the network fault detection result comprises a target confidence coefficient of each network equipment in the network topology, which is predicted by the network fault detection model to be in each preset fault type.

The network fault detection model may be trained based on an example sample set, where the example sample set may include a network feedback message of a sample network device and sample network topology information corresponding to the sample network device. Through training, the network fault detection model can learn the characteristics of the network feedback message fed back by the network equipment with faults in the network topology based on the sample set, so as to detect whether the network equipment has faults.

The training process for the network failure detection model will be described in detail in some embodiments below, and will not be described in detail herein.

Step S104, based on the network fault detection result, determining the fault type of the network equipment with faults in the network topology.

In this step, the first network device may determine whether each network device is in a state of each preset fault type based on the target confidence that each network device is in each preset fault type. For example, if the confidence that a network device is in a first failure type is higher than the detection confidence indicator, then it may be determined that the network device is in the first failure type. Or if the confidence that a certain network device is in the second fault type is lower than the detection confidence index, determining that the network device does not have a corresponding fault.

It may be understood that, if the first network device does not receive the network response message fed back by a certain network device in step S102, it may still be determined that the network device has a fault, and the fault type of the network device may be a preset fault type corresponding to the device type of the network device.

In the embodiment of the invention, the network fault detection model can be trained in advance, the network detection message is sent to each network device in the network topology based on the network topology information, and after the corresponding network response message is received, the network topology information and the network response message are input into the network fault detection model, so that the fault types corresponding to each network device in the network topology are detected by using the network fault detection model, and the accuracy and the efficiency of network fault detection are effectively improved.

In some embodiments, training of the network failure detection model may specifically include the steps of:

(1) Acquiring an example sample set and sample network topology information corresponding to sample network equipment in the example sample set; the sample set comprises a plurality of batches of first samples and a plurality of batches of second samples, wherein each batch of first samples comprises a network response message fed back by the sample network equipment under a preset fault type, and each batch of second samples comprises a network response message fed back by the sample network equipment outside the preset fault type;

In this step, the sample set may be a set of samples, which may include two types of samples, respectively denoted as a first sample and a second sample. In this step, the sample set may be obtained locally or from the cloud, which is not limited in the embodiment of the present invention.

In the embodiment of the present application, for the first sample of one batch, the first sample may include a network response message fed back by the sample network device under a preset fault type, where the sample network device may be any network device in the sample network topology, and the present application is not limited to this. The preset fault types refer to preset known fault types, which can be flexibly set based on needs, and the number of the preset fault types can be more than or equal to two. In the embodiment of the present application, at least one network response message may be included in the first sample of a batch, where the network response messages are network response messages fed back by the sample network device in a certain preset fault type state. In the embodiment of the present application, for the second sample of one batch, the second sample may include a network response message that is fed back by the sample network device outside the preset fault type, for example, a network response message that is fed back by the sample network device in the case of an unknown fault type or a fault that does not have the preset fault type. It should be noted that in the embodiment of the present application, for samples of the same batch, the sample network devices included in the network response message are preferably the same, and the sample network topology information is also the same. The corresponding sample network device and sample network topology information in different samples may be the same or different.

(2) And inputting the sample network topology information and the first sample into a network fault detection model to perform model parameter iteration, and determining a first iteration cost parameter of the model parameter iteration.

In this step, for the obtained sample set, the sample network topology information and the first sample may be input into the network fault detection model to perform model parameter iteration, and the iteration cost parameter of the model parameter iteration process may be determined and recorded as the first iteration cost parameter.

(3) Inputting sample network topology information and a second sample into a network fault detection model, and reasoning fault types corresponding to sample network equipment in the second sample by using the network fault detection model to obtain a first fault detection result; the first fault detection result includes a first confidence that the network fault detection model predicts that the sample network device is in each of the preset fault types.

In this step, the sample network topology information in the sample set and the second sample are also input into the network fault detection model, and the fault type corresponding to the sample network device in the second sample is inferred by using the network fault detection model, so as to obtain a fault detection result, and the fault detection result is recorded as a first fault detection result. Specifically, in the embodiment of the present invention, the data format of the first fault detection result may be a matrix, where the matrix includes values corresponding to elements corresponding to the number of preset fault types, where the value corresponding to each element represents a confidence that the network fault detection model infers that the sample network device is in each preset fault type.

It may be understood that in the embodiment of the present invention, the first confidence coefficient of each preset fault type of the sample network device in the second sample is predicted by using the network fault detection model, and if the confidence coefficient of a certain preset fault type is higher, the probability that the sample network device in the second sample is considered to be in the preset fault type by the network fault detection model is higher is illustrated; conversely, if the confidence of a certain preset fault type is lower, the network fault detection model considers that the sample network device in the second sample is less likely to be in the preset fault type.

In the embodiment of the invention, a detection confidence index can be preset, when the first confidence coefficient corresponding to a certain preset fault type is greater than the detection confidence coefficient index, the network fault detection model can be considered to judge that the fault type corresponding to the sample network equipment in the second sample is a known fault type, namely belongs to one of the preset fault types, and at the moment, the fault type corresponding to the sample network equipment is determined based on the preset fault type with the maximum corresponding first confidence coefficient. In contrast, if the first confidence degrees corresponding to all the preset fault types are smaller than or equal to the detection confidence degree index, the network fault detection model can be considered to determine that the fault type corresponding to the sample network device in the second sample is an unknown fault type or no fault exists, that is, the fault type does not belong to one of the preset fault types, and at this time, the fault type corresponding to the sample network device can be used as the unknown fault type or no fault exists.

(4) A second iteration cost parameter of the model parameter iteration is determined based on the first confidence and the predefined detection confidence indicator.

In this step, after the first fault detection result corresponding to the second sample is obtained, an iteration cost parameter for the iteration of the second sample model parameter may be determined based on the first confidence coefficient in the first fault detection result and the predefined detection confidence coefficient index.

In the embodiment of the application, the second iteration cost parameter for the second sample model parameter iteration can be determined based on the first confidence coefficient and the value of the detection confidence coefficient index. It will be appreciated that if the first fault detection result of the second sample obtained by current prediction meets the condition that each first confidence coefficient is smaller than or equal to the detection confidence coefficient index, the iteration cost parameter corresponding to the second sample may be determined to be a smaller value, for example, may be 0. In contrast, if the first fault detection result of the second sample obtained by current prediction does not conform to the situation that each first confidence coefficient is smaller than or equal to the detection confidence coefficient index, the iteration cost parameter corresponding to the second sample can be determined to be a larger value, for example, the iteration cost parameter can be used as a value higher than or equal to zero, and the application is not limited to a specific size.

In the embodiment of the invention, if the first fault detection result of the second sample obtained by current prediction does not accord with the condition that each first confidence coefficient is smaller than or equal to the detection confidence coefficient index, the second iteration cost parameter corresponding to the second sample can be determined based on the fact that the first confidence coefficient is larger than the amplitude of the detection confidence coefficient index, and in response to the fact that the amplitude of the first fault detection result of the second sample, which is larger than the detection confidence coefficient index, is smaller than the amplitude of the detection confidence coefficient index, the second iteration cost parameter corresponding to the second sample can be determined to be a value which is higher than or equal to zero but smaller than zero; in response to the first fault detection result of the second sample, if the first confidence coefficient greater than the detection confidence coefficient index exceeds the detection confidence coefficient index by a larger magnitude, the second iteration cost parameter corresponding to the second sample can be determined to be a value which is higher than or equal to zero and larger than zero.

(5) And iterating the model parameters of the network fault detection model based on the first iteration cost parameter and the second iteration cost parameter to obtain the trained network fault detection model.

In this step, after the first iteration cost parameter and the second iteration cost parameter are obtained, parameter iteration of the network fault detection model can be realized based on the two iteration cost parameters, so as to obtain the network fault detection model after training is completed.

Specifically, in the embodiment of the invention, the influence adjustment can be performed on the first iteration cost parameter and the second iteration cost parameter to obtain the target iteration cost parameter, and the model parameter of the network fault detection model is iterated by using a back propagation algorithm based on the target iteration cost parameter. Here, when the influence adjustment is performed on the first iteration cost parameter and the second iteration cost parameter, the influence corresponding to the first iteration cost parameter and the second iteration cost parameter may be flexibly set based on the requirement, and in some embodiments, the first iteration cost parameter and the second iteration cost parameter may correspond to the same influence; in some embodiments, the lot numbers of the first sample and the second sample in the example set of samples may be first detected, the lot number of the first sample is noted as the first lot number, and the lot number of the second sample is noted as the second lot number. And then determining the influence corresponding to the first iteration cost parameter based on the first batch number, determining the influence corresponding to the second iteration cost parameter based on the second batch number, marking the influence corresponding to the first sample as a first influence index, and marking the influence corresponding to the second sample as a second influence index. The influence of each type of sample is proportional to the number of batches, and the sum of the influence of each type of sample and the batch is 1.

In the embodiment of the invention, the parameter iteration of the network fault detection model can be performed in a loop iteration mode. After iterating the parameters of the network fault detection model for one round, continuing to use the network fault detection model after iterating the parameters to perform reasoning, determining new iterated cost parameters, and then iterating the parameters of the network fault detection model again. And (3) repeating the steps in a circulating way until the preset condition of finishing training is met, and considering that the training is finished, thereby obtaining a network fault detection model with the finished training.

It can be understood that, in the training method of the network fault detection model provided in the embodiment of the present application, an example sample set is obtained, where the example sample set includes a plurality of batches of first samples and a plurality of batches of second samples, and each batch of first samples includes a network response message fed back by the sample network device under a preset fault type, that is, a network response message corresponding to a known fault type; and each batch of second samples comprises network response messages fed back by the sample network equipment outside the preset fault type, namely network response messages corresponding to the fault-free or unknown fault type. Then inputting sample network topology information and a first sample into a network fault detection model to carry out model parameter iteration, determining a first iteration cost parameter of the model parameter iteration, and inputting the sample network topology information and a second sample into the network fault detection model to obtain a first confidence coefficient of the network fault detection model for predicting each preset fault type of the sample network equipment; for the second sample, the network fault detection model is hoped not to classify the network fault detection model into any preset fault type, so that whether the first confidence coefficient is larger or not can be judged by utilizing a predefined detection confidence coefficient index, and a second iteration cost parameter of model parameter iteration is determined; therefore, the model parameters of the network fault detection model can be iterated based on the first iteration cost parameter and the second iteration cost parameter, and the network fault detection model after training is obtained. According to the technical scheme, the network fault detection model is trained by using two types of samples, so that the accuracy of the network fault detection model obtained by training can be effectively improved, and the application of accurate and reliable network fault detection is facilitated.

Specifically, in one possible implementation manner, the sample set further includes pre-labeling information corresponding to the first sample, where the pre-labeling information is used to indicate a real fault type corresponding to the sample network device in the first sample; inputting the sample network topology information and the first sample into a network fault detection model for model parameter iteration, and determining a first iteration cost parameter of the model parameter iteration, wherein the method comprises the following steps:

Inputting the sample network topology information and the first sample into a network fault detection model, and reasoning the fault type corresponding to the sample network equipment in the first sample by using the network fault detection model to obtain a second fault detection result; the second fault detection result comprises a second confidence coefficient of the network fault detection model for predicting that the sample network equipment is in each preset fault type;

and determining a first iteration cost parameter of the model parameter iteration based on the pre-labeling information and the second fault detection result.

For the network fault detection model, the first sample and the second sample are not distinguished, and the output fault detection results are consistent in form. Therefore, in the embodiment of the present invention, when the first sample is used to iterate the model parameters, the sample network topology information and the first sample are input to the network fault detection model, and a fault detection result with the same form as the first fault detection result may also be obtained. And the second fault detection result comprises the confidence coefficient of the network fault detection model for predicting that the sample network equipment of the first sample is in each preset fault type, and the confidence coefficient is recorded as a second confidence coefficient. The second fault detection result, the meaning represented by the second confidence coefficient, and the subsequent manner of determining the fault type corresponding to the sample network device of the first sample based on the second confidence coefficient may be implemented by referring to the reasoning process of the second sample in the foregoing embodiment, which is not described herein in detail.

In the embodiment of the invention, the sample set can comprise pre-labeling information corresponding to the first sample, and the pre-labeling information can be used for indicating the real fault type corresponding to the sample network equipment in the first sample.

In the embodiment of the invention, after the second fault detection result corresponding to the first sample is obtained, the first iteration cost parameter corresponding to the first sample can be determined by using the pre-labeling information. Specifically, it may be understood that the pre-labeling information characterizes a real fault type corresponding to the sample network device in the first sample, the second fault detection result includes a second confidence level of the network fault detection model for predicting the sample network device corresponding to each preset fault type, and based on a difference between the real fault type and the fault detection result, prediction accuracy of the network fault detection model may be evaluated, so as to determine the first iteration cost parameter. Here, a loss function may be selected to determine the first iteration cost parameter based on the pre-label information and the second fault detection result. The loss function may be, for example, a 0-1 loss function, a square loss function, an absolute loss function, a logarithmic loss function, a cross entropy loss function, etc., which may be used as the loss function of the artificial intelligence model, and is not described herein, and in the embodiment of the present invention, the type of the loss function specifically used is not limited.

Specifically, in one possible implementation, the network failure detection model includes a first processing unit and a second processing unit; inputting the sample network topology information and the first sample into a network fault detection model, and reasoning the fault type corresponding to the sample network equipment in the first sample by using the network fault detection model to obtain a second fault detection result, wherein the method comprises the following steps of:

Inputting sample network topology information and a first sample into a first processing unit, and executing feature engineering on the first sample by using the first processing unit to obtain a first sample feature vector;

Inputting the first sample feature vector to a second processing unit, and weighting the first sample feature vector by using clustering attention information by using the second processing unit to obtain a first target feature vector; the clustering attention information comprises a plurality of clustering attention distribution vectors, each clustering attention distribution vector corresponds to one preset fault type, and the number of elements of the first target feature vector is equal to the number of the preset fault types;

Carrying out normalized data mapping on the first target feature vector by using a preset normalization function to obtain a second target feature vector, and taking the second target feature vector as a second fault detection result; the value corresponding to the element in the second target feature vector represents a second confidence that the network fault detection model predicts that the sample network device is in the preset fault type.

In the embodiment of the application, the sample network topology information and the first sample can be input into the first processing unit of the network fault detection model, the first processing unit can be used for executing feature engineering processing on the first sample, and the obtained sample feature vector is marked as the first sample feature vector. Then, the first sample feature vector may be input to a second processing unit, where the second processing unit may be a full connection layer, including clustering attention information, and the first sample feature vector may be weighted, where the result of the weighting is a vector, and in this embodiment, the first sample feature vector is recorded as a first target feature vector. The data form of the cluster attention information may be a matrix, which includes a plurality of cluster attention distribution vectors, each cluster attention distribution vector corresponds to a preset fault type, the cluster attention distribution vector may be weighted with the first sample feature vector, and the value corresponding to the obtained element is the original output result. Therefore, after the weighted processing of the attention distribution vector of each cluster and the first sample feature vector, a value corresponding to an element can be obtained, the values corresponding to the elements form a first target feature vector, the number of the elements of the first target feature vector is equal to that of the preset fault types, and the magnitude of the value corresponding to each element actually represents the confidence that the network fault detection model predicts that the sample network equipment in the network fault detection model is in the corresponding preset fault type.

It should be noted that, in the obtained first target feature vector, the value corresponding to the element represents the confidence level of the network fault detection model predicting that the sample network device is in the corresponding preset fault type, but the value corresponding to the element may be far greater than 1, and is not suitable to be directly output as the confidence level in the fault detection result. For this, the first target feature vector may be normalized by using a preset normalization function, for example, softmax, to obtain a second target feature vector, and then the second target feature vector is used as a fault detection result, that is, a second fault detection result.

Specifically, in one possible implementation manner, determining a first iteration cost parameter of the model parameter iteration based on the pre-labeling information and the second fault detection result includes:

Determining a first vector included angle between the first sample feature vector and the first attention index, and determining a second vector included angle between the first sample feature vector and each of the second attention indexes;

Determining the angle deviation between the first vector included angle and a predefined target included angle to obtain a first price related parameter;

Determining a first iteration cost parameter of the model parameter iteration based on the first price related parameter and each second vector included angle;

Wherein, the correlation coefficient of the first price correlation parameter and the first iteration cost parameter is negative, and the correlation coefficient of the second vector included angle and the first iteration cost parameter is positive.

In the embodiment of the invention, the second confidence coefficient in the second fault detection result is actually obtained by normalizing the value corresponding to the element in the first target feature vector by using a preset normalization function, and the value corresponding to the element in the first target feature vector is obtained by weighting the cluster attention distribution vector and the first sample feature vector. The current pre-label information indicates that the actual fault type corresponding to the sample network device in the first sample is the first fault type. When predicting the second fault detection result, it is desirable that the second confidence coefficient corresponding to the first fault type is as large as possible, the closer to 1 is that the better the detection effect of the network fault detection model is, and on the contrary, the smaller than 1 is that the worse the detection effect of the network fault detection model is. In the embodiment of the present invention, a corresponding first attention index may be determined from a cluster attention distribution vector, that is, a cluster attention distribution vector with a value corresponding to the element of the first bit in the first target feature vector is obtained by weighting the first sample feature vector, where the value of the cluster attention distribution vector is related to the second confidence corresponding to the first fault type, and cluster attention distribution vectors other than the first attention index may be used as the second attention index.

It will be appreciated that when the vector is weighted, the result obtained is related to the modulus of the vector and the vector angle between the vectors, whereas when fault type detection is performed, for a second sample, the first sample feature vector is fixed, the fault detection result in principle only depends on the first sample feature vector, and the clustering attention information does not contribute to the fault detection result. Therefore, in practice, the magnitude of the value corresponding to the element in the first target feature vector depends on the vector included angle between the first sample feature vector and the cluster attention distribution vector, and the larger the vector included angle between the first sample feature vector and a certain cluster attention distribution vector is, the more prone to the network fault detection model being considered as the preset fault type corresponding to the cluster attention distribution vector as the second fault detection result.

In the embodiment of the invention, after the first attention index corresponding to the real fault type and the other second attention indexes not corresponding to the real fault type are determined, the vector included angle between the first sample feature vector and the first attention index can be determined and marked as a first vector included angle, and the vector included angle between the first sample feature vector and the second attention index can be determined and marked as a second vector included angle. It can be understood that the larger the first vector included angle is, the smaller the other second vector included angles are, which indicates that the current network fault detection model is more prone to output a second fault detection result similar to the actual fault type, and the better the performance of the network fault detection model is; conversely, the smaller the first vector included angle is, the larger the other second vector included angles are, which indicates that the current network fault detection model is more prone to output a second fault detection result which is inconsistent with the actual fault type, and the poorer the performance of the network fault detection model is.

In some embodiments, a target included angle is further set, where the purpose of the target included angle is to enable the network fault detection model to divide the first sample feature vector within a certain range close to the first attention index into preset fault types corresponding to the first attention index. Based on the difference between the first vector included angle and the target included angle, a first price related parameter can be determined, the correlation coefficient of the first price related parameter and the first iteration cost parameter is negative, and the correlation coefficient of the second vector included angle and the first iteration cost parameter is positive.

In one possible implementation, the network failure detection model includes a first processing unit and a second processing unit; inputting the sample network topology information and the second sample into a network fault detection model, and reasoning the fault type corresponding to the sample network equipment in the second sample by using the network fault detection model to obtain a first fault detection result, wherein the first fault detection result comprises the following steps of:

Inputting the sample network topology information and the second sample into a first processing unit, and executing feature engineering on the second sample by using the first processing unit to obtain a second sample feature vector; inputting the second sample feature vector to a second processing unit, and weighting the second sample feature vector by using clustering attention information by using the second processing unit to obtain a third feature vector; the clustering attention information comprises a plurality of clustering attention distribution vectors, each clustering attention distribution vector corresponds to one preset fault type, and the number of elements of the third feature vector is equal to the number of the preset fault types; carrying out normalized data mapping on the third feature vector by using a preset normalization function to obtain a fourth feature vector, and taking the fourth feature vector as a first fault detection result; the value corresponding to the element in the fourth feature vector represents a first confidence that the network fault detection model predicts that the sample network device is in the corresponding preset fault type.

In the embodiment of the invention, the implementation principle flow of the structure and reasoning of the network fault detection model is also applicable to the second sample. Specifically, the sample network topology information and the second sample may be input to the first processing unit of the network fault detection model, and the first processing unit may perform feature engineering processing on the second sample, and the obtained sample feature vector is denoted as a second sample feature vector. Then, the second sample feature vector may be input to the second processing unit, and similarly, the second processing unit includes clustering attention information, and the second sample feature vector may be weighted, where the weighted result may be a vector, and in this embodiment, the second sample feature vector is denoted as a third feature vector. The data form of the cluster attention information may be a matrix, which includes a plurality of cluster attention distribution vectors, each cluster attention distribution vector corresponds to a preset fault type, the cluster attention distribution vector may be weighted with the second sample feature vector, and the value corresponding to the obtained element is the original output result. Therefore, after the weighted processing of the attention distribution vector of each cluster and the second sample feature vector, a value corresponding to an element can be obtained, the values corresponding to the elements form a third feature vector, the number of elements of the third feature vector is equal to that of the preset fault types, and the size of the value corresponding to each element actually represents the possibility that the network fault detection model predicts that the sample network equipment in the network fault detection model is in the corresponding preset fault type.

Then, the third feature vector may be subjected to normalized data mapping by using a preset normalization function to obtain a fourth feature vector, and then the fourth feature vector is used as a fault detection result, that is, a first fault detection result. The values corresponding to the elements in the fourth feature vector are actually the first confidence coefficient of the preset fault type corresponding to the network fault detection model prediction sample network equipment in the first fault detection result, and the normalized data mapping is performed on the third feature vector by using the preset normalization function, so that the values corresponding to the elements can be constrained to be between 0 and 1, and a specific numerical value suitable for indicating the confidence coefficient is obtained.

Specifically, in one possible implementation, determining the second iteration cost parameter of the model parameter iteration based on the first confidence and the predefined detection confidence indicator includes:

Determining a third vector included angle between the second sample feature vector and the attention distribution vector of each cluster; determining a target included angle based on the detection confidence index; determining the angle deviation between the maximum third vector included angle and the target included angle to obtain a second cost related parameter; determining a second iteration cost parameter of the model parameter iteration based on the second cost-related parameter; wherein the correlation coefficients of the second cost correlation parameter and the second iteration cost parameter are positive.

In the embodiment of the invention, when the second iteration cost parameter is determined, in some cases, a vector included angle between the second sample feature vector and the attention distribution vector of each cluster can be determined and recorded as a third vector included angle. And the correlation coefficient of the target included angle and the detection confidence index is positive, so that the corresponding target included angle can be determined based on the predefined detection confidence index.

In the embodiment of the invention, the largest third vector included angle can be determined, then the angle deviation between the third vector included angle and the target included angle is determined, and the obtained value is recorded as a second cost related parameter. It can be understood that if the second cost-related parameter is a positive value, it is indicated that the current network fault detection model predicts that the fault type corresponding to the sample network device in the second sample is a known fault type; and if the second cost related parameter is a negative value, the current network fault detection model predicts that the fault type corresponding to the sample network equipment in the second sample is the non-existing fault or unknown fault type. Therefore, the larger the value of the second cost-related parameter is, the more likely the network fault detection model predicts that the fault type corresponding to the sample network device in the second sample is wrong, the worse the model performance is, so the second iteration cost parameter can be determined based on the second cost-related parameter, and the correlation coefficient of the values of the second iteration cost parameter and the second iteration cost parameter is negative, that is, the larger the second cost-related parameter is, the smaller the second iteration cost parameter is, and the larger the second cost-related parameter is.

Determining a parameter size sequence of each third price related parameter in response to the existence of the third price related parameter higher than or equal to zero, and determining a second iteration cost parameter based on the largest third price related parameter; or in response to the third iteration cost parameter being below zero, determining that the parameter value of the second iteration cost parameter is zero.

The parameter size sequence of the third price related parameters may be arranged from small to large or from large to small according to the values of the respective third related parameters, which is not limited.

In the embodiment of the invention, when the second iteration cost parameter of the model parameter iteration is determined based on the first confidence coefficient and the predefined detection confidence coefficient index, the angle deviation between the first confidence coefficient and the detection confidence coefficient index of each preset fault type can be determined, and the third generation price related parameter corresponding to each preset fault type is obtained. It can be appreciated that when the third-generation price-related parameter is higher than or equal to zero, the network fault detection model determines the fault type corresponding to the sample network device based on the preset fault type with the largest corresponding first confidence coefficient; when the third generation price related parameters are smaller than 0, the network fault detection model judges that the fault type corresponding to the sample network equipment in the second sample is the fault-free or unknown fault type. Therefore, in the embodiment of the present invention, in response to the third price related parameter having a value higher than or equal to zero, the parameter size sequence of the third price related parameter may be determined, and the second iteration cost parameter is determined based on the largest third price related parameter, for example, the largest third price related parameter may be directly used as the second iteration cost parameter. In response to the third iteration cost parameter being below zero, a parameter value of the second iteration cost parameter may be determined to be zero.

Referring now to fig. 2, a schematic diagram of an electronic device 200 suitable for use in implementing embodiments of the present disclosure is shown, which may be, for example, a network device in a method embodiment corresponding to fig. 1. The terminal devices in the embodiments of the present disclosure may include, but are not limited to, mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle terminals (e.g., in-vehicle navigation terminals), and the like, and stationary terminals such as digital TVs, desktop computers, and the like. The electronic device shown in fig. 2 is merely an example and should not be construed to limit the functionality and scope of use of the disclosed embodiments.

As shown in fig. 2, the electronic device 200 may include a processing means 201, and the processing means 201 may be, for example, a central processing unit, an image processor, or the like, which may perform various appropriate actions and processes according to a program stored in a read only memory 202 or a program loaded from a storage means 208 into a random access memory 203. In the random access memory 203, various programs and data necessary for the operation of the electronic device 200 are also stored. The processing means 201, the read only memory 202 and the random access memory 203 are connected to each other by a bus 204. An input/output interface 205 is also connected to the bus 204.

In general, the following devices may be connected to the input/output interface 205: input devices 206 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; an output device 207 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 208 including, for example, magnetic tape, hard disk, etc.; and a communication device 209. The communication means 209 may allow the electronic device 200 to communicate with other devices wirelessly or by wire to exchange data. While fig. 2 shows an electronic device 200 having various means, it is to be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may be implemented or provided instead.

In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a non-transitory computer readable medium, the computer program comprising program code for performing the method shown in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication means 209, or from the storage means 208, or from the read only memory 202. The above-described functions defined in the methods of the embodiments of the present disclosure are performed when the computer program is executed by the processing device 201.

It should be noted that the computer readable medium described in the present disclosure may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present disclosure, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, fiber optic cables, RF (radio frequency), and the like, or any suitable combination of the foregoing.

In some embodiments, the clients, servers may communicate using any currently known or future developed network protocol, such as HTTP (HyperText Transfer Protocol ), and may be interconnected with any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the internet (e.g., the internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed networks.

The computer readable medium may be contained in the electronic device; or may exist alone without being incorporated into the electronic device.

The computer-readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to perform the steps referred to in the embodiments.

Or the computer-readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to perform the steps referred to in the embodiments.

Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, including, but not limited to, an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The modules described in the embodiments of the present disclosure may be implemented in software or hardware. The name of a module does not in some cases define the module itself.

The functions described above herein may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), an Application Specific Standard Product (ASSP), a system on a chip (SOC), a Complex Programmable Logic Device (CPLD), and the like.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The foregoing is merely illustrative of the present invention, and the present invention is not limited thereto, and any changes or substitutions easily contemplated by those skilled in the art within the scope of the present invention should be included in the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. The network fault detection method is characterized by being applied to first network equipment, wherein the first network equipment is network equipment in network topology, and comprises the following steps:

Determining the fault type of the network equipment with the fault in the network topology based on the network fault detection result;

The training of the network fault detection model comprises the following steps:

2. The method of claim 1, wherein inputting the network topology information and the network response message into a pre-trained network failure detection model to obtain a network failure detection result comprises:

3. The method of claim 1, wherein the sample set further includes pre-labeling information corresponding to the first sample, the pre-labeling information being used to indicate a true fault type corresponding to the sample network device in the first sample;

4. The method of claim 1, wherein the network failure detection model comprises a first processing unit and a second processing unit; inputting the sample network topology information and the second sample into the network fault detection model, and reasoning the fault type corresponding to the sample network equipment in the second sample by using the network fault detection model to obtain a first fault detection result, wherein the first fault detection result comprises the following steps:

5. The method of claim 4, wherein determining a second iteration cost parameter for the model parameter iteration comprises:

Determining a target included angle based on the detection confidence index;

6. The method of claim 1, wherein the determining a second iteration cost parameter for a model parameter iteration based on the first confidence level and a predefined detection confidence indicator comprises:

7. The method of claim 1, wherein iterating model parameters of the network fault detection model based on the first iteration cost parameter and the second iteration cost parameter to obtain a trained network fault detection model comprises:

8. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processing device, carries out the steps of the method according to any one of claims 1-7.

9. An electronic device, comprising:

A storage device having a computer program stored thereon;

Processing means for executing said computer program in said storage means to carry out the steps of the method according to any one of claims 1-7.