CN110225540A

CN110225540A - A kind of fault detection method towards centralization access net

Info

Publication number: CN110225540A
Application number: CN201910383058.1A
Authority: CN
Inventors: 叶冠文; 王园园; 张宗帅; 孙茜
Original assignee: Beijing Zhongke Polytron Technologies Inc
Current assignee: Beijing Zhongke Polytron Technologies Inc
Priority date: 2019-01-30
Filing date: 2019-05-09
Publication date: 2019-09-10

Abstract

The method that a kind of pair of centralization access net executes abnormality detection, it include: 1) for one layer in the network architecture of centralization access net, accident detection is carried out using Fisrt fault detector corresponding with this layer, wherein operation data of the Fisrt fault detector based on this layer itself, obtaining using Negative Selection Algorithm training；2) component is detected in the Fisrt fault detector there are when multinomial fault message, the second tracer is used to execute to ship the multinomial fault message of the component and calculate with the failure of the determination component, fault message described in one of them is represented as occur on the components anomalous event and there are the set of associated all anomalous events with the anomalous event.

Description

Fault detection method for centralized access network

Technical Field

The present invention relates to fault detection in wireless communication systems, and more particularly to fault detection for centralized access networks.

Background

A centralized access network (C-RAN) is a novel resource management and control system, which creates a large number of virtual base stations in a large-scale centralized resource pool on demand through a uniform and open interface to realize resource sharing among a plurality of virtual base stations. However, for such a shared resource pool, once a problem occurs in the resource pool, multiple base stations associated with the resource pool may fail, thereby affecting the services of access users in a wide range, and even causing the entire network to crash. It is therefore desirable to provide a fault management system for a centralized access network.

The fault detection is used as the first step of fault management, and the detection effect of the fault detection directly influences the effect of fault management. The traditional fault detection mode has large manual participation, and the fault detection mode is very dependent on the experience of operation and maintenance personnel and is easy to cause misjudgment and missed judgment. With the emergence of new services, the complexity of centralized access network equipment is higher and higher, and the network scale is gradually increased, which makes it more difficult to implement accurate and efficient fault detection.

Currently, some new fault detection methods that can reduce human involvement are proposed in the art, and can be roughly classified into two categories: 1. the method based on the probe periodically sends detection data to the network, and judges whether a fault occurs according to the response condition of the network. The method has high detection rate, but the detection packets need to be continuously sent to the network, the data scale of the centralized access network is huge, and the mode of sending the detection data can increase extra overhead for the system. 2. Based on a data mining and neural network method, the method trains fault data samples into corresponding rules or models for fault detection, and the larger the fault data sample is, the more accurate the trained rules or models are. However, for most devices, it is difficult to obtain a large number of fault samples at one time, so such a method is not suitable for a centralized access network.

Disclosure of Invention

Therefore, an object of the present invention is to overcome the above-mentioned drawbacks of the prior art, and to provide a method for performing anomaly detection on a centralized access network, including:

1) aiming at one layer in a network architecture of a centralized access network, a first fault detector corresponding to the layer is adopted for detecting abnormal events, wherein the first fault detector is obtained by adopting a negative selection algorithm for training based on the self operating data of the layer;

2) when the first fault detector detects that a plurality of fault information exist in a component, a second fault detector is adopted to perform intersection operation on the plurality of fault information of the component so as to determine the fault of the component, wherein one fault information is represented as a set of one abnormal event occurring on the component and all abnormal events related to the abnormal event.

Preferably, the method comprises: performing the above step 1) and/or step 2) for one layer in a network architecture of a centralized access network, and in case that a failure exists in a component of the layer, performing the above step 1) and/or step 2) for an adjacent layer of the layer until it is determined that there is no failure in the component of the current layer or all layers are traversed.

Preferably, the method comprises: performing the above step 1) and/or step 2) for a non-underlying layer in the network architecture of the centralized access network, in case it is found that there is a failure in a component of said layer, performing the above step 1) and/or step 2) for a lower layer of said layer until it is determined that there is no failure or the lowest layer is detected in the component of the current layer.

Preferably, according to the method, the following layers in the network architecture of the centralized access network are selected from the corresponding sets for the negative selection algorithm and the first fault detector, respectively:

computing resource layer: the temperature of the equipment, the voltage of the equipment, the utilization rate of a CPU (Central processing Unit), the utilization rate of a memory, the flow of a network interface and the rate of the network interface;

a virtualization layer: the utilization rate of a virtual CPU, the utilization rate of a virtual memory and the utilization rate of a virtual baseband;

a network element layer: the method comprises the following steps that the number of access users of a virtual base station, an uplink rate, a downlink rate, signal strength, time delay, packet loss rate and reference signal receiving power are calculated;

network layer: network performance parameters for the area.

Preferably, according to the method, wherein the first fault detector is trained by a method comprising:

I) for one layer in a network architecture of a centralized access network, collecting n-dimensional operation data D ═ (z) of the layer⁽¹⁾,z⁽²⁾,...,zⁿ) To serve as an autologous sample set;

II) training by adopting a negative selection algorithm based on the self-body sample set to obtain a candidate first fault detector, and if the candidate first fault detector is the minimum distance d from the self-body sample_minGreater than or equal to the affinity radius r of the autologous sample_sAdding the candidate fault detectors to a set of fault detectors;

III) outputting the set of fault detectors when the obtained set of fault detectors meets the set coverage rate, otherwise, repeating the step II).

Preferably, according to the method, wherein the step III) of determining whether the obtained set of fault detectors reaches a set coverage rate includes:

based on z > z_αDetermining whether the obtained set of fault detectors has reached a set coverage, whereinx is the number covered by the set of fault detectors in n test samples, p is a predetermined coverage, z_αIs the confidence value corresponding to the selected significance level α.

Preferably, according to said method, wherein between steps II) and III) it is comprised:

if the detection radius r of the candidate fault detector_di＞d_min-r_sThen the candidate fault detector is excluded from the set for the fault detector.

Preferably, according to said method, wherein said set of autologous samples is D ═ z (z)⁽¹⁾,z⁽²⁾,...,zⁿ) Corresponding to analysis by principal componentsThe algorithm carries out dimension reduction processing on the m-dimensional running data which can be selected by the layer to determine n dimensions, n<m。

A method of training a fault detector, comprising:

I) collecting n-dimensional operation data D ═ (z) generated by the system in operation⁽¹⁾,z⁽²⁾,...,zⁿ) To serve as an autologous sample set;

II) training by adopting a negative selection algorithm based on the self-body sample set to obtain a candidate fault detector, wherein if the minimum distance d between the candidate fault detector and the self-body sample is_minGreater than or equal to the affinity radius r of the autologous sample_sAdding the candidate fault detectors to a set of fault detectors;

Preferably, according to said method, wherein said set of autologous samples is D ═ z (z)⁽¹⁾,z⁽²⁾,...,zⁿ) Corresponding to n dimensions determined by performing dimension reduction processing on the selectable m-dimensional running data of the layer through a principal component analysis algorithm, n<m。

A computer-readable storage medium, in which a computer program is stored which, when executed, is adapted to carry out the method of any of the above.

Compared with the prior art, the embodiment of the invention has the advantages that:

1. the invention can continuously and automatically detect the abnormal condition of the centralized access network, reduces the amount of manual participation and improves the automation degree of the system.

2. The negative selection algorithm adopted by the primary fault detection only needs to provide normal operation parameter samples when the abnormal detection model is trained, and a large number of fault samples are not needed, so that the method is easy to implement.

3. According to the invention, the second-level fault detection establishes the abnormal association mapping table for carrying out abnormal reasoning, so that large-scale abnormal conditions can be effectively faced, and the fault detection efficiency is improved.

Drawings

Embodiments of the invention are further described below with reference to the accompanying drawings, in which:

FIG. 1 is a diagram of a multi-layer network architecture of a super base station according to one embodiment of the present invention;

fig. 2 is a flow diagram of a method for a centralized access network to perform anomaly detection in accordance with one embodiment of the present invention;

FIG. 3 is a flow diagram of a method for training an anomaly detector using a negative selection algorithm, according to one embodiment of the present invention;

FIG. 4 is a flow diagram of a method of checking detector coverage according to one embodiment of the invention;

FIG. 5 is a test result of the fault detection rate for the inventive arrangements;

FIG. 6 is a test result of a detection time analysis for the protocol of the present invention.

Detailed Description

When studying fault detection of a centralized access network, the inventor finds that the current fault detection mechanism has the problems of low detection rate, large manual participation amount and insufficient automation degree, and proposes to adopt a Negative Selection Algorithm (NSA) based on the field of artificial immunity to carry out fault detection. The NSA algorithm references the 'negative selection' process when immune cells mature, and is used for detecting the 'non-self' condition by learning an 'self' data training abnormality detector, wherein the 'self' data refers to normal data, and the 'variant' refers to an abnormal condition. The method does not need a large number of fault samples in the training process, can generate the abnormal detector mainly depending on the normal operation parameters of the method, does not send data packets to the network, and avoids increasing the burden of the system. Therefore, anomaly detection for centralized access networks can be carried out based on this approach.

However, the abnormality detected by NSA does not necessarily mean that a failure has occurred, and further failure determination is required. In the centralized access network, one fault can be associated with one or more abnormal conditions, and at this time, fault association analysis needs to be performed on a plurality of abnormal conditions to find out a fault source where the abnormality occurs.

In view of the above, the present invention proposes to use a multi-stage Fault detection mechanism (MFDM) to determine whether the network is abnormal by using a first-stage Fault detection (anomaly detection), and perform association analysis on the abnormality by using a second-stage Fault detection (anomaly association analysis) to find out the Fault source where the abnormality occurs.

In summary, for the first-stage fault detection, an abnormal detector set based on an artificial immune negative selection algorithm is obtained by training normal data of a centralized access network, and the abnormal detector set obtained by training is used for carrying out abnormal detection on the network to judge whether an abnormal condition occurs.

For the second level of fault detection, an intersection operation is performed based on a plurality of exceptions for a component to determine fault information for the component, wherein an exception is represented as a set of one exception event occurring on the component and all exception events associated with the exception event. The exception event occurring at the component is an exception event occurring at the component as determined by the first level of fault detection.

The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.

< example 1>

The method of training the first fault detector will be described below by way of one embodiment.

For most centralized access networks, the network architecture of the centralized access network can be abstracted into multiple layers, and each layer has different characteristics, so according to an embodiment of the present invention, the first fault detectors can be trained respectively for each layer in the network architecture of the centralized access network, so as to perform abnormal event detection by using the first fault detector corresponding to the layer when performing abnormal event detection.

For the above embodiment, when the first fault detector is trained, the corresponding operating data of each layer may be selected by combining the characteristics of the layer to be used as a training sample of the negative selection algorithm, and when step 1 is implemented, the to-be-detected data of the same category as the training sample may be selected to perform anomaly detection.

Referring to fig. 1, taking a super base station (a centralized access network) as an example, the network architecture is divided into, from top to bottom:

network layer: is a network area formed by a plurality of base stations serving together, such as the network area of the Hai lake district of Beijing. Based on the characteristics of the network layer, data reflecting the network performance of the area can be selected for the layer as a training sample, such as throughput, packet loss rate, time delay and the like of the area.

A network element layer: is a variety of virtual base stations established according to user requirements. The single cell of the super base station is equivalent to a traditional base station, such as a base station capable of supporting 2G, 3G and 4G. The virtual base station has all the logic functions of the conventional base station, and is different from the conventional base station in that the virtual base station does not have the physical form of the conventional base station. For the network element layer, the number of access users, uplink and downlink rates, signal strength, time delay, packet loss rate, Reference Signal Received Power (RSRP), and the like can be selected as training samples.

A virtualization layer: physical computing resources such as a memory, a CPU, a baseband and the like on the bottom layer are abstracted into logic computing resources which can be directly called by using a real-time virtualization technology. Most of the reasons for the exception of the virtualization layer are caused by insufficient physical computing resources (migration or newly added physical devices), or by software configuration errors. Thus, for the virtualization layer, the usage of virtual CPUs, virtual memories, virtual baseband, etc. can be selected as training samples.

Computing resource layer: namely a hardware layer, which corresponds to an external radio frequency unit, a centralized baseband pool, a centralized general server, etc., to provide software and hardware support. The exception of the computing resource layer is mainly caused by hardware, so that the temperature, voltage, CPU, memory utilization rate, network interface flow, speed and the like of hardware equipment can be selected as training samples for the layer.

According to one embodiment of the invention, a method of training a first fault detector comprises:

step 1, aiming at one layer of the network architecture of the centralized access network, collecting n-dimensional operation data D ═ z (of the layer) of the layer⁽¹⁾,z⁽²⁾,...,zⁿ) To serve as an autologous sample set. Wherein the one-dimensional operational data representsOne kind of operation data, taking a hardware layer as an example, may select 6 kinds of operation data, namely, temperature, voltage, CPU, memory utilization rate, network interface traffic, and rate of a hardware device, as a training sample for the layer, that is, n is 6.

In one embodiment of the present invention, the managed object may be configured in the detection agent by an administrator, and the data uploaded by the management software, the protocol stack software, and the baseband software for the managed object may be collected through a network management protocol (SNMP).

For example, for the management and control software, the configuration management module collects the configured information to judge whether the configuration information of the base station input by the administrator is reasonable or not; collecting data of hardware such as related computing resources and the like based on a board management module, such as temperature, voltage, CPU and utilization rate of an internal memory; collecting information of network calls (FTP, HTTP, SNMP and the like) based on the network port management module, such as uplink and downlink rates, packet loss rate, TCP/UDP packets and the like; collecting cell-related information such as the number of users, bandwidth, cell load, etc. of a cell based on a service management module; the RRU-based management module is responsible for collecting information of the remote radio units, such as transmission power, signal strength, coverage rate, etc. of signals.

For protocol stack software, protocol processing information about protocol stack layer 2(MAC, RLC, PDCP, RRC, etc.) such as packet transmission/reception rate, packet flooding rate, jitter delay, etc. is collected by the software.

For the baseband software, the processing information of the physical layer PHY, such as data volume of the network port, throughput, baseband packet receiving and transmitting rate, power consumption, error rate, etc., is collected by the software.

Considering that a centralized access network such as a super base station is divided into multiple layers, the number of parameters to be detected in each layer is large, and the corresponding data amount is very large, so that the collected parameters can be subjected to redundancy removal, for example, linear conversion is adopted to reduce the dimensionality of data so as to reduce the time for data training and the complexity of data processing.

In one embodiment of the invention, the parameters measurable for one layer in the super base station network architecture are summarized as m-dimensional parameters D ═ x⁽¹⁾,x⁽²⁾,...,x^(m)) And converted into an n-dimensional parameter D' ═ x based on Principal Component Analysis (PCA)⁽¹⁾,x⁽²⁾,...,x⁽ⁿ⁾) Wherein n is<And m is selected. In the PCA algorithm, the smaller n is selected, the shorter the training time of the algorithm is, and the lower the corresponding accuracy is, but the larger n is, the longer the training time of the algorithm is, and further the efficiency of fault detection is affected. The value of n is related to the contribution rate of the data and is expressed as:

where p is the contribution of the data and λ is the covariance matrix XX of the original sample^TThe characteristic value of (2). Typically, a contribution rate greater than 90% may be considered to cover the original data space well.

The specific implementation of the dimensionality reduction of the samples by the PCA algorithm can refer to the prior art. Taking the example of using protocol stack software to obtain self samples, it is assumed that 20 sets of data with m-5 as shown in table 1 can be captured at the current layer of the network architecture according to an embodiment of the present invention.

TABLE 1 summary of data for Current layer

Based on PCA algorithm, the data are decentralized and the corresponding covariance is calculatedMatrix XX^TThe results shown in table 2 were obtained.

TABLE 2 covariance matrix XX^T

0.9444	-0.3232	0.0592	0.0138	-0.0008
					0.3278	0.9126	-0.2434	-0.0177	-0.0082
0.0035	0.0037	-0.0126	-0.0372	0.9992
					-0.0059	0.0330	0.0420	0.9979	0.0093
0.0250	0.2481	0.9671	-0.0491	0.0093

Continuing the PCA algorithm, eigenvalues and contribution rates corresponding to the above covariance matrix are calculated, and λ ═ 36.7635,10.3131,0.4373,0.0275,0.0000 can be obtained in descending order of eigenvalues]^TWhere the total contribution ratio of the first 2 eigenvalues is 99.02%, and the criterion that the set contribution ratio is greater than 90% is satisfied, n is selected to be 2.

Finding the vectors corresponding to the first n-2 feature matrices and calculating the principal component vectors based on the PCA algorithm can obtain the results as shown in table 3.

Table 3 major component data with n-2

For the convenience of calculation, the above-described feature 1 and feature 2 are normalized to a dimensionless index by the following equation in the present embodiment.

Where x is the original data, x_max、x_minMaximum and minimum values, x, of the original data set_normIs normalized data.

The results of the normalization process on the data in table 3 are shown in table 4.

TABLE 4 normalized data

Thus, an autologous sample D' (z) for one layer of the network architecture of the centralized access network may be obtained⁽¹⁾,z⁽²⁾)，(n＝2)。

Step 2, training by adopting a negative selection algorithm based on the self-body sample set to obtain a candidate first fault detector, and if the candidate first fault detector is the minimum distance d between the candidate first fault detector and the self-body sample_minGreater than or equal to the affinity radius r of the autologous sample_sThen the candidate fault detectors are added to the set of fault detectors.

According to one embodiment of the invention, the Euclidean distance calculation can be adopted to calculate the affinity radius r of the autologous sample_s. Assuming that the autologous samples obtained in step 1 are as shown in table 4, x is made [0.4446,0 ]]，y＝[0.3419,0.0722]The distance between the 2 real-valued vectors represents the affinity between the two, and the smaller the affinity, the more matched. The euclidean distance is used to calculate the affinity radius between vector x and vector y:

wherein x is_iAnd y_iRepresenting the ith bit of vectors x and y, respectively. n represents the number of parameter types.

FIG. 3 illustrates a process for training an anomaly detector using a negative selection algorithm according to one embodiment of the present invention. Referring to fig. 3, the training process includes: randomly generating a detector sequence, and comparing the detector with the minimum distance d of the autologous sample_min(the distance can be calculated by the Euclidean distance formula), if d_min＜r_sThe detector is negated; if d is_min＞r_sThe detector can be used as a candidate detector, and the corresponding candidate detection radius is L ═ d_min-r_s. The detection radius set of the existing anomaly detector is r_diI is the number of the anomaly detector. To reduce the coincidence rate between the detectors, it is also possible to judge L and all r_diIf L < r is present_diThen the candidate detector will be discarded; if L < r is absent_diThen the candidate detector is added to the mature detector set.

By this step, a set for the first fault detector may be obtained.

And 3, outputting the set of fault detectors when the obtained set of fault detectors meets the set coverage rate, and otherwise, repeating the step 2.

In the present embodiment, the stop condition for generating the detector set is that the detector reaches a value of a predetermined coverage. The estimated coverage of the set of current fault detectors may be evaluated based on the decimated test samples using the following equation

Wherein,for estimated coverage, x is the number of detector coverage and n is the number of test samples taken.

Assuming that p is a predetermined coverage and σ is a standard deviation, according to the central limit theorem, when the test sample n is large enough, the error z value of the estimated coverage of the test sample can be approximated as following a standard normal distribution:

based on the expressions (4) and (5), the following expression can be derived:

when the error z > z of the coverage determined by the formula (6)_αIf so, receiving the assumption that the coverage rate p is reached, and stopping training; when z < z_αWhere α is a significance level, the smaller α indicates the more accurate the result was, typically the significance level is selected to be α -0.05, and the confidence level is 1- α -0.95, z_αFor this confidence level, the corresponding value of z at that time can be found by looking up the standard normal distribution table_αIs 1.645.

Referring to fig. 4, a desired coverage p, a significance level α, and a number of test samples n are selected as needed, autologous samples are randomly selected for testing, it is determined for the n selected test samples whether each test sample is covered by a detector, and a z value is calculated based on equation (6) as a result, it is determined whether z is greater than z determined by the significance level α_αAnd if so, considering that the expected coverage rate is reached, otherwise, considering that the expected coverage rate is not reached and continuously training the first fault detector.

By the embodiment 1, a set of first fault detectors for a layer in a network architecture of a centralized access network can be obtained based on data of self operation of the layer, and comprehensive coverage of self samples is satisfied based on a plurality of fault detectors.

< example 2>

According to embodiment 2 of the present invention, there is provided a method for performing anomaly detection on a centralized access network, and referring to fig. 2, the method includes:

step 1, aiming at a centralized access network, a first fault detector is adopted to detect abnormal events, wherein the first fault detector is obtained by adopting a negative selection algorithm for training based on the self operation data of the centralized access network.

According to an embodiment of the invention, the first fault detector is trained using the method of embodiment 1.

The inventor finds that the characteristic that each layer has faults accords with the following rule, namely if the current layer has faults, the lower layer has a maximum probability to be abnormal. For example, if the index of the network layer is abnormal, the lower network element layer usually has an abnormality. Therefore, the anomaly detection directly performed on a certain higher layer often cannot directly locate the fault point, and the analysis on the lower layer is also needed. For example, according to an embodiment of the present invention, when an index of a network layer is abnormal, fault detection is performed on a network element layer, a corresponding abnormal network element (virtual base station service segment) is found, and then whether the network element is a software fault of a virtualization layer is determined through a virtual resource corresponding to a mapping relationship of the virtualization layer, and if not, fault detection is continuously performed on a lower layer until a fault source is found. Since the closer to the bottom layer, the more devices and indexes that need to be detected, in view of gradually narrowing the fault range to achieve fast fault location, it is preferable to perform fault detection from high to low for each layer in the network architecture of the centralized access network, including: in the event that a failure is found to exist in a higher layer component, failure detection is performed for a lower layer in the network architecture of the centralized access network until it is determined that there is no failure in the current layer component or all layers have been traversed. However, it is to be understood that in other embodiments of the present invention, the fault detection may also be performed from low to high for each layer in the network architecture of the centralized access network.

And 2, performing intersection operation according to a plurality of detected anomalies of one component by adopting a second-stage fault detector (also called an intersection operation detector) to determine fault information of the component, wherein one anomaly is represented as a set of one anomaly event occurring on the component and all anomaly events related to the anomaly event.

According to an embodiment of the present invention, for the scheme of using the first fault detector for abnormal event detection in step 1, in step 2, correlation analysis may be further performed based on the abnormal event occurring on the component detected by the first-stage fault detection. It is assumed that each failure that may occur in one layer of the centralized access network is numbered F₁,…,F_i(i is the total number of faults) and a fault F is detected by the first fault detector corresponding to that layer₁The abnormal event F₁Possibly by an exception event F₂And F₃Derived so that the detected anomaly can be denoted as F₁F₂F₃. More than one fault may be detected by the first fault detector, which may be represented for each of them in the manner described above, and then an intersection operation is performed on all the exceptions to determine a set of exception events from the results obtained.

Taking a computing resource layer as an example, suppose that the layer has 4 kinds of failures, which are respectively numbered as: f₁Indicating event-over-temperature, F₂Indicating an event-memory utilization is too high, F₃Indicating event-network interface overload, F₄Indicating an event-device overload. This layer has 3 devices, respectively: b is₁-represents a baseband processing unit 1, S₁-represents a generic server 1, R₁-represents a radio frequency unit 1. In addition, at this layer, the memory utilization rate is too high, the temperature is too high due to overload of the network interface, and the memory utilization rate is too high due to overload of the network interface. Thus, an abnormality of one device can be indicated, for example, an abnormality in which the temperature of the server 1 is too high is indicated as S₁F₁F₂F₃An abnormality that the memory utilization rate of the server 1 is too high is represented as S₁F₂F₃Exception table for overloading the network interface of the server 1Shown as S₁F₃。

If the server 1 is detected to have overhigh temperature and overhigh memory utilization rate at the same time through the step 1, the two exceptions are handed over: s₁F₁F₂F₃∩S₁F₂F₃＝S₁F₂F₃Based on this operation, it can be determined that the memory utilization of the server 1 is excessive (S)₁F₂F₃)。

Based on the embodiment 1, the centralized access network can be subjected to multistage fault detection, and the detection method is realized through two-stage fault detectors of the MFDM, so that the manual participation amount is reduced, and the automation degree of the system is improved. The primary fault detector is obtained by performing negative selection algorithm training on data generated by the operation of the network architecture of the centralized access network, and a large number of training samples do not need to be specially prepared. And the primary fault detectors are respectively trained aiming at each layer in the network architecture, so that the number of required training samples is reduced, meanwhile, the uncertainty caused by training aiming at the whole network acquisition sample is reduced, and the pertinence and the detection accuracy of the primary fault detector obtained by training are improved. And a second fault detector is used for establishing a direct association mapping of the abnormal events, one abnormal event is expressed as a set of one abnormal event which occurs on the component and all abnormal events which are associated with the abnormal event, and when a plurality of abnormal events of one component are detected by the first fault detector, the second fault detector is used for determining the fault of the component, so that the large-scale abnormal conditions can be realized, and the fault detection efficiency can be improved.

< Performance test >

In order to verify the use effect of the scheme of the invention in the face of super base station fault detection, the inventor performs tests, and the evaluation indexes of the scheme comprise: training time, fault detection rate, detection time of the first fault detector. A total of 2 experiments were performed: (1) the training time of EFDM is analyzed through simulation, the training time is related to the set coverage rate of an abnormal detector and the size of a data set, and the training time of the data set with different sizes in 3 cases of 90%, 95%, 99% and the like is analyzed. (2) The detection rate and detection time of the EFDM are verified through 100 pieces of fault injection and 1000 pieces of parameters containing fault data.

Table 5 shows the parameter indices of the test experiments.

TABLE 5 Experimental test parameter index

(1) Training time analysis

In the experimental process, the data of protocol stack software is used for carrying out PCA processing, in order to better compare the PCA effect, the original data are respectively processed into 20 groups of n-3-dimensional data, the simplified autologous samples are adopted for carrying out detector training, and the training time before and after the PCA processing is counted in the training process.

The training time of the EFDM algorithm is related to the coverage of the anomaly detector. The 3 cases with the coverage of the anomaly detectors set to 99%, 95%, and 90% are selected, and the data samples are trained to generate an anomaly detector set. The average value is obtained after 10 times of repeated training: when the coverage rate is set to be 99%, the number of the training obtained abnormal detectors is 777, and the average training time is 2.6509 s; when the coverage rate is set to be 95%, the number of the abnormal detectors obtained by training is 194, and the average training time is 1.4937 s; when the coverage rate is set to 90%, the number of abnormal detectors is 54, and the average training time is 0.4665 s.

To analyze the effect of selecting different n values on training time, the inventors also tested training times when n is 2 and 5, with the results shown in table 6.

TABLE 6 training time comparison

The experimental results show that: (i) under the condition that other conditions are not changed, the number of generated abnormal detections is increased along with the increase of the coverage rate of the abnormal detector, but the corresponding training time is prolonged; however, when the coverage rate is 90%, the training time is short, but the area which is not covered by the detector is large, and the reliability of the abnormity detection is not guaranteed; when the coverage is set to 99%, the training time is relatively long. Therefore, from the experimental results, it is reasonable to set the coverage of the anomaly detector to 95%. (ii) As the data dimensionality decreases, the algorithm training time also decreases. (iii) EFDM can greatly reduce the training time of the algorithm, and the training time is reduced by more than 20%.

(2) Analysis of fault detection rate and detection time

To test the failure detection rate, the inventors selected a set of anomaly detectors trained under 95% coverage, injected with known 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000 test samples (where anomaly data accounts for 50%) to test the failure detection rate and detection time of EFDM processing into 2 and 3 dimensional data. Here, the detection rate is a probability that abnormal data is detected, and the detection time is a time taken to detect the data. The results are shown in fig. 5 and 6, averaged over several tests.

As can be seen from the detection rate curve in fig. 5, when the test sample is greater than 500, the failure detection rate of the data before and after the PCA processing tends to be stable, and the failure detection rate of the data after the PCA processing is slightly reduced compared to the failure detection without the PCA processing, but within an acceptable error range, the reason may be that data compression may cause data errors. As can be seen from comparing the detection time curves in fig. 6, when the number of detection samples is less than 400, the difference between the detection times before and after the data is processed by PCA is small, and when the number of detection samples is greater than 400, the detection time after the data is compressed is significantly shorter than that before the data is compressed. Since the actual fault detection data size on the super base station line is much larger than 400, the EFDM can be considered to be feasible.

Based on the above two tests, EFDM was found to be more effective than the original negative selection algorithm with a training time reduction of more than 20% and with a predetermined anomalous coverage detector coverage rate of 95%. Compared with the common negative selection algorithm, the EFDM provided by the invention can reduce the detection time and greatly improve the fault detection efficiency on the premise of not influencing the detection accuracy.

It should be noted that, all the steps described in the above embodiments are not necessary, and those skilled in the art may make appropriate substitutions, replacements, modifications, and the like according to actual needs.

Finally, it should be noted that the above embodiments are only used for illustrating the technical solutions of the present invention and are not limited. Although the present invention has been described in detail with reference to the embodiments, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted without departing from the spirit and scope of the invention as defined in the appended claims.

Claims

1. A method of performing anomaly detection for a centralized access network, comprising:

2. The method of claim 1, comprising: performing the above step 1) and/or step 2) for one layer in a network architecture of a centralized access network, and in case that a failure exists in a component of the layer, performing the above step 1) and/or step 2) for an adjacent layer of the layer until it is determined that there is no failure in the component of the current layer or all layers are traversed.

3. The method of claim 2, comprising: performing the above step 1) and/or step 2) for a non-underlying layer in the network architecture of the centralized access network, in case it is found that there is a failure in a component of said layer, performing the above step 1) and/or step 2) for a lower layer of said layer until it is determined that there is no failure or the lowest layer is detected in the component of the current layer.

4. The method according to claim 1, wherein the operating data of the layer itself is chosen from the corresponding set for the negative selection algorithm and the first fault detector for each of the following layers in the network architecture of the centralized access network:

network layer: network performance parameters for the area.

5. The method of claim 1, wherein the first fault detector is trained by a method comprising:

6. The method of claim 5, wherein step III) of determining whether the obtained set of fault detectors has reached a set coverage comprises:

7. The method of claim 5, wherein between steps II) and III) comprises:

8. The method of claim 5, wherein the set of autologous samples is D ═ (z)⁽¹⁾,z⁽²⁾,...,zⁿ) Corresponding to n dimensions determined by performing dimension reduction processing on the selectable m-dimensional running data of the layer through a principal component analysis algorithm, n<m。

9. A method of training a fault detector, comprising:

10. The method of claim 9, wherein step III) determining whether the obtained set of fault detectors has reached a set coverage comprises:

11. The method of claim 9, wherein between steps II) and III) comprises:

if the candidate fault is detectedRadius of detection r of device_di＞d_min-r_sThen the candidate fault detector is excluded from the set for the fault detector.

12. The method of claim 9, wherein the set of autologous samples is D ═ (z)⁽¹⁾,z⁽²⁾,...,zⁿ) Corresponding to n dimensions determined by performing dimension reduction processing on the selectable m-dimensional running data of the layer through a principal component analysis algorithm, n<m。

13. A computer-readable storage medium, in which a computer program is stored which, when being executed, is adapted to carry out the method of any one of claims 1-12.