Disclosure of Invention
The invention aims to solve the problems that a fault event positioning system in the prior art has limitations and cannot adapt to dynamic changes. The invention provides a fault event positioning method based on topology model tracking analysis, which comprises the following steps of S1, constructing a topology model of a system based on physical connection relation and running state information of a monitored system, wherein in the topology model, each side is given with a weight which can be dynamically adjusted according to connection characteristics;
S2, tracking dynamic events, namely tracking a fault propagation path in a topology model in real time by utilizing time sequence data of fault propagation, and extracting causal relationship of the fault propagation by analyzing state changes of all nodes;
s3, path optimization and tracing, namely optimizing a fault propagation path by applying a shortest path algorithm, removing noise data and invalid paths, and locking possible fault sources;
And S4, outputting the positioning result, namely outputting the most probable fault source node and the related path information thereof for simultaneously providing the confidence evaluation of the positioning result.
Preferably, in S1, the monitoring system includes a power grid, a communication network, an industrial control network, a traffic network, a water supply network, a natural gas pipeline network, a computer network, a logistics network, an energy network, an intelligent building system, a rail transit system and a medical network, the connection characteristics include resistance, communication delay, bandwidth, signal attenuation, reliability, power consumption, flow, load capacity, delay jitter, safety, physical distance and environmental influence, the topology model is represented by a graph structure, the nodes represent devices or functional units in the system, and the edges represent connection relations among the nodes.
Preferably, in S2, the dynamically adjusted path weight includes strength of the signal, propagation speed, frequency characteristic of the signal, attenuation coefficient of the signal, noise interference level, environmental condition, propagation medium characteristic, type of fault signal, directionality of the signal, response delay of the node to the signal, reliability or health status of the node, and timeliness of the signal.
Preferably, in S3, the shortest path algorithm adopts an a-algorithm, and the specific steps for optimizing the fault propagation path are as follows:
S301, defining a search space of a problem, namely converting a topology model into a graph structure, namely representing the topology model of a monitored system into a weighted graph, wherein nodes of the graph represent elements in the system, edges represent connection relations among the nodes, the weights of the edges are dynamically adjusted according to the connection characteristics, and setting a starting point and an ending point, namely selecting a node with the latest state change as the starting point according to a fault propagation time sequence;
S302, defining a heuristic function of an algorithm A, wherein the core of the heuristic function A is to integrate actual cost and heuristic cost, an optimal path is preferentially selected in path searching, the actual cost (g (n))isa cumulative path weight value from a starting point to a current node, the heuristic cost (h (n))isan estimated cost from the current node to an end point, the heuristic function is defined according to fault propagation characteristics and comprises network distance, signal delay, fault propagation probability and evaluation function f (n)) =g (n) +h (n), wherein the network distance is based on the shortest path distance between nodes in a topological model, the signal delay is the minimum possible value of the signal delay between the evaluation nodes, the fault propagation probability is the probability of fault occurrence of the evaluation nodes according to historical data or a machine learning model;
Initializing a path searching step of an algorithm A, namely putting a starting point node into an open list, setting g (n) =0, f (n) =h (n) of the starting point, setting the closed list as empty, circularly searching, namely selecting a node with the minimum f (n) from the open list as a current node, stopping searching and returning a path if the current node is an end point node, otherwise, moving the current node from the open list to the closed list, updating neighbor nodes, namely traversing all neighbor nodes of the current node, if the neighbor nodes are in the closed list, skipping, if the neighbor nodes are not in the open list, adding the neighbor nodes into the open list, calculating g (n), h (n) and f (n) of the neighbor nodes, recording the current node as father nodes, if the neighbor nodes are already in the open list, checking whether the new path is superior to the current path, updating the g (n) and f (n) of the neighbor nodes, and setting the current node as father nodes, and stopping the end point node, namely finding the optimal path when the end point node is added into the closed list, and if the open list is empty;
S304, removing noise data and invalid paths, namely dynamically adjusting path weights, updating edge weights in real time according to the dynamic propagation characteristics of fault signals in the searching process, removing paths with overlarge weights, verifying path validity, screening and verifying candidate paths by using a machine learning model, removing paths which are not matched with a historical fault mode, and removing paths which do not accord with the fault propagation rules by combining causality analysis;
And S305, returning to the optimal path, namely outputting the shortest path from the starting point to the end point, wherein the nodes contained in the path represent key nodes through which the fault propagates, and providing the total cost of the path and the confidence assessment of the end point nodes.
Preferably, in S3, the specific steps of further analyzing the topology path in combination with the machine learning model include the following:
S31, data preparation, namely preprocessing a fault propagation path and related data thereof before combining a machine learning model, so as to ensure the integrity and the effectiveness of model input data;
S311, collecting data, namely extracting candidate paths from the output of a shortest path algorithm, wherein the candidate paths comprise path length, path weight sum and the number of nodes passing through, the node characteristics comprise node state change information comprising current, voltage, signal strength and load capacity, historical fault rate or health score of the nodes, time sequence characteristics comprise time sequence data of fault propagation, edge characteristics comprise connection characteristics of each edge and influence of external environment conditions on signal propagation, and the edge characteristics comprise signal attenuation, delay and reliability;
S312, if historical fault data exist, marking whether a path is related to an actual fault source or not, and performing supervised learning;
S313, data preprocessing, namely normalizing the characteristic values, denoising, namely removing uncorrelated or poorer-quality data, reducing the interference of data noise on a model, and performing feature engineering, namely performing dimension reduction on complex features or constructing new features;
S32, selecting a proper machine learning model, namely selecting the proper machine learning model for path analysis according to the data characteristics and the analysis target;
S321, model selection, namely a supervised learning model, a support vector machine, a decision tree, a neural network, an unsupervised learning model, a K-means clustering model, an isolated forest and a normal propagation path model, wherein the supervised learning model is used for classifying two kinds of problems, the support vector machine is used for classifying high-dimensional data and distinguishing fault related paths from irrelevant paths, the decision tree is used for processing complex nonlinear relations and providing feature importance analysis, the gradient lifting tree is excellent in classification precision and speed and suitable for processing large-scale data, the neural network is used for modeling the complex nonlinear feature relations and needs larger data scale, and the unsupervised learning model is used for carrying out clustering analysis on paths to identify abnormal paths;
S322, training a model, namely training by using a supervised learning model if historical fault data exists, wherein the input features comprise path features, node features, time sequence features and edge features;
s33, model application and analysis, namely, further analyzing and optimizing the candidate paths by combining a machine learning model;
S331, scoring paths, namely scoring each candidate path by using a trained model, and outputting the possibility that the path is related to a fault source;
S332, analyzing the feature importance, namely, the influence of weight change on path classification, the influence of node state change on path classification, and optimizing the construction of a topology model through the feature importance analysis;
s333, verifying a fault source, namely checking whether a destination node in a path accords with a historical fault propagation rule or not for the path with higher scores;
S334, dynamically updating the model, namely collecting new fault positioning data, adding new fault samples into a training set, and periodically retraining the model to improve the accuracy and generalization capability of the model;
S34, outputting an analysis result, namely outputting an optimal path, namely outputting an optimal fault propagation path verified by a machine learning model, positioning the most probable fault source node, evaluating the confidence, namely providing the confidence of the fault source positioning result by combining the path scores output by the model, and assisting in possibly suboptimal paths for multi-fault source analysis and the causal relationship between the node state change and the path association.
Preferably, in S4, the confidence level is evaluated as that the confidence level represents the reliability and accuracy of the current positioning result, usually expressed in terms of a probability value or percentage, and the reliability reference is provided for final fault source positioning by quantifying the influence of different factors on the positioning result to help decision.
Preferably, the influence factors for determining the confidence level comprise path correlation scores, path weight characteristics, historical data support, fault propagation rule verification, multi-path consistency verification and model prediction reliability, and the confidence level for evaluating the positioning result according to fault propagation path analysis specifically comprises the following steps:
s401, scoring the path relevance, namely scoring candidate paths by using a machine learning model, wherein the higher the path score is, the higher the confidence is;
S402, path weight characteristics, namely whether the total weight of the path is in a reasonable range or not, wherein the lower the weight is, the more likely the path is an actual propagation path, and the higher the confidence is;
S403, historical data support, namely whether the current positioning result is matched with a historical fault event or not, and if the fault source node and the propagation path are consistent with a historical mode, improving the confidence;
S404, verifying a fault propagation rule, namely analyzing time series data of fault propagation, and checking whether a propagation path accords with a causal relationship or not;
S405, multi-path consistency verification, namely checking whether each path points to the same fault source node in a plurality of candidate paths, and if a plurality of high-scoring paths point to the same fault source node, improving the confidence;
s406, model prediction reliability, namely evaluating the applicability of the model to the current positioning task by combining the performance index of the machine learning model, and if the model has better performance in similar scenes, improving the confidence coefficient.
Preferably, each influence factor is quantified, and a confidence score is assigned to each influence factor, and the specific steps include that the path correlation score is obtained by directly taking the path correlation probability output by a machine learning model and is recorded as P_path, the path weight characteristic score is obtained by calculating a score according to the total weight of the path, the path weight range is set to be [ min_w, max_w ], the current path weight is set to be w, and the path weight score S_weight is obtained by calculating the score:
S_weight = 1 - (w - min_w) / (max_w - min_w);
historical data support scoring, namely calculating a support score S_history according to the frequency of similar fault events in the historical data:
S_history = count_similar_events / total_events;
Verifying and grading the fault propagation rule, namely verifying the propagation rule according to the time sequence data, and grading to 1 if the propagation path accords with the rule, or 0 if the propagation path accords with the rule;
Multipath consistency scoring, namely, if a plurality of candidate paths point to the same fault source node, the score is 1, otherwise, the score is 0, the model prediction reliability scoring is marked as S_consistency, and the model performance index is used as the score, and the model prediction reliability scoring is marked as S_model.
Preferably, the weighted average method is that the weights of all factors are respectively w1, w2, w3, w4, w5 and w6, and the weights are adjusted according to actual conditions;
The comprehensive confidence Confidence calculation formula:
Confidence = w1 * P_path + w2 * S_weight + w3 * S_history + w4 * S_causal + w5 * S_consistency + w6 * S_model; Constraint conditions: w1+w2+w3+w 4+w5+w6=1; example weight assignment: path relevance score w1=0.3, path weight characteristic score w2=0.2, historical data support score w3=0.2, propagation rule verification score w4=0.1, multipath consistency score w5=0.1, model prediction reliability score w6=0.1.
Preferably, the specific steps of confidence verification and output comprise verifying a confidence result, namely verifying calculated confidence, ensuring that the confidence value is associated with actual positioning accuracy, outputting the confidence result, namely outputting a positioning result comprising most probable fault source nodes, relevant propagation path information, the confidence of the positioning result, overhauling the position of the positioning result by overhauling personnel, meanwhile verifying the accuracy of the positioning result, repositioning the position if the positioned positions are far apart, adjusting relevant coefficients in a model according to the position corrected coordinate change, and outputting key factors influencing the confidence, namely providing a suboptimal path and the confidence thereof for multi-fault source or complex scene analysis.
Compared with the prior art, the method has the advantages that the method constructs a dynamic topology model on the basis of improving fault positioning accuracy, constructs the topology model according to the physical connection relation and running state information of various monitored networks such as a power grid, a communication network and the like, dynamically adjusts the edge weight, adjusts by utilizing the characteristics of different networks, simultaneously tracks paths in real time by utilizing fault propagation time sequence data, analyzes node state changes and causal relations, dynamically updates the path weight, optimizes paths by combining an algorithm A and machine learning model analysis verification, effectively eliminates interference data, enables the fault positioning accuracy of a power system, an industrial control network, a water supply network and the like to be more than 90%, more than 95% and more than 92% respectively, dynamically updates and optimizes the topology model to reduce redundant calculation on the aspect of improving the fault positioning efficiency, reasonably sets heuristic functions, comprehensively considers actual and estimated cost to search an optimal path, can quickly lock the position of a fault source by combining dynamic tracking, path optimization and a machine learning model, can position and recover the system operation in time when the network faults such as power, communication and industrial control are performed, is suitable for the system with the complex dynamic positioning method, the dynamic positioning efficiency is high in the aspect of enhancing the adaptability, the system can always maintain the dynamic positioning efficiency and the dynamic positioning efficiency is more than 92% by the dynamic positioning model, and the dynamic positioning efficiency is more than the dynamic positioning model has high accuracy, and the dynamic positioning efficiency is suitable for the dynamic positioning model has high real-time and can always has high precision.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings and examples, it being understood that the detailed description herein is merely a preferred embodiment of the present invention, which is intended to illustrate the present invention, and not to limit the scope of the invention, as all other embodiments obtained by those skilled in the art without making any inventive effort fall within the scope of the present invention.
Before discussing the exemplary embodiments in more detail, it should be mentioned that some exemplary embodiments are described as processes or methods depicted as flowcharts. Although a flowchart depicts operations (or steps) as a sequential process, many of the operations (or steps) can be performed in parallel, concurrently, or at the same time. Furthermore, the order of the operations may be rearranged. The process may be terminated when its operations are completed, but may have additional steps not included in the figures, the process may correspond to a method, a function, a procedure, a subroutine, etc.
The terms "first," "second," "third," "fourth," and the like in the description of the invention and in the above figures, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. It should also be understood that, in various embodiments of the present invention, the sequence number of each process does not mean the order of execution, and the order of execution of each process should be determined by its functions and internal logic, and should not constitute any limitation on the implementation process of the embodiments of the present invention.
It should be understood that in the present invention, "plurality" means two or more. "and/or" is merely a variable relationship that describes an associated object, meaning that three relationships may exist. It should be understood that in the present invention, "B corresponding to a", "a corresponding to B", or "B corresponding to a" means that B is associated with a, from which B can be determined. Determining B from a does not mean determining B from a alone, but may also determine B from a and/or other information. The matching of A and B is that the similarity of A and B is larger than or equal to a preset threshold value.
The invention relates to an A-algorithm, which is a heuristic search algorithm for finding the shortest path in an optimization problem. It combines two key factors, actual cost and heuristic evaluation, aiming at finding the most efficient path from origin to destination. In the invention, the applicant optimizes the algorithm a based on the application scenario.
Referring to fig. 1, an embodiment of the present invention provides a method for locating fault events based on topology model tracking analysis, including the following steps:
s1, constructing a topology model:
constructing a topology model of the system based on the physical connection relation and the running state information of the monitored system;
in the topology model, each edge is endowed with a weight, and the weight can be dynamically adjusted according to the connection characteristic;
s2, dynamic event tracking:
Tracking a fault propagation path in a topology model in real time by utilizing time sequence data of fault propagation, and extracting causal relationship of fault propagation by analyzing state changes of all nodes;
defining a dynamic updating rule of the propagation path weight, and dynamically adjusting the path weight;
S3, path optimization and tracing:
optimizing a fault propagation path by applying a shortest path algorithm, removing noise data and invalid paths, and locking possible fault sources;
further analyzing the topology path by combining a machine learning model, and verifying the accuracy of a fault source;
s4, outputting a positioning result:
The most likely source node of the fault and its associated path information are output while providing a confidence assessment of the positioning result.
Further improved, in step S1, the monitoring system includes a power grid, a communication network, an industrial control network, a traffic network, a water supply network, a natural gas pipeline network, a computer network, a logistics network, an energy network, an intelligent building system, a rail transportation system, a medical network, a traffic network, a water supply network, a tap water pipe network system, a natural gas pipeline network, a computer network, a data center network or a cloud computing network, a connection system comprising servers, switches and routers, a logistics network, a connection system of storage and transportation nodes, an energy network, a new energy distribution network, an intelligent building system, a building automation network, a connection between a sensor, a controller and equipment, a rail transportation system, a connection between a train, a rail and signal control system, a medical network, a networking system between medical equipment and a sensor.
Connection characteristics include resistance, communication delay, bandwidth, signal attenuation, reliability, power consumption, traffic, load capacity, delay jitter, security, physical distance, environmental impact, bandwidth, capacity of a communication link, influence on data transmission rate, signal attenuation, strength reduction of a signal in a propagation process, reliability, stability of connection, including a drop rate or a failure frequency, power consumption, energy consumption characteristics of nodes and connection, traffic, actual data or material traffic (network traffic, liquid traffic) through connection, load capacity, maximum load which can be borne by connection, including current capacity or mechanical stress, delay jitter, delayed volatility, important in particular in a communication network, security, attack resistance or confidentiality of data transmission of connection, physical distance, actual physical distance between nodes, possibly influence on propagation time, environmental impact, including influence of temperature and humidity on connection performance. The application range of the topology model can be further expanded, and the accuracy and the practicability of the fault event positioning method are improved.
The topology model is represented by a graph structure, the nodes represent devices or functional units in the system, and the edges represent connection relations among the nodes.
Further improved, in step S2, the dynamically adjusted path weight includes strength of the signal, propagation speed, frequency characteristic of the signal, attenuation coefficient of the signal, noise interference level, environmental condition, propagation medium characteristic, type of fault signal, directionality of the signal, response delay of the node to the signal, reliability or health status of the node, and timeliness of the signal;
The frequency characteristics of the signals-the fault signals exhibit different propagation characteristics over different frequency ranges, including that high frequency components of certain signals may decay faster, while low frequency components may propagate farther;
attenuation coefficient of signal, namely attenuation degree of the signal generated along with the distance or medium characteristics in the propagation process, wherein different paths possibly have different attenuation coefficients due to medium or environment differences;
noise interference level, which is the intensity of background noise or interference signals possibly existing on a path, and affects the propagation quality and reliability of signals;
environmental conditions, namely, environmental factors such as temperature, humidity, electromagnetic interference and the like influence the propagation speed and strength of signals;
Propagation medium characteristics path weights may be dynamically adjusted based on propagation medium characteristics (including resistance, capacitance, inductance of wires, or medium type of communication link);
the types of fault signals are that different types of fault signals (including short circuit, overload and broken wire) have different propagation modes and characteristics, and the calculation of path weights is affected;
directionality of signal-fault signals have a particular directionality of propagation, including in certain networks, signals preferentially propagating toward low impedance paths;
The response delay of the nodes to the signals, namely the detection or response time of the nodes to the fault signals in the topology may be inconsistent, and the delay characteristic can influence the dynamic adjustment of the weight;
The reliability or health status of the node, namely the reliability of the node itself can play an influence on the signal transmission, including that the signal transmission can be weaker or delay is larger when the node is in a sub-health status;
time-effectiveness of signals, namely time-effectiveness of fault signals, namely time sequence of arrival of signals, and is used for dynamically adjusting weights to reflect actual propagation conditions of paths;
the accuracy of dynamic adjustment of the path weight is improved, so that the performance and reliability of the fault positioning method are improved.
Further, as shown in fig. 2, in step S3, the shortest path algorithm adopts an a-th algorithm, and the specific steps for optimizing the fault propagation path are as follows:
s301, defining a search space of a problem:
The topology model is converted into a graph structure, wherein the topology model of a monitored system is expressed as a weighted graph, nodes of the graph represent elements (equipment or subnetworks) in the system, and edges represent connection relations among the nodes;
Setting a starting point and an ending point, namely selecting a node with the latest state change as the starting point according to the fault propagation time sequence;
S302, define heuristic function of a algorithm:
the core of the algorithm A is to integrate the actual cost and the heuristic cost, and the optimal path is selected in the path search, wherein the actual cost (g (n))isthe accumulated path weight from the starting point to the current node, the heuristic cost (h (n))isthe estimated cost from the current node to the end point, the heuristic function is defined according to the fault propagation characteristics and comprises network distance, signal delay, fault propagation probability and fault occurrence probability, wherein the network distance is based on the shortest path distance between nodes in a topology model, the signal delay is estimated to be the minimum possible value of the signal delay between the nodes, and the fault propagation probability is estimated according to historical data or a machine learning model;
an evaluation function (f (n)) =g (n) +h (n);
preferentially searching for a path having a minimum evaluation value;
S303, path search step of the a algorithm:
Initializing, namely putting a starting point node into an Open List (OpenList), setting g (n) =0 and f (n) =h (n) of the starting point, and setting a Closed List (Closed List) to be empty;
Updating the neighbor node, namely traversing all neighbor nodes of the current node, skipping if the neighbor node is in a closed list, adding the neighbor node into the open list if the neighbor node is not in the open list, calculating g (n), h (n) and f (n) of the neighbor node, recording the current node as a father node, checking whether a new path is superior to the current path if the neighbor node is already in the open list, updating g (n) and f (n) of the neighbor node and setting the current node as father nodes if g (n) of the new path is smaller, and stopping the process, namely finding the optimal path when the endpoint node is added into the closed list, and indicating that no feasible path exists if the open list is empty;
s304, eliminating noise data and invalid paths:
dynamically adjusting path weight, namely updating edge weight in real time according to the dynamic propagation characteristic of fault signals in the searching process;
Screening and verifying candidate paths by using a machine learning model, and removing paths which are not matched with a historical fault mode;
S305, returning to the optimal path:
outputting a shortest path from a starting point to an end point, wherein nodes contained in the path represent key nodes through which faults propagate;
Providing an overall cost (propagation time or signal loss) of the path and a confidence assessment of the end node (source of the fault).
Key points for optimization include:
the dynamic weight updating is to adjust the edge weight according to the real-time fault propagation data, so as to improve the accuracy of path searching;
heuristic function design, namely, the heuristic function must combine specific characteristics of a topological model to ensure the rationality of estimated cost;
And verifying the optimized path by combining machine learning analysis, so that the fault source positioning accuracy is further improved.
Through the steps, the algorithm A can efficiently search fault propagation paths, reject interference data and lock possible fault source nodes.
Further improved, as shown in fig. 3, in step S3, the specific steps of further analyzing the topology path in combination with the machine learning model include the following:
s31, data preparation:
before combining a machine learning model, preprocessing a fault propagation path and related data thereof, and ensuring the integrity and the effectiveness of model input data;
s311, data collection:
The topological path characteristics comprise candidate paths extracted from the output of a shortest path algorithm, wherein the candidate paths comprise path length, path weight sum and the number of nodes passing through, the node characteristics comprise node state change information comprising current, voltage, signal strength and load capacity, historical fault rate or health score of nodes, time sequence characteristics comprise time sequence data of fault propagation (arrival time and propagation delay of fault signals), edge characteristics comprise connection characteristics of each edge, including signal attenuation, delay and reliability, and external environment characteristics comprise influence of external environment conditions (temperature, humidity and electromagnetic interference) on signal propagation;
s312, data tag:
If historical fault data exist, whether the labeling path is related to an actual fault source (1 represents related and 0 represents unrelated) is judged, and the labeling path is used for supervised learning;
s313, data preprocessing:
The method comprises the steps of normalization, denoising, characteristic engineering, wherein the normalization is to perform normalization processing (path weight sum and time sequence data) on characteristic values, the denoising is to remove uncorrelated or poor quality data, and reduce the interference of data noise on a model;
s32, selecting a proper machine learning model:
Selecting a proper machine learning model for path analysis according to the data characteristics and the analysis target;
S321, model selection:
The supervised learning model comprises a logistic regression for classifying problems (whether paths are related to fault sources or not), a Support Vector Machine (SVM) for classifying high-dimensional data to distinguish fault related paths from irrelevant paths, a decision tree for processing complex nonlinear relations and providing feature importance analysis, a gradient lifting tree (XGBoost, lightGBM) for processing large-scale data, a neural network for modeling complex nonlinear feature relations and needing large data scale, wherein the logistic regression is used for classifying the problems (whether the paths are related to the fault sources or not);
Unsupervised learning model:
the K-means clustering performs clustering analysis on the paths to identify abnormal paths, and an isolated Forest (Isolation Forest) is used for detecting abnormal paths different from a normal propagation path mode;
s322, model training:
if the historical fault data exists, training is carried out by using a supervised learning model:
the input features comprise path features, node features, time sequence features and edge features, and output labels, namely whether the path is related to a fault source or not;
If tag data is absent, an unsupervised learning model is used:
based on the input features, learning a distribution pattern of the paths, and identifying abnormal paths;
S33, model application and analysis:
further analyzing and optimizing the candidate paths by combining a machine learning model;
s331, path scoring:
each candidate path is scored by using a trained model, the probability (confidence) that the path is related to the fault source is output, and paths (low scoring paths) which are possibly noise data are eliminated according to the scoring result.
S332, feature importance analysis:
influence of weight change on path classification, influence of node state change on path classification, and construction of a topology model is optimized through feature importance analysis;
s333, fault source verification:
Checking whether the destination nodes in the paths conform to the historical fault propagation rules or not for paths (possible fault propagation paths) with higher scores;
s334, dynamically updating the model:
collecting new fault positioning data, adding new fault samples into a training set, and periodically retraining a model to improve the accuracy and generalization capability of the model;
S34, outputting an analysis result:
Outputting an optimal fault propagation path verified by a machine learning model;
the fault source node is used for positioning the most probable fault source node;
Confidence assessment, namely providing the confidence of the fault source positioning result by combining the path scores output by the model;
Auxiliary information, namely possible suboptimal paths for multi-fault source analysis, and causal relation of node state change and path association.
Through the steps, the noise data and the invalid paths can be effectively removed by combining the machine learning model, the accuracy of the fault source is further verified, and the reliability and the accuracy of the fault positioning method are improved.
Further more, in step S4, the confidence level is evaluated as that the confidence level represents the reliability and accuracy of the current positioning result, usually expressed by a probability value (such as a fraction between 0 and 1) or a percentage, and the reliability reference is provided for final fault source positioning by quantifying the influence of different factors on the positioning result to help decision.
Further improved, as shown in fig. 4, the influence factors for determining the confidence level include path correlation score, path weight characteristics, historical data support, fault propagation rule verification, multipath consistency verification and model prediction reliability;
According to the fault propagation path analysis, the confidence level for evaluating the positioning result specifically comprises the following steps:
s401, scoring the path relevance, namely scoring candidate paths by using a machine learning model, wherein the higher the path score is, the higher the confidence is;
S402, path weight characteristics, namely whether the total weight (propagation delay and signal attenuation) of the path is in a reasonable range or not, wherein the lower the weight (representing smaller propagation cost), the more likely the path is an actual propagation path, and the higher the confidence level;
S403, historical data support, namely whether the current positioning result is matched with a historical fault event or not, and if the fault source node and the propagation path are consistent with a historical mode, improving the confidence;
S404, verifying a fault propagation rule, namely analyzing time series data of fault propagation, and checking whether a propagation path accords with a causal relationship (signal propagation direction and time sequence);
S405, multi-path consistency verification, namely checking whether each path points to the same fault source node in a plurality of candidate paths, and if a plurality of high-scoring paths point to the same fault source node, improving the confidence;
s406, model prediction reliability, namely evaluating the applicability of the model to the current positioning task by combining performance indexes (accuracy, precision and recall) of the machine learning model, and if the model has better performance in similar scenes, improving the confidence.
Further improved, each influencing factor is quantified, and a confidence score is assigned to each influencing factor, and the specific steps comprise the following steps:
directly taking path correlation probability output by a machine learning model and marking the path correlation probability as P_path;
Path weight feature scoring:
Calculating a score according to the total weight of the path:
Setting the path weight range as [ min_w, max_w ], and setting the current path weight as w;
Path weight score s_weight:
S_weight = 1 - (w - min_w) / (max_w - min_w);
(the closer the weight is to the minimum, the higher the score);
Historical data support scoring:
According to the frequency of similar fault events in the historical data, calculating a support degree score S_history:
S_history = count_similar_events / total_events;
(the higher the proportion of similar events, the higher the score);
Fault propagation law verification scoring:
Verifying a propagation rule according to the time sequence data, and grading to 1 if the propagation path accords with the rule, or 0 if the propagation path accords with the rule;
multipath consistency score:
if multiple candidate paths (such as paths 3 before scoring) point to the same fault source node, scoring the path as 1, otherwise, scoring the path as 0;
Model predictive reliability scoring:
Model performance index (historical accuracy accuracy) was used as a score, denoted s_model.
Further improved, the whole confidence coefficient is calculated by combining the scores, and the weighted average method specifically comprises the following steps:
the weighted average method is to set the weights of all factors as w1, w2, w3, w4, w5 and w6 respectively, and the weights can be adjusted according to actual conditions;
The comprehensive confidence Confidence calculation formula:
Confidence = w1 * P_path + w2 * S_weight + w3 * S_history + w4 * S_causal + w5 * S_consistency + w6 * S_model;
Constraint conditions: w1+w2 +w3+w +w3 +w;
Example weight assignment: path relevance score w1=0.3, path weight characteristic score w2=0.2, historical data support score w3=0.2, propagation rule verification score w4=0.1, multipath consistency score w5=0.1, model prediction reliability score w6=0.1.
Specifically, the specific steps of the confidence verification and output comprise the following steps:
Verifying a confidence result:
verifying the calculated confidence coefficient, and ensuring that the confidence coefficient value is associated with the actual positioning accuracy;
through historical data verification, when the confidence coefficient is higher than a certain threshold value (such as 0.8), whether the positioning accuracy is obviously improved or not;
Outputting a confidence coefficient result, namely outputting a positioning result comprising the most probable fault source node, related propagation path information and the confidence coefficient (Confidence =0.85 or 85%) of the positioning result;
Auxiliary information is provided, namely key factors (path scores and historical data support) influencing the confidence are output, and suboptimal paths and the confidence thereof are provided for multi-fault source or complex scene analysis.
Through the steps, the confidence evaluation can provide clear reliability quantification for the positioning result, aid decision making and improve the reliability and practicability of the fault event positioning method.
Example 1 fault location in an electric Power System
The topology model construction comprises the steps of obtaining real-time operation data of a power system, including connection relations and operation states of equipment such as a transformer substation, a circuit, a switch and the like, constructing a topology model of the power network, wherein nodes represent the transformer substation and the switch equipment, edges represent the power circuit, and edge weights are dynamically adjusted according to the impedance and the load condition of the circuit;
The dynamic event tracking comprises the steps of collecting time sequence data of fault signals when a short circuit fault occurs somewhere in a power system, tracking layer by layer from the starting point of fault signal propagation in a topology model, and recording propagation paths;
Path optimization and tracing, namely optimizing a propagation path by applying a shortest path algorithm, and eliminating an interference path caused by noise;
and outputting a fault source node and a propagation path thereof, wherein the positioning accuracy reaches more than 90%.
Example 2 fault location in a communication network
The topology model construction comprises the steps of obtaining topology information of a communication network, including connection relation and flow data of a router, a switch, terminal equipment and the like, constructing a dynamic topology model of the communication network, and dynamically adjusting edge weight according to delay, packet loss rate and the like of a link;
The dynamic event tracking is to collect the propagation path data of a fault signal when a certain node in the network breaks down;
path optimization and tracing, namely removing abnormal data by utilizing time sequence characteristics of fault propagation, and optimizing a propagation path;
And outputting the result, namely outputting the most probable fault node and providing a visualized result of path optimization.
Example 3 Fault location in an Industrial control network
The topology model construction comprises the steps of obtaining real-time data of an industrial control system, including connection relation among a PLC (programmable logic controller), a sensor and an actuator and state information thereof, constructing a topology model of an industrial control network, wherein nodes represent control equipment, edges represent signal transmission channels, and edge weight is dynamically adjusted according to characteristics such as delay, bandwidth and the like;
In the topology model, signal propagation paths are tracked layer by layer from a fault node, and state change and key node information are recorded;
Path optimization and tracing, namely optimizing a fault propagation path by applying a shortest path algorithm, and eliminating invalid paths caused by environmental interference or data abnormality;
and outputting the positioning result, namely outputting the most probable fault source node and the propagation path thereof, and simultaneously providing confidence evaluation of the fault positioning result, so as to ensure that the positioning accuracy reaches more than 95%.
Example 4 fault location in a traffic network
Constructing a topology model of the traffic network, wherein nodes represent intersections, edges represent roads, and edge weights are dynamically adjusted according to traffic flow, road conditions and running speeds;
In the topology model, tracking an influence area from a fault occurrence point, and recording traffic flow changes and key nodes;
Optimizing and tracing the path, namely optimizing traffic flow in an influence area by utilizing a shortest path algorithm, and eliminating a congestion path caused by an accident;
And outputting the positioning result, namely outputting the most probable fault node and the influence path thereof, providing a visual result of traffic flow optimization, and ensuring that the accuracy of the decision support system reaches more than 90 percent.
Example 5 Fault location in Water supply network
The topology model construction comprises the steps of acquiring real-time monitoring data of a water supply system, including the connection relation of a water source, a pipeline, a valve and a water meter and the water flow state, constructing the topology model of the water supply network, wherein nodes represent water source and pipeline junction points, and represent water flow paths, and the side weight is dynamically adjusted according to the diameter, the water pressure and the flow of the pipeline;
in the topology model, the change of water flow is tracked, the water flow expands outwards from a fault point, and affected nodes and paths are recorded;
Path optimization and tracing, namely optimizing a water flow propagation path by applying a shortest path algorithm, and eliminating an interference path caused by noise or abnormal water flow;
and outputting the positioning result, namely outputting the most probable fault source node and related path information thereof, and simultaneously providing confidence evaluation of the positioning result, so as to ensure that the positioning accuracy reaches more than 92%.
In summary, by constructing a dynamically updated system topology model and combining the time sequence characteristics and the network characteristics of a fault propagation path, the fault source position is quickly locked, the accurate positioning of a fault event is realized, and the method can be widely applied to complex networks such as a power system, a communication network, an industrial control system and the like, and the quick and accurate positioning of the fault event is realized;
The fault propagation path is tracked through the dynamic topology model, so that a fault source can be accurately positioned in a complex network, the positioning precision is remarkably improved, redundant calculation is reduced by dynamic updating and optimization of the topology model, the real-time performance of fault positioning is improved, and the self-adaptive tracking and optimization of the fault propagation path are realized by combining the dynamic topology model with a fault propagation path optimization algorithm, so that noise interference can be effectively filtered.
The above embodiments are preferred embodiments of the present invention for tracking and analyzing fault event localization based on topology model, and are not limited to the specific embodiments of the present invention, but the scope of the present invention includes not limited to the present embodiments, and all equivalent changes of shape and structure according to the present invention are within the scope of the present invention. Those skilled in the art will appreciate that implementing all or part of the above-described embodiments of the method may be accomplished by computer programs to instruct the associated hardware, and the programs may be stored on a computer readable storage medium, which when executed may include the steps of the various embodiments of the method for analyzing and processing piezoelectric energy based on metrology data as described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM) or the like.