CN117370052A - Microservice fault analysis method, device, equipment and storage medium - Google Patents
Microservice fault analysis method, device, equipment and storage medium Download PDFInfo
- Publication number
- CN117370052A CN117370052A CN202311190434.8A CN202311190434A CN117370052A CN 117370052 A CN117370052 A CN 117370052A CN 202311190434 A CN202311190434 A CN 202311190434A CN 117370052 A CN117370052 A CN 117370052A
- Authority
- CN
- China
- Prior art keywords
- service
- fault
- micro
- target
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/079—Root cause analysis, i.e. error or fault diagnosis
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Quality & Reliability (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Debugging And Monitoring (AREA)
Abstract
The invention belongs to the field of computers and discloses a method, a device, equipment and a storage medium for analyzing micro-service faults. The method comprises the following steps: when a fault analysis request is received, acquiring historical log data corresponding to the fault analysis request; determining an abnormal characteristic identifier according to the fault analysis request, and extracting target characteristic data in the history log data according to the abnormal characteristic identifier; and judging whether the target micro-service has faults or not based on a preset fault judging rule and the target characteristic data. According to the method, target characteristic data in history log data are extracted according to abnormal characteristic identification; and judging whether the target micro-service has faults or not based on a preset fault judging rule and target characteristic data. Compared with the existing mode of embedding a large number of monitoring codes into each micro service needing fault analysis to carry out hard coupling analysis on whether the micro service is faulty or not, the fault analysis efficiency of the micro service can be improved through the mode.
Description
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a method, an apparatus, a device, and a storage medium for analyzing a micro service failure.
Background
With the promotion of open source and cloud computing, the technical threshold of the cloud primary micro-service as a core is greatly reduced, and penetration into various industries is started. More and more enterprises begin to adopt new generation technologies such as micro services on system selection to accelerate the digitalized transformation of enterprises. As the number of components of micro services starts to grow, the finer the granularity of servicing, in micro service architecture, each service needs to be independently configured, deployed, monitored, and journaled. Many monitoring difficulties are brought during intersystem call, and conventional fault monitoring requires that a large number of monitoring codes be embedded in each micro service for hard coupling, resulting in additional network overhead and reduced code readability. Therefore, how to efficiently monitor the micro service and discover the failure of the micro service in time becomes a technical problem to be solved urgently.
The foregoing is provided merely for the purpose of facilitating understanding of the technical solutions of the present invention and is not intended to represent an admission that the foregoing is prior art.
Disclosure of Invention
The invention mainly aims to provide a micro-service fault analysis method, a device, equipment and a storage medium, and aims to solve the technical problem that the monitoring effect of micro-service faults is not high in the prior art.
To achieve the above object, the present invention provides a micro service fault analysis method, which includes the steps of:
when a fault analysis request is received, acquiring historical log data corresponding to the fault analysis request;
determining an abnormal characteristic identifier according to the fault analysis request, and extracting target characteristic data in the history log data according to the abnormal characteristic identifier;
and judging whether the target micro-service has faults or not based on a preset fault judging rule and the target characteristic data.
Optionally, the step of determining whether the target micro service has a fault based on a preset fault determination rule and the target feature data includes:
when the abnormal feature identifier is a service abnormal feature identifier, acquiring normal service data in the preset fault judgment rule;
comparing the normal service data with the target feature data to obtain a comparison result;
and when the comparison result is that the data are inconsistent, judging that the target micro-service has faults, and generating a fault analysis result according to the difference information in the comparison result.
Optionally, after the step of determining that the target micro service has a fault and generating a fault analysis result according to the difference information in the comparison result when the comparison result is inconsistent, the method further includes:
determining abnormal data in the comparison result, and extracting abnormal characteristics in the abnormal data;
determining an influence score of the abnormal feature based on a preset feature weight;
and determining the fault influence degree according to the influence score, and carrying out early warning according to the fault influence degree.
Optionally, the step of determining an abnormal feature identifier according to the fault analysis request and extracting target feature data in the history log data according to the abnormal feature identifier includes:
determining an abnormal characteristic identifier corresponding to the fault analysis request;
determining an abnormal feature extraction strategy based on the abnormal feature identification;
and extracting target feature data in the history log data according to the abnormal feature extraction strategy and the abnormal feature identification.
Optionally, after the step of determining whether the target micro service has a fault based on the preset fault determination rule and the target feature data, the method further includes:
acquiring performance information of the target micro-service when judging that the target micro-service has faults;
determining an optimization direction of the target micro-service according to the performance information;
and generating an optimization strategy according to the optimization direction.
Optionally, after the step of determining whether the target micro service has a fault based on the preset fault determination rule and the target feature data, the method further includes:
when the fault exists in the target micro-service, determining a fault-tolerant strategy corresponding to the target micro-service;
when the fault-tolerant policy is a switching service, acquiring a standby service corresponding to the target micro-service;
enabling the backup service to replace the target micro service.
Optionally, the step of determining the fault-tolerant policy corresponding to the target micro-service when determining that the target micro-service has a fault includes:
when the target micro-service is judged to have faults, fault information is obtained;
determining a fault node according to the fault information;
determining the influence degree of the fault node on the target micro-service according to the function information of the fault node;
and generating a fault-tolerant strategy corresponding to the target micro-service according to the influence degree.
In addition, to achieve the above object, the present invention also provides a micro service failure analysis apparatus, the apparatus comprising:
the receiving module is used for acquiring historical log data corresponding to the fault analysis request when the fault analysis request is received;
the extraction module is used for determining an abnormal characteristic identifier according to the fault analysis request and extracting target characteristic data in the history log data according to the abnormal characteristic identifier;
and the fault analysis module is used for judging whether the target micro-service has faults or not based on a preset fault judgment rule and the target characteristic data.
In addition, to achieve the above object, the present invention also proposes a micro-service failure analysis apparatus, the apparatus comprising: a memory, a processor, and a micro-service failure analysis program stored on the memory and executable on the processor, the micro-service failure analysis program configured to implement the steps of the micro-service failure analysis method as described above.
In addition, in order to achieve the above object, the present invention also proposes a storage medium having stored thereon a micro service failure analysis program which, when executed by a processor, implements the steps of the micro service failure analysis method as described above.
When a fault analysis request is received, acquiring historical log data corresponding to the fault analysis request; determining an abnormal characteristic identifier according to the fault analysis request, and extracting target characteristic data in the history log data according to the abnormal characteristic identifier; and judging whether the target micro-service has a fault or not based on a preset fault judging rule and the target characteristic data, and obtaining a fault analysis result. According to the method, target characteristic data in the history log data are extracted according to the abnormal characteristic identification; and judging whether the target micro-service has faults or not based on a preset fault judging rule and the target characteristic data. Compared with the existing mode of embedding a large number of monitoring codes into each micro service needing fault analysis to carry out hard coupling analysis on whether the micro service is faulty or not, the fault analysis efficiency of the micro service can be improved through the mode.
Drawings
FIG. 1 is a schematic diagram of a micro-service failure analysis device of a hardware operating environment according to an embodiment of the present invention;
FIG. 2 is a flow chart of a first embodiment of a micro service failure analysis method according to the present invention;
FIG. 3 is a flow chart of a second embodiment of the micro service fault analysis method of the present invention;
FIG. 4 is a flow chart of a third embodiment of a micro service failure analysis method according to the present invention;
fig. 5 is a block diagram of a first embodiment of a micro service failure analysis apparatus according to the present invention.
The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Referring to fig. 1, fig. 1 is a schematic structural diagram of a micro-service fault analysis device in a hardware running environment according to an embodiment of the present invention.
As shown in fig. 1, the micro service failure analysis apparatus may include: a processor 1001, such as a central processing unit (Central Processing Unit, CPU), a communication bus 1002, a user interface 1003, a network interface 1004, a memory 1005. Wherein the communication bus 1002 is used to enable connected communication between these components. The user interface 1003 may include a Display, an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may further include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a Wireless interface (e.g., a Wireless-Fidelity (WI-FI) interface). The Memory 1005 may be a high-speed random access Memory (Random Access Memory, RAM) or a stable nonvolatile Memory (NVM), such as a disk Memory. The memory 1005 may also optionally be a storage device separate from the processor 1001 described above.
Those skilled in the art will appreciate that the structure shown in fig. 1 does not constitute a limitation of the microservice failure analysis apparatus, and may include more or fewer components than shown, or may combine certain components, or a different arrangement of components.
As shown in fig. 1, an operating system, a network communication module, a user interface module, and a micro service failure analysis program may be included in the memory 1005 as one type of storage medium.
In the micro service failure analysis apparatus shown in fig. 1, the network interface 1004 is mainly used for data communication with a network server; the user interface 1003 is mainly used for data interaction with a user; the processor 1001 and the memory 1005 in the micro service fault analysis apparatus of the present invention may be provided in the micro service fault analysis apparatus, and the micro service fault analysis apparatus calls the micro service fault analysis program stored in the memory 1005 through the processor 1001 and executes the micro service fault analysis method provided by the embodiment of the present invention.
Based on the foregoing micro service fault analysis device, an embodiment of the present invention provides a micro service fault analysis method, and referring to fig. 2, fig. 2 is a flow chart of a first embodiment of the micro service fault analysis method of the present invention.
In this embodiment, the method for analyzing the micro service fault includes the following steps:
step S10: when a fault analysis request is received, historical log data corresponding to the fault analysis request is obtained.
It should be noted that, the execution body of the embodiment may be a computing service device with functions of data processing, network communication and program running, such as a mobile phone, a tablet computer, a personal computer, or an electronic device or a micro-service fault analysis device capable of implementing the above functions. The present embodiment and the following embodiments will be described below by taking the foregoing micro-service failure analysis apparatus as an example.
It should be noted that, the fault analysis request may be a command generated by a user to monitor whether the target micro service is faulty, which may include an identifier of the micro service to be monitored, a fault type to be monitored, a service to be monitored, and the like. For example, whether a code failure occurs in the target micro-service 404, a null pointer, etc., may be monitored, and whether a certain business in the target micro-service is abnormal may be monitored. The historical log data corresponding to the fault analysis request may be log data generated by a target micro-service in a monitoring period corresponding to the fault analysis request.
Step S20: and determining an abnormal characteristic identifier according to the fault analysis request, and extracting target characteristic data in the history log data according to the abnormal characteristic identifier.
It should be noted that, the abnormal feature identifier may be a feature of a fault to be monitored, for example, whether a null pointer abnormality occurs in the target micro service to be monitored, and the abnormal feature identifier may be a nullpointexception; if it is to be monitored whether the target micro-service is abnormal 404, the abnormal feature identifier may be 404, and if it is to be monitored whether a certain service is abnormal, the abnormal feature identifier may be a relevant feature of the service to be monitored, for example, parameters, fields, names, etc. possibly related to the service.
It should be appreciated that to improve compatibility and adaptability of the targeted micro-service, the targeted micro-service may also help manage service level agreements with clients via Service Level Agreements (SLAs) when network requests are received, improving service reliability and stability. An abstract interface adapter and a general data packet structure format can be defined, and when the received network request is inconsistent with the data format and language of the target micro-service, the data conversion is carried out on the network request, so that the target micro-service can identify the network request.
Further, in order to accurately determine whether the target micro service has a fault, the step of determining an abnormal feature identifier according to the fault analysis request and extracting target feature data in the history log data according to the abnormal feature identifier includes:
determining an abnormal characteristic identifier corresponding to the fault analysis request;
determining an abnormal feature extraction strategy based on the abnormal feature identification;
and extracting target feature data in the history log data according to the abnormal feature extraction strategy and the abnormal feature identification.
It should be noted that the abnormal feature extraction strategy includes statistical features, frequency domain features, time domain features, and the like. The determining the abnormal feature extraction policy based on the abnormal feature identifier may be determining the abnormal feature extraction policy according to a fault determination condition corresponding to the abnormal feature identifier. For example, if the fault determination condition corresponding to the abnormal feature identifier is that the fault frequency is greater than a preset frequency threshold, the abnormal feature extraction strategy may perform feature extraction according to a frequency domain feature extraction manner; if the fault determination condition is that the total number of faults is greater than a preset fault number threshold, the abnormal feature extraction strategy may perform feature extraction according to a statistical feature extraction mode.
Step S30: and judging whether the target micro-service has faults or not based on a preset fault judging rule and the target characteristic data.
It should be noted that, the determining whether the target micro service has a fault based on the preset fault determination rule and the target feature data may be determining whether the target feature data meets a fault determination condition in the preset fault determination rule, and if yes, determining that the target micro service has a fault. The preset fault judgment rule comprises fault judgment conditions corresponding to various types of faults. For example, when the failure analysis request is to detect whether the target micro-service has a failure 404, the failure determination condition in the preset failure determination rule may be that the failure 404 occurs more than 3 times in the preset period, and then it is determined that the target micro-service has a failure 404. When the fault analysis request is a fault that whether the target micro-service fails to acquire the service data or not, the data sources of the data request result corresponding to the service data acquisition request may be multiple, that is, the data needs to be requested from multiple data sources, the requested data is spliced to obtain the data request result, wherein a plurality of data sources may not have data, but all the data sources may not have data, and at this time, the fault determination condition in the preset fault determination rule may be that when the data source does not have data exceeding a preset threshold value, the fault that fails to acquire the service data is determined. When the fault analysis request is to detect whether the target micro-service is full of memory, the fault judgment condition in the preset fault judgment rule can be that more than 3 times of faults with full memory occur in a preset period, and the fault judgment condition is that the target micro-service is full of memory.
In a specific implementation, if a user wants to monitor hardware and/or service faults in the target micro-service, relevant information required to be acquired when judging whether the faults exist can be preset, the target micro-service prints the relevant information in a log, and then an abnormal feature identifier corresponding to the faults is set according to feature information in the faults, so that fault information related to the faults is extracted from the log printed by the target micro-service by using the abnormal feature identifier. And then judging whether the fault exists in the target micro-service or not by using a fault judging condition corresponding to the fault in a preset fault judging rule and fault information related to the fault.
In this embodiment, after determining that the target micro-service has a fault based on the preset fault determination rule and the target feature data, a dependency relationship between the micro-services may also be obtained, and an influence range of the fault is determined according to the dependency relationship between the micro-services, so as to form a netlike dependency source, so as to analyze a root cause of the fault and perform early warning according to the influence range of the fault. In this embodiment, a simulated fault injection test may also be introduced to evaluate the fault tolerance and robustness of the target microservice, and to discover potential weaknesses.
Further, in order to improve the experience of the user, after the step of determining whether the target micro service has a fault based on the preset fault determination rule and the target feature data, the method further includes:
acquiring performance information of the target micro-service when judging that the target micro-service has faults;
determining an optimization direction of the target micro-service according to the performance information;
and generating an optimization strategy according to the optimization direction.
It should be noted that the performance information may include information such as a message passing mode, an event driven architecture, a memory usage rate, a CPU occupation, a communication protocol, a time consumption of a processing task, a structure of a database, a code quality, whether there is a cache, and the like of the target micro service. And determining the optimization direction of the target micro-service according to the performance information, determining the performance bottleneck of the target micro-service by analyzing the performance information, and determining the optimization direction according to the performance bottleneck. The optimization direction may include: asynchronous communication: and an asynchronous message passing mode or an event driven architecture is used, so that the waiting time brought by synchronous calling is reduced, and the concurrency performance of the system is improved. Caching data: for frequently accessed data, buffering can be used to reduce the number of requests for back-end services and improve response speed. Horizontal expansion: according to the system load condition, the micro service is horizontally expanded by adding more examples, and the processing capacity and throughput of the system are improved. Reduced communication protocol: and the communication protocol between the micro services is optimized, and the communication overhead and the data transmission quantity are reduced. Asynchronous processing: the operation with long time consumption is designed to be an asynchronous task, and the response speed is improved through background processing. Database optimization: and the database structure is reasonably designed, and the read-write performance of the database is improved by using proper index and query optimization skills. Service splitting: the large micro-service is split into smaller and more concentrated service units, so that the scalability and flexibility of the system are improved. And (3) resource management: resource consumption of the micro-service, including memory, CPU, etc., is reasonably managed, and resource waste and bottleneck are avoided. Code optimization: and performing performance analysis and optimization on the codes of the micro services, reducing unnecessary calculation, circulation and resource occupation, and improving the code execution efficiency. Monitoring and optimizing: the monitoring tool is used for monitoring the micro-service in real time, timely finding out performance bottleneck and abnormal conditions, and performing targeted tuning. And (3) pressure test: and performing pressure test of the system, simulating scenes with high concurrency and large data volume, evaluating performance and bottleneck of the system, and finding out an optimized space. Capacity planning: and (3) carrying out reasonable capacity planning according to the service requirements and the predicted load conditions, and ensuring that the micro-service can meet future expansion requirements.
When a fault analysis request is received, the embodiment acquires history log data corresponding to the fault analysis request; determining an abnormal characteristic identifier according to the fault analysis request, and extracting target characteristic data in the history log data according to the abnormal characteristic identifier; and judging whether the target micro-service has a fault or not based on a preset fault judging rule and the target characteristic data, and obtaining a fault analysis result. Since this embodiment extracts the target feature data in the history log data according to the abnormal feature identifier; and judging whether the target micro-service has faults or not based on a preset fault judging rule and the target characteristic data. Compared with the existing mode of embedding a large number of monitoring codes into each micro service needing fault analysis to conduct hard coupling analysis on whether the micro service is faulty or not, the fault analysis efficiency of the micro service can be improved through the mode.
Referring to fig. 3, fig. 3 is a flowchart illustrating a second embodiment of a microservice fault analysis method according to the present invention.
Based on the first embodiment, in this embodiment, the step S30 includes:
step S301: and when the abnormal feature identifier is a service abnormal feature identifier, acquiring normal service data in the preset fault judgment rule.
It should be noted that, when the fault to be monitored in the fault analysis request has the preset normal service data in the normal state in the preset fault determination rule, the fault to be monitored corresponding to the fault analysis request may be determined as the service fault. The feature or key word corresponding to the business fault to be monitored and used for extracting fault related information from the target micro-service is the business abnormal feature identification. And when a fault analysis request is received, matching is carried out in a preset fault judgment rule according to the abnormal feature identifier in the fault analysis request to obtain a matching result, and if the matching result is that the abnormal feature identifier is a service abnormal feature identifier, normal service data corresponding to the service abnormal feature identifier is extracted. For some service faults to be detected, the log in the target micro-service is not defaulted to print sometimes, so when the service faults which may not print in the log in the target micro-service are to be monitored, the target micro-service needs to be set so that the target micro-service prints information related to the service.
Step S302: and comparing the normal service data with the target feature data to obtain a comparison result.
It should be noted that, the comparing the normal service data with the target feature data may be comparing the normal service data with the target feature data in a text manner, determining whether the normal service data and the target feature data are the same, if they are different, obtaining difference data of the normal service data and the target feature data, and generating a comparison result.
Step S303: and when the comparison result is that the data are inconsistent, judging that the target micro-service has faults, and generating a fault analysis result according to the difference information in the comparison result.
It should be noted that, the generating the fault analysis result according to the difference information in the comparison result may be determining difference data in normal service data and the target feature data according to the difference information, determining a cause of a possible target micro-service fault according to a source of the difference data, and generating the fault analysis result according to the cause of the fault.
In a specific implementation, when the comparison result is that the normal service data and the target feature data are inconsistent, determining that the target micro-service has a fault, determining a source or a generation path of difference information in the comparison result, determining a fault cause which possibly causes the fault according to the source or the generation path of the difference information, and generating a fault analysis result according to the fault cause.
Further, in order to timely early warn the fault when the fault occurs, after the step of determining that the target micro-service has the fault and generating the fault analysis result according to the difference information in the comparison result when the comparison result is that the data is inconsistent, the method further includes:
determining abnormal data in the comparison result, and extracting abnormal characteristics in the abnormal data;
determining an influence score of the abnormal feature based on a preset feature weight;
and determining the fault influence degree according to the influence score, and carrying out early warning according to the fault influence degree.
The abnormal feature may be a parameter value, a field, a keyword, or the like in the abnormal data. The preset feature weight may be a preset degree of influence of each abnormal feature on the target micro-service. For example, for a data acquisition task, the feature weight of a primary key in data is higher, the feature weights of other fields are lower, the feature weight of certain key information can also be set higher, the feature weight of other additional information is lower, and specifically, the feature weight of the additional information can be set in a self-defined manner according to an actual scene, and the determining the influence score of the abnormal feature based on the preset feature weight can be performed by adding feature weight values corresponding to each abnormal feature in the abnormal data to obtain the influence score of the abnormal feature. The determining the fault influence degree according to the influence score may be determining that the fault influence degree is high when the influence score is greater than a preset first threshold; when the influence score is smaller than a preset second threshold value, judging that the influence degree of the fault is low; when the impact score is smaller than or equal to a preset first threshold value and larger than or equal to a preset second threshold value, judging the impact degree of the fault as middle, wherein the preset first threshold value and the preset second threshold value can be set in a self-defined mode, and early warning according to the impact degree of the fault can be performed according to an early warning strategy corresponding to the preset impact degree of the fault.
In this embodiment, when the abnormal feature identifier is a service abnormal feature identifier, normal service data in the preset fault determination rule is obtained; comparing the normal service data with the target feature data to obtain a comparison result; and when the comparison result is that the data are inconsistent, judging that the target micro-service has faults, and generating a fault analysis result according to the difference information in the comparison result. According to the method and the device, the normal business data are compared with the target feature data, so that whether the target micro-service has faults or not is judged according to the comparison result, the fault analysis efficiency of the target micro-service can be improved, and the faults of the target micro-service can be found in time.
Referring to fig. 4, fig. 4 is a flowchart illustrating a third embodiment of a microservice fault analysis method according to the present invention.
Based on the above embodiments, in this embodiment, after step S30, the method further includes:
step S40: and when the fault exists in the target micro-service, determining a fault-tolerant strategy corresponding to the target micro-service.
It should be noted that, the fault tolerant policy includes switching to a standby service instance or adopting a recovery mechanism, where the recovery mechanism may be restarting the target micro-service or performing node switching according to the failed node.
Further, when it is determined that the target micro-service has a fault, the step of determining a fault tolerance policy corresponding to the target micro-service includes:
when the target micro-service is judged to have faults, fault information is obtained;
determining a fault node according to the fault information;
determining the influence degree of the fault node on the target micro-service according to the function information of the fault node;
and generating a fault-tolerant strategy corresponding to the target micro-service according to the influence degree.
The fault information may be target feature data in the history log data extracted according to the abnormal feature identifier when it is determined that the target micro service has a fault. The determining of the fault node according to the fault information may be determining a target node corresponding to the fault according to the fault information, and the target node may be a node possibly causing the fault. The determining the influence degree of the fault node on the target micro-service according to the function information of the fault node may be determining the influence degree of the fault node on the target micro-service according to the function information querying a preset function influence degree mapping table, and the preset function influence degree mapping table may include the influence degree of each function corresponding to the micro-service in the target micro-service. The generating the fault-tolerant policy corresponding to the target micro-service according to the influence degree may be determining that the fault-tolerant policy is a switching service when the influence degree is medium or high, that is, replacing the target micro-service with a standby service. And when the influence degree is low, judging that the fault-tolerant strategy is used for switching or restarting the service for the node, namely adopting a standby node to replace the fault node or restarting the target micro-service.
Step S50: and when the fault-tolerant policy is switching service, acquiring standby service corresponding to the target micro-service.
The switching service may be a service instance for switching to a standby. The standby service is consistent with the function of the target micro service, and can replace the target micro service when the target micro service fails.
Step S60: enabling the backup service to replace the target micro service.
It should be noted that, the enabling the standby service to replace the target micro-service may be enabling the standby service and disabling the target micro-service.
When the fault exists in the target micro-service, determining a fault tolerance strategy corresponding to the target micro-service; when the fault-tolerant policy is a switching service, acquiring a standby service corresponding to the target micro-service; enabling the backup service to replace the target micro service. The embodiment combines with an automatic fault tolerance strategy, and can help the target micro-service to automatically switch to a standby service instance or adopt a recovery mechanism when the fault occurs, so as to ensure the availability and stability of the target micro-service.
Referring to fig. 5, fig. 5 is a block diagram showing the construction of a first embodiment of a micro service failure analysis apparatus according to the present invention.
As shown in fig. 5, the micro service fault analysis device according to the embodiment of the present invention includes:
the receiving module 10 is configured to obtain, when a fault analysis request is received, historical log data corresponding to the fault analysis request;
an extracting module 20, configured to determine an abnormal feature identifier according to the fault analysis request, and extract target feature data in the history log data according to the abnormal feature identifier;
the fault analysis module 30 is configured to determine whether the target micro service has a fault based on a preset fault determination rule and the target feature data.
When a fault analysis request is received, the embodiment acquires history log data corresponding to the fault analysis request; determining an abnormal characteristic identifier according to the fault analysis request, and extracting target characteristic data in the history log data according to the abnormal characteristic identifier; and judging whether the target micro-service has a fault or not based on a preset fault judging rule and the target characteristic data, and obtaining a fault analysis result. Since this embodiment extracts the target feature data in the history log data according to the abnormal feature identifier; and judging whether the target micro-service has faults or not based on a preset fault judging rule and the target characteristic data. Compared with the existing mode of embedding a large number of monitoring codes into each micro service needing fault analysis to conduct hard coupling analysis on whether the micro service is faulty or not, the fault analysis efficiency of the micro service can be improved through the mode.
It should be noted that the above-described working procedure is merely illustrative, and does not limit the scope of the present invention, and in practical application, a person skilled in the art may select part or all of them according to actual needs to achieve the purpose of the embodiment, which is not limited herein.
In addition, technical details not described in detail in this embodiment may refer to the micro service fault analysis method provided in any embodiment of the present invention, which is not described herein.
Based on the first embodiment of the micro service failure analysis apparatus of the present invention, a second embodiment of the micro service failure analysis apparatus of the present invention is proposed.
In this embodiment, the fault analysis module 30 is further configured to obtain normal service data in the preset fault determination rule when the abnormal feature identifier is a service abnormal feature identifier;
comparing the normal service data with the target feature data to obtain a comparison result;
and when the comparison result is that the data are inconsistent, judging that the target micro-service has faults, and generating a fault analysis result according to the difference information in the comparison result.
Further, the fault analysis module 30 is further configured to determine abnormal data in the comparison result, and extract abnormal features in the abnormal data;
determining an influence score of the abnormal feature based on a preset feature weight;
and determining the fault influence degree according to the influence score, and carrying out early warning according to the fault influence degree.
Further, the extracting module 20 is further configured to determine an abnormal feature identifier corresponding to the fault analysis request;
determining an abnormal feature extraction strategy based on the abnormal feature identification;
and extracting target feature data in the history log data according to the abnormal feature extraction strategy and the abnormal feature identification.
Further, the fault analysis module 30 is further configured to obtain performance information of the target micro-service when it is determined that the target micro-service has a fault;
determining an optimization direction of the target micro-service according to the performance information;
and generating an optimization strategy according to the optimization direction.
Further, the fault analysis module 30 is further configured to determine a fault tolerance policy corresponding to the target micro-service when it is determined that the target micro-service has a fault;
when the fault-tolerant policy is a switching service, acquiring a standby service corresponding to the target micro-service;
enabling the backup service to replace the target micro service.
Further, the fault analysis module 30 is further configured to obtain fault information when it is determined that the target micro service has a fault;
determining a fault node according to the fault information;
determining the influence degree of the fault node on the target micro-service according to the function information of the fault node;
and generating a fault-tolerant strategy corresponding to the target micro-service according to the influence degree.
Other embodiments or specific implementation manners of the micro service fault analysis device of the present invention may refer to the above method embodiments, and are not described herein.
In addition, the embodiment of the invention also provides a storage medium, wherein a micro-service fault analysis program is stored on the storage medium, and the micro-service fault analysis program realizes the steps of the micro-service fault analysis method when being executed by a processor.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.
The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. read-only memory/random-access memory, magnetic disk, optical disk), comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method according to the embodiments of the present invention.
The foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the scope of the invention, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.
Claims (10)
1. A micro service fault analysis method, characterized in that the micro service fault analysis method comprises the steps of:
when a fault analysis request is received, acquiring historical log data corresponding to the fault analysis request;
determining an abnormal characteristic identifier according to the fault analysis request, and extracting target characteristic data in the history log data according to the abnormal characteristic identifier;
and judging whether the target micro-service has faults or not based on a preset fault judging rule and the target characteristic data.
2. The micro service fault analysis method as claimed in claim 1, wherein the step of judging whether the target micro service has a fault based on a preset fault determination rule and the target feature data comprises:
when the abnormal feature identifier is a service abnormal feature identifier, acquiring normal service data in the preset fault judgment rule;
comparing the normal service data with the target feature data to obtain a comparison result;
and when the comparison result is that the data are inconsistent, judging that the target micro-service has faults, and generating a fault analysis result according to the difference information in the comparison result.
3. The micro service fault analysis method as claimed in claim 2, wherein the step of determining that the target micro service has a fault when the comparison result is inconsistent with the data, and generating a fault analysis result according to the difference information in the comparison result, further comprises:
determining abnormal data in the comparison result, and extracting abnormal characteristics in the abnormal data;
determining an influence score of the abnormal feature based on a preset feature weight;
and determining the fault influence degree according to the influence score, and carrying out early warning according to the fault influence degree.
4. The micro-service failure analysis method according to claim 1, wherein the step of determining an abnormal feature identification from the failure analysis request, extracting target feature data from the history log data from the abnormal feature identification, comprises:
determining an abnormal characteristic identifier corresponding to the fault analysis request;
determining an abnormal feature extraction strategy based on the abnormal feature identification;
and extracting target feature data in the history log data according to the abnormal feature extraction strategy and the abnormal feature identification.
5. The method for analyzing micro service faults according to any of claims 1 to 4, further comprising, after the step of judging whether a fault exists in the target micro service based on a preset fault judgment rule and the target feature data:
acquiring performance information of the target micro-service when judging that the target micro-service has faults;
determining an optimization direction of the target micro-service according to the performance information;
and generating an optimization strategy according to the optimization direction.
6. The method for analyzing micro service faults according to any of claims 1 to 4, further comprising, after the step of judging whether a fault exists in the target micro service based on a preset fault judgment rule and the target feature data:
when the fault exists in the target micro-service, determining a fault-tolerant strategy corresponding to the target micro-service;
when the fault-tolerant policy is a switching service, acquiring a standby service corresponding to the target micro-service;
enabling the backup service to replace the target micro service.
7. The method for analyzing micro service faults as claimed in claim 6, wherein the step of determining a fault tolerant policy corresponding to the target micro service when it is determined that the target micro service has faults comprises:
when the target micro-service is judged to have faults, fault information is obtained;
determining a fault node according to the fault information;
determining the influence degree of the fault node on the target micro-service according to the function information of the fault node;
and generating a fault-tolerant strategy corresponding to the target micro-service according to the influence degree.
8. A micro service failure analysis apparatus, characterized in that the micro service failure analysis apparatus comprises:
the receiving module is used for acquiring historical log data corresponding to the fault analysis request when the fault analysis request is received;
the extraction module is used for determining an abnormal characteristic identifier according to the fault analysis request and extracting target characteristic data in the history log data according to the abnormal characteristic identifier;
and the fault analysis module is used for judging whether the target micro-service has faults or not based on a preset fault judgment rule and the target characteristic data.
9. A micro service failure analysis apparatus, the apparatus comprising: memory, a processor and a micro-service failure analysis program stored on the memory and executable on the processor, the micro-service failure analysis program being configured to implement the steps of the micro-service failure analysis method according to any one of claims 1 to 7.
10. A storage medium, wherein a micro service failure analysis program is stored on the storage medium, which when executed by a processor, implements the steps of the micro service failure analysis method according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311190434.8A CN117370052B (en) | 2023-09-14 | 2023-09-14 | Microservice fault analysis method, device, equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311190434.8A CN117370052B (en) | 2023-09-14 | 2023-09-14 | Microservice fault analysis method, device, equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117370052A true CN117370052A (en) | 2024-01-09 |
CN117370052B CN117370052B (en) | 2024-04-26 |
Family
ID=89401218
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311190434.8A Active CN117370052B (en) | 2023-09-14 | 2023-09-14 | Microservice fault analysis method, device, equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117370052B (en) |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101201786A (en) * | 2006-12-13 | 2008-06-18 | 中兴通讯股份有限公司 | Method and device for monitoring fault log |
CN108376107A (en) * | 2018-03-01 | 2018-08-07 | 郑州云海信息技术有限公司 | A kind of method, apparatus, equipment and the storage medium of server failure detection |
CN109710444A (en) * | 2018-12-26 | 2019-05-03 | 九逸(北京)信息技术有限公司 | The method and relevant device of the abnormality processing of intelligent hospital information system |
CN110888783A (en) * | 2019-11-21 | 2020-03-17 | 望海康信(北京)科技股份公司 | Monitoring method and device of micro-service system and electronic equipment |
CN110888755A (en) * | 2019-11-15 | 2020-03-17 | 亚信科技(中国)有限公司 | A method and device for finding abnormal root cause nodes in a microservice system |
CN111209134A (en) * | 2020-01-02 | 2020-05-29 | 广州虎牙科技有限公司 | Log information based fault analysis method and device, storage medium and equipment |
CN111240876A (en) * | 2020-01-06 | 2020-06-05 | 远光软件股份有限公司 | Fault positioning method and device for microservice, storage medium and terminal |
CA3096768A1 (en) * | 2019-10-24 | 2021-04-24 | Next Pathway Inc. | System and method for automated microservice source code generation and deployment |
CN114153703A (en) * | 2021-12-08 | 2022-03-08 | 中国建设银行股份有限公司 | Micro-service exception positioning method and device, electronic equipment and program product |
US20220147409A1 (en) * | 2020-11-10 | 2022-05-12 | International Business Machines Corporation | Identification and/or prediction of failures in a microservice architecture for enabling automatically-repairing solutions |
US11561868B1 (en) * | 2021-12-23 | 2023-01-24 | Intel Corporation | Management of microservices failover |
-
2023
- 2023-09-14 CN CN202311190434.8A patent/CN117370052B/en active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101201786A (en) * | 2006-12-13 | 2008-06-18 | 中兴通讯股份有限公司 | Method and device for monitoring fault log |
CN108376107A (en) * | 2018-03-01 | 2018-08-07 | 郑州云海信息技术有限公司 | A kind of method, apparatus, equipment and the storage medium of server failure detection |
CN109710444A (en) * | 2018-12-26 | 2019-05-03 | 九逸(北京)信息技术有限公司 | The method and relevant device of the abnormality processing of intelligent hospital information system |
CA3096768A1 (en) * | 2019-10-24 | 2021-04-24 | Next Pathway Inc. | System and method for automated microservice source code generation and deployment |
CN110888755A (en) * | 2019-11-15 | 2020-03-17 | 亚信科技(中国)有限公司 | A method and device for finding abnormal root cause nodes in a microservice system |
CN110888783A (en) * | 2019-11-21 | 2020-03-17 | 望海康信(北京)科技股份公司 | Monitoring method and device of micro-service system and electronic equipment |
CN111209134A (en) * | 2020-01-02 | 2020-05-29 | 广州虎牙科技有限公司 | Log information based fault analysis method and device, storage medium and equipment |
CN111240876A (en) * | 2020-01-06 | 2020-06-05 | 远光软件股份有限公司 | Fault positioning method and device for microservice, storage medium and terminal |
US20220147409A1 (en) * | 2020-11-10 | 2022-05-12 | International Business Machines Corporation | Identification and/or prediction of failures in a microservice architecture for enabling automatically-repairing solutions |
CN114153703A (en) * | 2021-12-08 | 2022-03-08 | 中国建设银行股份有限公司 | Micro-service exception positioning method and device, electronic equipment and program product |
US11561868B1 (en) * | 2021-12-23 | 2023-01-24 | Intel Corporation | Management of microservices failover |
Also Published As
Publication number | Publication date |
---|---|
CN117370052B (en) | 2024-04-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107451040B (en) | Method and device for positioning fault reason and computer readable storage medium | |
JP6160064B2 (en) | Application determination program, failure detection apparatus, and application determination method | |
CN112954031B (en) | Equipment state notification method based on cloud mobile phone | |
CN114595127B (en) | Log exception processing method, device, equipment and storage medium | |
CN112100070A (en) | Version defect detection method and device, server and storage medium | |
CN108650123B (en) | Fault information recording method, device, equipment and storage medium | |
US7496795B2 (en) | Method, system, and computer program product for light weight memory leak detection | |
CN117370052B (en) | Microservice fault analysis method, device, equipment and storage medium | |
CN114327967A (en) | Equipment repairing method and device, storage medium and electronic device | |
CN115729727A (en) | Fault repairing method, device, equipment and medium | |
CN111756594B (en) | Control method of pressure test, computer device and computer readable storage medium | |
US20090083747A1 (en) | Method for managing application programs by utilizing redundancy and load balance | |
CN114546759B (en) | Database access error monitoring and analyzing method and device and electronic equipment | |
CN112463343B (en) | Restarting method and device of business process, storage medium and electronic equipment | |
CN110362464B (en) | Software analysis method and equipment | |
CN115220992A (en) | Interface change monitoring method and device, computer equipment and storage medium | |
CN115629919A (en) | Method and device for fast switching fault system | |
CN115150253A (en) | Fault root cause determination method and device and electronic equipment | |
CN114895879A (en) | Management system design scheme determining method, device, equipment and storage medium | |
CN114650211A (en) | Fault repairing method, device, electronic equipment and computer readable storage medium | |
CN112882893A (en) | Method for real-time monitoring application service log generated by mobile terminal | |
CN115529250B (en) | Flow playback method and device, electronic equipment and storage medium | |
CN117251337B (en) | Micro-service health dial testing method, device, equipment and storage medium | |
CN119621386A (en) | Fault diagnosis method, device, electronic equipment and storage medium | |
CN119537112A (en) | Interface request processing method and device, computer equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |