Detailed Description
The following description of the exemplary embodiments of the present application, taken in conjunction with the accompanying drawings, includes various details of the embodiments of the application for the understanding of the same, which are to be considered exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
Fig. 1 illustrates an exemplary system architecture 100 to which embodiments of the microservice anomaly compensation method of the present application may be applied.
As shown in fig. 1, the system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The terminal devices 101, 102, 103 interact with a server 105 via a network 104 to receive or send messages or the like. Various communication client applications, such as communication-type applications, may be installed on the terminal devices 101, 102, 103.
The terminal apparatuses 101, 102, and 103 may be hardware or software. When the terminal devices 101, 102, 103 are hardware, they may be various electronic devices having a display screen, including but not limited to a mobile phone and a notebook computer. When the terminal apparatuses 101, 102, 103 are software, they can be installed in the electronic apparatuses listed above. It may be implemented as multiple software or software modules (e.g., to provide microservice anomaly compensation services) or as a single software or software module. And is not particularly limited herein.
The server 105 may be a server providing various services, for example, in response to receiving the abnormal service warning information, performing distributed link tracking according to the abnormal service identifier to obtain an abnormal call chain; acquiring context parameters corresponding to the abnormal call chain; and performing compensation operation on each node in the abnormal calling chain before the abnormal node based on the context parameters and the abnormal calling chain.
The server 105 may be hardware or software. When the server 105 is hardware, it may be implemented as a distributed server cluster composed of a plurality of servers, or may be implemented as a single server. When the server is software, it may be implemented as a plurality of software or software modules (for example, to provide the micro-service anomaly compensation service), or may be implemented as a single software or software module. And is not particularly limited herein.
It should be noted that the micro-service anomaly compensation method provided by the embodiment of the present disclosure may be executed by the server 105, or may be executed by the terminal devices 101, 102, and 103, or may be executed by the server 105 and the terminal devices 101, 102, and 103 in cooperation with each other. Accordingly, each part (for example, each unit, sub-unit, module, sub-module) included in the micro-service abnormality compensation apparatus may be entirely provided in the server 105, may be entirely provided in the terminal devices 101, 102, and 103, and may be provided in the server 105 and the terminal devices 101, 102, and 103, respectively.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Fig. 2 shows a flow diagram 200 of an embodiment of a microservice anomaly compensation method that can be applied to the present application. In this embodiment, the micro-service anomaly compensation method includes the following steps:
step 201, in response to receiving the abnormal service alarm information, performing distributed link tracking according to the abnormal service identifier to obtain an abnormal call chain.
In this embodiment, the execution main body (for example, the server 105 or the terminal devices 101, 102, 103 shown in fig. 1) may monitor the calling condition of each service through the abnormal service monitoring and warning system, analyze the abnormal service warning information to obtain an abnormal service identifier in response to receiving the abnormal service warning information, further determine an abnormal calling link identifier to which the abnormal service belongs according to the abnormal service identifier, and further perform distributed link tracking according to the abnormal calling link identifier and the link tracking system to obtain the abnormal calling chain.
The services described above are typically used to indicate microservices, indicating many loosely coupled and independently deployable smaller components or services that are broken down by a single application.
Here, the distributed link tracing refers to restoring a distributed request to a call link, and collectively displaying the call condition of the distributed request. The calling situation may include time consumption on each service node, specific arrival of the request on which machine, request status of each service node, and the like.
The link tracking system may be any tracking system known in the art or developed in the future, such as a Zipkin system, a Jaeger system, a Dapper system, etc., and is not limited in this application.
The Dapper system provides the concepts of trace and span, wherein the trace is used for indicating a calling link and a path for requesting all services passing through a back end, and each link is identified by a globally unique trace id. The span is used for indicating a basic working unit of the tracking service and represents a cross-service call, and comprises information such as a summary, a timestamp event, a key value annotation, and a span. Each span has a parent span id and its own span id, the span without the parent span id is a root span, the parent span id of the current span is the span id of the calling link upstream, and all the spans are associated to a specific trace and share the trace id of the trace.
Step 202, obtaining context parameters corresponding to the abnormal call chain.
In this embodiment, after the execution main body obtains the abnormal call chain, the execution main body may further obtain a context parameter corresponding to the abnormal call chain in a service log corresponding to the abnormal call chain.
Step 203, based on the context parameter and the abnormal call chain, performing a compensation operation on each node before the abnormal node in the abnormal call chain.
In this embodiment, the execution main body may directly execute the compensation operation on the node before the abnormal node corresponding to the abnormal service in the abnormal call chain according to the context parameter and the abnormal call chain, that is, perform data recovery, or may convert the abnormal call chain into the tree-shaped flowchart at first, and further perform data recovery on each node before the abnormal node in the abnormal call chain according to the tree-shaped flowchart and the context parameter, which is not limited in this application.
In the process of directly performing data recovery on each node before the abnormal node in the abnormal call chain according to the context parameter and the abnormal call chain, because the call chain is complicated, the execution main body needs to synchronously acquire information of a father node, a brother node and the like of each node in the call chain according to the corresponding service log so as to perform data recovery.
In addition, the execution main body may perform data recovery on each node before the abnormal node according to the context parameter in a subsequent traversal manner, that is, sequentially perform data recovery according to a reverse order of a calling order of each node in the abnormal calling chain, perform data recovery preferentially according to a reuse degree of each node, perform data recovery concurrently by multiple nodes, and the like, which is not limited in the present application.
It should be noted that the execution subject may perform data recovery on each node before the abnormal node according to a preset compensation logic, that is, a data recovery logic. Here, the preset compensation logic may restore the state of the data of each node to guarantee the consistency of the data for performing a compensation service or transmitting a compensation message.
The compensation logic is separated from the execution logic of the service, and is realized by the developer according to the preset design of the service.
Further, in the process of performing data recovery on a node before the abnormal node in the abnormal call chain, if data recovery of a certain node fails, the execution subject may attempt to perform data recovery again on the node.
It should be noted that, the execution subject may determine the exception type of the exception node before performing the compensation operation on each node before the exception node in the exception call chain. Here, the exception type is mainly used to determine whether a compensation operation needs to be performed on the current exception node.
Specifically, the exception type may include a first type and a second type, and if the exception type of the exception node is the first type, a compensation operation is performed on the exception node; and if the abnormal type of the abnormal node is the second type, not executing compensation operation on the abnormal node.
In some optional ways, the method further comprises: and outputting indication information requesting manual processing in response to the situation that the execution of the compensation operation on each node before the abnormal node in the abnormal call chain fails and the retry number is greater than or equal to a preset retry number threshold.
In this implementation, in the process of performing data recovery on each node before the abnormal node in the abnormal call chain, if the compensation operation performed on a certain node fails, that is, if the data recovery fails, retry may be performed, and if the retry number is greater than or equal to the preset retry number threshold, retry is not performed, and instruction information requesting manual processing is output.
The retry threshold may be determined according to experience, actual requirements, and specific application scenarios, for example, 3 times, 5 times, and the like, which is not limited in this application.
According to the implementation mode, the compensation operation failure is executed on each node in the abnormal call chain before the abnormal node, the retry times are larger than or equal to the preset retry time threshold, and the instruction information of requesting manual processing is output, so that the problems that the data recovery cannot be completed due to the failure of one-time compensation operation and the system pressure is increased due to a large number of retry operations can be effectively solved, the effectiveness of abnormal compensation is effectively improved, and the consistency of distributed data is better guaranteed.
In some optional modes, the compensation operation is performed on each node before the abnormal node in the abnormal call chain, and the compensation operation comprises the following steps: and executing compensation operation on each node before the abnormal node in the abnormal call chain according to a subsequent traversal method.
In this implementation, the execution subject may perform the compensation operation on the nodes before the abnormal node in the abnormal call chain according to a subsequent traversal method, that is, sequentially perform data recovery on the nodes before the abnormal node according to a reverse order of the call order of the nodes in the abnormal call chain.
The implementation mode can effectively guarantee the reliability of executing the compensation operation on the nodes before the abnormal node in the abnormal call chain, and further is more beneficial to guaranteeing the consistency of the distributed data.
With continued reference to fig. 3, fig. 3 is a schematic diagram of an application scenario of the micro-service anomaly compensation method according to the present embodiment.
In the application scenario of fig. 3, the calling relationship between the micro services corresponding to one distributed request is micro service 301-micro service 302-micro service 303, the execution main body 304, in response to receiving the abnormal service alarm information, parses the abnormal service alarm information to obtain an abnormal service identifier, for example, micro service 303, further determines an abnormal calling link identifier to which the abnormal service belongs according to the abnormal service identifier, and further performs distributed link tracking according to the abnormal calling link identifier and the link tracking system to obtain an abnormal calling chain 305. Further, the execution subject obtains context parameters 306 corresponding to the abnormal call chain, and performs compensation operation 307 on each node before the abnormal node in the abnormal call chain according to the context parameters and the abnormal call chain, that is, performs data recovery on distributed data corresponding to the micro service 301 and the micro service 302 in sequence.
The micro-service abnormity compensation method disclosed by the invention has the advantages that by responding to the received abnormity service alarm information, the distributed link tracking is carried out according to the abnormity service identifier to obtain an abnormity call chain, and the abnormity event alarm information comprises the following steps: an abnormal service identifier; acquiring context parameters corresponding to the abnormal call chain; and executing compensation operation on each node in the abnormal call chain before the abnormal node based on the context parameter and the abnormal call chain, thereby improving the performance of service execution while effectively ensuring the consistency of distributed data.
With further reference to fig. 4, a flow 400 of yet another embodiment of a microservice anomaly compensation method is illustrated. The process 400 of the micro-service anomaly compensation method of the present embodiment may include the following steps:
step 401, in response to receiving the abnormal service alarm information, performing distributed link tracking according to the abnormal service identifier to obtain an abnormal call chain.
In this embodiment, details of implementation and technical effects of step 401 may refer to the description of step 201, and are not described herein again.
Step 402, obtaining context parameters corresponding to the abnormal call chain.
In this embodiment, reference may be made to the description of step 202 for details of implementation and technical effects of step 402, which are not described herein again.
Step 403, converting the abnormal call chain into a tree-shaped flow chart based on the parent-child call relation of each node in the abnormal call chain.
In this embodiment, the execution subject may convert the abnormal call chain into the tree-shaped flowchart in advance according to the parent-child call relationship of each node in the service log corresponding to the abnormal call chain.
The execution sequence relation of the brother nodes in the abnormal call chain is not required in the transfer process, namely, only the brother nodes in the abnormal call chain are required to be ensured to correspond to the branches of the same level in the tree-shaped flow chart.
In some optional manners, converting the abnormal call chain into a tree-shaped flow chart based on the parent-child call relationship of each node in the abnormal call chain, including: and converting the abnormal call chain into a tree-shaped flow chart based on the parent-child call relationship of each node in the abnormal call chain and the execution sequence relationship of the brother nodes.
In this implementation manner, the execution subject may convert the exception call chain into the tree-shaped flowchart in advance according to the parent-child call relationship and the execution sequence relationship of the sibling node of each node in the service log corresponding to the exception call chain. Further based on the context parameters and the tree-shaped flow chart, compensation operation is executed on each node in the abnormal call chain before the abnormal node
Specifically, the tree flow diagram is shown in fig. 5, where node a501 is a parent node of node B502, node B502 is a parent node of node C503, node C503 is a parent node of node D504 and node F505, node D504 is a parent node of node E506, node F505 is a parent node of node G507, and nodes D504 and E506 are siblings with node F505 and node G507. The calling order of the sibling nodes is from right to left, i.e. the calling order of node D504 and node E506 precedes that of node F505 and node G507.
The order in which the execution subject performs the compensation operation on the nodes before the abnormal node (e.g., node G507) in the abnormal call chain based on the context parameters and the tree flow diagram (as shown in fig. 5) may be the reverse order of the call order indicated by the tree flow diagram, i.e., node F505-node E506-node D504-node C503-node B502-node a 501.
And step 404, based on the context parameters and the tree-shaped flow chart, performing compensation operation on nodes before the abnormal node in the abnormal call chain.
In this embodiment, the execution main body may execute, according to the context parameter, a compensation operation on each node before the abnormal node in the abnormal call chain, that is, perform data recovery, in a reverse order of the call order of each node indicated by the tree-shaped flowchart.
Compared with the embodiment corresponding to fig. 2, in the above embodiment of the present application, the flow 400 of the microservice anomaly compensation method in the present embodiment embodies that the exception call chain is converted into a tree-shaped flow chart based on the parent-child call relationship of each node in the exception call chain; according to the context parameters and the tree-shaped flow chart, compensation operation is performed on each node in the abnormal call chain before the abnormal node, the abnormal call chain is clearly and definitely represented, the execution main body can directly perform compensation operation on each node in the abnormal call chain before the abnormal node according to the tree-shaped flow chart, and the efficiency of abnormal compensation is improved.
With further reference to fig. 6, as an implementation of the methods shown in the above-mentioned figures, the present application provides an embodiment of a microservice anomaly compensation apparatus, which corresponds to the method embodiment shown in fig. 1, and which can be applied to various electronic devices.
As shown in fig. 6, the microservice abnormality compensation apparatus 600 of the present embodiment includes: a tracking module 601, an acquisition module 602, and a compensation module 603.
Wherein, the tracking module 601 may be configured to, in response to receiving the abnormal service alarm information, perform distributed link tracking according to the abnormal service identifier to obtain the abnormal call chain.
The obtaining module 602 may be configured to obtain a context parameter corresponding to the abnormal call chain.
The compensation module 603 may be configured to perform a compensation operation on each node preceding the abnormal node in the abnormal call chain based on the context parameter and the abnormal call chain.
In some optional manners of this embodiment, the compensation module further includes: the conversion unit is configured to convert the abnormal call chain into the tree-shaped flow chart based on the parent-child call relation of each node in the abnormal call chain; and the execution unit is configured to execute compensation operation on each node before the abnormal node in the abnormal call chain according to the context parameters and the tree-shaped flow chart.
In some alternatives of this embodiment, the conversion unit is further configured to: and converting the abnormal call chain into a tree-shaped flow chart based on the parent-child call relationship of each node in the abnormal call chain and the execution sequence relationship of the brother nodes.
In some alternatives of this embodiment, the compensation module is further configured to: and executing compensation operation on each node before the abnormal node in the abnormal call chain according to a subsequent traversal method.
In some optional manners of this embodiment, the apparatus further includes: and the retry module is configured to respond to the failure of executing the compensation operation on each node before the abnormal node in the abnormal call chain, and the retry number is greater than or equal to a preset retry number threshold value, and output indication information requesting manual processing.
According to an embodiment of the present application, an electronic device and a readable storage medium are also provided.
Fig. 7 is a block diagram of an electronic device according to an embodiment of the present application.
700 is a block diagram of an electronic device for a microservice anomaly compensation method according to an embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the present application that are described and/or claimed herein.
As shown in fig. 7, the electronic apparatus includes: one or more processors 701, a memory 702, and interfaces for connecting the various components, including a high-speed interface and a low-speed interface. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions for execution within the electronic device, including instructions stored in or on the memory to display graphical information of a GUI on an external input/output apparatus (such as a display device coupled to the interface). In other embodiments, multiple processors and/or multiple buses may be used, along with multiple memories and multiple memories, as desired. Also, multiple electronic devices may be connected, with each device providing portions of the necessary operations (e.g., as a server array, a group of blade servers, or a multi-processor system). In fig. 7, one processor 701 is taken as an example.
The memory 702 is a non-transitory computer readable storage medium as provided herein. The memory stores instructions executable by at least one processor to cause the at least one processor to perform the micro-service anomaly compensation method provided by the application. The non-transitory computer-readable storage medium of the present application stores computer instructions for causing a computer to perform the microservice anomaly compensation method provided by the present application.
The memory 702, which is a non-transitory computer readable storage medium, may be used to store non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules (e.g., the tracking module 601, the obtaining module 602, and the compensation module 603 shown in fig. 6) corresponding to the micro-service anomaly compensation method in the embodiments of the present application. The processor 701 executes various functional applications of the server and data processing by running non-transitory software programs, instructions, and modules stored in the memory 702, that is, implements the micro-service anomaly compensation method in the above-described method embodiment.
The memory 702 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created by use of the electronic device for microservice anomaly compensation, and the like. Further, the memory 702 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, memory 702 may optionally include memory located remotely from processor 701, which may be connected to microservice anomaly compensated electronics over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The electronic device of the microservice anomaly compensation method may further include: an input device 703 and an output device 704. The processor 701, the memory 702, the input device 703 and the output device 704 may be connected by a bus or other means, and fig. 7 illustrates an example of a connection by a bus.
The input device 703 may receive input numeric or character information, such as a touch screen, keypad, mouse, track pad, touch pad, pointer, one or more mouse buttons, track ball, joystick, or other input device. The output devices 704 may include a display device, auxiliary lighting devices (e.g., LEDs), and tactile feedback devices (e.g., vibrating motors), among others. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device can be a touch screen.
Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
These computer programs (also known as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented using high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
According to the technical scheme of the embodiment of the application, the consistency of the distributed data is effectively guaranteed, and meanwhile the performance of service execution is improved.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present application may be executed in parallel, sequentially, or in different orders, and the present invention is not limited thereto as long as the desired results of the technical solutions disclosed in the present application can be achieved.
The above-described embodiments should not be construed as limiting the scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.