[go: up one dir, main page]

CN102567181B - Based on resource access mode prediction, diagnostic application fault and recover from it - Google Patents

Based on resource access mode prediction, diagnostic application fault and recover from it Download PDF

Info

Publication number
CN102567181B
CN102567181B CN201110442035.7A CN201110442035A CN102567181B CN 102567181 B CN102567181 B CN 102567181B CN 201110442035 A CN201110442035 A CN 201110442035A CN 102567181 B CN102567181 B CN 102567181B
Authority
CN
China
Prior art keywords
resource
application program
resource access
error condition
access mode
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201110442035.7A
Other languages
Chinese (zh)
Other versions
CN102567181A (en
Inventor
M·D·扬
K·H·雷厄森
E·杰瓦特
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Microsoft Technology Licensing LLC
Original Assignee
Microsoft Technology Licensing LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US12/978,663 external-priority patent/US9189308B2/en
Application filed by Microsoft Technology Licensing LLC filed Critical Microsoft Technology Licensing LLC
Publication of CN102567181A publication Critical patent/CN102567181A/en
Application granted granted Critical
Publication of CN102567181B publication Critical patent/CN102567181B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The present invention relates to based on resource access mode prediction, diagnostic application fault and recover from it.This document describes for the normal operating distinguishing application program and error condition with prediction, diagnostic application fault the technology recovered from it.Monitor the application program access to resource, and record resource Access Events.Utilize computer pattern recognition, set up resource access mode from the resource Access Events recorded.If the subsequent access of resource is deviateed the pattern set up by application program, then notify possible error condition based on the deviation detected to user and/or the manager of application program.Furthermore, it is possible to temporal proximity based on the time of origin to mistake, the sequence of the resource Access Events from the resource access mode deviation set up is associated with error condition, to provide the diagnostic message about mistake.

Description

Based on resource access mode prediction, diagnostic application fault and recover from it
Technical field
The present invention relates to fault recovery.
Background technology
The software application performed on the computer systems is likely to be due to a variety of causes and breaks down, such as code Bug, user error, incorrect input data, unavailable resource etc..These application and troubles may be made Become loss of data and application downtime, and may cause with recover to apply and the relevant cost of data and time Between.It is expected in the case of given identical input, condition and/or situation, in public environment run or Application from common mounting can run into identical fault.This can be to apply to run in virtual application environment Situation.
Application virtualization allows software application and the hardware of computer, the operating system performed by computer (" OS ") and locally configured decoupling.Application virtualization can eliminate installing locally on a computer, Configuration and the requirement of maintenance application.As an alternative, virtual application environment can perform on computers and lead to Cross network and transmit application assembly from centralized maintenance virtualization applications packet stream on virtual application server.
Here made disclosure considers precisely with these and other and proposes.
Summary of the invention
This document describes that the normal operating for distinguishing application program and error condition are with prediction, diagnostic application event Barrier the technology recovered from it.When application program operates in virtual application environment, virtualization layer or virtual Applied environment could be aware that and control the application request to resource, such as reads from data file, to registration table Item write etc..Utilizing the techniques described herein, virtualization layer can record the access to resource, and at any time Between set up resource use commonality schemata.The most this resource access mode is established, and virtualization layer can continue Continuous supervision application program is to the use of resource and provides warning or alarm when pattern changes.This warning of trying to be the first Can provide to the user of application program or manager and quickly take diagnosis or the chance of corrective action, thus subtract Less, downtime and loss of data are even prevented.
It should be appreciated that application program is generally of the means showing mistake to user or manager, similar bullet Go out formula dialog box or the event recorded in application or System Event Log.But, the matter of these error messages Measure and may significantly change with applying different with utilizing.There is provided and allow to determine the high-quality error message of error source Need a large amount of input, and lack the most all of software commercial city and made this input.It is as herein described right to utilize The access log of resource and the commonality schemata set up come just accessed to which resource or just at wrong bar How the premode of part changes is associated, and can allow the event of user and/or manager diagnostic application more quickly Hinder and realize recovery action, thus reduce the application downtime.
Additionally, due to virtualization layer knows the application program all uses to resource, it is able to record that resource adds The data adding, revise or deleting and use in these resource modifyings occurred in time.If application journey Sequence fault, can start the second example of application program immediately, and can reset resource modifying and data Daily record, so that application state just reverts to just applying the point before the first Instance failure.Application is real This fast failure transfer (failover) between example can limit the further downtime.
According to embodiment, monitor the application program performed in the virtual application environment access to resource, and By resource Access Events record in resource access log.Utilize computer pattern recognition, from record Resource Access Events set up resource access mode.If the follow-up access to resource of application program deviates built Vertical pattern, then notify possible mistake based on the deviation detected to user and/or the manager of application program Condition.
Furthermore, it is possible to temporal proximity based on error condition time of origin, will access from the resource set up The resource Access Events sequence of pattern deviation is associated with error condition, to provide pass to user and/or manager Diagnostic message in mistake.Finally, application and trouble and restart application subsequently event in, can reset The resource Access Events adding about data, revising and delete of record in resource access log, answers to rebuild With the application state of program.
It should be appreciated that above-mentioned theme can be implemented as computer-controlled device, computer processes, calculating The goods of system or such as computer-readable medium etc..It is associated by reading detailed description below checking Accompanying drawing, these and various further feature will become clear from.
There is provided present invention so that it will be further for introducing in simplified form Some concepts described.Present invention is not intended as identifying the key feature of theme required for protection or must Want feature, be not intended to this general introduction for limiting the scope of theme required for protection.Additionally, asked The theme of protection is not limited to the realization of any or all shortcoming that solution is mentioned in any portion of the disclosure.
Accompanying drawing explanation
Fig. 1 is to illustrate the illustrative operatinr environment provided by embodiment presented herein and some software groups Many block diagrams of part;
Fig. 2-4 is to illustrate according to embodiment described herein for distinguishing normal operating and the mistake bar of application program Part is with prediction, diagnostic application fault the flow chart from its method recovered;And
Fig. 5 is the illustrative meter of the many calculating systems illustrating the ability to realize embodiment presented herein The block diagram of calculation machine hardware and software architecture.
Specific embodiment
The normal operating relating to distinguish application program based on the pattern that resource accesses described in detail below With error condition so as to predict, diagnostic application fault and/or the technology recovered from it.Although combining computer Operating system and the execution of application program in system and the general context of program module that performs presents Theme described herein, it will be recognized to those skilled in the art that other realize can in conjunction with other class The program module of type performs.It is said that in general, program module includes performing particular task or realizing specific abstract The routine of data type, program, assembly, data structure and other type of structure.Additionally, this area skill Art personnel be readily apparent that, it is possible to use other computer system configurations implements theme described herein, these meters The configuration of calculation machine system includes portable equipment, multicomputer system, based on microprocessor or programmable-consumer Electronic product, minicomputer, mainframe computer etc..
In the following detailed description, with reference to constitute one part and be shown by way of illustration each specific embodiment or The accompanying drawing of example.In the accompanying drawings, in whole some accompanying drawings, similar reference represents similar element.
Fig. 1 illustrates according to the non-illustrative operatinr environment of embodiment presented herein 100, answers including for differentiation With the normal operating of program with error condition with prediction, diagnostic application fault the some software groups recovered from it Part.Environment 100 includes computer 102.Calculating equipment 102 can be server computer;Individual calculus Machine (" PC "), such as desk-top workstation, laptop computer or notebook;Individual digital helps Reason (" PDA ");Radio telephone;Set Top Box;Game console;Maybe can perform appointing of application program What it calculates equipment.
Software application 104 performs on a computer 102.According to embodiment, application program 104 is permissible Perform in virtual application environment 106.Virtual application environment 106 can allow calculating equipment 102 start and hold Application program on computers is not yet installed before row.Virtual application environment 106 is readily modified as passing through network 110 in real time or near-real-time is from the assembly of virtual application server 112 streaming application program 104. Virtual application environment 106 and virtual application server 112 can be based on from Microsofts of Redmond city The MICROSOFT of company APP-V technology, CITRIX from Fort Lauderdale, Florida city The CITRIX XENAPP of SYSTEMS company limitedTMTechnology or other application streaming and virtual any Change platform or technology.Network 110 can be that LAN, wide area network (" WAN "), the Internet maybe will meters Calculation machine 102 is connected to other networking topographies any of virtual application server 112.
The component software of application program 104 can be stored in that virtual application server 112 is addressable to be deposited In virtualization applications bag 114 on storage equipment.According to embodiment, virtualization applications bag 114 is by multiple data Block forms, and data block comprises other element of application structure information and each component file and application. Virtualization applications bag 114 may also include with by application program 104 the term of execution utilize local and remote money The metadata that the position in source is relevant with configuration.Virtualization applications bag 114 can be by the management of application program 104 Member installs by performing the typical case of application on the management server and records such as local file system, registration table Etc. the change made create.Then block in virtualization applications bag 114 is streamed to virtual application ring Border 106 performs on a computer 102 with permission application program 104.
When virtual application environment 106 can create single virtual operation, environment (is referred to as " application sandbox (application sandbox) ") to perform from the streamed each application journey of virtual application server 112 Sequence 104.Application sandbox allows the assembly of application program 104 to perform isolator with the remainder of system.Empty Intend applied environment 106 may be provided for extract to by application program 104 the term of execution utilize local resource The virtualization layer 116 of the access of 118A and remote resource 118B (being collectively referred to as resource 118 herein). It is locally available or stored by the remotely available system of network 110 that resource 118 can be included in computer 102 Device, native processor time or process thread, the file being stored in file system, be stored in registration table number According to the data in storehouse, application service, presentation services, database service etc..
Application program 104 can be by by the operating system 122 installed on a computer 102 or other standard The resource application programming interface (API) that software library realizes accesses local and remote resource 118.According to reality Executing example, virtualization layer 116 extracts resource API 120 to monitor and controlling by virtual application environment 106 The application program 104 of the middle execution access request to local and remote resource 118.Additionally, virtualization layer 116 Can by application program 104 to the access record of resource 118 in resource access log 124.Resource accesses Daily record 124 can include the many numbers on the journal file in local file system, remote database server According to storehouse table, a combination of both or other data-storage system any that can be accessed by computer 102.
Resource access log 124 can comprise the daily record of resource Access Events 126.Resource Access Events 126 Can include by virtual application environment 106 perform application program 104 make, to resource API 120 The details called.Each of resource Access Events 126 can include indicating when resource access occurs Timestamp, the identifier of each resource API 120 called and instruction are just being accessed for this locality or remote The multiple parameter values of the resource type of Cheng Ziyuan 118, position or other side.Resource Access Events 126 can With the object in the row in the entry being stored as in journal file, database table, dictionary (dictionary) Or store with other data structure any as known in the art or form.
In one embodiment, resource access log 124 also comprises resource access mode 128.Resource accesses Pattern 128 can include the resource access mode of the application program 104 periodically occurred.Such as, application program 104 can read particular registry key at time T1, subsequently in time T1+240msec (millisecond) para-position Specific file in Telefile writes.Additionally, read registry entry and write this of file Pattern can be such as in response to particular event or condition, periodically or occur many at the special time of a day In once.Resource access mode 128 can be between specific API Calls, at the API to specific resources Resource (amount of memory such as distributed or the number of threads started) between calling or to specified quantitative API Calls between set up.
Resource access mode 128 can access thing from using mode identification technology in the resource that certain period collects Part 126 is set up.Such as, the subset of event type can be confirmed as important, and Bayes can be utilized to learn Resource access mode is set up on the resource Access Events 126 of the habit technology these types in collecting the period 128.The resource access mode 128 generated can be stored as such as Markov chain or probability tree, instruction Relative probability of occurrence between resource Access Events.
According to an embodiment, resource access mode 128 can be parallel in the execution with application program 104 In background process, generate according to the daily record near-real-time on computers of resource Access Events 126.Separately In one embodiment, by the multiple computers 102 performing application program 104 in virtual application environment 106 Virtualization layer 116 record resource Access Events 126 can be focused into middle position.Can be by such as Remove the relevant resource path of computer and carry out the event data that extensive (genericize) is assembled, and permissible From being assembled with extensive event data to set up resource access mode 128.Then the money that can will be generated Source access module 128 is sent to perform each computer 102 of application program 104 so that in prediction application event Land use models during barrier, as being described in more detail referring to Fig. 2.
In another embodiment, resource access log 124 also comprises resource write data 130.Resource writes Data 130 can comprise is called interpolation by application program 104, is revised or delete resource API 120 of data Daily record, such as registry value write or I/O buffer write.Resource write data 130 can comprise to come From " deep copy (the deep copy) " of pointer or structure-type parameter to include the data being currently written into. In addition it is possible to use perform application program 104 current context, extensive any filename, key name, Location or other location parameter.Should be understood that and resource can be write data 130 and resource Access Events 126 It is integrated into individual log file or other structure of resource access log 124.During recovering from application and trouble, Resource write data 130 can be utilized to reduce the application state of application program 104, as referring to Fig. 4 more Describe in detail.
Referring now to Fig. 2-4, it is provided that about the additional detail of embodiment presented herein.Should be understood that pin The computer being implemented as (1) running on a computing system to the logical operation described in Fig. 2-4 realizes action Sequence or program module sequence and/or (2) calculate intrasystem interconnected machine logic circuit or circuit module. Depend on performance and other requirement of calculating system, different realizations can be selected.Therefore, described herein Logical operation be variously referred to as operation, structural device, action or module.These operations, structural device, Action and module can realize with software, firmware, special digital logic and any combination thereof.Also should Understand, ratio can be performed shown in accompanying drawing and in the more or less of operation of operation described herein.These behaviour Make also to perform by the order different from described order.
Fig. 2 illustrates according to embodiment as herein described, predicts based on from the deviation of set up resource access mode The routine 200 of error condition possible in application program 104.Routine 200 can be by a computer 102 The virtualization layer 116 that performs in virtual application environment 106 and/or on computers or at pooled applications server The combination of other module of upper execution performs.It will be appreciated that routine 200 also can be calculated equipment by other Other module of upper execution or assembly perform, or are held by any combination of module, assembly and calculating equipment OK.
Routine 200 is in operation 202 beginning, and wherein virtualization layer 116 monitors in virtual application environment 106 The access to local and remote resource 118 of the application program 104 of middle execution, and these access be recorded money Source access log 124.As described above with reference to Figure 1, virtualization layer 116 can be with records application program 104 To the details called of resource API 120 as resource Access Events 126, when access including instruction resource The timestamp occurred, the identifier of each resource API called and instruction be just accessed for this locality or The multiple parameter values of the resource type of remote resource 118, position or other side.
From operation 202, routine 200 proceeds to operate 204, wherein sets up resource access mode 128.Should This is understood, on a certain period, ample resources Access Events 126 can be virtualized layer 116 and record in resource In access log 124.Above with reference to described in Fig. 1, virtualization layer 116 or some other modules or process can To utilize the resource Access Events 126 recorded to set up resource access mode 128.Such as, virtualization layer 116 The mode identification technology that can utilize such as Bayesian network etc accesses thing to set up two or more resources Relative probability of occurrence between part.The resource access mode 128 set up subsequently can be as Markov chain Or probability tree is stored in resource access log 124.
Resource access mode 128 can be generated by virtualization layer 116 near-real-time on a computer 102.Or Person, the resource Access Events 126 recorded can gather middle position from multiple computers 102, general Change and be used for the multiple realities at the application program 104 performed in the virtual application environment 106 of each computer Resource access mode 128 is set up in example.The general resource set up from the resource Access Events 126 assembled is visited Ask that pattern 128 can be subsequently used for being performed any meter of application program 104 by manner described herein prediction Error condition on calculation machine 102.
Routine proceeds to operate 206 from operation 204, and wherein virtualization layer 116 detects at virtual application ring The application program 104 performed in border 106 is from the deviation of the resource access mode 128 set up.Such as, empty Planization layer 116 can detect probability of happening less than specifying based on the Bayesian analysis of resource access mode 128 The sequence of the resource API Calls of threshold value.Similarly, virtualization layer 116 can detect and have resource access such as The sequence of the resource API Calls of the high probability of happening with known error condition set up in pattern 128. In one embodiment, specifying under threshold value if the probability of the sequence of resource API Calls does not fall, virtual Change layer 116 and record corresponding resource Access Events 126 so that can use in background process as above New probability updates resource access mode 128.So, application program 104 is in virtual application environment 106 The term of execution, resource access mode 128 may be continually updated.
If be detected that application program 104 deviates from the resource access mode 128 set up, routine 200 Proceeding to operate 208 from operation 206, wherein virtualization layer 116 sends the alarm about pattern deviation.Should Alarm can be directed into user or the manager of application program 104.This alarm can via e-mail, literary composition This message or system message queue send;Send as system level events;It is recorded in application or system thing In part daily record;Or otherwise it is sent to manager via the addressable message system of computer 102. This alarm of trying to be the first can give quickly to take diagnosis or the chance of corrective action to manager, thus reduce or May prevent from following closely the downtime after possible, imminent error condition and loss of data.From operation 208, routine 200 terminates.
Fig. 3 illustrate according to embodiment described herein for by resource Access Events 126 and application program 104 Middle known error condition is associated to allow the routine 300 of error diagnosis.Routine 300 can be by computer The virtualization layer 116 that performs in virtual application environment 106 on 102 and/or on computers or in pooled applications The combination of other module performed on server performs.It will be appreciated that routine 300 also can be counted by other Other module performed on calculation equipment or assembly perform, or by module, assembly and any group of calculating equipment Incompatible execution.
Routine 300 starts in operation 302, wherein by the way of above in relation to described in operation 202, virtual Change layer 116 and monitor that the application program 104 performed in virtual application environment 106 is to local and remote resource The access of 118, and these access be recorded in resource access log 124.Routine 300 proceeds to operation 304, wherein the error condition in application program 104 is detected.For example, it is possible to by the user applied or (such as Pop-up error box, record are in application or System Event Log by conventional means for manager Event etc.) in application program 104, error condition detected.
From operation 304, routine 300 proceeds to operate 306, wherein by the money in resource access log 124 Source Access Events 126 is associated with the error condition detected.Manager can provide the generation of error condition Time, or the time of origin of error condition can from record resource Access Events 126 to resource The particular invocation of API 120 identifies.Then virtualization layer 116 or other module can identify resource and access The son of the resource Access Events 126 in a temporal proximity of error condition time of origin in daily record 124 Collection.For example, it is possible to all resource Access Events 126 that will occur in before error condition 10 seconds windows It is associated with this error condition.
In one embodiment, only will access from the resource set up in a temporal proximity of error condition The sequence of the resource API Calls recorded in the resource Access Events 126 of pattern 128 deviation and error condition phase Association.Such as, virtualization layer 116 or other module can be visited based on to resource in resource access log 124 Ask the Bayesian analysis of pattern 128, mark in error condition time of origin 10 seconds, probability of happening is less than Specify the sequence of the resource API Calls of record in the resource Access Events 126 of threshold value.Should be understood that for Appointment probability threshold value deviation in resource access mode 128 being associated with error condition can be higher than as above For the probability threshold for carrying out prediction error condition based on the deviation in resource access mode described in operation 206 Value.
Routine proceeds to operate 308 from operation 306, wherein shows to user or the manager of application program 104 Go out the resource Access Events 126 being associated with error condition.Associated resource Access Events 126 can lead to Cross user interface dialog illustrate or via e-mail, text message, system message sequence etc. send Report shown in.The money being associated with particular error conditions in time is provided from resource access log 124 Source Access Events 126 can allow user or the cause of manager's quick diagnosis error condition and realize suitably Recovery action to reduce downtime and loss of data.From operation 308, routine 300 terminates.
Fig. 4 illustrate according to embodiment described herein for from application program 104 error condition recover Routine 400.Routine 400 can be by the virtualization performed in virtual application environment 106 on a computer 102 Layer 116 and/or the combination of other module performed on computers or on pooled applications server perform. It will be appreciated that routine 400 also can be performed by other module performed on other calculating equipment or assembly, or Person is performed by any combination of module, assembly and calculating equipment.
Routine 400 starts in operation 402, wherein by the way of above in relation to described in operation 202, virtual Change layer 116 and monitor that the application program 104 performed in virtual application environment 106 is to local and remote resource The access of 118, and these access be recorded resource access log 124.Additionally, as described above with Fig. 1 Described, virtualization layer 116 for application program 104 to add, revise or delete data resource API 120 call, record resource write data 130.Resource write data 130 can include adjusting from API With the middle pointer specified or the deep copy of structure-type parameter, and execution application program 104 can be used Current context these data are processed further with extensive resource write data 130 in filename, key name, Address or other location parameter.
From operation 402, routine 400 proceeds to operate 404, and wherein application program 104 is because of error condition Break down.Such as, application program is likely to be due to software bug (bug), user error, incorrect Hardware fault etc. in input data, unavailable resource, computer 102 and break down.At application program During 104 fault, routine 400 moves to operate 406, and wherein application program 104 is restarted.Application program 104 Can automatically be restarted by virtualization layer 116 or another module performed on a computer 102, or it is permissible Manually restarted by same computer or the system manager that has in another computer system of similar configuration.
Routine proceeds to operate 408 from operation 406, virtual application environment 106 the most on a computer 102 Virtualization layer 116 playback of recorded performed in (wherein re-launching applications 104) is at resource access log 124 In specific resources Access Events 126, in order to application state is reverted to fault occur before point.Such as, Virtualization layer 116 can be reset and to being written to such as system memory addresses, I/O buffer, delaying at a high speed Deposit the non-volatile of file etc. or cache storage position resource API 120 call corresponding all Resource Access Events 126.
In another embodiment, virtualization layer 116 can reset with self-application fault before by virtualization layer and / or application program 104 take and store the data occurred since the last snapshot of application state or " checkpoint " Write corresponding all resource Access Events 126.Virtualization layer 116 can utilize resource access log 124 In resource write data 130 reset selected resource Access Events 126, in order to guarantee in reduction application Just data is write during state.Record adds, revises or delete resource Access Events 126 and the phase of data The resource write data 130 answered, so that playback write is to reduce application state, can allow from application and trouble more Fast recovery, thus reduce the application downtime.From operation 408, routine 400 terminates.
Although the disclosure described in the context of virtualization applications environment 106, it should be understood that herein Presented for distinguish the normal operating of application program and error condition with prediction, diagnostic application fault and from Its method recovered can monitor the application program 104 access to local and remote resource 118 wherein Other applied environment any in realize.Such as, similar with virtualization layer 116 module can be implemented so that profit Link resource API 120 realized by operating system 122 by means known in the art, in order to monitor by The request to local and remote resource 118 of the application program 104 locally executed outside virtual application environment 106. The commonality schemata that the recordable resource that resource accesses and sets up application program 104 of this module uses.One This resource access mode of denier is established, and this module can continue to monitor application journey by manner described herein Sequence 104 is to the use of resource and provides warning or alarm when pattern changes.
Fig. 5 illustrates the example computer architecture of computer 500, and this computer is able to carry out described herein For distinguishing the normal operating of application program by the way of presented above with error condition to predict, to examine Disconnected application and trouble the component software recovered from it.Computer Architecture shown in Fig. 5 illustrates regular service Device computer, desk computer, laptop computer, notebook, PDA, radio telephone or its It calculates equipment, it is possible to be used for performing the presented herein calculating equipment 102 that is described as be in or other calculates Any aspect of the component software performed on equipment.
Computer Architecture shown in Fig. 5 includes one or more CPU (" CPU ") 502. CPU 502 can be carried out the standard processor of the arithmetic sum logical operation needed for the operation of computer 500. CPU 502 is transformed into NextState by the physical state discrete from and performs the operation of necessity, this turn Change is to carry out distinguishing and changing the switching device of these states between each state and realize by handling.Cut Change element and typically can include maintaining the electronic circuit of one of two binary conditions, such as trigger circuit, and The electronic circuit of the logical groups incompatible offer output state of states based on other switching devices one or more, Such as gate.These basic switching devices can be combined to create more complicated logic circuit, including depositing Device, adder substracter, ALU, floating point unit and other logic element.
This Computer Architecture also includes containing random access memory (" RAM ") 514 and read-only deposits The system storage 508 of reservoir 516 (" ROM ") and couple the memory to CPU 502 be System bus 504.Basic input/output is stored in ROM 516, and this system comprises help such as The basic routine of information is transmitted between element in computer 500 during starting.Computer 500 also wraps Include the mass-memory unit 510 for storing operating system 122, application program and other program module, This will the most at large describe.
Mass-memory unit 510 is by being connected to the bulk memory controller (not shown) of bus 504 It is connected to CPU 502.Mass-memory unit 510 provides non-volatile memories for computer 500.Calculate By the physical state of conversion mass-memory unit 510, machine 500 can reflect that stored information will be believed Breath storage is on the device.In the different realizations of this specification, the concrete conversion of physical state can be depending on Various factors.The example of these factors can include but not limited to: is used for realizing mass-memory unit Technology, mass-memory unit is characterized as being primary storage or auxiliary storage etc..
Such as, information can be stored by computer 500 by sending to give an order to bulk memory controller To mass-memory unit 510: the magnetic characteristic of the ad-hoc location in change disc driver;Change optical storage The reflection of the ad-hoc location in equipment or refracting characteristic;Or specific capacitor in change solid storage device, crystalline substance Body pipe or the electrical characteristics of other discrete component.In the case of without departing substantially from scope and spirit of the present invention, physics Other conversion of medium is possible.Computer 500 is also by detection mass-memory unit Or the physical state of multiple ad-hoc location or characteristic read information from mass-memory unit 510.
As briefly described above, multiple program modules and data file are storable in the massive store of computer 500 and set In standby 510 and RAM 514, including being applicable to control the operating system 518 of the operation of computer.Large Copacity Storage device 510 and RAM 514 also can store one or more program module.Specifically, Large Copacity Storage device 510 and RAM 514 can Storage Virtualization layer 116, this is once more detailed above with reference to Fig. 1 Described.Mass-memory unit 510 and RAM 514 also can store other type of program module or Data.
In addition to above-mentioned mass-memory unit 510, computer 500 is able to access that other computer-readable Medium is with storage and retrieval information, such as program module, data structure or other data.People in the art Member should be understood that computer-readable medium can be the addressable any usable medium of computer 500, including Computer-readable recording medium and communication media.Communication media includes instantaneous signal.Computer-readable storage medium Matter includes appointing with the storage such as information such as computer-readable instruction, data structure, program module or other data Volatibility that where method or technology realize and medium non-volatile, removable and irremovable.Such as, calculate Machine readable storage medium storing program for executing includes but not limited to, RAM, ROM, EPROM, EEPROM, flash memory or its Its solid-state memory technology, CD-ROM, digital versatile disc (DVD), HD-DVD, blue light or Other optical storage, cartridge, tape, disk storage or other magnetic storage apparatus, or can be used to store Information needed other medium any that can be accessed by computer 500.
Computer-readable recording medium can be used in when being loaded in computer 500 can be by computer system Being transformed into the computer of special-purpose computer being capable of embodiment described herein from general-purpose computing system can Perform instruction to encode.Computer executable instructions can by change computer-readable recording medium in specific The electricity of position, light, magnetic or other physical characteristic encode on the medium.These computer executable instructions Specify how CPU 502 changes transformation calculations machine 500 between each state as described above.According to One embodiment, computer 500 can access the computer-readable storage medium of storage computer executable instructions Matter, when executed by a computer, this instruction performs above with reference to being used for application program described in Fig. 2-4 Normal operating and error condition distinguish with prediction, diagnostic application fault the routine 200,300 restarting recovery And/or 400.
According to various embodiments, computer 500 can use by network 110 to remote computing device and calculating The logic of machine system be connected in networked environment operation, described network such as LAN, WAN, the Internet or The network of any topological structure known in the art.Computer 500 can be by being connected to the net of bus 504 Network interface unit 506 is connected to network 110.Should be appreciated that NIU 506 can be utilized to It is connected to other type of network and remote computer system.
Computer 500 may also include for receiving and processing from including keyboard, mouse, touch pads, touch The input/output control of the input of multiple input equipments such as screen, electronic stylus or other type of input equipment Device 512 processed.Similarly, i/o controller 512 can to such as computer monitor, flat faced display, The display devices such as digital projector, printer, drawing apparatus or other type of outut device provide output.Can To understand, computer 500 can not include all components shown in Fig. 5, can include the brightest Other assembly really illustrated, or the architecture being totally different from shown in Fig. 5 can be used.
Based on above, it is to be understood that provided herein is for by the normal operating of application program and error condition phase Distinguish with prediction, diagnostic application fault the technology recovered from it.Although with computer structural features, method Action and the special language of computer-readable recording medium describe theme presented herein, but should Understanding, the present invention limited in the dependent claims is not necessarily limited to concrete spy described herein Levy, action or medium.On the contrary, these specific features, action and medium are as realizing claim Exemplary forms is come disclosed.
Above-mentioned theme is only used as explanation to be provided, and is not necessarily to be construed as limiting.Can be to theme as herein described Various modifications and changes may be made, without following shown and described example embodiment and application, and without departing substantially from True spirit and scope of the present invention described in appended claims.

Claims (11)

1. it is used for predicting a computer implemented method for error condition possible in application program (104), Described method includes:
Extract one or more resource application programming interface API to monitor that the access of resource is asked by described application program Ask;
The resource Access Events (126) that record is initiated by described application program (104), wherein said resource Access Events includes the details calling the one or more resource API;
Setting up resource access mode (128) from the resource Access Events (126) recorded, wherein said resource is visited The pattern of asking is included between specific API Calls, between the API Calls to specific resources or to specified quantitative The resource access mode set up between the API Calls of resource;
Detect by described application program (104) from the deviation of the resource access mode (128) set up;With And
Based on the deviation detected, possible to user or manager's alarm of described application program (104) Error condition.
2. computer implemented method as claimed in claim 1, it is characterised in that also include that record comes Comfortable by described application program described resource API for adding, revise or delete data called middle finger The data of fixed parameter value.
3. computer implemented method as claimed in claim 2, it is characterised in that also include utilizing institute The comfortable data that described resource API is called the middle parameter value specified record, next, reset and are used for adding Add, revise or delete the respective resources Access Events of data, in order to after application and trouble, reduce described application The application state of program.
4. computer implemented method as claimed in claim 1, it is characterised in that described resource accesses Pattern is to use Bayesian learning technology to set up from the resource Access Events recorded.
5. computer implemented method as claimed in claim 4, it is characterised in that described resource accesses Pattern is stored as indicating the Markov chain of the relative probability of occurrence between described resource Access Events or general Rate tree.
6. computer implemented method as claimed in claim 4, it is characterised in that detection is from the money set up The deviation of source access module includes: based on the Bayesian analysis to described resource access mode, detection occurs general Rate is less than the sequence of the resource Access Events specifying threshold probability.
7. computer implemented method as claimed in claim 4, it is characterised in that based on to the institute detected State a temporal proximity of the error condition time of origin of application program, will be based on to described resource access mode Bayesian analysis and probability of happening less than threshold probability resource Access Events sequence with described mistake bar Part is associated.
8. for predicting a computer implemented method for error condition possible in application program, described method Including:
Extract one or more resource application programming interface API to monitor and being held by described computer (102) The application program (104) of the row access request to resource (118);
The resource Access Events (126) initiated by described application program (104) be recorded resource and access day Will (124), wherein said resource Access Events includes the details calling the one or more resource API;
Use computer pattern recognition to set up resource from the resource Access Events (126) recorded to access Pattern (128), wherein said resource access mode is included between specific API Calls, to specific resources Between API Calls or to the resource access mode set up between the API Calls of the resource of specified quantitative;
Detect the deviation from the resource access mode set up of the described application program;And
Based on the deviation detected, to the wrong bar that the user of described application program or manager's alarm are possible Part.
9. method as claimed in claim 8, it is characterised in that also include:
Detect the error condition in described application program;
Determine the time of origin of described error condition;
Temporal proximity based on the time of origin to described error condition, associates described resource access log Recorded in resource Access Events one or more;And
Show that the one or more resource associated accesses to user or the manager of described application program Event.
10. method as claimed in claim 8, it is characterised in that the one or more has associated Resource Access Events includes: based on the Bayesian analysis to the resource access mode set up, probability of happening is low Sequence in the resource Access Events of threshold probability.
11. 1 kinds for predicting the computer implemented system of error condition possible in application program, described system System includes:
For extracting one or more resource application programming interface API to monitor the visit to resource of the described application program Ask the device of request;
For the device of the resource Access Events that record is initiated by described application program, wherein said resource accesses Event includes the details calling the one or more resource API;
For setting up the device of resource access mode from the resource Access Events recorded, wherein said resource is visited The pattern of asking is included between specific API Calls, between the API Calls to specific resources or to specified quantitative The resource access mode set up between the API Calls of resource;
Detect by described application program (104) from the deviation of the resource access mode (128) set up;With And
For based on the deviation detected, to the mistake that the user of described application program or manager's alarm are possible The device of condition by mistake.
CN201110442035.7A 2010-12-27 2011-12-26 Based on resource access mode prediction, diagnostic application fault and recover from it Active CN102567181B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US12/978,663 US9189308B2 (en) 2010-12-27 2010-12-27 Predicting, diagnosing, and recovering from application failures based on resource access patterns
US12/978,663 2010-12-27

Publications (2)

Publication Number Publication Date
CN102567181A CN102567181A (en) 2012-07-11
CN102567181B true CN102567181B (en) 2016-12-14

Family

ID=

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7165190B1 (en) * 2002-07-29 2007-01-16 Oracle International Corporation Method and mechanism for managing traces within a computer system

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7165190B1 (en) * 2002-07-29 2007-01-16 Oracle International Corporation Method and mechanism for managing traces within a computer system

Similar Documents

Publication Publication Date Title
US10884837B2 (en) Predicting, diagnosing, and recovering from application failures based on resource access patterns
CN100498725C (en) Method and system for minimizing loss in a computer application
CN103201724B (en) Providing application high availability in highly-available virtual machine environments
Levy et al. Predictive and adaptive failure mitigation to avert production cloud {VM} interruptions
US11178037B2 (en) Methods and systems that diagnose and manage undesirable operational states of computing facilities
Dai et al. Self-healing and hybrid diagnosis in cloud computing
US10489232B1 (en) Data center diagnostic information
US11093319B2 (en) Automated recovery of webpage functionality
Costa et al. A system software approach to proactive memory-error avoidance
TW201537461A (en) Framework for user-mode crash reporting
CN104583968A (en) Management system and management program
US9436539B2 (en) Synchronized debug information generation
CN101021800A (en) Virtual machine monitoring
US11151020B1 (en) Method and system for managing deployment of software application components in a continuous development pipeline
Chang et al. Modeling and analysis of high availability techniques in a virtualized system
US8555105B2 (en) Fallover policy management in high availability systems
Chen et al. Survivability modeling and analysis of cloud service in distributed data centers
CN102567181B (en) Based on resource access mode prediction, diagnostic application fault and recover from it
US20020156826A1 (en) Method and apparatus for performing emergency shutdown of a malfunctioning computer system saving all open files, data, and work in progress to a remote data storage business entity
US20240118991A1 (en) Application scenario injection and validation system
JP7180319B2 (en) Information processing device and dump management method for information processing device
CN119473662A (en) Method, apparatus, computer device, readable storage medium and program product for programming application program interface
Carberry et al. Real-Time rejuvenation scheduling for cloud systems with virtualized software spares
CN116560917A (en) Screen fault detection method and system of electronic equipment and operating system
CN119718745A (en) Fault diagnosis and automatic recovery system, method and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20150717

Address after: Washington State

Applicant after: Micro soft technique license Co., Ltd

Address before: Washington State

Applicant before: Microsoft Corp.

GR01 Patent grant