[go: up one dir, main page]

CN109976986A - The detection method and device of warping apparatus - Google Patents

The detection method and device of warping apparatus Download PDF

Info

Publication number
CN109976986A
CN109976986A CN201711455271.6A CN201711455271A CN109976986A CN 109976986 A CN109976986 A CN 109976986A CN 201711455271 A CN201711455271 A CN 201711455271A CN 109976986 A CN109976986 A CN 109976986A
Authority
CN
China
Prior art keywords
algorithm
real data
data
assessed
optional
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201711455271.6A
Other languages
Chinese (zh)
Other versions
CN109976986B (en
Inventor
朱婉怡
豆龙超
黄杰龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201711455271.6A priority Critical patent/CN109976986B/en
Publication of CN109976986A publication Critical patent/CN109976986A/en
Application granted granted Critical
Publication of CN109976986B publication Critical patent/CN109976986B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3055Monitoring arrangements for monitoring the status of the computing system or of the computing system component, e.g. monitoring if the computing system is on, off, available, not available
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Computer Hardware Design (AREA)
  • Testing And Monitoring For Control Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

This specification one or more embodiment provides a kind of detection method and device of warping apparatus, and this method may include: to obtain the real data of the performance indicator of monitored system;Determine the second real data between the first real data for being in the stable region in time series in the real data, the range of instability in time series;First real data is assessed to obtain the first assessment result, second real data is assessed to obtain the second assessment result;The warping apparatus in the monitored system is determined according to first assessment result and second assessment result.

Description

The detection method and device of warping apparatus
Technical field
This specification one or more embodiment is related to abnormality detection technical field more particularly to a kind of inspection of warping apparatus Survey method and device.
Background technique
It, can be to the operating status of the monitored system in system by configuring alarming mechanism in the systems such as data center It is monitored, to find and solve in time the unusual condition that monitored system is likely to occur.
In the related art, by the data (i.e. performance data) of the performance indicator for the system that is monitored in acquisition system, and Performance data is compared with predefined performance threshold, can be sentenced in the case where performance data does not meet performance threshold There may be exceptions for fixed system monitored accordingly.
Summary of the invention
In view of this, this specification one or more embodiment provides a kind of detection method and device of warping apparatus.
To achieve the above object, it is as follows to provide technical solution for this specification one or more embodiment:
According to this specification one or more embodiment in a first aspect, propose a kind of detection method of warping apparatus, Include:
Obtain the real data of the performance indicator of monitored system;
Determine the first real data of the stable region in the real data in time series, in time series On range of instability between the second real data;
First real data is assessed to obtain the first assessment result, second real data is assessed Obtain the second assessment result;
The warping apparatus in the monitored system is determined according to first assessment result and second assessment result.
According to the second aspect of this specification one or more embodiment, a kind of detection device of warping apparatus is proposed, Include:
Acquiring unit obtains the real data of the performance indicator of monitored system;
Determination unit determines the first real data, the place of the stable region in the real data in time series The second real data between the range of instability in time series;
Assessment unit is assessed to obtain the first assessment result, to second actual number to first real data According to being assessed to obtain the second assessment result;
Recognition unit determines in the monitored system according to first assessment result and second assessment result Warping apparatus.
Detailed description of the invention
Fig. 1 is a kind of structural schematic diagram for data center that an exemplary embodiment provides.
Fig. 2 is a kind of flow chart of the detection method for warping apparatus that an exemplary embodiment provides.
Fig. 3 is the process that a kind of warning system that an exemplary embodiment provides implements abnormality detection for data center Figure.
Fig. 4 is a kind of song of historical data of the performance indicator X of exemplary embodiment offer within three cycles of operation Line schematic diagram.
Fig. 5 be an exemplary embodiment provide 1 minute load mean value of one kind, 5 minutes load mean values and 15 minutes The curve synoptic diagram of the real data of load mean value.
Fig. 6 is a kind of 1 minute load average after Box-Cox converts algorithm process that an exemplary embodiment provides The normal distribution schematic diagram of the real data of value.
Fig. 7 is the structural schematic diagram for a kind of electronic equipment that an exemplary embodiment provides.
Fig. 8 is a kind of block diagram of the detection device for warping apparatus that an exemplary embodiment provides.
Specific embodiment
Example embodiments are described in detail here, and the example is illustrated in the accompanying drawings.Following description is related to When attached drawing, unless otherwise indicated, the same numbers in different drawings indicate the same or similar elements.Following exemplary embodiment Described in embodiment do not represent all embodiments consistent with this specification one or more embodiment.Phase Instead, they are only some aspects phases with the one or more embodiments of as detailed in the attached claim, this specification The example of consistent device and method.
It should be understood that the sequence that might not show and describe according to this specification in other embodiments executes The step of correlation method.In some other embodiments, step included by method can than described in this specification more It is more or less.In addition, single step described in this specification, may be broken down into other embodiments multiple steps into Row description;And multiple steps described in this specification, it may also be merged into single step progress in other embodiments Description.
Fig. 1 is a kind of structural schematic diagram for data center that an exemplary embodiment provides.As shown in Figure 1, in data In the heart configured with hardware devices and warning systems 14 such as equipment 11, equipment 12, equipment 13.Wherein, the hardware such as equipment 11-13 are set It is standby that application program is run by independent or cooperation, to realize the specific function of the data center.
Warning system 14 passes through the performance indicator at monitoring data center, to determine operation conditions locating for the data center. Wherein, which may come from the hardware state of the hardware devices such as equipment 11-13, can be from transporting on hardware device The application state of capable application program, or there may also be other sources, this specification is limited not to this.By right Performance indicator is monitored, and warning system 14 can find the warping apparatus being likely to occur in data center in time, so that alarm System 14 can issue prompt or alarm to default object and wherein should in order to diagnose, analyze in time or handle the abnormal conditions Default object may include staff or automated programming system of data center etc., and this specification is limited not to this System.
In the embodiment of this specification, it is optimized and improves by the abnormality detection scheme to warning system 14, it can Keep monitoring operation more accurate, sensitive, avoids to abnormal wrong report and cause the human resources or other resources of staff Waste, it is ensured that the normal operation of data center.Wherein, data center is only the one of the abnormality detection scheme provided in this specification Kind application;In fact, it is any that the abnormality detection scheme of this specification can also be applied to other other than data center Electronic equipment, structure or system, this specification are limited not to this.
Fig. 2 is a kind of flow chart of the detection method for warping apparatus that an exemplary embodiment provides.As shown in Fig. 2, should Method may comprise steps of:
Step 202, the real data of the performance indicator of monitored system is obtained.
In one embodiment, performance indicator may include the arbitrary parameter for reflecting the operation conditions of monitored system, Such as handling capacity, event count, operation duration, memory or storage size etc., this specification is limited not to this.
In one embodiment, correlation analysis can be carried out for the performance indicator of the monitored system;Wherein, work as phase When the real data of associated multiple performance indicators is acquired, it can screen out in associated multiple performance indicators at least The real data of one performance indicator, and only retain the real data of part (one or more) performance indicator, after reducing accordingly The data volume handled needed for during continuous.For example, it is assumed that the corresponding performance indicator of the real data of acquisition includes in 1 minute Load mean value, the load mean value in 5 minutes, the load mean value in 15 minutes, if between these three performance indicators Associated (such as correlation degree is greater than predeterminable level), can only select the corresponding real data of load mean value in 1 minute, The corresponding real data of load mean value in the load mean value and 15 minutes in 5 minutes is screened out, to only need for 1 point The corresponding real data of load mean value in clock is handled, to reduce the data processing amount in subsequent step, without right Final assessment result etc. causes adverse effect.
In one embodiment, it can determine whether the real data of performance indicator meets predefined standard data structure, Real data can be adjusted if being unsatisfactory for, so that it is satisfied with the standard data structure.For example it is assumed that the standard Data structure may include normal distribution structure, can be converted by such as Gaussian Profile, Box-Cox converts scheduling algorithm to reality Data are adjusted, so that its standard data structure for being satisfied with normal distribution.
Step 204, it determines the first real data of the stable region in the real data in time series, be in The second real data between range of instability in time series.
In one embodiment, the cycle of operation of the monitored system is divided into several periods in time series;When It is described same when the historical data of same period of the performance indicator within the multiple cycles of operation being selected is in stable state One period was confirmed as belonging to the stable region, when the operation that the historical data of the same period is selected at least one When in the period being in unstable state, the same period is confirmed as belonging between the range of instability.For example, it is assumed that supervised The cycle of operation of control system is one day, and multiple cycles of operation of selection may include nearest three days, i.e. nearest continuous three fortune The row period, if period 9:00~12:00 is in stable state within these three cycles of operation, can by period 9:00~ 12:00 is determined to belong to stable region;And if at least one operation of period 3:20~4:10 within these three cycles of operation Period plays pendulum, and period 3:20~4:10 can be determined to belong between range of instability.
In one embodiment, stable region may include one or more periods.
It in one embodiment, may include one or more periods between range of instability.
In one embodiment, can according to specific fine granularity to the cycle of operation carry out Time segments division, such as when, minute, second Deng this specification is limited not to this.
In one embodiment, the period of stable state shows monitored performance indicator of the system in relevant time period, can It is able to maintain that within multiple cycles of operation compared with minor swing or there is no fluctuations;And the period of unstable state shows monitored system , there is larger fluctuation at least once within multiple cycles of operation in performance indicator of the system in relevant time period.
In one embodiment, the stability of each period can be identified with the numerical characteristics of passage capacity index, such as not The period of stable state may include at least one of: peak period (period for peak value occur), noise intervals (exist The period of noise) etc., and the period of stable state is then and not comprising above-mentioned peak period or noise intervals etc..
In one embodiment, the period of the stable state may include in the cycle of operation except the unstable state Period except other periods, i.e. stable state can be complementary, common group within the cycle of operation with the period of unstable state At the complete cycle of operation.In other embodiments, the period of stable state and unstable state might not be in the cycle of operation Interior is in complementation, for example there is likely to be the period of nondeterministic statement, the period being not concerned with etc., this specification is limited not to this System.
It in one embodiment, can property by time-series pattern discovery algorithm to monitored system within multiple cycles of operation The historical data of energy index is analyzed, to determine between above-mentioned stable region and range of instability.
Step 206, first real data is assessed to obtain the first assessment result, to second real data It is assessed to obtain the second assessment result.
In one embodiment, the first real data and second real data are assessed respectively, to respectively obtain Corresponding first assessment result and the second assessment result.
In one embodiment, it is assumed that the first real data is assessed using the first algorithm, using the second algorithm to Two real data are assessed, and first algorithm and second algorithm can use unsupervised algorithm, thus without mark In the case that numeration is handled according to participation, that is, it may recognize that the warping apparatus in monitored system, to dramatically reduce work The processing load of personnel.
In one embodiment, the first algorithm may include at least one of following optional algorithm: time series mining algorithm, Clustering algorithm, statistical learning algorithm, algorithm with regress analysis method etc..
In one embodiment, the second algorithm may include at least one of following optional algorithm: clustering algorithm, statistical learning Algorithm, algorithm with regress analysis method.
Step 208, the exception in the monitored system is determined according to first assessment result and the second assessment result Equipment.
In one embodiment, when first algorithm includes plurality of optional algorithm, there are corresponding for each optional algorithm Weighted value, the assessment result to first real data include that the point value of evaluation obtained respectively to plurality of optional algorithm adds The first weighting point value of evaluation being calculated is weighed, the advantage so as to comprehensive plurality of optional algorithm combines warping apparatus Detection, helps to promote the detectability to warping apparatus.
In one embodiment, when second algorithm includes plurality of optional algorithm, there are corresponding for each optional algorithm Weighted value, the assessment result to second real data include that the point value of evaluation obtained respectively to plurality of optional algorithm adds The second weighting point value of evaluation being calculated is weighed, the advantage so as to comprehensive plurality of optional algorithm combines warping apparatus Detection, helps to promote the detectability to warping apparatus.
In one embodiment, the assessment result can be compared with flag data, to analyze as described first The assessment accuracy of each optional algorithm of algorithm or second algorithm, then according to the assessment accuracy to accordingly may be used Algorithm is selected to improve.For example, optional algorithm can be carried out according to real data of the predefined assessment threshold value to performance indicator Assessment, to generate corresponding point value of evaluation, then the numerical values recited of adjustable assessment threshold value when being improved to optional algorithm; For another example when in the first algorithm or the second algorithm including multiple optional algorithms, the corresponding power of adjustable each optional algorithm Weight values size.
In order to make it easy to understand, below by taking warning system is to data center implementation abnormality detection as an example, to the skill of this specification Art scheme is described in detail.
Fig. 3 is the process that a kind of warning system that an exemplary embodiment provides implements abnormality detection for data center Figure.As shown in figure 3, the monitoring process may comprise steps of:
Step 302, the historical data of performance indicator is obtained.
In one embodiment, warning system can be configured with data acquisition function, so that the available data of warning system The historical data of the performance indicator at center.For example, warning system can (Extract-Transform-Load takes out configured with ETL - conversion-is taken to load) module, and be acquired by historical data of the ETL module to the performance indicator of data center, this is gone through History data may include full dose historical data, also may include the historical data in a historical time section, and this specification is not right This is limited.
In one embodiment, it by increasing the type of performance indicator, can at least be promoted to a certain extent in data The abnormality detection ability of the heart, thus performance indicator belonging to the historical data obtained can be made as comprehensive as possible, in order to realize To the complete detection of data center.For example, performance indicator may include 1 minute load mean value, 5 minutes load mean values, 15 Minute load mean value, cpu busy percentage, handling capacity (queries per second), delay (response time), Thread Count, memory or storage Size etc., this specification is limited not to this.
Step 304, it determines between stable region and range of instability.
In one embodiment, there are certain cycle of operation, such as 1 day, 3 days, 1 week, 1 month etc., this theorys for data center Bright book is limited not to this.
In one embodiment, the operating status according to data center within multiple cycles of operation, it can be found that data center Existing some moving laws.It is assumed that the cycle of operation of data center is 00:00~24:00 of each consecutive days, can choose A certain performance indicator X was at nearest three cycles of operation (such as the d-1 days to the d-3 days;In other embodiments, it can also use Other modes choose multiple cycles of operation, and this specification is limited not to this) historical data, these historical datas can be with It is expressed as three curves as shown in Figure 4;Correspondingly, above-mentioned moving law can be found by following principles:
In time series, the cycle of operation may include several periods, such as " when " can be under (or minute, second etc.) granularity Including 24 periods;It then, can be by above-mentioned performance indicator X three operation weeks for each period in the cycle of operation Interim corresponding historical data is compared.
Such as in the period 41 shown in Fig. 4, performance value (i.e. performance indicator of the data center in three cycles of operation The numerical value of the historical data of X) be respectively positioned on stable state, i.e. data center corresponds at the period 41 in three cycles of operation Peak period (period comprising peak value) or noise period (period comprising noise) is not present, so that corresponding in three curves Value at the period 41 is same or similar (such as difference is less than default value), thus can consider that the period 41 belongs to Stable region in the cycle of operation.Similarly, period 42 as shown in Figure 4, period 43 etc. can be regard as the cycle of operation Interior stable region.
And for the period 44 shown in Fig. 4, the performance value due to data center in three cycles of operation is in shakiness Determine state, i.e., data center is at least one cycle of operation corresponding to there are when peak period or noise at the period 44 Section so that in three curves correspond to the period 44 at value due to randomness peak value or noise and there are larger differences It is different, thus can consider that the period 44 belongs between the range of instability in the cycle of operation.It similarly, can will be as shown in Figure 4 Periods 45 etc. are as between the range of instability in the cycle of operation.
Therefore, it can choose based on the above principles or the algorithm of similar principles, to data center in specific multiple operations The historical data of performance indexes in period is analyzed, to will divide the cycle of operation for each performance indicator respectively Between corresponding stable region and range of instability;For example, which may include time sequence model discovery (Time Series Motif Discovery) algorithm, for finding " stable region ", " no in the cycle of operation in time series The time series patterns such as stable region ".
In one embodiment, it can identify between stable region and range of instability simultaneously;It in another embodiment, can be only Between identifying stable region and regarding the remaining period as range of instability, or it can only identify between range of instability and will remain The remaining period is used as stable region.
In one embodiment, above-mentioned step 302-304 can the historical data based on the performance indicator of data center into Row processing, thus processed offline can be carried out to historical data by warning system, to determine the stable region in the cycle of operation Between range of instability.And in following step 306-314, it can be carried out for the real data of the performance indicator of data center Line analysis, to identify the warping apparatus in the data center.
Step 306, the real data of acquisition performance index.
In one embodiment, the performance that the quantity for the performance indicator that step 302 is related to should be related to not less than step 306 refers to Mark, to ensure that on-line analysis can be smoothly completed based on the processed offline result that step 302-304 is obtained in the next steps.
In one embodiment, the real data of performance indicator may include 1 minute of every equipment in data center negative Carry average value, 5 minutes load mean values, 15 minutes load mean values, cpu busy percentage, handling capacity (queries per second), delay (response time), Thread Count, memory or storage size etc., this specification is limited not to this.Warning system can be according to Present position, corresponding device group, the application program that is currently running etc. of the every equipment in data center, to collected above-mentioned Real data carries out data aggregate.
In one embodiment, warning system can carry out data cleansing to collected real data.For example, warning system It can determine missing values of the real data in time series, and completion be given to missing values (missing values of completion can be adopted With default value, it can perhaps use the numerical value of adjacent time point or the data mean value in neighbouring preset time period can be used Deng this specification is limited not to this);For another example warning system can remove the significant actual number for violating business rule According to etc..
In one embodiment, warning system can be according to the real data of collected each performance indicator, for each Performance indicator carries out correlation analysis.As shown in figure 5, negative with 1 minute load mean value, 5 minutes load mean values and 15 minutes For carrying average value, if there are significant correlation, (i.e. three is associated, and closes between the real data of these three performance indicators Connection degree is greater than predeterminable level), then can the real data of only selected part performance indicator be used for subsequent abnormality detection Process, without handling all real data, to reduce resource occupation;For example, 1 minute load average can only be chosen The real data of value carries out subsequent processing without the real data to 5 minutes load mean values, 15 minutes load mean values.
In one embodiment, warning system can be standardized the real data of collected performance indicator, So that real data meets predefined standardized data structures.For example, which may include Normal distribution structure, it is assumed that warning system has chosen 1 minute above-mentioned load mean value, which may further determine that Whether the real data of 1 minute load mean value meets above-mentioned normal distribution structure, can be by such as if do not met Gaussian Profile transformation, Box-Cox transformation scheduling algorithm carry out conversion process to corresponding real data, to be adjusted to accord with The normal distribution structure;For example Fig. 6 shows above-mentioned 1 minute load mean value after Box-Cox converts algorithm process Real data, by using 0.4 power (taking 0.4 power for the real data of prototype structure) in conversion process, so that The real data of 1 minute load mean value meets above-mentioned normal distribution structure.
In one embodiment, warning system can provide the open of standard for staff or other data consumers API (Application Programming Interface, application programming interface), so that it is for performance indicator Real data is programmed access.
Step 308, interval division is carried out to real data.
In one embodiment, according between the stable region and range of instability determined in step 304, for collected property The real data can be divided to the first real data in stable region, be in unstable by the real data of energy index Second real data in section, to be analyzed and processed respectively to the first real data and the second real data.
Step 310, optional algorithm is determined.
It in one embodiment, can be using (first calculation of the first algorithm for the first real data for being in stable region Method may include one or more optional algorithms), and for the second real data between range of instability, second can be used Algorithm (second algorithm may include one or more optional algorithms).
Step 312A assesses the first real data.
In one embodiment, warning system can be by the assessment models based on above-mentioned first algorithm, to first reality Data are assessed.For example, first algorithm may include following optional algorithms of at least one: time series mining algorithm gathers Class algorithm, statistical learning algorithm, algorithm with regress analysis method etc., this specification is limited not to this.
It, can be right for the first real data of any performance indicator when the first algorithm includes time series mining algorithm Implement following processing:
1) reasonable time window is selected, respectively to data, the correspondence for corresponding to every equipment in first real data The data of the equipment group belonging to every equipment are smoothed, for example can use Savitzky-Golay filter here Realize the smoothing processing;Alternatively, can be filtered in other embodiments using Kalman (Kalman) filter, moving window average Wave device, Butterworth (Butterworth) filter etc. are smooth or low-pass filter realizes the smoothing processing.Wherein, in data Intracardiac all devices, which may be considered that, belongs to an equipment group;Alternatively, the equipment in data center can be by according to position, function Energy or other dimensions carry out group division, to form multiple equipment group.
2) implement following processing for each equipment group respectively: in each time window, determining in the first real data The corresponding equipment group smooth value of data corresponding to each equipment group, and using the equipment group smooth value as every in the equipment group The desired value of platform equipment, while determining that the corresponding equipment of data for corresponding to every equipment in the first real data is smooth respectively Value;Then, the distance between the equipment smooth value of every equipment and corresponding desired value are calculated separately.
3) the corresponding distance of every equipment is compared with predefined threshold value, is considered different with every equipment of determination Standby probability is set up, i.e. time sequential mining algorithm is the point value of evaluation that every equipment provides.Wherein, above-mentioned threshold value can wrap Include the pre- numerical value for first passing through statistical learning and determining, for example, the algorithm of the statistical learning may include " three-sigma " rule, Median absolute deviation method (Median absolute deviation approach) etc., this specification is limited not to this System.
When the first algorithm includes clustering algorithm, which may include Name-based Routing, such as DBSCAN (Density-Based Spatial Clustering of Applications with Noise has noisy based on close The clustering method of degree), LOF (Local Outlier Factor, local outlier factor) algorithm etc., which can also wrap Include the algorithm based on prediction, such as one-class support vector machine (One-Class SVM, i.e. One-Class Support Vector Machines) etc., this specification is limited not to this.
When the first algorithm includes statistical learning algorithm, the statistical learning algorithm may include " three-sigma " rule, Median absolute deviation method, Turkey's test (Turkey testing) etc., this specification is limited not to this;Based on above-mentioned Statistical learning algorithm, the equipment that can be deviated considerably from the average value or median in selected equipment group with all devices.
In one embodiment, when the first algorithm includes simultaneously plurality of optional algorithm, warning system can pass through base respectively The first real data is analyzed to obtain corresponding point value of evaluation in the assessment models of each optional algorithm, then according to every kind The corresponding weighted value of optional algorithm, the point value of evaluation that these optional algorithms obtain is weighted (for example weighted sum calculates Deng), to obtain the comprehensive score that these optional algorithms correspond to the first real data --- the first weighting point value of evaluation.
Step 312B assesses the second real data.
In one embodiment, warning system can be by the assessment models based on above-mentioned second algorithm to second actual number According to being assessed.For example, second algorithm may include following optional algorithms of at least one: clustering algorithm, statistical learning algorithm, Algorithm with regress analysis method etc., this specification is limited not to this.
In one embodiment, when the second algorithm includes simultaneously plurality of optional algorithm, warning system can pass through base respectively The second real data is analyzed to obtain corresponding point value of evaluation in the assessment models of each optional algorithm, then according to every kind The corresponding weighted value of optional algorithm, the point value of evaluation that these optional algorithms obtain is weighted (for example weighted sum calculates Deng), to obtain the comprehensive score that these optional algorithms correspond to the second real data --- the second weighting point value of evaluation.
Step 314, the risk score of every equipment is obtained.
In one embodiment, since the first real data is corresponding to the stable region in the cycle of operation, the second real data Corresponding between the range of instability in the cycle of operation, thus the comprehensive assessment result to the first real data with to the second real data Assessment result, the risk score for every equipment in data center can be obtained.
Step 316, the assessment result and flag data set obtained according to step 314 is analyzed based on optional algorithm Assessment models.
It in one embodiment, may include reality of the staff to the performance indicator of data center in flag data set Data carry out the flag data after hand labeled, by staff according to the actual situation by real data labeled as abnormality or Normal condition.
In one embodiment, warning system refers to performance based on the optional algorithm in the first above-mentioned algorithm, the second algorithm When target real data is assessed, in fact it could happen that a variety of situations, such as: by disorder data recognition be abnormal data, will be abnormal Data are identified as normal data, normal data is identified as to normal data, normal data are identified as abnormal data;By that will comment Estimate result to be compared with flag data set, can determine knowledge of the above-mentioned optional algorithm for the real data of performance indicator Other effect, for example, the recognition effect can show as the corresponding assessment models of each optional algorithm accurate rate (Precision), Recall rate (Recall), accuracy rate (Accuracy), F1 value etc., this specification is limited not to this.
Step 318, the assessment models based on optional algorithm are improved.
It in one embodiment, can be to based on optional according to the Evaluated effect of the corresponding assessment models of each optional algorithm The assessment models of algorithm improve.For example, can improve in assessment models for evaluating the threshold of the abnormal probability of real data Value (such as above for the threshold value being related in the description of " time series mining algorithm ").For another example can improve each The weighted value of the corresponding assessment models of optional algorithm.
Based on above-mentioned step 316-318, warning system can be directed to the risk score of every equipment with automatic feedback, thus The sustained improvement to the assessment models based on optional algorithm is realized based on flag data set.Wherein, which can With online or offline realization, this specification is limited not to this.For example, being referred to following feedback formulas realizations should Automatic feedback and corresponding improvement:
Wherein, s indicates total abnormal score, siIndicate the abnormal score of i-th of assessment models, wiIndicate i-th of assessment models Weighted value, fiIndicate the comprehensive assessment score value (being similar to F1 value) to the accuracy of i-th of assessment models.
In one embodiment, warning system can connect to the incident management platform of data center, so that warning system can To send a warning message to the incident management platform.Wherein, warning information may include:
1) effect analysis based on flag data set to the assessment models based on optional algorithm.It is false such as shown in the following table 1 The optional algorithm for determining warning system use includes one-class support vector machine, statistic algorithm, time series mining algorithm, DBSCAN calculation Method, algorithm with regress analysis method etc., can be from accurate rate, recall rate and F1 value etc. to the assessment models based on above-mentioned optional algorithm It is evaluated.
Optional algorithm Accurate rate Recall rate F1 value
One-class support vector machine 80% 80% 0.400
Statistic algorithm 90% 50% 0.321
Time series mining algorithm 80% 85% 0.412
DBSCAN algorithm 75% 75% 0.375
Algorithm with regress analysis method 85% 70% 0.384
Table 1
2) detail information of each warping apparatus.For example, the detail information may include device id, wind shown in the following table 2 Danger scoring;In addition, the detail information can also refer to including device IP, the timestamp occurred extremely, the performance with abnormal behaviour Other kinds of information, this specification such as mark are not limited to this.
Table 2
In conclusion by the way that the cycle of operation of data center is divided into stable region and range of instability in this specification Between, and anomaly assessment is carried out for the real data of the performance indicator in different sections respectively, it can be in data center The unusual condition of every equipment provides risk score, can be obviously improved compared to technical solution in the related technology in data The abnormality detection efficiency of the heart.Meanwhile it being directed to multiple combinations algorithm by using during anomaly assessment, and be based on reference numerals Sustained improvement is carried out according to the assessment models based on these algorithms, can further promote technical side employed in this specification The performance of case.
Fig. 7 is the schematic configuration diagram for a kind of electronic equipment that an exemplary embodiment provides.Referring to FIG. 7, in hardware layer Face, the electronic equipment include processor 702, internal bus 704, network interface 706, memory 708 and nonvolatile memory 710, it is also possible that hardware required for other business certainly.Processor 702 reads correspondence from nonvolatile memory 710 Computer program into memory 708 then run, on logic level formed warping apparatus detection device.Certainly, in addition to Except software realization mode, other implementations, such as logical device suppression is not precluded in this specification one or more embodiment Or mode of software and hardware combining etc., that is to say, that the executing subject of following process flow is not limited to each logic unit, It is also possible to hardware or logical device.
Referring to FIG. 8, in Software Implementation, the detection device of the warping apparatus may include:
Acquiring unit 81 obtains the real data of the performance indicator of monitored system;
Determination unit 82, determine the stable region in the real data in the time series the first real data, The second real data between range of instability in time series;
Assessment unit 83 is assessed to obtain the first assessment result, practical to described second to first real data Data are assessed to obtain the second assessment result;
Recognition unit 84 determines in the monitored system according to first assessment result and second assessment result Warping apparatus.
Optionally, the cycle of operation of the monitored system is divided into several periods in time series;When the property When the historical data of same period of the energy index within the multiple cycles of operation being selected is in stable state, the same period It is confirmed as belonging to the stable region, when the historical data of the same period is within the cycle of operation that at least one is selected When in unstable state, the same period is confirmed as belonging between the range of instability.
Optionally, the period of the unstable state includes at least one of: peak period, noise intervals;It is described steady The period for determining state includes the cycle of operation interior other periods in addition to the period of the unstable state.
It optionally, is to be found out of described the cycle of operation by time series pattern between the stable region and the range of instability Algorithm is determined to obtain.
Optionally, first real data, second real data are assessed using unsupervised algorithm.
Optionally,
Assessed by least one of following optional algorithm first real data: time series, which is excavated, to be calculated Method, clustering algorithm, statistical learning algorithm, algorithm with regress analysis method;
Second real data is assessed by least one of following optional algorithm: clustering algorithm, statistics Practise algorithm, algorithm with regress analysis method.
Optionally,
When being assessed using plurality of optional algorithm first real data, there are corresponding for each optional algorithm Weighted value, the assessment result to first real data include that the point value of evaluation obtained respectively to plurality of optional algorithm adds Weigh the first weighting point value of evaluation being calculated;
When being assessed using plurality of optional algorithm second real data, there are corresponding for each optional algorithm Weighted value, the assessment result to second real data include that the point value of evaluation obtained respectively to plurality of optional algorithm adds Weigh the second weighting point value of evaluation being calculated.
Optionally, further includes:
The assessment result is compared by comparing unit 85 with flag data, with analysis for practical to described first The assessment accuracy for each optional algorithm that data or second real data are assessed;
Unit 86 is improved, corresponding optional algorithm is improved according to the assessment accuracy.
Optionally, further includes:
Analytical unit 87 carries out correlation analysis for the performance indicator of the monitored system;
Unit 88 is screened out, when the real data of associated multiple performance indicators is acquired, is screened out described associated Multiple performance indicators at least one performance indicator real data.
System, device, module or the unit that above-described embodiment illustrates can specifically realize by computer chip or entity, Or it is realized by the product with certain function.A kind of typically to realize that equipment is computer, the concrete form of computer can To be personal computer, laptop computer, cellular phone, camera phone, smart phone, personal digital assistant, media play In device, navigation equipment, E-mail receiver/send equipment, game console, tablet computer, wearable device or these equipment The combination of any several equipment.
In a typical configuration, computer includes one or more processors (CPU), input/output interface, network Interface and memory.
Memory may include the non-volatile memory in computer-readable medium, random access memory (RAM) and/or The forms such as Nonvolatile memory, such as read-only memory (ROM) or flash memory (flash RAM).Memory is computer-readable medium Example.
Computer-readable medium includes permanent and non-permanent, removable and non-removable media can be by any method Or technology come realize information store.Information can be computer readable instructions, data structure, the module of program or other data. The example of the storage medium of computer includes, but are not limited to phase change memory (PRAM), static random access memory (SRAM), moves State random access memory (DRAM), other kinds of random access memory (RAM), read-only memory (ROM), electric erasable Programmable read only memory (EEPROM), flash memory or other memory techniques, read-only disc read only memory (CD-ROM) (CD-ROM), Digital versatile disc (DVD) or other optical storage, magnetic cassettes, disk storage, quantum memory, based on graphene Storage medium or other magnetic storage devices or any other non-transmission medium, can be used for storing can be accessed by a computing device Information.As defined in this article, computer-readable medium does not include temporary computer readable media (transitory media), Such as the data-signal and carrier wave of modulation.
It should also be noted that, the terms "include", "comprise" or its any other variant are intended to nonexcludability It include so that the process, method, commodity or the equipment that include a series of elements not only include those elements, but also to wrap Include other elements that are not explicitly listed, or further include for this process, method, commodity or equipment intrinsic want Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including described want There is also other identical elements in the process, method of element, commodity or equipment.
It is above-mentioned that this specification specific embodiment is described.Other embodiments are in the scope of the appended claims It is interior.In some cases, the movement recorded in detail in the claims or step can be come according to the sequence being different from embodiment It executes and desired result still may be implemented.In addition, process depicted in the drawing not necessarily require show it is specific suitable Sequence or consecutive order are just able to achieve desired result.In some embodiments, multitasking and parallel processing be also can With or may be advantageous.
The term that this specification one or more embodiment uses be only merely for for the purpose of describing particular embodiments, and It is not intended to be limiting this specification one or more embodiment.In this specification one or more embodiment and the appended claims Used in the "an" of singular, " described " and "the" be also intended to including most forms, unless context understands earth's surface Show other meanings.It is also understood that term "and/or" used herein refers to and includes one or more associated list Any or all of project may combine.
It will be appreciated that though this specification one or more embodiment may using term first, second, third, etc. come Various information are described, but these information should not necessarily be limited by these terms.These terms are only used to same type of information area each other It separates.For example, the first information can also be referred to as in the case where not departing from this specification one or more scope of embodiments Two information, similarly, the second information can also be referred to as the first information.Depending on context, word as used in this is " such as Fruit " can be construed to " ... when " or " when ... " or " in response to determination ".
The foregoing is merely the preferred embodiments of this specification one or more embodiment, not to limit this theory Bright book one or more embodiment, all within the spirit and principle of this specification one or more embodiment, that is done is any Modification, equivalent replacement, improvement etc. should be included within the scope of the protection of this specification one or more embodiment.

Claims (18)

1. a kind of detection method of warping apparatus characterized by comprising
Obtain the real data of the performance indicator of monitored system;
Determine the first real data of the stable region in the real data in time series, in time series The second real data between range of instability;
First real data is assessed to obtain the first assessment result, second real data is assessed to obtain Second assessment result;
The warping apparatus in the monitored system is determined according to first assessment result and the second assessment result.
2. the method according to claim 1, wherein the cycle of operation of the monitored system is in time series It is divided into several periods;When the historical data of same period of the performance indicator within the multiple cycles of operation being selected is equal When in stable state, the same period is confirmed as belonging to the stable region, when the historical data of the same period exists When at least one cycle of operation being selected being in unstable state, the same period is confirmed as belonging to the range of instability Between.
3. according to the method described in claim 2, it is characterized in that, the period of the unstable state include it is following at least it One: peak period, noise intervals;The period of the stable state include in the cycle of operation except the unstable state when Other periods except section.
4. according to the method described in claim 2, it is characterized in that, being from described between the stable region and the range of instability It is determined to obtain by time-series pattern discovery algorithm in the cycle of operation.
5. the method according to claim 1, wherein using unsupervised algorithm to first real data, institute The second real data is stated to be assessed.
6. the method according to claim 1, wherein
Assessed by least one of following optional algorithm first real data: time series mining algorithm gathers Class algorithm, statistical learning algorithm, algorithm with regress analysis method;
Assessed by least one of following optional algorithm second real data: clustering algorithm, statistical learning are calculated Method, algorithm with regress analysis method.
7. the method according to claim 1, wherein
When being assessed using plurality of optional algorithm first real data, there are corresponding weights for each optional algorithm Value, the assessment result to first real data includes being weighted meter to the point value of evaluation that plurality of optional algorithm obtains respectively The first obtained weighting point value of evaluation;
When being assessed using plurality of optional algorithm second real data, there are corresponding weights for each optional algorithm Value, the assessment result to second real data includes being weighted meter to the point value of evaluation that plurality of optional algorithm obtains respectively The second obtained weighting point value of evaluation.
8. the method according to the description of claim 7 is characterized in that further include:
The assessment result is compared with flag data, with analysis for first real data or second reality The assessment accuracy for each optional algorithm that border data are assessed;
Corresponding optional algorithm is improved according to the assessment accuracy.
9. the method according to claim 1, wherein further include:
Correlation analysis is carried out for the performance indicator of the monitored system;
When the real data of associated multiple performance indicators is acquired, screen out in associated multiple performance indicators The real data of at least one performance indicator.
10. a kind of detection device of warping apparatus characterized by comprising
Acquiring unit obtains the real data of the performance indicator of monitored system;
Determination unit, when determining the first real data of the stable region in the real data in time series, being in Between the second real data between range of instability in sequence;
Assessment unit, to first real data assessed to obtain the first assessment result, to second real data into Row assessment obtains the second assessment result;
Recognition unit determines the exception in the monitored system according to first assessment result and second assessment result Equipment.
11. device according to claim 10, which is characterized in that the cycle of operation of the monitored system is in time series On be divided into several periods;When the historical data of same period of the performance indicator within the multiple cycles of operation being selected When being in stable state, the same period is confirmed as belonging to the stable region, when the historical data of the same period When being in unstable state within the cycle of operation that at least one is selected, the same period is confirmed as belonging to described unstable Section.
12. device according to claim 11, which is characterized in that the period of the unstable state include it is following at least it One: peak period, noise intervals;The period of the stable state include in the cycle of operation except the unstable state when Other periods except section.
13. device according to claim 11, which is characterized in that be from institute between the stable region and the range of instability It states and is determined to obtain by time-series pattern discovery algorithm in the cycle of operation.
14. device according to claim 10, which is characterized in that using unsupervised algorithm to first real data, Second real data is assessed.
15. device according to claim 10, which is characterized in that
Assessed by least one of following optional algorithm first real data: time series mining algorithm gathers Class algorithm, statistical learning algorithm, algorithm with regress analysis method;
Assessed by least one of following optional algorithm second real data: clustering algorithm, statistical learning are calculated Method, algorithm with regress analysis method.
16. device according to claim 10, which is characterized in that
When being assessed using plurality of optional algorithm first real data, there are corresponding weights for each optional algorithm Value, the assessment result to first real data includes being weighted meter to the point value of evaluation that plurality of optional algorithm obtains respectively The first obtained weighting point value of evaluation;
When being assessed using plurality of optional algorithm second real data, there are corresponding weights for each optional algorithm Value, the assessment result to second real data includes being weighted meter to the point value of evaluation that plurality of optional algorithm obtains respectively The second obtained weighting point value of evaluation.
17. device according to claim 16, which is characterized in that further include:
The assessment result is compared by comparing unit with flag data, with analysis for first real data or The assessment accuracy for each optional algorithm that second real data is assessed;
Unit is improved, corresponding optional algorithm is improved according to the assessment accuracy.
18. device according to claim 10, which is characterized in that further include:
Analytical unit carries out correlation analysis for the performance indicator of the monitored system;
Unit is screened out, when the real data of associated multiple performance indicators is acquired, is screened out described associated multiple The real data of at least one performance indicator in performance indicator.
CN201711455271.6A 2017-12-28 2017-12-28 Abnormal equipment detection method and device Active CN109976986B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711455271.6A CN109976986B (en) 2017-12-28 2017-12-28 Abnormal equipment detection method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711455271.6A CN109976986B (en) 2017-12-28 2017-12-28 Abnormal equipment detection method and device

Publications (2)

Publication Number Publication Date
CN109976986A true CN109976986A (en) 2019-07-05
CN109976986B CN109976986B (en) 2023-12-19

Family

ID=67074171

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711455271.6A Active CN109976986B (en) 2017-12-28 2017-12-28 Abnormal equipment detection method and device

Country Status (1)

Country Link
CN (1) CN109976986B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110569389A (en) * 2019-07-25 2019-12-13 深圳壹账通智能科技有限公司 Environment monitoring method and device, computer equipment and storage medium
CN112527604A (en) * 2020-12-16 2021-03-19 广东昭阳信息技术有限公司 Deep learning-based operation and maintenance detection method and system, electronic equipment and medium
CN113110981A (en) * 2021-03-26 2021-07-13 北京中大科慧科技发展有限公司 Air conditioner room health energy efficiency detection method for data center
WO2024254992A1 (en) * 2023-06-15 2024-12-19 瑞莱谱(杭州)医疗科技有限公司 Mass spectrometer stability determination method and trace element analyzer performance testing method

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070055477A1 (en) * 2005-09-02 2007-03-08 Microsoft Corporation Web data outlier detection and mitigation
US20080130645A1 (en) * 2006-11-30 2008-06-05 Shivani Deshpande Methods and Apparatus for Instability Detection in Inter-Domain Routing
CN103716180A (en) * 2013-12-04 2014-04-09 国网上海市电力公司 Network flow actual forecasting-based network abnormality pre-warning method
CN104899405A (en) * 2014-03-04 2015-09-09 携程计算机技术(上海)有限公司 Data prediction method and system and alarming method and system
US20150269050A1 (en) * 2014-03-18 2015-09-24 Microsoft Corporation Unsupervised anomaly detection for arbitrary time series
US20160217022A1 (en) * 2015-01-23 2016-07-28 Opsclarity, Inc. Anomaly detection using circumstance-specific detectors
CN106485526A (en) * 2015-08-31 2017-03-08 阿里巴巴集团控股有限公司 A kind of diagnostic method of data mining model and device
US20170097863A1 (en) * 2015-10-05 2017-04-06 Fujitsu Limited Detection method and information processing device
CN107066365A (en) * 2017-02-20 2017-08-18 阿里巴巴集团控股有限公司 The monitoring method and device of a kind of system exception

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070055477A1 (en) * 2005-09-02 2007-03-08 Microsoft Corporation Web data outlier detection and mitigation
US20080130645A1 (en) * 2006-11-30 2008-06-05 Shivani Deshpande Methods and Apparatus for Instability Detection in Inter-Domain Routing
CN103716180A (en) * 2013-12-04 2014-04-09 国网上海市电力公司 Network flow actual forecasting-based network abnormality pre-warning method
CN104899405A (en) * 2014-03-04 2015-09-09 携程计算机技术(上海)有限公司 Data prediction method and system and alarming method and system
US20150269050A1 (en) * 2014-03-18 2015-09-24 Microsoft Corporation Unsupervised anomaly detection for arbitrary time series
US20160217022A1 (en) * 2015-01-23 2016-07-28 Opsclarity, Inc. Anomaly detection using circumstance-specific detectors
CN106485526A (en) * 2015-08-31 2017-03-08 阿里巴巴集团控股有限公司 A kind of diagnostic method of data mining model and device
US20170097863A1 (en) * 2015-10-05 2017-04-06 Fujitsu Limited Detection method and information processing device
CN107066365A (en) * 2017-02-20 2017-08-18 阿里巴巴集团控股有限公司 The monitoring method and device of a kind of system exception

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110569389A (en) * 2019-07-25 2019-12-13 深圳壹账通智能科技有限公司 Environment monitoring method and device, computer equipment and storage medium
CN112527604A (en) * 2020-12-16 2021-03-19 广东昭阳信息技术有限公司 Deep learning-based operation and maintenance detection method and system, electronic equipment and medium
CN113110981A (en) * 2021-03-26 2021-07-13 北京中大科慧科技发展有限公司 Air conditioner room health energy efficiency detection method for data center
CN113110981B (en) * 2021-03-26 2024-04-09 北京中大科慧科技发展有限公司 Air conditioner room health energy efficiency detection method for data center
WO2024254992A1 (en) * 2023-06-15 2024-12-19 瑞莱谱(杭州)医疗科技有限公司 Mass spectrometer stability determination method and trace element analyzer performance testing method

Also Published As

Publication number Publication date
CN109976986B (en) 2023-12-19

Similar Documents

Publication Publication Date Title
CN109542740A (en) Method for detecting abnormality and device
CN109976986A (en) The detection method and device of warping apparatus
CN109491850A (en) A kind of disk failure prediction technique and device
US20180107528A1 (en) Aggregation based event identification
Akbari et al. A maritime search and rescue location analysis considering multiple criteria, with simulated demand
CN105871634B (en) Detect the method for cluster exception and the system of application, management cluster
CN106095639A (en) A kind of cluster subhealth state method for early warning and system
US9916535B2 (en) Systems and methods for predictive analysis
Khojasteh-Ghamari et al. Supply chain risk management: a comprehensive review
CN110209560A (en) Data exception detection method and detection device
CN111367747B (en) Index abnormal detection early warning device based on time annotation
JP5387779B2 (en) Operation management apparatus, operation management method, and program
Prabhakaran et al. Towards prediction of paradigm shifts from scientific literature
Huang et al. A generalized likelihood ratio chart for monitoring Bernoulli processes
Lee et al. RePAD2: real-time, lightweight, and adaptive anomaly detection for open-ended time series
Liang et al. Ecological network analysis quantifying the sustainability of regional economies: a case study of Guangdong province in China
Santolamazza et al. Evaluation of machine learning techniques to enact energy consumption control of compressed air generation in production plants
TWI590052B (en) Data storage device monitoring
CN112668225A (en) Distribution network grid planning method and device, computer equipment and storage medium
CN114327963A (en) Anomaly detection method and device
Yu Hard disk drive failure prediction challenges in machine learning for multi-variate time series
Waiyamai et al. SED–Stream: discriminative dimension selection for evolution–based clustering of high dimensional data streams
CN110096415A (en) A kind of data monitoring method based on topological relation
Chang et al. A stack-based prospective spatio-temporal data analysis approach
US20120109707A1 (en) Providing a status indication for a project

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant