[go: up one dir, main page]

CN103178990A - A network equipment performance monitoring method and network management system - Google Patents

A network equipment performance monitoring method and network management system Download PDF

Info

Publication number
CN103178990A
CN103178990A CN2011104303495A CN201110430349A CN103178990A CN 103178990 A CN103178990 A CN 103178990A CN 2011104303495 A CN2011104303495 A CN 2011104303495A CN 201110430349 A CN201110430349 A CN 201110430349A CN 103178990 A CN103178990 A CN 103178990A
Authority
CN
China
Prior art keywords
time window
index data
performance index
kpi performance
kpi
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2011104303495A
Other languages
Chinese (zh)
Inventor
单建业
刘武升
王明昭
石国章
刘涛
赵浩然
邓小红
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Group Qinghai Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Group Qinghai Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Group Qinghai Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN2011104303495A priority Critical patent/CN103178990A/en
Publication of CN103178990A publication Critical patent/CN103178990A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a network device performance monitoring method and a network management system. The network device performance monitoring method includes: the network management system acquires key performance indicator (KPI) data of a network device periodically; the acquired KPI data are summarized according to set time windows to obtain KPI data summarizing values of the time windows; a linear regression algorithm model is utilized, KPI data of future N time windows including the current time window are predicted according to KPI data summarizing values of N time windows before the current window; whether the KPI data of the further N time windows exceed a set alarm threshold value is judged; and a alarm is given out during yes judgment. The network device performance monitoring method and the network management system can achieve early warning according to KPI historical data.

Description

一种网络设备性能监控方法及网络管理系统A network equipment performance monitoring method and network management system

技术领域 technical field

本发明涉及通信领域的网络管理技术,尤其涉及一种网络设备性能监控方法及网络管理系统。The invention relates to network management technology in the communication field, in particular to a network equipment performance monitoring method and a network management system.

背景技术 Background technique

在电信业蓬勃发展的历程中,业务支撑系统在客户服务、业务开通、服务保障、计费与账务、预测与规划等方面发挥了越来越重要的作用,随着市场发展的需要,新业务的不断推出,业务支撑系统伴随着公司的成长也在不断进行自我超越,而支撑着系统正常运行的基本节点就是主机,主机的安全问题也日益严峻。During the booming development of the telecom industry, the business support system has played an increasingly important role in customer service, service provisioning, service guarantee, billing and accounting, forecasting and planning, etc. With the needs of market development, new With the continuous introduction of business, the business support system is constantly surpassing itself with the growth of the company, and the basic node that supports the normal operation of the system is the mainframe, and the security problems of the mainframe are becoming increasingly serious.

传统的主机性能监控方式都是采用监控主机运行的KPI(KPI全称为KeyPerformance Indicator,即关键性能指标,如CPU使用率、内存使用率、IO吞吐量、硬盘使用率、数据库等的使用情况的指标)设定门限值,当系统监控到KPI指标超过门限值时,发出主动预警,该方法的优点在于简便易行。The traditional host performance monitoring method uses the KPI (KPI is called KeyPerformance Indicator, which is the key performance indicator, such as CPU usage, memory usage, IO throughput, hard disk usage, database usage, etc.) to monitor the running of the host. ) to set the threshold value, when the system monitors that the KPI index exceeds the threshold value, it will issue an active warning. The advantage of this method is that it is simple and easy to implement.

从长期的运维实践中看,现有技术有其明显的局限性,主要体现在以下几个方面:From the perspective of long-term operation and maintenance practice, the existing technology has its obvious limitations, which are mainly reflected in the following aspects:

(1)传统的基于门限值的监控方式没有考虑到系统KPI指标的变化趋势,在系统运行过程当中,如果发生KPI指标突变,但还没有达到设定门限值的时候,不会产生告警,而此时实际上已经需要引起系统维护人员的关注,需要采取主动式的干预措施防止系统KPI指标进一步上涨。(1) The traditional monitoring method based on the threshold value does not take into account the changing trend of the system KPI index. During the operation of the system, if the KPI index changes suddenly but has not reached the set threshold value, no alarm will be generated. , and at this time, it actually needs to attract the attention of the system maintenance personnel, and it is necessary to take proactive intervention measures to prevent the system KPI indicators from further rising.

(2)传统的基于门限值的监控方式都是采用的被动式监控,在发生了状况后才进行报警,这个时候可能已经影响到系统的正常运行,不能起到主动式预防性监控的目的。(2) The traditional threshold-based monitoring method is passive monitoring, and the alarm is issued after a situation occurs. At this time, the normal operation of the system may have been affected, and the purpose of active preventive monitoring cannot be achieved.

总之,传统的主机性能监控方式采用的是基于门限值的监控方式,一旦在系统KPI指标超过门限值时,往往已经是系统超负荷运转状态。此时告警,对于客户来说,处理起来难度大,系统运行风险也比较高。In short, the traditional host performance monitoring method adopts the monitoring method based on the threshold value. Once the system KPI index exceeds the threshold value, the system is often already in an overloaded state. An alarm at this time is difficult for customers to deal with, and the risk of system operation is relatively high.

发明内容 Contents of the invention

本发明实施例提供了一种网络设备性能监控方法及网络管理系统,用以实现根据KPI性能指标历史数据进行预警。Embodiments of the present invention provide a network device performance monitoring method and a network management system, which are used to implement early warning based on historical data of KPI performance indicators.

本发明实施例提供的网络设备性能监控方法,包括:The network device performance monitoring method provided by the embodiment of the present invention includes:

网络管理系统周期采集网络设备的KPI性能指标数据;The network management system periodically collects KPI performance index data of network equipment;

所述网络管理系统将采集到的KPI性能指标数据按照设定的时间窗口进行汇总,得到各时间窗口的KPI性能指标数据汇总值;The network management system summarizes the collected KPI performance index data according to the set time window, and obtains the KPI performance index data summary value of each time window;

所述网络管理系统利用线性回归算法模型,并根据当前时间窗口之前的N个时间窗口的KPI性能指标数据汇总值,预测包括当前时间窗口在内的未来N个时间窗口的KPI性能指标数据;The network management system utilizes a linear regression algorithm model, and predicts the KPI performance index data of N time windows in the future including the current time window according to the summary value of KPI performance index data of N time windows before the current time window;

所述网络管理系统判断所述未来N个时间窗口的KPI性能指标数据是否超过设定告警阈值,并在判断为是时发出告警。The network management system judges whether the KPI performance index data of the N time windows in the future exceeds a set alarm threshold, and sends out an alarm if it is judged to be yes.

本发明实施例提供的网络设备性能监控装置,包括:The network device performance monitoring device provided by the embodiment of the present invention includes:

采集模块,用于周期采集网络设备的KPI性能指标数据;The collection module is used to periodically collect KPI performance index data of network equipment;

汇总模块,用于将采集到的KPI性能指标数据按照设定的时间窗口进行汇总,得到各时间窗口的KPI性能指标数据汇总值;The summary module is used to summarize the collected KPI performance index data according to the set time window, and obtain the KPI performance index data summary value of each time window;

预测模块,用于利用线性回归算法模型,并根据当前时间窗口之前的N个时间窗口的KPI性能指标数据汇总值,预测包括当前时间窗口在内的未来N个时间窗口的KPI性能指标数据;A prediction module, for utilizing the linear regression algorithm model, and predicting the KPI performance index data of N time windows in the future including the current time window according to the summary value of the KPI performance index data of N time windows before the current time window;

告警模块,用于判断预所述未来N个时间窗口的KPI性能指标数据是否超过设定告警阈值,并在判断为是时发出告警。The alarm module is used to judge whether the KPI performance index data in the N future time windows exceeds the set alarm threshold, and issue an alarm when it is judged to be yes.

本发明的上述实施例,通过将采集到的KPI性能指标数据按照设定的时间窗口进行汇总,并利用线性回归算法模型,根据当前时间窗口之前的N个时间窗口的KPI性能指标数据汇总值,预测包括当前时间窗口在内的未来N个时间窗口的KPI性能指标数据,从而利用历史KPI性能指标数据对未来KPI性能指标数据进行预测,并根据预测情况进行告警,从而在实际发生问题之前发出预警,以便实现对系统进行预分析处理。In the above-mentioned embodiment of the present invention, by summarizing the collected KPI performance index data according to the set time window, and using the linear regression algorithm model, according to the summary value of the KPI performance index data of N time windows before the current time window, Predict the KPI performance index data of N time windows in the future including the current time window, so as to use the historical KPI performance index data to predict the future KPI performance index data, and give an alarm according to the forecast situation, so as to issue an early warning before the actual problem occurs , in order to realize the pre-analysis processing of the system.

附图说明 Description of drawings

图1为本发明实施例提供的网络设备性能监控流程示意图;FIG. 1 is a schematic diagram of a network device performance monitoring process provided by an embodiment of the present invention;

图2和图3分别为本发明实施例中的中长期KPI性能指标数据预测示意图;Fig. 2 and Fig. 3 are respectively the medium and long-term KPI performance index data prediction schematic diagrams in the embodiment of the present invention;

图4和图5分别为本发明实施例中的长期KPI性能指标数据预测示意图;Fig. 4 and Fig. 5 are respectively the long-term KPI performance index data prediction schematic diagram in the embodiment of the present invention;

图6为本发明实施例提供的网络管理系统的结构示意图。FIG. 6 is a schematic structural diagram of a network management system provided by an embodiment of the present invention.

具体实施方式 Detailed ways

针对现有技术存在的问题,本发明实施例根据监控到的KPI性能指标数据,预测未来一段时间内系统资源的使用情况,并根据预测情况进行主动式监控和告警,从而在实际发生问题之前发出预警,以便实现对系统进行预分析处理。Aiming at the problems existing in the prior art, the embodiment of the present invention predicts the usage of system resources in a period of time in the future according to the monitored KPI performance index data, and performs active monitoring and alarm according to the predicted situation, so as to issue a warning before the actual problem occurs. Early warning, so as to realize the pre-analysis and processing of the system.

这里所说的KPI性能指标数据可包括CPU使用率、内存使用率、网络吞吐量等能够表征网络设备或业务系统性能的各种参数之一或任意组合。The KPI performance index data mentioned here may include one or any combination of various parameters that can characterize the performance of network equipment or business systems, such as CPU usage, memory usage, and network throughput.

下面结合附图对本发明实施例进行详细描述。Embodiments of the present invention will be described in detail below in conjunction with the accompanying drawings.

参见图1,为本发明实施例提供的网络设备性能监控流程示意图,该流程可由网络管理系统实现,该流程可包括:Referring to FIG. 1 , it is a schematic diagram of a network device performance monitoring process provided by an embodiment of the present invention. The process can be implemented by a network management system, and the process can include:

步骤101,周期采集网络设备的KPI性能指标数据。Step 101, periodically collect KPI performance index data of network equipment.

具体实施时,可从传统监控系统中采集各种KPI性能指标数据。During specific implementation, various KPI performance index data can be collected from traditional monitoring systems.

步骤102,将采集到的KPI性能指标数据按照设定的时间窗口进行汇总,得到各时间窗口的KPI性能指标数据汇总值。Step 102, summarizing the collected KPI performance index data according to the set time windows to obtain the summary value of the KPI performance index data in each time window.

具体实施时,可将一个时间窗口内采集到的KPI性能指标数据峰值作为该时间窗口的KPI性能指标数据汇总值。在一个时间窗口内包含有多个数据统计周期的情况下,将一个时间窗口内各个数据统计周期内所采集到的KPI性能指标数据峰值进行平均,再将各个数据采集周期的KPI性能指标数据峰值的平均值,作为该时间窗口的KPI性能指标数据汇总值。During specific implementation, the peak value of the KPI performance index data collected in a time window may be used as the summary value of the KPI performance index data in the time window. When there are multiple data statistics periods in a time window, average the KPI performance indicator data peak values collected in each data statistics period in a time window, and then average the KPI performance indicator data peak values in each data collection period The average value of is used as the summary value of KPI performance indicator data in this time window.

步骤103,利用线性回归算法模型,并根据当前时间窗口之前的N个时间窗口的KPI性能指标数据汇总值,预测包括当前时间窗口在内的未来N个时间窗口的KPI性能指标数据。这里的N个时间窗口是指连续的N个时间窗口。Step 103, using the linear regression algorithm model, and according to the summary value of the KPI performance index data of N time windows before the current time window, predicting the KPI performance index data of N future time windows including the current time window. The N time windows here refer to consecutive N time windows.

步骤104,输出预测出的未来N个时间窗口的KPI性能指标数据。Step 104, outputting the predicted KPI performance index data of N time windows in the future.

进一步的,上述流程还包括:Further, the above process also includes:

步骤105,判断未来N个时间窗口的KPI性能指标数据是否满足告警条件,并在判断为是时发出告警。该步骤与步骤104没有严格的时序要求。Step 105, judging whether the KPI performance index data of N time windows in the future meet the warning condition, and issuing a warning if it is judged to be yes. There is no strict timing requirement between this step and step 104 .

具体实施时,如果步骤101中采集到的KPI性能指标数据包括多种类型,如包括CPU使用率和内存使用率,则此步骤中,需要根据CPU使用率的告警条件对未来N个时间窗口的CPU使用率是否需要告警进行判断,根据内存使用率的告警条件对未来N个时间窗口的内存使用率是否需要告警进行判断,并根据判断结果进行告警。其中,根据告警条件决定是否告警,可以有以下几种实现方式:During specific implementation, if the KPI performance index data collected in step 101 includes multiple types, such as including CPU usage and memory usage, then in this step, the alarm conditions of the future N time windows need to be adjusted according to the CPU usage alarm condition. It is judged whether the CPU usage needs an alarm, and according to the alarm condition of the memory usage, it is judged whether the memory usage of N time windows in the future needs an alarm, and an alarm is issued according to the judgment result. Among them, according to the alarm conditions to determine whether to alarm, there are the following implementation methods:

方式1:如果未来N个时间窗口的KPI性能指标数据中有数据超过该KPI性能指标的阈值(该阈值为预设的固定值),则发出告警;Mode 1: If any data in the KPI performance index data of N time windows in the future exceeds the threshold of the KPI performance index (the threshold is a preset fixed value), an alarm is issued;

方式2:判断方法同方式1,但其中的KPI性能指标的阈值是根据未来N个时间窗口的KPI性能指标数据,以及当前时间窗口之前的N个时间窗口的KPI性能指标数据汇总值的线性回归算法模型斜率,在预先设定的阈值基础上动态调整得到的;Method 2: The judgment method is the same as Method 1, but the threshold of the KPI performance index is based on the linear regression of the KPI performance index data of N time windows in the future and the summary value of the KPI performance index data of N time windows before the current time window The slope of the algorithm model is dynamically adjusted based on the preset threshold;

方式3:如果未来N个时间窗口的KPI性能指标数据中有数据超过该KPI性能指标阈值(该阈值可以预先设置的,也可以是如方式2动态调整得到的),并且线性回归算法模型斜率满足设定条件,则发出告警。Mode 3: If there is data in the KPI performance index data of N time windows in the future that exceeds the threshold of the KPI performance index (the threshold can be preset or dynamically adjusted as in Mode 2), and the slope of the linear regression algorithm model satisfies If the condition is set, an alert is issued.

方式4:将上述方式结合使用。Method 4: Combine the above methods.

根据时间窗口设置的长短,本发明实施例可实现近实时预前监控告警、中长期预前监控告警以及长期预前监控告警。According to the length of the time window setting, the embodiment of the present invention can realize near-real-time pre-monitoring and alarming, mid- and long-term pre-monitoring and alarming, and long-term pre-monitoring and alarming.

近实况预前监控告警的预测周期较短,可以及时预测网络设备的性能情况,以便尽早发现问题。其时间窗口长度可以设置在10至20分钟之间,例如10分钟、15分钟或20分钟,最佳为15分钟,也可根据不同类型的KPI设置对应的时间窗口长度。该10至20分钟的时间窗口长度是从长期运维工作中得到的经验值,并且反复测试得出的对预测结果效果最好的时间窗口值。The prediction period of the near-live pre-monitoring alarm is short, and the performance of the network equipment can be predicted in time to detect problems as early as possible. The length of the time window can be set between 10 and 20 minutes, such as 10 minutes, 15 minutes or 20 minutes, preferably 15 minutes, and the corresponding time window length can also be set according to different types of KPIs. The time window length of 10 to 20 minutes is an empirical value obtained from long-term operation and maintenance work, and the time window value that has the best effect on the prediction result obtained through repeated tests.

近实况预前监控流程中,在统计当前时间窗口之前的N个时间窗口的KPI性能指标数据汇总值时,可将其中每个时间窗口内采集到的KPI性能指标数据峰值作为相应时间窗口的KPI性能指标数据汇总值。In the near-live pre-monitoring process, when counting the summary value of KPI performance index data in N time windows before the current time window, the peak value of KPI performance index data collected in each time window can be used as the KPI of the corresponding time window Performance metrics data summary value.

在预测未来N个时间窗口的KPI性能指标数据时,可根据当前时间窗口之前的N个时间窗口的KPI性能指标数据汇总值,将时间窗口序号作为自变量,将KPI性能指标数据汇总值作为因变量,得到一元线性回归方程;然后再根据该方程预测未来N个时间窗口的KPI性能指标数据。When predicting the KPI performance index data of N time windows in the future, according to the summary value of KPI performance index data of N time windows before the current time window, the serial number of the time window can be used as an independent variable, and the summary value of KPI performance index data can be used as a factor variables to obtain a linear regression equation; and then predict the KPI performance index data of N time windows in the future according to the equation.

在判断未来N个时间窗口的KPI性能指标数据是否满足相应KPI性能指标的告警条件时,所依据的KPI性能指标阈值可以根据未来N个时间窗口的KPI性能指标数据,以及当前时间窗口之前的N个时间窗口的KPI性能指标数据汇总值的线性回归算法模型斜率,对预先设置的KPI性能指标阈值进行调整得到。KPI预警的判定通常需要考虑阈值和斜率,比如可设置如下告警规则:When judging whether the KPI performance index data of N time windows in the future meets the alarm condition of the corresponding KPI performance index, the KPI performance index threshold value can be based on the KPI performance index data of N time windows in the future and the N time window before the current time window. The linear regression algorithm model slope of the KPI performance index data summary value of each time window is obtained by adjusting the preset KPI performance index threshold. The determination of KPI early warning usually needs to consider the threshold and slope. For example, the following alarm rules can be set:

(VALUE>90)OR(VALUE>80AND SLOPE>0.5)(VALUE>90)OR(VALUE>80 AND SLOPE>0.5)

这个规则的意思是:预测值超过90%(阈值1),或者预测值超过80%(阈值2)且趋势斜率大于0.5,就会发出告警。This rule means: if the predicted value exceeds 90% (threshold 1), or if the predicted value exceeds 80% (threshold 2) and the trend slope is greater than 0.5, an alarm will be issued.

进一步的,这几个值的取值可以在使用过程中不断调整和优化,主要根据以下几点原则:Furthermore, the values of these values can be continuously adjusted and optimized during use, mainly based on the following principles:

(1)如果在原有监控系统中产生了监控告警,而在本预警系统中没有提前产生预警,则需要分析在产生告警前的KPI数据情况,适当降低绝对阈值,或降低斜率阈值。(1) If a monitoring alarm is generated in the original monitoring system, but no early warning is generated in this early warning system, it is necessary to analyze the KPI data before the alarm is generated, and appropriately reduce the absolute threshold or slope threshold.

(2)如果产生了大量的预警,而大部分都属于误报,则需要调高绝对阈值或调高斜率阈值。(2) If a large number of early warnings are generated, and most of them are false alarms, the absolute threshold or the slope threshold needs to be increased.

(3)比较理想的情况是,原有监控系统80%的监控告警在产生之前的1-2个小时内,本系统会有预警,同时预警的准确率应达到70%以上。(3) The ideal situation is that 80% of the monitoring alarms in the original monitoring system will have an early warning within 1-2 hours before they are generated, and the accuracy of the early warning should reach more than 70%.

中长期预前监控告警的预测周期适中,可以及时预测网络设备的性能情况且又不至于象近实况预前监控那样过于频繁的进行预测和告警。其时间窗口长度可以设置在1天或几天,例如1天、2天或5天,最佳为1天,也可根据不同类型的KPI设置对应的时间窗口长度。该1天的时间窗口长度是从长期运维工作中得到的经验值,并且反复测试得出的对预测结果效果最好的时间窗口值。The prediction cycle of medium and long-term pre-monitoring and alarm is moderate, which can predict the performance of network equipment in time without making predictions and alarms too frequently like near-live pre-monitoring. The length of the time window can be set to 1 day or several days, such as 1 day, 2 days or 5 days, preferably 1 day, and the corresponding time window length can also be set according to different types of KPIs. The 1-day time window length is an empirical value obtained from long-term operation and maintenance work, and the time window value that has the best effect on the prediction results obtained through repeated tests.

中长期预前监控流程中,在统计当前时间窗口之前的N个时间窗口的KPI性能指标数据汇总值时,可将一个时间窗口内各个数据统计周期内所采集到的KPI性能指标数据峰值进行平均,再将各个数据采集周期的KPI性能指标数据峰值的平均值,作为该时间窗口的KPI性能指标数据汇总值。In the medium and long-term pre-monitoring process, when counting the summary value of KPI performance index data in N time windows before the current time window, the peak values of KPI performance index data collected in each data statistical period in a time window can be averaged , and then take the average value of the KPI performance index data peak values in each data collection period as the summary value of the KPI performance index data in this time window.

在预测未来N个时间窗口的KPI性能指标数据时,可在当前时间窗口之前的N个时间窗口中,用每个时间窗口的KPI性能指标数据汇总值减去其前一个时间窗口的KPI性能指标数据汇总值,得到包含有N-1个增量值的数组;利用线性回归算法模型,并根据该数组,计算得到该N-1个增量值的线性回归算法模型斜率(即以数组中的元素序号作为因变量,将对应的元素值作为自变量,得到一元线性回归方程中的);利用该线性回归算法模型,并根据该斜率分别计算得到包括当前时间窗口在内的未来N个时间窗口的KPI性能指标数据与前一个时间窗口相比的增量值;根据未来N个时间窗口中的每一个时间窗口的KPI性能指标数据的增量值,及其前一个时间窗口的KPI性能指标数据预测值,得到未来N个时间窗口中的每个时间窗口的KPI性能指标数据预测值;其中,该未来N个时间窗口中的第一个时间窗口的KPI性能指标数据预测值为对应的增量值与其前一个时间窗口的KPI性能指标数据汇总值之和。When predicting the KPI performance index data of N time windows in the future, in the N time windows before the current time window, the KPI performance index data of each time window can be subtracted from the KPI performance index of the previous time window Summarize the data to obtain an array containing N-1 incremental values; use the linear regression algorithm model, and according to the array, calculate the slope of the linear regression algorithm model of the N-1 incremental values (that is, the slope of the linear regression algorithm model in the array The element number is used as the dependent variable, and the corresponding element value is used as the independent variable to obtain ) in the linear regression equation; use the linear regression algorithm model, and calculate the future N time windows including the current time window according to the slope The incremental value of the KPI performance index data compared with the previous time window; according to the incremental value of the KPI performance index data of each time window in the next N time windows, and the KPI performance index data of the previous time window The predicted value is to obtain the predicted value of the KPI performance index data of each time window in the future N time windows; wherein, the predicted value of the KPI performance index data of the first time window in the future N time windows is the corresponding increment The sum of the value and the summary value of the KPI performance indicator data in the previous time window.

在判断未来N个时间窗口的KPI性能指标数据是否满足相应KPI性能指标的告警条件时,如果增量斜率大于0(表明上升趋势很陡峭),则会产生告警或者以标记方式对预测数据进行标注,以期引起网络管理员的注意。When judging whether the KPI performance indicator data in the next N time windows meets the alarm conditions of the corresponding KPI performance indicator, if the incremental slope is greater than 0 (indicating a steep upward trend), an alarm will be generated or the predicted data will be marked in a marked manner , hoping to get the attention of the network administrator.

长期预前监控告警的预测周期较长,可以预测未来较长一段时间的网络设备的性能情况,以便根据KPI变化趋势采用相应的处理策略。其时间窗口长度可以设置为一个月或几个月,最佳为1个月,当然可根据不同类型的KPI设置对应的时间窗口长度。该1个月的时间窗口长度是从长期运维工作中得到的经验值,并且反复测试得出的对预测结果效果最好的时间窗口值。Long-term pre-monitoring alarms have a long forecast period, and can predict the performance of network equipment for a long period of time in the future, so that corresponding processing strategies can be adopted according to the KPI change trend. The length of the time window can be set to one month or several months, preferably one month. Of course, the corresponding time window length can be set according to different types of KPIs. The 1-month time window length is an experience value obtained from long-term operation and maintenance work, and the time window value that has the best effect on the prediction results obtained through repeated tests.

长期预前监控流程中,在统计当前时间窗口之前的N个时间窗口的KPI性能指标数据汇总值时,可将一个时间窗口内各个数据统计周期内所采集到的KPI性能指标数据峰值进行平均,再将各个数据采集周期的KPI性能指标数据峰值的平均值,作为该时间窗口的KPI性能指标数据汇总值。In the long-term pre-monitoring process, when counting the summary value of KPI performance index data in N time windows before the current time window, the peak value of KPI performance index data collected in each data statistical cycle in a time window can be averaged, Then, the average value of the peak values of the KPI performance index data in each data collection cycle is used as the summary value of the KPI performance index data in the time window.

在预测未来N个时间窗口的KPI性能指标数据时,可根据当前时间窗口之前的N个时间窗口的KPI性能指标数据汇总值,将时间窗口序号作为自变量,将KPI性能指标数据汇总值作为因变量,得到一元线性回归方程;然后再根据该方程预测未来N个时间窗口的KPI性能指标数据。When predicting the KPI performance index data of N time windows in the future, according to the summary value of KPI performance index data of N time windows before the current time window, the serial number of the time window can be used as an independent variable, and the summary value of KPI performance index data can be used as a factor variables to obtain a linear regression equation; and then predict the KPI performance index data of N time windows in the future according to the equation.

对于长期预前监控,可以不设置告警机制。因为长期预前监控主要是用于查看长期性能走势,可作为容量分析的参考,比如:CPU长期处于高位,且有缓慢上升趋势,是否考虑硬件扩容。For long-term pre-monitoring, there is no need to set up an alarm mechanism. Because long-term pre-monitoring is mainly used to check long-term performance trends, it can be used as a reference for capacity analysis. For example, if the CPU has been at a high level for a long time and has a slow upward trend, whether to consider hardware expansion.

下面分别结合具体实例,说明近实况近实时预前监控告警、中长期预前监控告警以及长期预前监控告警的实现流程。The implementation process of the near-live and near-real-time pre-monitoring alarm, the mid- and long-term pre-monitoring alarm, and the long-term pre-monitoring alarm will be described below in conjunction with specific examples.

实例一:近实时预前监控告警Example 1: Near real-time pre-monitoring alarm

以监控网络设备的CPU使用率、时间窗口长度为15分钟为例,其近实况预前监控告警流程可包括:Taking the monitoring of the CPU utilization rate of network equipment and the time window length as 15 minutes as an example, the near-live pre-monitoring and alarm process may include:

每5至15分钟从传统监控系统中采集一次数据,例如每5分钟、10分钟或15分钟采集一次。每个时间窗口为15分钟,这样每个时间窗口内可以采集到一个网络设备的一个或多个CPU使用率,将每个时间窗口内所采集到的CPU使用率的最大值作为该时间窗口的CPU使用率汇总值。然后从当前时间窗口开始向前取8个时间窗口,即取前2个小时的CPU使用率,该8个时间窗口的CPU使用率汇总值可如表1所示。Data is collected from traditional monitoring systems every 5 to 15 minutes, such as every 5, 10, or 15 minutes. Each time window is 15 minutes, so that one or more CPU usages of a network device can be collected in each time window, and the maximum value of the CPU usage collected in each time window is taken as the time window Summary value of CPU usage. Then take 8 time windows forward from the current time window, that is, take the CPU usage rate of the previous 2 hours, and the summary value of the CPU usage rate of the 8 time windows can be shown in Table 1.

表1Table 1

  时间 time   CPU使用率(%) CPU usage (%)   20:00 20:00   73.2 73.2   20:15 20:15   72.1 72.1   20:30 20:30   74.3 74.3   20:45 20:45   74.6 74.6   21:00 21:00   75.1 75.1   21:15 21:15   75.8 75.8   21:30 21:30   78.3 78.3   21:45 21:45   77.2 77.2

下面以这8个时间窗口的CPU使用率汇总值预测未来8个时间窗口的CPU使用率预测值:The CPU usage summary value of these 8 time windows is used to predict the CPU usage forecast value of the next 8 time windows:

将这8个时间窗口的CPU使用率处理成自变量X和因变量Y的形式,如表2所示:Process the CPU usage of these 8 time windows into the form of independent variable X and dependent variable Y, as shown in Table 2:

表2Table 2

  X x   Y Y   1 1   73.2 73.2   2 2   72.1 72.1   3 3   74.3 74.3   4 4   74.6 74.6   5 5   75.1 75.1   6 6   75.8 75.8   7 7   78.3 78.3   8 8   77.2 77.2

线性回归公式为y=a*x+b,在预测之前首先根据表2分别计算出a和b的值。本发明实施例采用Weka的LineRegression的模型来计算,这里可以用Excel中的两个函数来替代:The linear regression formula is y=a*x+b, and the values of a and b are calculated respectively according to Table 2 before forecasting. The embodiment of the present invention adopts the model of the LineRegression of Weka to calculate, can replace with two functions in Excel here:

a=SLOPE(因变量数组,自变量数组)a=SLOPE(dependent variable array, independent variable array)

b=INTERCEPT(因变量数组,自变量数组)b=INTERCEPT(dependent variable array, independent variable array)

根据上述线性回归算法模型以及表2中的数据,分别计算出b=71.65,a=0.7619。According to the above linear regression algorithm model and the data in Table 2, b=71.65 and a=0.7619 were calculated respectively.

然后根据公式y=a*x+b,便可预测未来8个时间窗口的CPU使用率。具体的,第9个时间窗口的CPU利用率为:Then according to the formula y=a*x+b, the CPU utilization rate in the next 8 time windows can be predicted. Specifically, the CPU utilization of the ninth time window is:

Y9=a*x+b=0.7619*9+71.65=78.50(取近似值,保留小数点后2位)Y 9 =a*x+b=0.7619*9+71.65=78.50 (approximate value, keep 2 digits after the decimal point)

第10个时间窗口的CPU利用率为:The CPU utilization of the 10th time window is:

Y10=a*x+b=0.7619*10+71.65=78.27(取近似值,保留小数点后2位)Y 10 =a*x+b=0.7619*10+71.65=78.27 (approximate value, keep 2 decimal places)

依次类推可预测出第9到第18个时间窗口的CPU利用率,如表3所示:By analogy, the CPU utilization of the 9th to 18th time windows can be predicted, as shown in Table 3:

表3table 3

  X x   Y Y   9 9   78.50 78.50   10 10   79.27 79.27

  11 11   80.03 80.03   12 12   80.79 80.79   13 13   81.55 81.55   14 14   82.31 82.31   15 15   83.08 83.08   16 16   83.84 83.84   17 17   84.60 84.60   18 18   85.36 85.36

根据预设的CPU使用率阈值调整策略判断是否需要调整CPU使用率阈值,如果需要调整,则在预先设定的CPU使用率阈值的基础上进行调整,并根据调整后的CPU使用率阈值以及表3的预测数据判决是否需要告警。Determine whether the CPU usage threshold needs to be adjusted according to the preset CPU usage threshold adjustment policy. 3 prediction data to determine whether an alarm is required.

例如,本流程中预先设定的CPU使用率阈值为80%。根据上述线性回归算法模型y=0.7619*x+71.65可知斜率为0.7619,该值较大,表明CPU使用率上升趋势较为明显,时间窗口长度为15分钟,这样在较快时间内就可以达到预先设定的阈值,且还会以较快速度上升,进一步根据表3所给出的预测值可以看出大部分预测值已超过80%,并趋向85%。这种情况下,可不调整CPU使用率阈值,因为根据预测,CPU使用率上升较快,需要及时发出告警以提醒网络管理员进行处理。For example, the preset CPU usage threshold in this process is 80%. According to the above linear regression algorithm model y=0.7619*x+71.65, it can be seen that the slope is 0.7619. This value is relatively large, indicating that the CPU usage rate is on the rise. According to the forecast value given in Table 3, it can be seen that most of the forecast values have exceeded 80% and tend to 85%. In this case, you do not need to adjust the CPU usage threshold, because according to predictions, the CPU usage rises rapidly, and an alarm needs to be issued in time to remind the network administrator to handle it.

实例二:中长期预前监控告警Example 2: Medium and long-term pre-monitoring and warning

以监控网络设备的CPU使用率、时间窗口长度为1天为例,其中长期预前监控告警流程可包括:Taking the monitoring of the CPU usage of network devices and the time window length as one day as an example, the long-term pre-monitoring and alarm process may include:

按照设定的数据采集周期从传统监控系统中采集一次CPU使用率,例如每5分钟、10分钟或15分钟采集一次,每个小时统计该时间段内采集到的CPU使用率的最大值,将1天之内每个小时统计到的CPU使用率峰值进行平均,得到该天的CPU使用率的峰值均值。Collect the CPU usage rate from the traditional monitoring system according to the set data collection cycle, for example, collect once every 5 minutes, 10 minutes or 15 minutes, and count the maximum value of the CPU usage rate collected in this time period every hour, and set The peak value of the CPU usage counted every hour within a day is averaged to obtain the peak value of the CPU usage of the day.

然后从当前时间窗口开始向前取15个时间窗口的CPU使用率峰值均值,即取前15天的CPU使用率的峰值均值,该15个时间窗口的CPU使用率的峰值均值可如图2所示。其中,示出了2009年5月16日至5月30日期间每天的CPU使用率的峰值均值。Then take the peak average value of the CPU usage rate in 15 time windows forward from the current time window, that is, take the peak value value of the CPU usage rate in the previous 15 days. The peak value value of the CPU usage rate in the 15 time windows can be shown in Figure 2 Show. Wherein, the peak average value of the daily CPU usage from May 16 to May 30, 2009 is shown.

之后,在前15天的CPU使用率的峰值均值中,用每天的CPU使用率减去其前一天的CPU使用率,得到包含有14个增量值的数组;然后利用线性回归算法模型,并根据该数组,计算得到该14个增量值的线性回归算法模型斜率;然后再利用该线性回归算法模型,并根据该斜率分别计算得到未来15天每天的CPU使用率与其前一天相比的增量值。After that, subtract the CPU usage of the previous day from the CPU usage of the previous 15 days from the peak average value of the CPU usage to obtain an array containing 14 incremental values; then use the linear regression algorithm model, and According to the array, calculate the slope of the linear regression algorithm model of the 14 incremental values; then use the linear regression algorithm model, and calculate the increase of the CPU usage rate in the next 15 days compared with the previous day according to the slope. magnitude.

之后,将未来15天每天的CPU使用率增量值与其前一天的CPU使用率预测值相加,得到该天的CPU使用率预测值。其中,该未来15天中第一天的CPU使用率预测值为该天的增量值与其前一天的CPU使用率的峰值均值之和。Afterwards, add the incremental value of the CPU usage rate for the next 15 days to the predicted value of the CPU usage rate of the previous day to obtain the predicted value of the CPU usage rate for this day. Wherein, the CPU usage forecast value of the first day in the next 15 days is the sum of the incremental value of the day and the peak average value of the CPU usage rate of the previous day.

根据以上预测过程,如图2所示,得到2009年5月31日至6月14日的CPU使用率预测值。According to the above prediction process, as shown in FIG. 2 , the predicted value of CPU usage from May 31 to June 14, 2009 is obtained.

通过斜率的斜率进行KPI性能指标数据的预测,可以将KPI性能指标数据的变化趋势放大,以期引起监控人员的注意。Predicting the KPI performance index data through the slope of the slope can amplify the change trend of the KPI performance index data in order to attract the attention of the monitoring personnel.

如图3所示,示出了5月31日至6月14日的实际CPU使用率的情况。与之前的预测相比,虽然没有超出75%,但是CPU使用率却比前15天有了显著上升,这种情况下,虽然没有达到监控告警的阈值,但这台机器的CPU使用率已经以较快的速度上升,需要监控人员重点关注,尽快调查CPU使用率快速上升的原因,避免不久的将来影响系统的运行。As shown in FIG. 3 , the actual CPU usage from May 31 to June 14 is shown. Compared with the previous forecast, although the CPU usage did not exceed 75%, the CPU usage has increased significantly compared with the previous 15 days. In this case, although the threshold for monitoring alarms has not been reached, the CPU usage of this machine has exceeded The rapid increase requires monitoring personnel to focus on and investigate the cause of the rapid increase in CPU usage as soon as possible, so as to avoid affecting the operation of the system in the near future.

实例三:长期预前监控告警Example 3: Long-term pre-monitoring alarm

以监控网络设备的CPU使用率、时间窗口长度为1个月为例,其长期预前监控告警流程可包括:Taking the monitoring of the CPU utilization rate of network equipment and the time window length as one month as an example, the long-term pre-monitoring and alarming process may include:

按照设定的数据采集周期从传统监控系统中采集一次CPU使用率,例如每5分钟、10分钟或15分钟采集一次,每个小时统计该时间段内采集到的CPU使用率的最大值,将1天之内每个小时统计到的CPU使用率峰值进行平均,得到该天的CPU使用率的峰值均值;将一个月内每天的CPU使用率峰值均值再次进行平均,得到一个月的CPU使用率峰值均值。Collect the CPU usage rate from the traditional monitoring system according to the set data collection cycle, for example, collect once every 5 minutes, 10 minutes or 15 minutes, and count the maximum value of the CPU usage rate collected in this time period every hour, and set Average the peak value of the CPU usage counted every hour within a day to obtain the peak average value of the CPU usage rate for that day; average the peak value value of the CPU usage rate every day within a month to obtain the CPU usage rate for a month peak mean.

然后从当前月开始向前取3个月的CPU使用率峰值均值,该3个月的CPU使用率的峰值均值可如图4所示。其中,示出了2009年5月~7月期间每月的CPU使用率的峰值均值。Then take the peak average value of the CPU usage for three months forward from the current month, and the peak average value of the CPU usage for the three months can be shown in FIG. 4 . Wherein, the peak average value of the monthly CPU usage during May to July 2009 is shown.

然后根据2009年5月~7月期间每月的CPU使用率的峰值均值,利用线性回归的算法,预测未来三个月的CPU使用率。Then, according to the peak average value of the monthly CPU usage from May to July 2009, use the linear regression algorithm to predict the CPU usage in the next three months.

从图4可以看出:未来三个月CPU使用率较之前几个月的变化不大,基本比较稳定,可以认为该服务器未来3个月内是满足容量需求的。It can be seen from Figure 4 that the CPU usage rate in the next three months will not change much compared with the previous months, and is basically stable. It can be considered that the server will meet the capacity requirements in the next three months.

以某公司一台应用服务器的CPU使用率为例,如图5所示:Take the CPU usage rate of an application server in a company as an example, as shown in Figure 5:

该实例通过计算2009年5、6、7三个月的CPU使用率均值,来预测未来三个月的CPU使用率情况。从图5可以看出:CPU使用率呈明显的下降趋势,未来几个月将降至50%以下,可以考虑重新分配资源,使得该机器可以更加合理有效地使用。This instance predicts the CPU usage in the next three months by calculating the average CPU usage in May, June, and July of 2009. It can be seen from Figure 5 that the CPU usage is showing a clear downward trend, and will drop below 50% in the next few months. It can be considered to re-allocate resources so that the machine can be used more reasonably and effectively.

基于相同的技术构思,本发明实施例还提供了一种可实现上述流程的性能监控装置。Based on the same technical concept, an embodiment of the present invention also provides a performance monitoring device that can implement the above process.

参见图6,为本发明实施例提供的网络管理系统的结构示意图。该装置可包括:Referring to FIG. 6 , it is a schematic structural diagram of a network management system provided by an embodiment of the present invention. This device can include:

采集模块601,用于周期采集网络设备的KPI性能指标数据;A collection module 601, configured to periodically collect KPI performance index data of network equipment;

汇总模块602,用于将采集到的KPI性能指标数据按照设定的时间窗口进行汇总,得到各时间窗口的KPI性能指标数据汇总值;A summary module 602, configured to summarize the collected KPI performance index data according to the set time window, to obtain the summary value of the KPI performance index data of each time window;

预测模块603,用于利用线性回归算法模型,并根据当前时间窗口之前的N个时间窗口的KPI性能指标数据汇总值,预测包括当前时间窗口在内的未来N个时间窗口的KPI性能指标数据;The prediction module 603 is used to predict the KPI performance index data of N time windows in the future including the current time window according to the summary value of the KPI performance index data of N time windows before the current time window by using the linear regression algorithm model;

告警模块604,用于判断预所述未来N个时间窗口的KPI性能指标数据是否超过设定告警阈值,并在判断为是时发出告警。The alarm module 604 is configured to determine whether the KPI performance index data in the N future time windows exceeds the set alarm threshold, and issue an alarm if the determination is yes.

进一步的,还可包括输出模块605,用于输出预测到的所述未来N个时间窗口的KPI性能指标数据。Further, an output module 605 may also be included, configured to output the predicted KPI performance index data of the N time windows in the future.

具体的,汇总模块602汇总得到的所述各时间窗口的KPI性能指标数据汇总值,为各时间窗口内采集到的KPI性能指标数据峰值,或者为各时间窗口内的各采集周期采集到的KPI性能指标数据峰值的平均值,所述采集周期的长度小于所述时间窗口的长度。Specifically, the KPI performance index data summary value of each time window summarized by the summary module 602 is the peak value of the KPI performance index data collected in each time window, or the KPI collected in each collection period in each time window The average value of the peak value of the performance index data, the length of the collection period is shorter than the length of the time window.

具体的,预测模块603可利用线性回归算法模型,并根据当前时间窗口之前的N个时间窗口的KPI性能指标数据汇总值,计算得到该N个时间窗口的KPI性能指标数据汇总值的线性回归算法模型斜率;利用所述线性回归算法模型,并根据该斜率分别计算得到包括当前时间窗口在内的未来N个时间窗口的KPI性能指标数据的预测值。其具体实现可参照前述流程描述。Specifically, the prediction module 603 can use the linear regression algorithm model, and calculate the linear regression algorithm of the KPI performance index data summary value of the N time windows before the current time window according to the summary value of the KPI performance index data of the N time windows Model slope: using the linear regression algorithm model, and calculating the predicted values of the KPI performance index data of N time windows in the future including the current time window according to the slope. Its specific implementation can be described with reference to the aforementioned process.

具体的,预测模块603可在当前时间窗口之前的N个时间窗口中,用每个时间窗口的KPI性能指标数据汇总值减去其前一个时间窗口的KPI性能指标数据汇总值,得到包含有N-1个增量值的数组;利用线性回归算法模型,并根据该数组,计算得到该N-1个增量值的线性回归算法模型斜率;利用所述线性回归算法模型,并根据该斜率分别计算得到包括当前时间窗口在内的未来N个时间窗口的KPI性能指标数据与前一个时间窗口相比的增量值;根据所述未来N个时间窗口中的每一个时间窗口的KPI性能指标数据的增量值,及其前一个时间窗口的KPI性能指标数据预测值,得到所述未来N个时间窗口中的每个时间窗口的KPI性能指标数据预测值;其中,该未来N个时间窗口中的第一个时间窗口的KPI性能指标数据预测值为对应的增量值与其前一个时间窗口的KPI性能指标数据汇总值之和。其具体实现可参照前述流程描述。Specifically, in the N time windows before the current time window, the prediction module 603 may subtract the KPI performance index data summary value of the previous time window from the KPI performance index data summary value of each time window to obtain the N -1 array of incremental values; using the linear regression algorithm model, and according to the array, calculate the slope of the linear regression algorithm model of the N-1 incremental values; using the linear regression algorithm model, and according to the slope respectively Calculate the incremental value of the KPI performance index data of the next N time windows including the current time window compared with the previous time window; according to the KPI performance index data of each time window in the future N time windows Incremental value, and the KPI performance index data prediction value of the previous time window, to obtain the KPI performance index data prediction value of each time window in the N time windows in the future; wherein, in the N time windows in the future The predicted value of the KPI performance indicator data in the first time window is the sum of the corresponding incremental value and the summary value of the KPI performance indicator data in the previous time window. Its specific implementation can be described with reference to the aforementioned process.

具体的,告警模块604具体用于:若所述未来N个时间窗口的KPI性能指标数据超过该KPI性能指标阈值,则发出告警;若所述当前时间窗口之前的N个时间窗口的KPI性能指标数据汇总值的线性回归算法模型斜率大于设定阈值,则发出告警;若所述当前时间窗口之前的N个时间窗口的KPI性能指标数据汇总值的线性回归算法模型斜率大于设定阈值,且所述未来N个时间窗口的KPI性能指标数据超过该KPI性能指标阈值,则发出告警。Specifically, the alarm module 604 is specifically configured to: if the KPI performance index data of the N time windows in the future exceeds the KPI performance index threshold, send an alarm; if the KPI performance index data of the N time windows before the current time window If the slope of the linear regression algorithm model of the data summary value is greater than the set threshold, an alarm is issued; if the slope of the linear regression algorithm model of the KPI performance index data summary value of the N time windows before the current time window is greater than the set threshold, and the If the KPI performance index data of the N future time windows exceeds the threshold of the KPI performance index, an alarm is issued.

具体的,所述KPI性能指标阈值是根据所述未来N个时间窗口的KPI性能指标数据,以及当前时间窗口之前的N个时间窗口的KPI性能指标数据汇总值的线性回归算法模型斜率,在预先设置的KPI性能指标阈值基础上调整得到的。Specifically, the threshold of the KPI performance index is based on the slope of the linear regression algorithm model of the KPI performance index data of the N time windows in the future and the summary value of the KPI performance index data of the N time windows before the current time window. It is adjusted based on the set KPI performance indicator threshold.

具体的,所述时间窗口的长度以分钟为单位;或者,所述时间窗口的长度以天为单位;或者,所述时间窗口的长度以月为单位。Specifically, the length of the time window is in minutes; or, the length of the time window is in days; or, the length of the time window is in months.

通过以上描述可以看出,本发明实施例与现有技术相比,在以下几个方面具有优势:It can be seen from the above description that, compared with the prior art, the embodiments of the present invention have advantages in the following aspects:

(1)本发明实施例将实时监控变成主动预警式监控,从而避免现有技术问题的发生,将传统的监控数据进行深入的挖掘整理,通过计算历史数据的斜率的方法,来预测未来一段时间的使用情况,通过配置告警机制实现主动式预警,根据系统KPI指标的变化趋势,在系统发生性能突变时,会提前发出预警信息,告知系统维护人员采取有效措施防止发生系统性能方面的故障。(1) The embodiment of the present invention turns real-time monitoring into active early warning monitoring, so as to avoid the occurrence of existing technical problems, carry out in-depth mining and sorting of traditional monitoring data, and predict the future period by calculating the slope of historical data. For the use of time, active early warning is realized by configuring the alarm mechanism. According to the change trend of the system KPI indicators, when the system has a sudden change in performance, an early warning message will be issued in advance to inform the system maintenance personnel to take effective measures to prevent system performance failures.

(2)本发明实施例支持分析时间窗口的可配置,默认为标准的15分钟,15分钟的时间窗口是根据长期运维经验积累得出的最佳时间窗口配置,15分钟的间隔预测出来的效果也最为理想,也可以根据不同用户的不同系统,进行客户化的时间窗口配置,更改简便易行,推广前景十分广泛。(2) The embodiment of the present invention supports the configuration of the analysis time window, which defaults to the standard 15 minutes, and the 15-minute time window is the best time window configuration based on long-term operation and maintenance experience, and the interval of 15 minutes is predicted The effect is also the most ideal, and it is also possible to configure customized time windows according to different systems of different users, which is easy to change and has a wide promotion prospect.

(3)本发明实施例提供了近实时预测过程,通过在实际应用系统的监控结果来看,通常在系统KPI数据超过阈值之前的60到120分钟,可以实现故障的预前定位,提前给管理人员发送告警信息,做到防患于未然。(3) The embodiment of the present invention provides a near-real-time prediction process. According to the monitoring results of the actual application system, usually 60 to 120 minutes before the system KPI data exceeds the threshold, the pre-location of the fault can be realized and the management can be given in advance. Personnel send alarm information to prevent problems before they happen.

(4)本发明实施例提供了进行中长期(15天)预测过程,中长期预测可以预测未来15天左右的KPI趋势数据,为中长期系统监控提供依据。(4) The embodiment of the present invention provides a mid-to-long-term (15-day) forecasting process. The mid-to-long-term forecast can predict KPI trend data in the next 15 days or so, providing a basis for mid-to-long-term system monitoring.

(5)本发明实施例提供了容量预测(6-12月),根据长期(6-12月)的预测,可以为关键业务系统的主机容量规划提供宝贵的依据,为系统扩容奠定良好的基础。(5) The embodiment of the present invention provides capacity prediction (6-12 months), according to the long-term (6-12 months) prediction, it can provide valuable basis for the host capacity planning of key business systems, and lay a good foundation for system expansion .

(6)本发明实施例实现了监控的KPI可配置,从CPU、内存到存储、网络几百种KPI可供配置,能够监控的范围比较广泛,依赖之前的历史监控数据,可以实现对于CPU、内存、网络等等几百种KPI的预前监控,同时对于监控的KPI指标,还可以实现完全的定制化,提供了良好的人机交互界面,配置方法简便易行,为系统的移植和推广奠定了良好的基础。(6) The embodiment of the present invention realizes the configurable KPI of monitoring, hundreds of KPIs are available for configuration from CPU, memory to storage, and the network, and the scope of monitoring is relatively wide. Pre-monitoring of hundreds of KPIs such as memory, network, etc. At the same time, the monitoring KPI indicators can also be completely customized, providing a good human-computer interaction interface, and the configuration method is simple and easy, which is very convenient for system transplantation and promotion. A good foundation has been laid.

(7)本发明实施例部署简单方便,对系统的资源占用极低,并且成本较低,易于大面积推广使用。(7) The embodiment of the present invention is simple and convenient to deploy, occupies very low system resources, and has low cost, and is easy to be popularized and used in a large area.

(8)本发明实施例预警的规则可定义,可根据斜率及预测值阈值进行定义,KPI斜率及阈值两项指标可以根据用户的业务特点进行客户化的预定义,从而能够完全满足客户化的故障预前定位。当系统运行资源较为紧张,KPI长期处于运行门限值之上,但没有明显变化的时候,可以在一定程度上认为系统运行状态正常,通过调整告警用的KPI性能指标数据阈值,可以避免产生的大量告警,减轻系统维护人员处理告警的负担。(8) The rules of the early warning in the embodiment of the present invention can be defined, and can be defined according to the slope and the threshold of the predicted value. The two indicators of the KPI slope and the threshold can be customized according to the user's business characteristics, so as to fully meet the requirements of customization. Pre-failure positioning. When the operating resources of the system are relatively tight, and the KPI is above the operating threshold for a long time, but there is no significant change, the system can be considered to be running normally to a certain extent. By adjusting the threshold of the KPI performance index data used for alarms, the generation of alarms can be avoided. A large number of alarms reduce the burden on system maintenance personnel to deal with alarms.

通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到本发明可以通过硬件实现,也可以借助软件加必要的通用硬件平台的方式来实现。基于这样的理解,本发明的技术方案可以以软件产品的形式体现出来,该软件产品可以存储在一个非易失性存储介质(可以是CD-ROM,U盘,移动硬盘等)中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本发明各个实施例所述的方法。Through the above description of the embodiments, those skilled in the art can clearly understand that the present invention can be realized by hardware, or by software plus a necessary general hardware platform. Based on this understanding, the technical solution of the present invention can be embodied in the form of software products, which can be stored in a non-volatile storage medium (which can be CD-ROM, U disk, mobile hard disk, etc.), including several The instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute the methods described in various embodiments of the present invention.

本领域技术人员可以理解附图只是一个优选实施例的示意图,附图中的模块或流程并不一定是实施本发明所必须的。Those skilled in the art can understand that the drawing is only a schematic diagram of a preferred embodiment, and the modules or processes in the drawing are not necessarily necessary for implementing the present invention.

本领域技术人员可以理解实施例中的装置中的模块可以按照实施例描述进行分布于实施例的装置中,也可以进行相应变化位于不同于本实施例的一个或多个装置中。上述实施例的模块可以合并为一个模块,也可以进一步拆分成多个子模块。Those skilled in the art can understand that the modules in the device in the embodiment can be distributed in the device in the embodiment according to the description in the embodiment, or can be located in one or more devices different from the embodiment according to corresponding changes. The modules in the above embodiments can be combined into one module, and can also be further split into multiple sub-modules.

上述本发明序号仅仅为了描述,不代表实施例的优劣。The above serial numbers of the present invention are only for description, and do not represent the advantages and disadvantages of the embodiments.

以上公开的仅为本发明的几个具体实施例,但是,本发明并非局限于此,任何本领域的技术人员能思之的变化都应落入本发明的保护范围。The above disclosures are only a few specific embodiments of the present invention, however, the present invention is not limited thereto, and any changes conceivable by those skilled in the art shall fall within the protection scope of the present invention.

Claims (16)

1. a performance of network equipments method for supervising, is characterized in that, comprising:
The KPI performance index data of network management system cycle collection network equipment;
Described network management system gathers the KPI performance index data that collect according to the time window of setting, the KPI performance index data that obtain each time window gather value;
Described network management system is utilized the linear regression algorithm model, and gathers value according to the KPI performance index data of N time window before the current time window, and prediction comprises the KPI performance index data of following N time window of current time window;
Described network management system judges whether the KPI performance index data of described following N time window surpass the setting alarm threshold, and sends alarm when being judged as YES.
2. the method for claim 1, is characterized in that, the KPI performance index data of described each time window gather value, is the KPI performance index data peaks that collects in each time window; Perhaps
The KPI performance index data of described each time window gather value, the mean value of the KPI performance index data peaks that collects for each collection period in each time window, and the length of described collection period is less than the length of described time window.
3. the method for claim 1, it is characterized in that, the described linear regression algorithm model that utilizes, and gather value according to the KPI performance index data of N time window before the current time window, prediction comprises the KPI performance index data of following N time window of current time window, comprising:
Utilize the linear regression algorithm model, and gather value according to the KPI performance index data of N time window before the current time window, the KPI performance index data that calculate this N time window gather the linear regression algorithm model slope of value;
Utilize described linear regression algorithm model, and calculate respectively the predicted value of the KPI performance index data of following N the time window that comprises the current time window according to this slope.
4. the method for claim 1, it is characterized in that, the described linear regression algorithm model that utilizes, and gather value according to the KPI performance index data of N time window before the current time window, prediction comprises the KPI performance index data of following N time window of current time window, comprising:
In N before a current time window time window, gather with the KPI performance index data of each time window the KPI performance index data that value deducts its previous time window and gather value, obtain including the array of N-1 increment size;
Utilize the linear regression algorithm model, and according to this array, calculate the linear regression algorithm model slope of this N-1 increment size;
Utilize described linear regression algorithm model, and calculate respectively according to this slope the increment size that the KPI performance index data of following N the time window that comprises the current time window are compared with previous time window;
Increment size according to the KPI performance index data of each time window in a described following N time window, and the KPI performance index data prediction value of previous time window, obtain the KPI performance index data prediction value of each time window in described following N time window; Wherein, the KPI performance index data prediction value of first time window in N time window in this future gathers the value sum for the KPI performance index data of corresponding increment size time window previous with it.
5. the method for claim 1, is characterized in that, when one of following situation, the KPI performance index data of the following N of a described network management system judgement time window surpass sets alarm threshold:
The KPI performance index data of described following N time window surpass this KPI Performance Counter Threshold;
The KPI performance index data of N time window before described current time window gather the linear regression algorithm model slope of value greater than setting threshold;
The linear regression algorithm model slope that the KPI performance index data of N time window before described current time window gather value is greater than setting threshold, and the KPI performance index data of described following N time window surpass this KPI Performance Counter Threshold.
6. method as claimed in claim 5, it is characterized in that, described KPI Performance Counter Threshold is the KPI performance index data according to a described following N time window, and the KPI performance index data of N time window before the current time window gather the linear regression algorithm model slope of value, and adjustment obtains on the KPI Performance Counter Threshold basis that sets in advance.
7. method as described in one of claim 1-6, is characterized in that, the length of described time window is take minute as unit; Perhaps
The length of described time window take in the sky as unit; Perhaps
The length of described time window is take the moon as unit.
8. method as described in one of claim 1-6, is characterized in that, described KPI performance index data comprise one of following or combination in any: CPU usage, memory usage and network throughput.
9. a network management system, is characterized in that, comprising:
Acquisition module is for the KPI performance index data of cycle collection network equipment;
Summarizing module, the KPI performance index data that are used for collecting gather according to the time window of setting, and the KPI performance index data that obtain each time window gather value;
Prediction module is used for utilizing the linear regression algorithm model, and gathers value according to the KPI performance index data of N time window before the current time window, and prediction comprises the KPI performance index data of following N time window of current time window;
Alarm module is used for judging whether the KPI performance index data of a pre-described following N time window surpass the setting alarm threshold, and sends alarm when being judged as YES.
10. network management system as claimed in claim 9, it is characterized in that, the KPI performance index data that described summarizing module gathers described each time window that obtains gather value, be the KPI performance index data peaks that collects in each time window, the mean value of the KPI performance index data peaks that perhaps collects for each collection period in each time window, the length of described collection period is less than the length of described time window.
11. network management system as claimed in claim 9, it is characterized in that, described prediction module specifically is used for, utilize the linear regression algorithm model, and gathering value according to the KPI performance index data of N time window before the current time window, the KPI performance index data that calculate this N time window gather the linear regression algorithm model slope of value; Utilize described linear regression algorithm model, and calculate respectively the predicted value of the KPI performance index data of following N the time window that comprises the current time window according to this slope.
12. network management system as claimed in claim 9, it is characterized in that, described prediction module specifically is used for, in N before a current time window time window, gather with the KPI performance index data of each time window the KPI performance index data that value deducts its previous time window and gather value, obtain including the array of N-1 increment size; Utilize the linear regression algorithm model, and according to this array, calculate the linear regression algorithm model slope of this N-1 increment size; Utilize described linear regression algorithm model, and calculate respectively according to this slope the increment size that the KPI performance index data of following N the time window that comprises the current time window are compared with previous time window; Increment size according to the KPI performance index data of each time window in a described following N time window, and the KPI performance index data prediction value of previous time window, obtain the KPI performance index data prediction value of each time window in described following N time window; Wherein, the KPI performance index data prediction value of first time window in N time window in this future gathers the value sum for the KPI performance index data of corresponding increment size time window previous with it.
13. network management system as claimed in claim 12 is characterized in that, described alarm module specifically is used for, if the KPI performance index data of described following N time window surpass this KPI Performance Counter Threshold, sends alarm; If the linear regression algorithm model slope that the KPI performance index data of N time window before described current time window gather value sends alarm greater than setting threshold; If the KPI performance index data of N time window before described current time window gather the linear regression algorithm model slope of value greater than setting threshold, and the KPI performance index data of described following N time window surpass this KPI Performance Counter Threshold, send alarm.
14. network management system as claimed in claim 13, it is characterized in that, described KPI Performance Counter Threshold is the KPI performance index data according to a described following N time window, and the KPI performance index data of N time window before the current time window gather the linear regression algorithm model slope of value, and adjustment obtains on the KPI Performance Counter Threshold basis that sets in advance.
15. network management system as described in one of claim 9-14 is characterized in that, the length of described time window is take minute as unit; Perhaps
The length of described time window take in the sky as unit; Perhaps
The length of described time window is take the moon as unit.
16. network management system as described in one of claim 9-14 is characterized in that, described KPI performance index data comprise one of following or combination in any: CPU usage, memory usage and network throughput.
CN2011104303495A 2011-12-20 2011-12-20 A network equipment performance monitoring method and network management system Pending CN103178990A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2011104303495A CN103178990A (en) 2011-12-20 2011-12-20 A network equipment performance monitoring method and network management system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2011104303495A CN103178990A (en) 2011-12-20 2011-12-20 A network equipment performance monitoring method and network management system

Publications (1)

Publication Number Publication Date
CN103178990A true CN103178990A (en) 2013-06-26

Family

ID=48638622

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2011104303495A Pending CN103178990A (en) 2011-12-20 2011-12-20 A network equipment performance monitoring method and network management system

Country Status (1)

Country Link
CN (1) CN103178990A (en)

Cited By (37)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103490948A (en) * 2013-09-06 2014-01-01 华为技术有限公司 Method and device for predicting network performance
CN103945442A (en) * 2014-05-07 2014-07-23 东南大学 System anomaly detection method based on linear prediction principle in mobile communication system
CN104468206A (en) * 2014-11-28 2015-03-25 华为技术服务有限公司 Performance warning method and device
CN104731690A (en) * 2013-11-13 2015-06-24 奈飞公司 Adaptive metric collection, storage, and alert thresholds
WO2015090022A1 (en) * 2013-12-18 2015-06-25 中兴通讯股份有限公司 Resource scheduling method and device, and computer storage medium
CN105071968A (en) * 2015-08-18 2015-11-18 大唐移动通信设备有限公司 Method and device for repairing hidden failures of service plane and control plane of communication device
WO2015172508A1 (en) * 2014-05-16 2015-11-19 中兴通讯股份有限公司 Performance data processing method and device
CN105323111A (en) * 2015-11-17 2016-02-10 南京南瑞集团公司 Operation and maintenance automation system and method
CN105491599A (en) * 2015-12-21 2016-04-13 南京华苏科技股份有限公司 Novel regression system for predicting LTE network performance indexes
CN105634787A (en) * 2014-11-26 2016-06-01 华为技术有限公司 Evaluation method, prediction method and device and system for network key indicator
CN105871575A (en) * 2015-01-21 2016-08-17 中国移动通信集团河南有限公司 Load early warning method and device for core network elements
CN106095639A (en) * 2016-05-30 2016-11-09 中国农业银行股份有限公司 A kind of cluster subhealth state method for early warning and system
CN106452931A (en) * 2016-12-27 2017-02-22 中国建设银行股份有限公司 Monitoring index, domain value discovery method, domain value adjusting method and automatic monitoring system
CN106487571A (en) * 2015-09-02 2017-03-08 中国移动通信集团公司 A kind of method and device of assessment network performance index variation tendency
CN106533730A (en) * 2015-09-15 2017-03-22 中兴通讯股份有限公司 Method and device for acquiring index of Hadoop cluster component
CN106559813A (en) * 2015-09-28 2017-04-05 中兴通讯股份有限公司 A kind of network estimation method and device
CN106713029A (en) * 2016-12-20 2017-05-24 中国银联股份有限公司 Method and apparatus for determining resource monitoring thresholds
CN106886485A (en) * 2017-02-28 2017-06-23 深圳市华傲数据技术有限公司 Power system capacity analyzing and predicting method and device
CN107426019A (en) * 2017-07-06 2017-12-01 国家电网公司 Network failure determines method, computer equipment and computer-readable recording medium
CN107534570A (en) * 2015-06-16 2018-01-02 慧与发展有限责任合伙企业 Virtualize network function monitoring
CN107608870A (en) * 2017-09-22 2018-01-19 郑州云海信息技术有限公司 A kind of statistical method and system of system resource utilization rate
WO2018103524A1 (en) * 2016-12-08 2018-06-14 Huawei Technologies Co., Ltd. Prediction of performance indicators in cellular networks
CN108984320A (en) * 2018-06-27 2018-12-11 郑州云海信息技术有限公司 A kind of anti-fissure method and device of message queue cluster
CN109298989A (en) * 2018-09-14 2019-02-01 北京市天元网络技术股份有限公司 Operational indicator threshold value acquisition methods and device
CN109933487A (en) * 2017-12-19 2019-06-25 深圳光启合众科技有限公司 The monitoring method and device of intelligent robot
CN110134079A (en) * 2019-03-26 2019-08-16 石化盈科信息技术有限责任公司 A kind of technological parameter method for early warning and system based on slope analysis
CN110278121A (en) * 2018-03-15 2019-09-24 中兴通讯股份有限公司 A kind of method, apparatus, equipment and storage medium detecting network performance exception
CN110300008A (en) * 2018-03-22 2019-10-01 北京华为数字技术有限公司 A kind of method and device of the state of the determining network equipment
CN110326257A (en) * 2017-03-01 2019-10-11 瑞典爱立信有限公司 Use the method and apparatus of artificial life forecast key performance indicators
CN111431769A (en) * 2020-03-30 2020-07-17 招商局金融科技有限公司 Data monitoring method, server and storage medium
CN111800297A (en) * 2020-07-07 2020-10-20 浪潮云信息技术股份公司 Snmp-based intelligent monitoring method and system for cloud physical host
CN111934895A (en) * 2019-05-13 2020-11-13 中国移动通信集团湖北有限公司 Intelligent early warning method and device for network management system and computing equipment
CN112203311A (en) * 2019-07-08 2021-01-08 中国移动通信集团浙江有限公司 Network element abnormity diagnosis method, device, equipment and computer storage medium
CN112486772A (en) * 2020-12-02 2021-03-12 南宁师范大学 Method and system for monitoring computer equipment in real time based on WeChat small program
CN113672467A (en) * 2021-08-24 2021-11-19 中国电信股份有限公司 Operation and maintenance early warning method and device, electronic equipment and storage medium
CN113890837A (en) * 2021-09-13 2022-01-04 浪潮通信信息系统有限公司 Method and system for predicting index degradation based on sliding window cross algorithm
CN115253384A (en) * 2022-06-24 2022-11-01 云南电网有限责任公司电力科学研究院 Insulating oil filtering method, oil filter, system, computer equipment and medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101541016A (en) * 2009-05-06 2009-09-23 华为技术有限公司 Method for predicting data and equipment
CN102004671A (en) * 2010-11-15 2011-04-06 北京航空航天大学 Resource management method of data center based on statistic model in cloud computing environment
CN102111284A (en) * 2009-12-28 2011-06-29 北京亿阳信通软件研究院有限公司 Method and device for predicting telecom traffic

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101541016A (en) * 2009-05-06 2009-09-23 华为技术有限公司 Method for predicting data and equipment
CN102111284A (en) * 2009-12-28 2011-06-29 北京亿阳信通软件研究院有限公司 Method and device for predicting telecom traffic
CN102004671A (en) * 2010-11-15 2011-04-06 北京航空航天大学 Resource management method of data center based on statistic model in cloud computing environment

Cited By (61)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015032252A1 (en) * 2013-09-06 2015-03-12 华为技术有限公司 Prediction method and device for network performance
US10298464B2 (en) 2013-09-06 2019-05-21 Huawei Technologies Co., Ltd. Network performance prediction method and apparatus
CN103490948A (en) * 2013-09-06 2014-01-01 华为技术有限公司 Method and device for predicting network performance
US11212208B2 (en) 2013-11-13 2021-12-28 Netflix, Inc. Adaptive metric collection, storage, and alert thresholds
US10498628B2 (en) 2013-11-13 2019-12-03 Netflix, Inc. Adaptive metric collection, storage, and alert thresholds
CN104731690A (en) * 2013-11-13 2015-06-24 奈飞公司 Adaptive metric collection, storage, and alert thresholds
CN104731690B (en) * 2013-11-13 2019-10-15 奈飞公司 Adaptive Metrics Collection, Storage, and Warning Thresholds
WO2015090022A1 (en) * 2013-12-18 2015-06-25 中兴通讯股份有限公司 Resource scheduling method and device, and computer storage medium
CN103945442A (en) * 2014-05-07 2014-07-23 东南大学 System anomaly detection method based on linear prediction principle in mobile communication system
CN105101281A (en) * 2014-05-16 2015-11-25 中兴通讯股份有限公司 Performance data processing method and device
WO2015172508A1 (en) * 2014-05-16 2015-11-19 中兴通讯股份有限公司 Performance data processing method and device
CN105634787B (en) * 2014-11-26 2018-12-07 华为技术有限公司 Appraisal procedure, prediction technique and the device and system of network key index
CN105634787A (en) * 2014-11-26 2016-06-01 华为技术有限公司 Evaluation method, prediction method and device and system for network key indicator
CN104468206B (en) * 2014-11-28 2019-04-05 华为技术服务有限公司 The method and apparatus of performance alarm
CN104468206A (en) * 2014-11-28 2015-03-25 华为技术服务有限公司 Performance warning method and device
CN105871575A (en) * 2015-01-21 2016-08-17 中国移动通信集团河南有限公司 Load early warning method and device for core network elements
CN107534570B (en) * 2015-06-16 2021-08-24 慧与发展有限责任合伙企业 Computer system, method and medium for virtualized network function monitoring
US10680896B2 (en) 2015-06-16 2020-06-09 Hewlett Packard Enterprise Development Lp Virtualized network function monitoring
CN107534570A (en) * 2015-06-16 2018-01-02 慧与发展有限责任合伙企业 Virtualize network function monitoring
CN105071968A (en) * 2015-08-18 2015-11-18 大唐移动通信设备有限公司 Method and device for repairing hidden failures of service plane and control plane of communication device
CN106487571A (en) * 2015-09-02 2017-03-08 中国移动通信集团公司 A kind of method and device of assessment network performance index variation tendency
CN106487571B (en) * 2015-09-02 2020-02-14 中国移动通信集团公司 Method and device for evaluating network performance index change trend
CN106533730A (en) * 2015-09-15 2017-03-22 中兴通讯股份有限公司 Method and device for acquiring index of Hadoop cluster component
CN106533730B (en) * 2015-09-15 2020-07-31 南京中兴软件有限责任公司 Hadoop cluster component index acquisition method and device
CN106559813A (en) * 2015-09-28 2017-04-05 中兴通讯股份有限公司 A kind of network estimation method and device
CN105323111B (en) * 2015-11-17 2018-08-10 南京南瑞集团公司 A kind of O&M automated system and method
CN105323111A (en) * 2015-11-17 2016-02-10 南京南瑞集团公司 Operation and maintenance automation system and method
CN105491599A (en) * 2015-12-21 2016-04-13 南京华苏科技股份有限公司 Novel regression system for predicting LTE network performance indexes
CN105491599B (en) * 2015-12-21 2019-03-08 南京华苏科技有限公司 Predict the novel regression system of LTE network performance indicator
CN106095639A (en) * 2016-05-30 2016-11-09 中国农业银行股份有限公司 A kind of cluster subhealth state method for early warning and system
CN109983798A (en) * 2016-12-08 2019-07-05 华为技术有限公司 The prediction of performance indicator in cellular network
WO2018103524A1 (en) * 2016-12-08 2018-06-14 Huawei Technologies Co., Ltd. Prediction of performance indicators in cellular networks
CN106713029B (en) * 2016-12-20 2020-05-01 中国银联股份有限公司 A method and device for determining resource monitoring threshold
CN106713029A (en) * 2016-12-20 2017-05-24 中国银联股份有限公司 Method and apparatus for determining resource monitoring thresholds
CN106452931B (en) * 2016-12-27 2019-09-17 中国建设银行股份有限公司 Monitor control index and thresholding discovery method, thresholding method of adjustment and automatic monitored control system
CN106452931A (en) * 2016-12-27 2017-02-22 中国建设银行股份有限公司 Monitoring index, domain value discovery method, domain value adjusting method and automatic monitoring system
CN106886485A (en) * 2017-02-28 2017-06-23 深圳市华傲数据技术有限公司 Power system capacity analyzing and predicting method and device
CN110326257B (en) * 2017-03-01 2022-11-29 瑞典爱立信有限公司 Method and apparatus for using artificial life forecast key performance indicators
US11424999B2 (en) 2017-03-01 2022-08-23 Telefonaktiebolaget Lm Ericsson (Publ) Method and apparatus for key performance indicator forecasting using artificial life
CN110326257A (en) * 2017-03-01 2019-10-11 瑞典爱立信有限公司 Use the method and apparatus of artificial life forecast key performance indicators
CN107426019A (en) * 2017-07-06 2017-12-01 国家电网公司 Network failure determines method, computer equipment and computer-readable recording medium
CN107608870A (en) * 2017-09-22 2018-01-19 郑州云海信息技术有限公司 A kind of statistical method and system of system resource utilization rate
CN109933487B (en) * 2017-12-19 2024-05-07 潘明旭 Intelligent robot monitoring method and device
CN109933487A (en) * 2017-12-19 2019-06-25 深圳光启合众科技有限公司 The monitoring method and device of intelligent robot
CN110278121A (en) * 2018-03-15 2019-09-24 中兴通讯股份有限公司 A kind of method, apparatus, equipment and storage medium detecting network performance exception
CN110300008B (en) * 2018-03-22 2021-03-23 北京华为数字技术有限公司 Method and device for determining state of network equipment
US11405294B2 (en) 2018-03-22 2022-08-02 Huawei Technologies Co., Ltd. Method and apparatus for determining status of network device
CN110300008A (en) * 2018-03-22 2019-10-01 北京华为数字技术有限公司 A kind of method and device of the state of the determining network equipment
CN108984320A (en) * 2018-06-27 2018-12-11 郑州云海信息技术有限公司 A kind of anti-fissure method and device of message queue cluster
CN109298989A (en) * 2018-09-14 2019-02-01 北京市天元网络技术股份有限公司 Operational indicator threshold value acquisition methods and device
CN110134079A (en) * 2019-03-26 2019-08-16 石化盈科信息技术有限责任公司 A kind of technological parameter method for early warning and system based on slope analysis
CN111934895A (en) * 2019-05-13 2020-11-13 中国移动通信集团湖北有限公司 Intelligent early warning method and device for network management system and computing equipment
CN111934895B (en) * 2019-05-13 2022-11-15 中国移动通信集团湖北有限公司 Intelligent early warning method and device for network management system and computing equipment
CN112203311A (en) * 2019-07-08 2021-01-08 中国移动通信集团浙江有限公司 Network element abnormity diagnosis method, device, equipment and computer storage medium
CN111431769A (en) * 2020-03-30 2020-07-17 招商局金融科技有限公司 Data monitoring method, server and storage medium
CN111800297A (en) * 2020-07-07 2020-10-20 浪潮云信息技术股份公司 Snmp-based intelligent monitoring method and system for cloud physical host
CN112486772A (en) * 2020-12-02 2021-03-12 南宁师范大学 Method and system for monitoring computer equipment in real time based on WeChat small program
CN113672467A (en) * 2021-08-24 2021-11-19 中国电信股份有限公司 Operation and maintenance early warning method and device, electronic equipment and storage medium
CN113672467B (en) * 2021-08-24 2024-08-06 中国电信股份有限公司 Operation and maintenance early warning method and device, electronic equipment and storage medium
CN113890837A (en) * 2021-09-13 2022-01-04 浪潮通信信息系统有限公司 Method and system for predicting index degradation based on sliding window cross algorithm
CN115253384A (en) * 2022-06-24 2022-11-01 云南电网有限责任公司电力科学研究院 Insulating oil filtering method, oil filter, system, computer equipment and medium

Similar Documents

Publication Publication Date Title
CN103178990A (en) A network equipment performance monitoring method and network management system
CN102882745B (en) A kind of method and apparatus for monitoring business server
US9467572B2 (en) Determining usage predictions and detecting anomalous user activity through traffic patterns
CN114443429B (en) Alarm event processing method and device and computer readable storage medium
CN100356729C (en) Method and system for monitoring network service performance
CN110888913B (en) Intelligent analysis system for electricity consumption based on Internet of things technology
CN103580905B (en) A kind of method for predicting, system and flow monitoring method, system
CN113448805B (en) Monitoring method, device, equipment and storage medium based on CPU dynamic threshold
CN108733531A (en) GPU performance monitoring systems based on cloud computing
CN107465575A (en) The monitoring method and system of a kind of cluster
JP6145067B2 (en) Communication traffic prediction apparatus, method and program
CN110930139A (en) Differentiated gas payment service system based on customer consumption
CN106951360B (en) Data statistical integrity calculation method and system
CN103856344A (en) Alarm event information processing method and device
CN113570277A (en) A kind of power capacity management method and device
CN119135540A (en) A configuration system and method for dynamically adjusting business processes
CN107248959A (en) A kind of flow optimization method and device
CN101345656A (en) Global fault rate measuring method
CN102547789A (en) Early warning method, device and system for quality of peer-to-peer service
CN114827033B (en) Data flow control method, device, equipment and computer readable storage medium
CN202798762U (en) Alarm device for power communication failure information analysis
EP2854334A1 (en) Communication network quality monitoring system
CN114493720A (en) Method, device, storage medium and equipment for monitoring Kafka consumers
CN203352612U (en) Integrated flow and alarm monitoring device for power communication devices
TWI749072B (en) Abnormal traffic detecting server and abnormal traffic detecting method thereof

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20130626