CN100435542C

CN100435542C - An overload detection method for a communication transaction processing system

Info

Publication number: CN100435542C
Application number: CNB2005101171484A
Authority: CN
Inventors: 廖建新; 王晶; 王纯; 李炜; 王玉龙; 朱晓民; 武家春; 张磊; 樊利民; 程莉
Original assignee: Beijing University of Posts and Telecommunications
Current assignee: Beijing University of Posts and Telecommunications
Priority date: 2005-11-01
Filing date: 2005-11-01
Publication date: 2008-11-19
Anticipated expiration: 2025-11-01
Also published as: CN1758685A

Abstract

An overload detection method for a communication transaction processing system is proposed based on the relationship between system resource occupancy rate and response time: high system load will inevitably lead to longer background message queuing time and longer response time. Overload detection method: by measuring the two measurable physical quantities-the service average call arrival rate λ and the corresponding system response time T, and then calculating the resource occupancy rate R according to the relevant formula and mechanism designed by the present invention, Then judge whether the system is overloaded. The method does not depend on various specific environments or implementation conditions of the system, but only automatically performs overload detection according to the business load in the system; and the calculation method is simple, the judgment is accurate, and the work is reliable. This method is combined with the applicant's patent application "Intelligent Network Overload Control Method Based on Service Control Point in Multi-Service Environment", which can well realize the overload detection and overload control of the network system.

Description

An overload detection method for a communication transaction processing system

技术领域 technical field

本发明涉及一种用于通信事务处理系统的过载检测方法，属于通信系统中的过载检测技术领域。The invention relates to an overload detection method for a communication transaction processing system, belonging to the technical field of overload detection in communication systems.

背景技术 Background technique

对通信事务系统进行过载检测是一个非常重要的技术问题，因为它是预防通信事务系统过载、实施流量控制的前提。目前，关于过载检测通常采取以下几种常用的方法：It is a very important technical problem to detect the overload of the communication transaction system, because it is the premise of preventing the communication transaction system from being overloaded and implementing flow control. At present, the following commonly used methods are usually adopted for overload detection:

(1)采用消息队列的系统是把队列是否溢出作为检测条件，其依据是当系统负荷增加时，处理速度变慢，等待处理的消息队列就加长，当队列长度增加到一定值时，就认为过载；(1) The system using the message queue uses whether the queue overflows as the detection condition. The basis is that when the system load increases, the processing speed slows down, and the message queue waiting to be processed is lengthened. When the queue length increases to a certain value, it is considered overload;

(2)判断链路负荷是否过高，即当链路负荷达到其设计容量的警戒值时，就认为过载；(2) Judging whether the link load is too high, that is, when the link load reaches the warning value of its design capacity, it is considered to be overloaded;

(3)判断CPU的占用率是否过高，因为CPU占用率从一定程度上反应了系统的忙闲程度，以CPU的占用率作为过载检测条件，在相当范围内是有效的；(3) Determine whether the CPU occupancy rate is too high, because the CPU occupancy rate reflects the busyness of the system to a certain extent, and the CPU occupancy rate is used as the overload detection condition, which is effective in a considerable range;

(4)把系统对消息的响应速度作为过载检测条件。(4) Take the response speed of the system to the message as the overload detection condition.

虽然上述前三种方法都能够在一定范围内检测、判断系统是否过载，但都有一定的局限性。其中方法(1)中队列是否溢出与具体的硬件平台有关，过载的队列长度不容易确定。方法(2)有时并不能真正反映系统过载情况，比如在链路容量与处理系统的配置不成比例等情况。方法(3)是判断系统是否过载最直观的方法，但是，对于复杂的分布式系统而言，准确获取CPU的占用率并不容易，特别是CPU的占用率的测量只能在实际的处理业务系统中，通常与采取过载控制的设备的空间位置是分离的，这样就会导致传递CPU占用率信息的滞后，即过载控制点获知CPU占用率的信息在时间上肯定滞后，因此，即使采取过载控制措施，也必然在时间上滞后，使得控制效果会大打折扣。而且，一旦出现CPU的占用率信息在传递过程中丢失的情况，将使过载控制完全失效的严重情况。Although the above-mentioned first three methods can detect and judge whether the system is overloaded within a certain range, they all have certain limitations. Wherein, whether the queue overflows in the method (1) is related to a specific hardware platform, and the length of the overloaded queue is not easy to determine. Method (2) sometimes cannot really reflect the system overload situation, for example, the link capacity is not proportional to the configuration of the processing system. Method (3) is the most intuitive way to judge whether the system is overloaded. However, for a complex distributed system, it is not easy to accurately obtain the CPU occupancy rate, especially the measurement of the CPU occupancy rate can only be performed in actual business In the system, it is usually separated from the space position of the equipment that takes overload control, which will lead to a delay in the transmission of CPU occupancy information, that is, the time for the overload control point to obtain the information of CPU occupancy must lag behind. Therefore, even if the overload control point is adopted The control measures must also lag in time, so that the control effect will be greatly reduced. Moreover, once the occupancy rate information of the CPU is lost during the transmission process, the overload control will completely fail in a serious situation.

方法(4)虽然相对前三种方法更容易实现，因为无论什么原因导致的过载，最直接的表现形式便是消息的响应速度变慢，即通过检测消息响应时间的变化来判断系统是否过载通常是一种更有效的检测方法。然而，迄今为止，采用响应时间检测过载的方法，都要根据预先测量或设置的响应时间值作为过载检测的基准条件；而这种方法只适用于固定系统和固定业务，即对每一个应用到现网的系统在实际应用之前，都要进行基于既定业务的平均响应时间的测量，以获得判断过载的检测基准。否则，一旦环境变化(包括系统软硬件的环境变化和/或业务环境的变化)，都将导致过载检测判断条件或基准的变化。Although method (4) is easier to implement than the first three methods, because no matter what causes the overload, the most direct manifestation is that the response speed of the message slows down, that is, it is usually judged whether the system is overloaded by detecting the change of the message response time. is a more effective detection method. However, so far, the method of using response time to detect overload has been based on the pre-measured or set response time value as the reference condition for overload detection; and this method is only applicable to fixed systems and fixed services, that is, for each application to Before the actual application of the system on the existing network, the measurement of the average response time based on the established business must be carried out to obtain the detection benchmark for judging the overload. Otherwise, once the environment changes (including system software and hardware environment changes and/or business environment changes), it will lead to changes in overload detection judgment conditions or benchmarks.

因此，如何有效地发现系统的过载检测点，并能适应环境的各种变化，至今还没有找到一个很好的解决方法，已经成为业内技术人员所关注的课题。Therefore, how to effectively find the overload detection point of the system and adapt to various changes in the environment has not found a good solution so far, and has become a subject of concern to technical personnel in the industry.

发明内容 Contents of the invention

有鉴于此，本发明的目的是提供一种用于通信事务处理系统的过载检测方法，该方法能够不依赖系统的各种具体环境或实现条件，只是根据系统中的业务负荷大小状况自动进行过载检测；且计算方法简单，判断准确，工作可靠。In view of this, the purpose of the present invention is to provide a kind of overload detection method for communication transaction processing system, and this method can not depend on various concrete environments or realization conditions of the system, just carry out overload automatically according to the business load size situation in the system detection; and the calculation method is simple, the judgment is accurate, and the work is reliable.

为了达到上述目的，一种用于通信事务处理系统的过载检测方法，其特征在于：包括下述步骤：In order to achieve the above object, an overload detection method for a communication transaction processing system is characterized in that: comprising the following steps:

(1)在系统不过载时，设置系统响应时间T和业务平均呼叫到达率λ之间存在如下关系式：T＝a+bλ；式中，参数a是每条消息在系统中的传输时间；参数b是每消耗单位系统资源对消息响应时间的比例系数；(1) When the system is not overloaded, there is the following relationship between the system response time T and the service average call arrival rate λ: T=a+bλ; in the formula, the parameter a is the transmission time of each message in the system; Parameter b is the proportion factor of the message response time per unit of system resources consumed;

(2)在系统负荷较低、即欠载时，测量多组业务平均呼叫到达率λ及其对应的系统响应时间T的数据，并采用最小二乘法计算上述两个参数a、b；(2) When the system load is low, that is, underload, measure the data of the average call arrival rate λ of multiple groups of services and the corresponding system response time T, and use the least square method to calculate the above two parameters a and b;

(3)在设定的时间段内，测量该时间段的业务平均呼叫到达率λ，再利用步骤(1)的关系式T＝a+bλ，计算该时间段的系统响应时间T的预测值T_预测；(3) In the set time period, measure the service average call arrival rate λ in this time period, and then use the relational expression T=a+bλ in step (1) to calculate the predicted value of the system response time T in this time period T _prediction ;

(4)在设定的同一时间段内，测量该时间段的系统响应时间T的实际数值，即系统实际响应时间T_测量；(4) In the same set time period, measure the actual value of the system response time T in this time period, that is, _{measure the} actual system response time T;

(5)根据系统实际响应时间T_测量和系统响应时间T的预测值T_预测之间的数值大小，判定系统是否出现“过载检测点”：(5) According to the value between the actual response time T _measurement of the system and the predicted value T _prediction of the system response time T, determine whether there is an "overload detection point" in the system:

如果T_测量＞T_预测+ωσ，则系统过载；否则，系统不过载；式中ω为过载保护因子，防止响应时间波动导致的误判，取经验值：1～5；σ为剩余标准差。If T _measurement > T _prediction + ωσ, the system is overloaded; otherwise, the system is not overloaded; where ω is the overload protection factor to prevent misjudgment caused by response time fluctuations, empirical values: 1 to 5; σ is the residual standard deviation.

该方法进一步包括下述步骤：The method further comprises the steps of:

(6)对所述两个参数a、b定期进行动态调整，即每隔设定时间执行步骤(2)，计算两个参数a、b的数值，以适应系统和业务环境的变化。(6) Dynamically adjust the two parameters a and b on a regular basis, that is, perform step (2) every set time to calculate the values of the two parameters a and b to adapt to changes in the system and business environment.

所述步骤(2)中，根据最小二乘法计算a、b两个参数的具体步骤为：In described step (2), according to the least squares method, the specific steps of calculating a, b two parameters are:

(21)在系统负荷较低时，测量N组业务平均呼叫到达率λ_i及其对应的系统响应时间T_i的数据，i为每组数据的序号，N为正整数，表示测量的数据组的最大序号；(21) When the system load is low, measure the data of the average call arrival rate λ _i of N groups of services and the corresponding system response time T _i , i is the serial number of each group of data, N is a positive integer, indicating the measured data group The largest sequence number;

(22)根据每组业务平均呼叫到达率λ_i和系统响应时间T_i的数据和下述公式，分别计算下述三个数值：(22) According to the data of average call arrival rate λ _i and system response time T _i of each group of business and the following formula, calculate the following three values respectively:

${S S}_{λλ λλ} = = {Σ Σ}_{i i = = 11}^{N N} {λ λ}_{i i}^{22} - - \frac{11}{N N} {(({Σ Σ}_{i i = = 11}^{N N} {λ λ}_{i i}))}^{22};;$

${S S}_{TT TT} = = {Σ Σ}_{i i = = 11}^{N N} {T T}_{i i}^{22} - - \frac{11}{N N} {(({Σ Σ}_{i i = = 11}^{N N} {T T}_{i i}))}^{22};;$

${S S}_{λT λT} = = {Σ Σ}_{i i = = 11}^{N N} {λ λ}_{i i} {T T}_{i i} - - \frac{11}{N N} (({Σ Σ}_{i i = = 11}^{N N} {λ λ}_{i i})) (({Σ Σ}_{i i = = 11}^{N N} {T T}_{i i}));;$

(23)根据下述两个公式，分别计算两个系数a和b，以及计算剩余标准差σ：(23) According to the following two formulas, calculate the two coefficients a and b respectively, and calculate the residual standard deviation σ:

$b b = = \frac{{S S}_{λT λT}}{{S S}_{λλ λλ}};;$

$a a = = \frac{11}{N N} {Σ Σ}_{i i = = 11}^{N N} {T T}_{i i} - - ((\frac{11}{N N} {Σ Σ}_{i i = = 11}^{N N} {λ λ}_{i i})) b b;;$

计算剩余标准差σ的公式是： $σ = \sqrt{\frac{1}{N - 2} [S_{TT} - {bS}_{λT}]} .$ The formula to calculate the residual standard deviation σ is: $σ = \sqrt{\frac{1}{N - 2} [S_{TT} - b_{λT}]} .$

本发明是一种用于通信事务处理系统的过载检测方法(又称线性回归法)，该方法的优点是：能够不依赖系统的各种具体环境、硬件平台或其它实现条件，只要测量若干个数据和进行一些计算，就能够根据系统中的业务负荷大小状况自动进行过载检测；且计算方法简单，判断准确，工作可靠。该方法与申请人的《在多业务环境下基于业务控制点的智能网过载的控制方法》的专利申请相结合，能够很好地实现智能网系统的过载检测和过载控制。而且，本发明方法具有通用性，能够应用于各种网络系统，并不局限于智能网系统的过载检测。The present invention is an overload detection method (also known as linear regression method) for communication transaction processing system. Data and some calculations can automatically perform overload detection according to the business load in the system; and the calculation method is simple, the judgment is accurate, and the work is reliable. This method is combined with the applicant's patent application "Intelligent Network Overload Control Method Based on Service Control Point in Multi-Service Environment", which can well realize the overload detection and overload control of the intelligent network system. Moreover, the method of the present invention has universality, can be applied to various network systems, and is not limited to the overload detection of intelligent network systems.

附图说明 Description of drawings

图1是通信事务处理系统中的平均响应时间与CPU占用率(即不同负载时)的变化关系图。FIG. 1 is a graph showing the relationship between the average response time and the CPU usage (that is, under different loads) in the communication transaction processing system.

图2是本发明过载检测方法的操作步骤流程图。Fig. 2 is a flow chart of the operation steps of the overload detection method of the present invention.

图3是应用本发明的过载检测方法对自适应窗口控制算法进行修正并应用于智能网业务控制点(SCP)系统的实施例测试曲线图。Fig. 3 is a test graph of an embodiment of applying the overload detection method of the present invention to modify the adaptive window control algorithm and applying it to the service control point (SCP) system of the intelligent network.

具体实施方式 Detailed ways

为使本发明的目的、技术方案和优点更加清楚，下面结合附图对本发明作进一步的详细描述。In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings.

现介绍本发明方法的工作机理：在事务处理系统中，系统对消息的响应时间快慢是系统闲、忙程度的反映，因此研究响应时间随系统负荷(通常指CPU占用率)的变化规律，便能很好地根据响应时间直观了解系统负荷的变化情况。大量的实验数据分析结果表明，在相同负荷情况下，系统对消息的响应时间近似按照负指数分布。在不同负载情况下，系统对消息的响应时间与CPU的占用率的变化关系如图1所示。Introduce the working mechanism of the inventive method now: in the transaction processing system, the response time speed of the system to the message is the reflection of the idle and busy degree of the system. It is very good to intuitively understand the change of system load according to the response time. A large number of experimental data analysis results show that under the same load condition, the response time of the system to the message is approximately distributed according to the negative exponential. Under different load conditions, the relationship between the response time of the system to the message and the CPU usage rate is shown in Figure 1.

参见图1，该图显示了平均响应时间在不同负载时的变化规律：系统负载较轻(CPU的占用率较低)时，系统的响应时间基本平稳，近似呈线性规律；只有在系统负荷很重时，即CPU的占用率大于60％后，系统响应时间才会呈指数规律增加。See Figure 1, which shows the change law of the average response time under different loads: when the system load is light (the CPU occupancy rate is low), the system response time is basically stable and approximately linear; only when the system load is very low When it is heavy, that is, when the CPU usage is greater than 60%, the system response time will increase exponentially.

研究表明，系统的响应时间T与资源占用率R之间的关系可以表示为：Research shows that the relationship between system response time T and resource occupancy rate R can be expressed as:

T＝c+de^βR (1)T=c+de ^βR (1)

式中，c为常数，反映系统的处理能力；d为常数，反映消息的复杂程度；β为强度因子，通常与业务的复杂程度有关。图1中c＝42，d＝0.36，β＝0.075。In the formula, c is a constant, which reflects the processing capability of the system; d is a constant, which reflects the complexity of the message; β is the strength factor, which is usually related to the complexity of the business. In Fig. 1 c=42, d=0.36, β=0.075.

虽然图1所展示的CPU资源占用率与系统响应时间的变化规律，只是基于一个具体平台的测试结果，但是具有普遍的意义，同样适用于其它平台。尽管在不同平台运行不同的业务时，上述公式(1)的各个系数可能不同，但是CPU资源占用率与响应时间的变化所遵循的规律是一致的：任何系统一旦负荷很高，必然导致后台消息排队的队列加长，响应时间将呈指数规律加大。Although the CPU resource occupancy rate and system response time shown in Figure 1 are only based on the test results of a specific platform, they have general significance and are also applicable to other platforms. Although the coefficients of the above formula (1) may be different when running different services on different platforms, the CPU resource usage rate and response time follow the same law: once any system is under high load, it will inevitably lead to background messages. The longer the queuing queue, the response time will increase exponentially.

通常情况下，在事物处理系统中，响应时间T是个可量测的物理量，资源占用率R则是不能准确量测的物理量，特别是在分布式系统中，R的数值更不易测量。在前述公式(1)中，如果能够确认三个参数c、d、β的值，再通过测量的T数值，就可很容易地根据公式(1)求出资源占用率R，从而判断系统是否过载。因为采用传统方法准确测量上述三个参数c、d、β的数值比较困难，必须找到一种方便、有效的方法自动测算c、d、β的数值，以得到T与R的函数关系。这也是本发明的研制开发的出发点。Usually, in a transaction processing system, the response time T is a measurable physical quantity, and the resource occupancy rate R is a physical quantity that cannot be accurately measured. Especially in a distributed system, the value of R is even more difficult to measure. In the aforementioned formula (1), if the values of the three parameters c, d, and β can be confirmed, and then through the measured value of T, the resource occupancy rate R can be easily calculated according to the formula (1), so as to judge whether the system is overload. Because it is difficult to accurately measure the values of the above three parameters c, d, and β by traditional methods, a convenient and effective method must be found to automatically measure the values of c, d, and β to obtain the functional relationship between T and R. This is also the starting point of the research and development of the present invention.

由于资源占用率R是一个几乎不能准确测量的物理量，必须寻找一种既容易测量、又能反映资源占用情况的物理量来替代之。经过分析，发现这个物理量是存在的。系统资源的占用状况主要取决于业务的呼叫到达率λ，而呼叫到达率λ是一个比较容易测量的物理量。因此如果能够掌握λ与R之间的关系，就可以通过测量λ间接了解R。对于同一个业务，从统计意义上讲，每一个呼叫消耗的系统资源是一定的，呼叫到达率越高，所消耗的系统资源就越多；假定系统资源足够多，λ与R应该是一个线性关系。实验数据表明：这一线性关系在系统不过载时是成立的，即：R＝γλ (2)Since the resource occupancy rate R is a physical quantity that can hardly be measured accurately, it is necessary to find a physical quantity that is easy to measure and can reflect the resource occupancy situation to replace it. After analysis, it is found that this physical quantity exists. The occupancy of system resources mainly depends on the call arrival rate λ of the service, and the call arrival rate λ is a physical quantity that is relatively easy to measure. Therefore, if you can grasp the relationship between λ and R, you can know R indirectly by measuring λ. For the same business, in a statistical sense, each call consumes a certain amount of system resources, and the higher the call arrival rate, the more system resources it consumes; assuming that the system resources are sufficient, λ and R should be a linear relation. Experimental data show that: this linear relationship is established when the system is not overloaded, that is: R = γλ (2)

式中，γ为比例系数，与具体业务有关。为了便于分析，把公式(1)在坐标原点0的附近作泰勒展开，取其线性部分，得到下述公式：In the formula, γ is a proportional coefficient, which is related to the specific business. For the convenience of analysis, Taylor expansion of formula (1) is carried out near the coordinate origin 0, and its linear part is taken to obtain the following formula:

T＝c+d+dβR (3)T＝c+d+dβR (3)

再把公式(2)式代入公式(3)，得到下述公式：Substitute formula (2) into formula (3) again to get the following formula:

T＝a+bλ (4)T=a+bλ (4)

式中，a＝c+d，b＝d ×β×γ，参数a表示每组消息在系统中传递时所消耗的时间，包括在消息队列的排队时间、不同模块间的传递时间等，其数值大小与消息的复杂程度有关。参数b表示每消耗单位系统资源对消息响应时间的比例系数，通常消息越复杂，消耗系统资源越多，反映出响应时间的延迟越大。In the formula, a=c+d, b=d ×β×γ, the parameter a represents the time consumed by each group of messages when they are transmitted in the system, including the queuing time in the message queue, the transmission time between different modules, etc. The value is related to the complexity of the message. Parameter b represents the proportion coefficient of each consumed unit of system resources to message response time. Generally, the more complex the message, the more system resources are consumed, which reflects the greater delay of response time.

基于上述的分析，本发明的过载检测方法包括以下操作步骤(参见图2)：Based on above-mentioned analysis, overload detection method of the present invention comprises the following operation steps (referring to Fig. 2):

(1)在系统不过载时，设置系统响应时间T和业务平均呼叫到达率λ之间存在如下关系式：T＝a+bλ；式中，参数a表示每组消息在系统中传递所消耗的时间，包括在消息队列的排队时间，以及不同模块间的传递时间等，其数值大小与消息的复杂程度有关。参数b表示每消耗单位系统资源对消息响应时间的比例系数，通常消息越复杂，消耗系统资源越多，则响应时间的延迟越大。(1) When the system is not overloaded, set the following relationship between the system response time T and the service average call arrival rate λ: T=a+bλ; in the formula, the parameter a represents the time consumed by the transmission of each group of messages in the system Time, including the queuing time in the message queue, and the delivery time between different modules, etc., its numerical value is related to the complexity of the message. The parameter b represents the ratio coefficient of each consumed unit of system resources to the message response time. Generally, the more complex the message is, the more system resources are consumed, and the greater the response time delay will be.

(2)在系统负荷较低(即欠载)时，采集测量N组系统响应时间T及其对应的业务平均呼叫到达率λ数据，并根据最小二乘法计算上述关系式中的两个参数a、b；具体步骤为：(2) When the system load is low (i.e., underload), collect and measure N groups of system response time T and the corresponding service average call arrival rate λ data, and calculate the two parameters a in the above relational formula according to the least square method , b; the specific steps are:

$b b = = \frac{{S S}_{λT λ T}}{{S S}_{λλ λλ}};;$

计算剩余标准差σ的公式是： $σ = \sqrt{\frac{1}{N - 2} [S_{TT} - {bS}_{λT}]} .$ The formula to calculate the residual standard deviation σ is: $σ = \sqrt{\frac{1}{N - 2} [S_{TT} - b_{λ T}]} .$

(3)在设定的时间段内，测量该时间段的业务平均呼叫到达率λ，再利用公式T＝a+bλ计算该时间段的系统响应时间T的预测值T_预测。(3) In the set time period, measure the service average call arrival rate λ in this time period, and then use the formula T=a+bλ to calculate the predicted value _Tprediction of the system response time T in this time period.

(4)在设定的同一个时间段内，测量该时间段的系统响应时间T的实际值，即系统实际响应时间T_测量。(4) In the same set time period, measure the actual value of the system response time T in this time period, that is, _{measure the} actual system response time T.

如果T_测量＞T_预测+ωσ，则系统过载；否则，系统不过载；式中ω为过载保护因子，防止响应时间波动导致的误判，通常取经验值：1～5；σ为剩余标准差。If T _measurement > T _prediction + ωσ, the system is overloaded; otherwise, the system is not overloaded; where ω is the overload protection factor to prevent misjudgment caused by response time fluctuations, usually empirical values: 1 to 5; σ is the remaining standard deviation .

(6)对两个参数a、b定期进行动态调整，即每隔设定时间执行上述步骤(2)，计算两个参数a、b的数值，以适应系统和业务环境的变化。(6) Dynamically adjust the two parameters a and b on a regular basis, that is, perform the above step (2) every set time, and calculate the values of the two parameters a and b to adapt to changes in the system and business environment.

为了验证本发明方法的有效性，把该方法应用于中国专利申请《在多业务环境下基于业务控制点的智能网过载的控制方法》(申请号：200510064628.9)进行试验实施。现在介绍该实施例的试验情况：In order to verify the effectiveness of the method of the present invention, the method is applied to the Chinese patent application "Control Method of Intelligent Network Overload Based on Service Control Point in Multi-service Environment" (application number: 200510064628.9) for test implementation. Introduce the test situation of this embodiment now:

根据中国专利申请《在多业务环境下基于业务控制点的智能网过载的控制方法》(申请号：200510064628.9)中的内容，可得到下式：According to the content in the Chinese patent application "Control Method for Intelligent Network Overload Based on Service Control Point in Multi-Service Environment" (application number: 200510064628.9), the following formula can be obtained:

$T T = = \frac{W W}{λ λ} - - - - - - ((66))$

式中，T为系统的平均响应时间，λ为进入系统平均呼叫到达率，W为平均被占用的窗口数。把公式(6)代入公式(4)，得到：In the formula, T is the average response time of the system, λ is the average arrival rate of calls entering the system, and W is the average number of occupied windows. Substituting formula (6) into formula (4), we get:

$T T = = \frac{11}{22} ((a a + + \sqrt{{a a}^{22} + + 44 bW wxya})) - - - - - - ((77))$

按照《在多业务环境下基于业务控制点的智能网过载的控制方法》的专利申请内容，可根据λ预测下一个测量点的平均被占用的窗口数W_预测，再利用公式(7)就可以求解系统响应时间T的预测值T_预测，再把该T_预测数值与系统实际响应时间T_测量进行比较，如果满足公式T_测量＞T_预测+ωσ，即可确定过载检测点。According to the patent application content of "Control Method of Intelligent Network Overload Based on Service Control Point in Multi-Service Environment", the average occupied window number W of the next measurement point _{can be predicted} according to λ, and then formula (7) can be used Solve the predicted value T _prediction of the system response time T, and then compare the T _prediction value with the actual system response time T _measurement . If the formula T _measurement > T _prediction + ωσ is satisfied, the overload detection point can be determined.

参见图3，该图是利用本发明的线性回归法修正后的自适应窗口控制算法应用于智能网业务控制点(SCP)系统的实施例测试结果曲线图，其纵坐标为平均呼叫到达率，横坐标为系统运行时间。硬件平台为HP ALPHA DS20小型机，在50分钟时间内，逐渐增加呼叫到达率到450呼叫/秒，保护因子ω＝5，根据本发明的过载检测方法进行判断，一旦发生过载状况时，立即启动过载控制。实施例的数据是：启动过载控制时的最佳窗口数为11。从图中可以看出：在本发明的过载检测方法控制下，SCP系统接收的最大平均呼叫到达率为196呼叫/秒，更高的呼叫到达率被拒绝。Referring to Fig. 3, this figure is to utilize the self-adaptive window control algorithm after the linear regression method revision of the present invention to be applied to the embodiment test result graph of intelligent network service control point (SCP) system, and its ordinate is the average call arrival rate, The abscissa is the system running time. The hardware platform is a HP ALPHA DS20 minicomputer, within 50 minutes, gradually increase the call arrival rate to 450 calls/second, the protection factor ω=5, judge according to the overload detection method of the present invention, once the overload situation occurs, start immediately overload control. The data of the embodiment is: the optimal window number is 11 when the overload control is started. It can be seen from the figure that under the control of the overload detection method of the present invention, the maximum average call arrival rate received by the SCP system is 196 calls/second, and higher call arrival rates are rejected.

在过载检测点确定以前，即平均呼叫到达率小于196呼叫/秒时，每次使用本发明的线性回归法寻找过载检测点时，在这个区间内平均响应时间与资源占用率的线性关系较好，相对应于图1的非过载区，判断为没有达到过载检测条件，对呼叫不作限制。但在平均呼叫到达率达到196呼叫/秒时，响应时间与平均呼叫到达率呈非线性关系，相对应于图1的过载区，根据前述步骤(5)中的公式可以确定过载检测点，对于后续呼叫，则根据过载检测条件启动流量控制。在寻找过载检测点的过程中，系统采用本发明方法根据系统负荷情况自动完成检测判断。实验结果表明：达到了过载检测和过载控制的效果，实现了发明目的。Before the overload detection point is determined, that is, when the average call arrival rate is less than 196 calls/second, each time the linear regression method of the present invention is used to find the overload detection point, the linear relationship between the average response time and the resource occupancy rate in this interval is better , corresponding to the non-overload area in FIG. 1 , it is judged that the overload detection condition has not been met, and no restriction is imposed on the call. But when the average call arrival rate reaches 196 calls/second, the response time and the average call arrival rate are in a non-linear relationship, corresponding to the overload zone of Fig. 1, the overload detection point can be determined according to the formula in the aforementioned step (5), for For subsequent calls, flow control is initiated based on the overload detection condition. In the process of finding the overload detection point, the system adopts the method of the present invention to automatically complete the detection and judgment according to the system load situation. Experimental results show that: the effects of overload detection and overload control are achieved, and the purpose of the invention is realized.

Claims

1. An overload detection method for a communication transaction processing system, characterized in that: comprising the steps of:

(1) When the system is not overloaded, there is the following relationship between the system response time T and the service average call arrival rate λ: T=a+bλ; in the formula, the parameter a is the transmission time of each message in the system; Parameter b is the proportion factor of the message response time per unit of system resources consumed;

(2) When the system load is low, that is, underload, measure the data of the average call arrival rate λ of multiple groups of services and the corresponding system response time T, and use the least square method to calculate the above two parameters a and b;

(3) In the set time period, measure the service average call arrival rate λ in this time period, and then use the relational expression T=a+bλ in step (1) to calculate the predicted value of the system response time T in this time period T _prediction ;

(4) In the same set time period, measure the actual value of the system response time T in this time period, that is, _{measure the} actual system response time T;

(5) According to the value between the actual response time T _measurement of the system and the predicted value T _prediction of the system response time T, determine whether there is an "overload detection point" in the system:

If T _measurement > T _prediction + ωσ, the system is overloaded; otherwise, the system is not overloaded; where ω is the overload protection factor to prevent misjudgment caused by response time fluctuations, empirical values: 1 to 5; σ is the residual standard deviation.

2. The overload detection method for communication transaction processing system according to claim 1, characterized in that: the method further comprises the following steps:

(6) Dynamically adjust the two parameters a and b on a regular basis, that is, perform step (2) every set time to calculate the values of the two parameters a and b to adapt to changes in the system and business environment.

3. The overload detection method for communication transaction processing system according to claim 1, characterized in that: in the step (2), the specific steps of calculating the two parameters a and b according to the least square method are:

(21) When the system load is low, measure the data of the average call arrival rate λ _i of N groups of services and the corresponding system response time T _i , i is the serial number of each group of data, N is a positive integer, indicating the measured data group The largest sequence number;

(22) According to the data of average call arrival rate λ _i and system response time T _i of each group of business and the following formula, calculate the following three values respectively:

{S S}_{λλ λλ} = = {Σ Σ}_{i i = = 11}^{N N} {λ λ}_{i i}^{22} - - \frac{11}{N N} {(({Σ Σ}_{i i = = 11}^{N N} {λ λ}_{i i}))}^{22};;

{S S}_{TT TT} = = {Σ Σ}_{i i = = 11}^{N N} {T T}_{i i}^{22} - - \frac{11}{N N} {(({Σ Σ}_{i i = = 11}^{N N} {T T}_{i i}))}^{22};;

{S S}_{λT λT} = = {Σ Σ}_{i i = = 11}^{N N} {λ λ}_{i i} {T T}_{i i} - - \frac{11}{N N} (({Σ Σ}_{i i = = 11}^{N N} {λ λ}_{i i})) (({Σ Σ}_{i i = = 11}^{N N} {T T}_{i i}));;

(23) According to the following two formulas, calculate the two coefficients a and b respectively, and calculate the residual standard deviation σ:

b b = = \frac{{S S}_{λT λ T}}{{S S}_{λλ λλ}};;

a a = = \frac{11}{N N} {Σ Σ}_{i i = = 11}^{N N} {T T}_{i i} - - ((\frac{11}{N N} {Σ Σ}_{i i = = 11}^{N N} {λ λ}_{i i})) b b;;

The formula to calculate the residual standard deviation σ is:

σ = \sqrt{\frac{1}{N - 2} [S_{TT} - b_{λ T}]} .