CN103164663B

CN103164663B - A kind of server overload guard method based on sliding window and device

Info

Publication number: CN103164663B
Application number: CN201110412221.6A
Authority: CN
Inventors: 姚明敏; 娄继冰
Original assignee: Shenzhen Tencent Computer Systems Co Ltd
Current assignee: Shenzhen Tencent Computer Systems Co Ltd
Priority date: 2011-12-12
Filing date: 2011-12-12
Publication date: 2016-06-29
Anticipated expiration: 2031-12-12
Also published as: CN103164663A

Abstract

A kind of server overload guard method based on sliding window of disclosure, including: the sliding window of express time section is set, the request that described sliding window record correspondence time period server receives；When server receives new request, the load condition that the request detection server that receives according to the server of sliding window record is current, judge whether request is processed according to testing result；The present invention also provides for a kind of server overload based on sliding window and protects device.According to technical scheme, it is possible to the premature beats based on sliding window controls the impact on server of the business burst visit capacity preferably.

Description

Server overload protection method and device based on sliding window

Technical Field

The present invention relates to a management technology of a multi-service server, and in particular, to a method and an apparatus for server overload protection based on a sliding window.

Background

At present, there are many technical researches on load balancing of server clusters, but in practical application, servers always reach the upper load limit under extreme conditions. The overload protection is used for ensuring that the server can still continue to work when the load upper limit is reached, maintaining the stable throughput of the server and not transmitting the influence of the front-end burst event to the server.

In the existing server overload protection technology, a method for processing requests with different priority levels by adopting a plurality of workload managers is provided: when the load of the server reaches a certain upper limit, the service request with low priority is refused, and only the request with high priority is processed. In practical internet applications, each user is rights-peer except for control requests issued by system operation maintenance, and their requests should have the same priority level. Therefore, the above method cannot clearly and correctly prioritize the requests, cannot achieve the effect of overload protection, and can cause the throughput of the server to be jittered.

The method starts from optimizing network card drive, and measures the load condition of the server according to the parameters of CPU utilization rate, memory utilization rate and the like of the current machine; if the server exceeds the load limit, the network card driver rejects all requests to establish new TCP connections to relieve server stress. This approach, while alleviating server load to some extent, can cause server throughput jitter for sudden, short-lived high-throughput situations. Meanwhile, the machine is taken as a control granularity instead of a service, so that different services deployed on the same machine cannot work normally due to the fact that one service is abnormal.

Disclosure of Invention

In view of the above, the present invention provides a method and an apparatus for protecting server overload based on a sliding window, which can better control the impact of a traffic burst access amount on a server based on overload control of the sliding window.

In order to achieve the purpose, the technical scheme of the invention is realized as follows:

the invention provides a server overload protection method based on a sliding window, which comprises the following steps:

setting a sliding window representing a time period;

the sliding window records a request received by the server in a corresponding time period;

and when the server receives a new request, detecting the current load state of the server according to the request received by the server and recorded by the sliding window, and judging whether to process the request according to the detection result.

In the above method, the setting of the sliding window representing the time period is: virtualizing time into a plurality of windows of equal length, each window representing a time period;

the sliding window records that the request received by the server in the corresponding time period is as follows: the window records the request received by the server in the corresponding time period, maintains a current window range, performs single load detection according to the current window range, and slides according to the change of the time point of the last load detection after each load detection.

In the method, the length of the time period is dynamically configured according to different service requirements, and the current window range is set according to the delay of the internet request.

In the above method, the method further comprises: and configuring the recorded request content for each window according to different service characteristics.

In the above method, the request content configuring and recording for each window according to different service characteristics is:

for the service of the disk bandwidth consumption type, the window records the number of times of disk I/O in the current time period;

for CPU consumption type service, the window records the accumulated consumed clock period number in the current time period;

for network bandwidth consuming services, the window records the traffic for the current time period.

In the above method, the detecting the current load state of the server according to the request received by the server recorded in the sliding window is:

detecting the current load state of a server after a new request comes according to the request received by the server recorded by the current sliding window; if the time point of the last load state detection and the time point of the current load state detection are positioned in the same window, adding 1 to the count in the window; if the time point of the last load state detection and the time point of the current load state detection are not in the same window, clearing the count value in the window; and after the count value is processed, judging whether the count value in the window exceeds the peak value of the server processing request.

In the above method, the determining whether to process the request according to the detection result includes:

if the count value in the window exceeds the peak value of the server processing request, rejecting a new request, and the server does not process the request; if the count value within the window does not exceed the peak value of the server processing request, the server processes the request.

The invention also provides a server overload protection device based on the sliding window, which comprises: the device comprises a setting unit, a sliding window unit, a receiving unit and a processing unit; wherein,

a setting unit for indicating a sliding window of a time period;

the sliding window unit is used for recording the request received by the server in the corresponding time period;

the receiving unit is used for detecting the current load state of the server according to the request received by the server recorded by the sliding window when receiving a new request;

and the processing unit is used for judging whether to process the request according to the detection result.

In the above device, the setting unit is further configured to dynamically configure the length of the time period according to different service requirements, and set the current window range according to the delay of the internet request.

In the above apparatus, the setting unit is further configured to configure the recorded request content for each window according to different service characteristics.

The invention provides a server overload protection method and device based on a sliding window, wherein the sliding window representing a time period is set, and the sliding window records a request received by a server in the corresponding time period; when the server receives a new request, the current load state of the server is detected according to the request received by the server and recorded by the sliding window, and whether the request is processed or not is judged according to the detection result, so that the influence of the burst access volume of the service on the server can be well controlled based on the overload control of the sliding window mechanism; in addition, the invention can also maintain the throughput of the server at a certain level, and avoid causing the jitter of the system throughput.

Drawings

FIG. 1 is a flow chart of a method for implementing sliding window based server overload protection according to the present invention;

FIG. 2 is an exemplary view of a sliding window of the present invention;

fig. 3 is a schematic structural diagram of a sliding window-based server overload protection apparatus according to the present invention.

Detailed Description

The basic idea of the invention is: setting a sliding window representing a time period, wherein the sliding window records a request received by a server in the corresponding time period; and when the server receives a new request, detecting the current load state of the server according to the request received by the server and recorded by the sliding window, and judging whether to process the request according to the detection result.

The invention is further described in detail below with reference to the drawings and the specific embodiments.

The invention provides a server overload protection method based on a sliding window, and fig. 1 is a flow schematic diagram of the server overload protection method based on the sliding window, as shown in fig. 1, the method comprises the following steps:

step 101, setting a sliding window representing a time period, wherein the sliding window records a request received by a server in the time period;

specifically, as shown in fig. 2, the elapsed time is virtualized in the direction of the time axis indicated by the arrow in fig. 2 into a plurality of windows of equal length, each of the virtualized windows representing a time period; the length of the time period may be dynamically configured according to different service requirements, for example, the time period may be 1ms, or may also be 1s, and the smaller the length of the time period is, the more accurately the real load state of the server in a certain time period can be described, and the more real-time performance is provided.

The window records the request received by the server in a corresponding time period, but not all the requests in the window are used as the basis for detecting the load state, because the correlation between the access amount of each time period of the internet is small, the load of the server cannot be caused because the request received by the server before a period of time is processed, and therefore, the load condition of the server only needs to be evaluated according to the access times of the latest time period, namely: only one current window range needs to be maintained, the current window range is in the whole time axis, a basis can be provided for single load detection, and the current window range can be set according to the delay of the internet request. For example, the current window range has 100 grids, and the time length of each grid is 1 second, then every load detection uses all records of the past 100 seconds as the basis, and by using the scheme, the jitter of the whole server throughput caused by the jitter in a single window can be avoided; as shown in fig. 2, t1 represents the time point of the last load detection, and the window slides in the time axis direction according to the change of t1 after each load detection, so the window is a sliding window.

In the overload protection scheme based on the hardware driving layer, the overload protection granularity is the whole machine, and in this case, different services deployed on the same machine cannot work normally because one service is abnormal. For example, at a specific moment, a certain disk bandwidth consumable service consumes a large amount of disk bandwidth to reach a specified upper limit, the network card does not accept other TCP connections at this moment, and rejects external services, and actually, the CPU consumable service still needs to be processed. And therefore lack sufficient flexibility.

The overload protection scheme based on the sliding window can perform dynamic configuration of a single window range and a time period corresponding to the single window, and can also configure the recorded request content for each window according to different service characteristics, for example, for a disk bandwidth consumption type service, the window can record the number of disk I/O times in the current time period, for a CPU consumption type service, the window can record the number of clock cycles consumed accumulatively in the current time period, and for a network bandwidth consumption type service, the window can record the flow in the current time period; by flexibly configuring the recorded request content for each window, different overload protection strategies can be implemented according to different services, and the service is used as the overload protection granularity.

Step 102, when the server receives a new request, detecting the current load state of the server according to the request received by the server recorded by the current sliding window, and judging whether to process the request according to the detection result;

specifically, each window may record the number of accesses within a corresponding time period, that is: the number of times of detecting the load state, because the load state is detected each time after the server receives the request, the number of the requests received by the server is equal to the number of times of detecting the load;

when a server receives a new request, firstly, detecting the current load state of the server after the new request arrives according to the request received by the server recorded by the current sliding window, and if the time point of the last load state detection and the time point of the current load state detection are positioned in the same window, adding 1 to the count in the window; after the count value is processed, further, whether the count value in the window exceeds the peak value of the server processing request is judged, if yes, the request is rejected, namely: receive the request only, but not process the request; if not, the server processes the request;

if the time point of the last load state detection and the time point of the current load state detection are not in the same window, clearing the count value in the window, wherein the current load state of the server cannot be reflected by the request received by the server before the current load state detection; after the count value is cleared, further, whether the count value in the window exceeds the peak value of the server processing request is judged, if yes, the request is rejected, and if not, the server processes the request.

In the invention, when the load of the server exceeds the request which can be borne by the server, the server can normally process a certain amount of requests after the time of at most one window, if the corresponding time period of one window is shorter, in milliseconds, the performance reflected in a service layer is that the server rejects some requests in a large number of requests, and the throughput of the server can be constant in a range instead of a jitter form of sudden increase and decrease of the throughput.

In order to implement the above method, the present invention further provides a sliding window based server overload protection apparatus, fig. 3 is a schematic structural diagram of the sliding window based server overload protection apparatus according to the present invention, and as shown in fig. 3, the apparatus includes: a setting unit 31, a sliding window unit 32, a receiving unit 33, a processing unit 34; wherein,

a setting unit 31 for indicating a sliding window of a time period;

a sliding window unit 32, configured to record a request received by the server within a corresponding time period;

a receiving unit 33, configured to detect a current load state of the server according to the request received by the server recorded in the sliding window when receiving a new request;

and the processing unit 34 is used for judging whether to process the request according to the detection result.

The setting unit 31 is further configured to dynamically configure the length of the time period according to different service requirements, and set the current window range according to the delay of the internet request.

The setting unit 31 is further configured to configure the recorded request content for each window according to different service characteristics.

The setting unit 31 sets a sliding window representing a time period as: virtualizing time into a plurality of windows of equal length, each window representing a time period;

the sliding window unit 32 records the request received by the server in the corresponding time period as: the window records the request received by the server in the corresponding time period, maintains a current window range, performs single load detection according to the current window range, and slides according to the change of the time point of the last load detection after each load detection.

The setting unit 31 configures the recorded request content for each window according to different service characteristics as follows: for the service of the disk bandwidth consumption type, the window records the number of times of disk I/O in the current time period; for CPU consumption type service, the window records the accumulated consumed clock period number in the current time period; for network bandwidth consuming services, the window records the traffic for the current time period.

The receiving unit 33 detects the current load state of the server according to the request received by the server recorded in the sliding window as follows: detecting the current load state of a server after a new request comes according to the request received by the server recorded by the current sliding window; if the time point of the last load state detection and the time point of the current load state detection are positioned in the same window, adding 1 to the count in the window; if the time point of the last load state detection and the time point of the current load state detection are not in the same window, clearing the count value in the window; after the count value is processed, it is further determined whether the count value in the window exceeds a peak value of the server processing request.

Correspondingly, the processing unit 34 determines whether to process the request according to the detection result as follows: if the count value in the window exceeds the peak value of the server processing request, rejecting a new request and not processing the request; if the count value within the window does not exceed the peak value of the server processing request, the request is processed.

The technical scheme of the invention can be applied to a mass distributed storage system TFS which is self-developed by an applicant, the system bears numerous service data, the stability of the TFS server is crucial to the operation of the borne service, and the technical scheme can help the TFS server to avoid the influence of the abnormal access condition of the front-end service on the server, thereby improving the stability of the server.

The above description is only exemplary of the present invention and should not be taken as limiting the scope of the present invention, and any modifications, equivalents, improvements, etc. that are within the spirit and principle of the present invention should be included in the present invention.

Claims

1. A server overload protection method based on a sliding window is characterized in that,

setting a sliding window representing a time period;

when the server receives a new request, detecting the current load state of the server according to the request received by the server and recorded by the sliding window, and judging whether to process the request according to the detection result;

the detecting the current load state of the server according to the request received by the server recorded by the sliding window includes:

and judging the current load condition of the server according to the number of the requests received by the server, which is recorded in the corresponding time period by the sliding window.

2. The method of claim 1, wherein setting the sliding window representing the time period is: virtualizing time into a plurality of sliding windows of equal length, each sliding window representing a time period;

the sliding window records that the request received by the server in the corresponding time period is as follows: the sliding window records the request received by the server in the corresponding time period, maintains a current window range, performs single load detection according to the current window range, and slides according to the change of the time point of the last load detection after each load detection.

3. The method of claim 2, wherein the length of the time period is dynamically configured according to different service requirements, and the current window range is set according to a delay of an internet request.

4. A method according to claim 2 or 3, characterized in that the method further comprises: and configuring the recorded request content for each sliding window according to different service characteristics.

5. The method according to claim 4, wherein the request content configuring records for each sliding window according to different service characteristics is:

for the service of the disk bandwidth consumption type, the sliding window records the number of times of disk I/O in the current time period;

for CPU consumption type service, a sliding window records the accumulated consumed clock period number in the current time period;

for network bandwidth consuming services, a sliding window records the traffic over the current time period.

6. The method according to claim 1, wherein the detecting the current load status of the server according to the request received by the server recorded in the sliding window is:

detecting the current load state of a server after a new request comes according to the request received by the server recorded by the current sliding window; if the time point of the last load state detection and the time point of the current load state detection are positioned in the same sliding window, adding 1 to the count in the sliding window; if the time point of the last load state detection and the time point of the current load state detection are not in the same sliding window, clearing the count value in the sliding window; and after the count value is processed, judging whether the count value in the sliding window exceeds the peak value of the server processing request or not.

7. The method of claim 6, wherein the determining whether to process the request according to the detection result is:

if the count value in the sliding window exceeds the peak value of the server processing request, rejecting a new request, and the server does not process the request; if the count value within the sliding window does not exceed the peak value of the server processing request, the server processes the request.

8. A sliding window based server overload protection apparatus, comprising: the device comprises a setting unit, a sliding window unit, a receiving unit and a processing unit; wherein,

a setting unit for indicating a sliding window of a time period;

the processing unit is used for judging whether to process the request according to the detection result;

the method for detecting the current load state of the server by the receiving unit according to the request received by the server and recorded by the sliding window comprises the following steps: and judging the current load condition of the server according to the number of the requests received by the server, which is recorded in the corresponding time period by the sliding window.

9. The apparatus according to claim 8, wherein the setting unit is further configured to dynamically configure the length of the time period according to different service requirements, and set the current window range according to the delay of the internet request.

10. The apparatus according to claim 9, wherein the setting unit is further configured to configure the recorded requested content for each sliding window according to different service characteristics.