8000 APM-164: Add basic overload control to arangod (#772) · arangodb/docs@480a350 · GitHub
[go: up one dir, main page]

Skip to content
This repository was archived by the owner on Dec 13, 2023. It is now read-only.

Commit 480a350

Browse files
authored
APM-164: Add basic overload control to arangod (#772)
1 parent fc1a35c commit 480a350

File tree

4 files changed

+128
-10
lines changed

4 files changed

+128
-10
lines changed

3.9/http/general.md

Lines changed: 52 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ format or ArangoDB's custom [VelocyPack](https://github.com/arangodb/velocypack)
2626
binary format. Details on the expected format and JSON attributes can be found
2727
in the documentation of the individual API endpoints.
2828

29-
Clients sending requests to ArangoDB must use either HTTP 1.0, HTTP 1.1, HTTP 2
29+
Clients sending requests to ArangoDB must use either HTTP 1.1, HTTP 2
3030
or VelocyStream. Other HTTP versions or protocols are not supported by ArangoDB.
3131

3232
Clients are required to include the `Content-Length` HTTP header with the
@@ -39,19 +39,16 @@ HTTP Keep-Alive
3939
---------------
4040

4141
ArangoDB supports HTTP keep-alive. If the client does not send a `Connection`
42-
header in its request, and the client uses HTTP version 1.1, ArangoDB will assume
43-
the client wants to keep alive the connection.
42+
header in its request, ArangoDB will assume the client wants to keep alive the
43+
connection.
4444
If clients do not wish to use the keep-alive feature, they should
4545
explicitly indicate that by sending a `Connection: Close` HTTP header in
4646
the request.
4747

48-
ArangoDB will close connections automatically for clients that send requests
49-
using HTTP 1.0, except if they send an `Connection: Keep-Alive` header.
50-
5148
The default Keep-Alive timeout can be specified at server start using the
5249
`--http.keep-alive-timeout` startup option.
5350

54-
Establishing TCP connections is expensive, since it takes several ping pongs
51+
Establishing TCP connections is expensive, since it takes several roundtrips
5552
between the communication parties. Therefore you can use connection keep-alive
5653
to send several HTTP request over one TCP-connection;
5754
each request is treated independently by definition. You can use this feature
@@ -540,14 +537,13 @@ The following APIs may use request forwarding:
540537

541538
- `/_api/control_pregel`
542539
- `/_api/cursor`
543-
- `/_api/document`
544540
- `/_api/job`
545541
- `/_api/replication`
546542
- `/_api/query`
547543
- `/_api/tasks`
548544
- `/_api/transaction`
549545

550-
Note: since forwarding such requests require an additional cluster-internal HTTP
546+
Note: since forwarding such requests requires an additional cluster-internal HTTP
551547
request, they should be avoided when possible for best performance. Typically
552548
this is accomplished either by directing the requests to the correct Coordinator
553549
at a client-level or by enabling request "stickiness" on a load balancer. Since
@@ -557,3 +553,50 @@ request forwarding as a fall-back solution.
557553
Note: some endpoints which return "global" data, such as `GET /_api/tasks` will
558554
only return data corresponding to the server on which the request is executed.
559555
These endpoints will generally not work well with load-balancers.
556+
557+
Overload control
558+
----------------
559+
560+
<small>Introduced in: v3.9.0</small>
561+
562+
_arangod_ returns an `x-arango-queue-time-seconds` HTTP
563+
header with all responses. This header contains the most recent request
564+
queueing/dequeuing time (in seconds) as tracked by the server's scheduler.
565+
This value can be used by client applications and drivers to detect server
566+
overload and react on it.
567+
568+
The arangod startup option `--http.return-queue-time-header` can be set to
569+
`false` to suppress these headers in responses sent by arangod.
570+
571+
In a cluster, the value returned in the `x-arango-queue-time-seconds` header
572+
is the most recent queueing/dequeuing request time of the Coordinator the
573+
request was sent to, except if the request is forwarded by the Coordinator to
574+
another Coordinator. In that case, the value will indicate the current
575+
queueing/dequeuing time of the forwarded-to Coordinator.
576+
577+
In addition, client applications and drivers can optionally augment the
578+
requests they send to arangod with the header `x-arango-queue-time-seconds`.
579+
If set, the value of the header should contain the maximum server-side
580+
queuing time (in seconds) that the client application is willing to accept.
581+
If the header is set in an incoming request, arangod will compare the current
582+
dequeuing time from its scheduler with the maximum queue time value contained
583+
in the request header. If the current queueing time exceeds the value set
584+
in the header, arangod will reject the request and return HTTP 412
585+
(precondition failed) with the error code 21004 (queue time violated).
586+
Using a value of 0 or a non-numeric value in the header will lead to the
587+
header value being ignored by arangod.
588+
589+
There is also a metric `arangodb_scheduler_queue_time_violations_total`
590+
that is increased whenever a request is dropped because of the requested
591+
queue time not being satisfiable. Administrators can use this metric to monitor
592+
overload situations. Although all instance types will expose this metric,
593+
it will likely always be `0` on DB-Servers and agency instances because the
594+
`x-arango-queue-time-seconds` header is not used in cluster-internal requests.
595+
596+
In a cluster, the `x-arango-queue-time-seconds` request header will be
597+
checked on the receiving Coordinator, before any request forwarding. If the
598+
request is forwarded by the Coordinator to a different Coordinator, the
599+
receiving Coordinator will also check the header on its own.
600+
Apart from that, the header will not be included in cluster-internal requests
601+
executed by the Coordinator, e.g. when the Coordinator issues sub-requests
602+
to DB-Servers or Agency instances.

3.9/programs-arangod-http.md

Lines changed: 17 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,23 @@ Idle keep-alive connections will be closed by the server automatically
1414
when the timeout is reached. A keep-alive-timeout value 0 will disable the keep
1515
alive feature entirely.
1616

17+
## Queue time header
18+
19+
<small>Introduced in: v3.9.0</small>
20+
21+
`--http.return-queue-time-header`
22+
23+
If *true*, the server will return the `x-arango-queue-time-seconds` HTTP
24+
header with all responses. The value contained in this header indicates the
25+
current queueing/dequeuing time for requests in the scheduler (in seconds).
26+
Client applications and drivers can use this value to control the server
27+
load and also react on overload.
28+
29+
Setting the option to `false` will make arangod not return the HTTP header
30+
in responses.
31+
32+
The default value is *true*.
33+
1734
## Hide Product header
1835

1936
`--http.hide-product-header`
@@ -24,7 +41,6 @@ responses.
2441

2542
The default is *false*.
2643

27-
2844
## Allow method override
2945

3046
`--http.allow-method-override`

3.9/release-notes-api-changes39.md

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,35 @@ should be prepared for this feature.
4141

4242
Also see [Database Naming Conventions](data-modeling-naming-conventions-database-names.html).
4343

44+
### Overload control
45+
46+
Starting with version 3.9.0, ArangoDB returns an `x-arango-queue-time-seconds`
47+
HTTP header with all responses. This header contains the most recent request
48+
queueing/dequeuing time (in seconds) as tracked by the server's scheduler.
49+
This value can be used by client applications and drivers to detect server
50+
overload and react on it.
51+
52+
The arangod startup option `--http.return-queue-time-header` can be set to
53+
`false` to suppress these headers in responses sent by arangod.
54+
55+
In a cluster, the value returned in the `x-arango-queue-time-seconds` header
56+
is the most recent queueing/dequeuing request time of the Coordinator the
57+
request was sent to, except if the request is forwarded by the Coordinator to
58+
another Coordinator. In that case, the value will indicate the current
59+
queueing/dequeuing time of the forwarded-to Coordinator.
60+
61+
In addition, client applications and drivers can optionally augment the
62+
requests they send to arangod with the header `x-arango-queue-time-seconds`.
63+
If set, the value of the header should contain the maximum server-side
64+
queuing time (in seconds) that ED47 the client application is willing to accept.
65+
If the header is set in an incoming request, arangod will compare the current
66+
dequeuing time from its scheduler with the maximum queue time value contained
67+
in the request header. If the current queueing time exceeds the value set
68+
in the header, arangod will reject the request and return HTTP 412
69+
(precondition failed) with the error code 21004 (queue time violated).
70+
In a cluster, the `x-arango-queue-time-seconds` request header will be
71+
checked on the receiving Coordinator, before any request forwarding.
72+
4473
### Privilege changes
4574

4675
### Endpoint return value changes

3.9/release-notes-new-features39.md

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -285,6 +285,36 @@ A pseudo log topic `"all"` was added. Setting the log level for the "all" log
285285
topic will adjust the log level for **all existing log topics**. For example,
286286
`--log.level all=debug` will set all log topics to log level "debug".
287287

288+
Overload control
289+
----------------
290+
291+
Starting with version 3.9.0, ArangoDB returns an `x-arango-queue-time-seconds`
292+
HTTP header with all responses. This header contains the most recent request
293+
queueing/dequeuing time (in seconds) as tracked by the server's scheduler.
294+
This value can be used by client applications and drivers to detect server
295+
overload and react on it.
296+
297+
The arangod startup option `--http.return-queue-time-header` can be set to
298+
`false` to suppress these headers in responses sent by arangod.
299+
300+
In a cluster, the value returned in the `x-arango-queue-time-seconds` header
301+
is the most recent queueing/dequeuing request time of the Coordinator the
302+
request was sent to, except if the request is forwarded by the Coordinator to
303+
another Coordinator. In that case, the value will indicate the current
304+
queueing/dequeuing time of the forwarded-to Coordinator.
305+
306+
In addition, client applications and drivers can optionally augment the
307+
requests they send to arangod with the header `x-arango-queue-time-seconds< 1E79 /span>`.
308+
If set, the value of the header should contain the maximum server-side
309+
queuing time (in seconds) that the client application is willing to accept.
310+
If the header is set in an incoming request, arangod will compare the current
311+
dequeuing time from its scheduler with the maximum queue time value contained
312+
in the request header. If the current queueing time exceeds the value set
313+
in the header, arangod will reject the request and return HTTP 412
314+
(precondition failed) with the error code 21004 (queue time violated).
315+
In a cluster, the `x-arango-queue-time-seconds` request header will be
316+
checked on the receiving Coordinator, before any request forwarding.
317+
288318
Support info API
289319
----------------
290320

0 commit comments

Comments
 (0)
0