8000 APM-164: Add basic overload control to arangod by jsteemann · Pull Request #772 · arangodb/docs · GitHub
[go: up one dir, main page]

Skip to content
This repository was archived by the owner on Dec 13, 2023. It is now read-only.

APM-164: Add basic overload control to arangod #772

Merged
merged 4 commits into from
Sep 27, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
61 changes: 52 additions & 9 deletions 3.9/http/general.md
8000 8000
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ format or ArangoDB's custom [VelocyPack](https://github.com/arangodb/velocypack)
binary format. Details on the expected format and JSON attributes can be found
in the documentation of the individual API endpoints.

Clients sending requests to ArangoDB must use either HTTP 1.0, HTTP 1.1, HTTP 2
Clients sending requests to ArangoDB must use either HTTP 1.1, HTTP 2
or VelocyStream. Other HTTP versions or protocols are not supported by ArangoDB.

Clients are required to include the `Content-Length` HTTP header with the
Expand All @@ -39,19 +39,16 @@ HTTP Keep-Alive
---------------

ArangoDB supports HTTP keep-alive. If the client does not send a `Connection`
header in its request, and the client uses HTTP version 1.1, ArangoDB will assume
the client wants to keep alive the connection.
header in its request, ArangoDB will assume the client wants to keep alive the
connection.
If clients do not wish to use the keep-alive feature, they should
explicitly indicate that by sending a `Connection: Close` HTTP header in
the request.

ArangoDB will close connections automatically for clients that send requests
using HTTP 1.0, except if they send an `Connection: Keep-Alive` header.

The default Keep-Alive timeout can be specified at server start using the
`--http.keep-alive-timeout` startup option.

Establishing TCP connections is expensive, since it takes several ping pongs
Establishing TCP connections is expensive, since it takes several roundtrips
between the communication parties. Therefore you can use connection keep-alive
to send several HTTP request over one TCP-connection;
each request is treated independently by definition. You can use this feature
Expand Down Expand Up @@ -540,14 +537,13 @@ The following APIs may use request forwarding:

- `/_api/control_pregel`
- `/_api/cursor`
- `/_api/document`
- `/_api/job`
- `/_api/replication`
- `/_api/query`
- `/_api/tasks`
- `/_api/transaction`

Note: since forwarding such requests require an additional cluster-internal HTTP
Note: since forwarding such requests requires an additional cluster-internal HTTP
request, they should be avoided when possible for best performance. Typically
this is accomplished either by directing the requests to the correct Coordinator
at a client-level or by enabling request "stickiness" on a load balancer. Since
Expand All @@ -557,3 +553,50 @@ request forwarding as a fall-back solution.
Note: some endpoints which return "global" data, such as `GET /_api/tasks` will
only return data corresponding to the server on which the request is executed.
These endpoints will generally not work well with load-balancers.

Overload control
----------------

<small>Introduced in: v3.9.0</small>

_arangod_ returns an `x-arango-queue-time-seconds` HTTP
header with all responses. This header contains the most recent request
queueing/dequeuing time (in seconds) as tracked by the server's scheduler.
This value can be used by client applications and drivers to detect server
overload and react on it.

The arangod startup option `--http.return-queue-time-header` can be set to
`false` to suppress these headers in responses sent by arangod.

In a cluster, the value returned in the `x-arango-queue-time-seconds` header
is the most recent queueing/dequeuing request time of the Coordinator the
request was sent to, except if the request is forwarded by the Coordinator to
another Coordinator. In that case, the value will indicate the current
queueing/dequeuing time of the forwarded-to Coordinator.

In addition, client applications and drivers can optionally augment the
requests they send to arangod with the header `x-arango-queue-time-seconds`.
If set, the value of the header should contain the maximum server-side
queuing time (in seconds) that the client application is willing to accept.
If the header is set in an incoming request, arangod will compare the current
dequeuing time from its scheduler with the maximum queue time value contained
in the request header. If the current queueing time exceeds the value set
in the header, arangod will reject the request and return HTTP 412
(precondition failed) with the error code 21004 (queue time violated).
Using a value of 0 or a non-numeric value in the header will lead to the
header value being ignored by arangod.

There is also a metric `arangodb_scheduler_queue_time_violations_total`
that is increased whenever a request is dropped because of the requested
queue time not being satisfiable. Administrators can use this metric to monitor
overload situations. Although all instance types will expose this metric,
it will likely always be `0` on DB-Servers and agency instances because the
`x-arango-queue-time-seconds` header is not used in cluster-internal requests.

In a cluster, the `x-arango-queue-time-seconds` request header will be
checked on the receiving Coordinator, before any request forwarding. If the
request is forwarded by the Coordinator to a different Coordinator, the
receiving Coordinator will also check the header on its own.
Apart from that, the header will not be included in cluster-internal requests
executed by the Coordinator, e.g. when the Coordinator issues sub-requests
to DB-Servers or Agency instances.
18 changes: 17 additions & 1 deletion 3.9/programs-arangod-http.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,23 @@ Idle keep-alive connections will be closed by the server automatically
when the timeout is reached. A keep-alive-timeout value 0 will disable the keep
alive feature entirely.

## Queue time header

<small>Introduced in: v3.9.0</small>

`--http.return-queue-time-header`

If *true*, the server will return the `x-arango-queue-time-seconds` HTTP
header with all responses. The value contained in this header indicates the
current queueing/dequeuing time for requests in the scheduler (in seconds).
Client applications and drivers can use this value to control the server
load and also react on overload.

Setting the option to `false` will make arangod not return the HTTP header
in responses.

The default value is *true*.

## Hide Product header

`--http.hide-product-header`
Expand All @@ -24,7 +41,6 @@ responses.

The default is *false*.


## Allow method override

`--http.allow-method-override`
Expand Down
29 changes: 29 additions & 0 deletions 3.9/release-notes-api-changes39.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,35 @@ should be prepared for this feature.

Also see [Database Naming Conventions](data-modeling-naming-conventions-database-names.html).

### Overload control

Starting with version 3.9.0, ArangoDB returns an `x-arango-queue-time-seconds`
HTTP header with all responses. This header contains the most recent request
queueing/dequeuing time (in seconds) as tracked by the server's scheduler.
This value can be used by client applications and drivers to detect server
overload and react on it.

The arangod startup option `--http.return-queue-time-header` can be set to
`false` to suppress these headers in responses sent by arangod.

In a cluster, the value returned in the `x-arango-queue-time-seconds` header
is the most recent queueing/dequeuing request time of the Coordinator the
request was sent to, except if the request is forwarded by the Coordinator to
another Coordinator. In that case, the value will indicate the current
queueing/dequeuing time of the forwarded-to Coordinator.

In addition, client applications and drivers can optionally augment the
requests they send to arangod with the header `x-arango-queue-time-seconds`.
If set, the value of the header should contain the maximum server-side
queuing time (in seconds) that the client application is willing to accept.
If the header is set in an incoming request, arangod will compare the current
dequeuing time from its scheduler with the maximum queue time value contained
in the request header. If the current queueing time exceeds the value set
in the header, arangod will reject the request and return HTTP 412
(precondition failed) with the error code 21004 (queue time violated).
In a cluster, the `x-arango-queue-time-seconds` request header will be
checked on the receiving Coordinator, before any request forwarding.

### Privilege changes

### Endpoint return value changes
Expand Down
30 changes: 30 additions & 0 deletions 3.9/release-notes-new-features39.md
Original file line number Diff line number Diff line change
Expand Up @@ -271,6 +271,36 @@ A pseudo log topic `"all"` was added. Setting the log level for the "all" log
topic will adjust the log level for **all existing log topics**. For example,
`--log.level all=debug` will set all log topics to log level "debug".

Overload control
----------------

Starting with version 3.9.0, ArangoDB returns an `x-arango-queue-time-seconds`
HTTP header with all responses. This header contains the most recent request
queueing/dequeuing time (in seconds) as tracked by the server's scheduler.
This value can be used by client applications and drivers to detect server
overload and react on it.

The arangod startup option `--http.return-queue-time-header` can be set to
`false` to suppress these headers in responses sent by arangod.

In a cluster, the value returned in the `x-arango-queue-time-seconds` header
is the most recent queueing/dequeuing request time of the Coordinator the
request was sent to, except if the request is forwarded by the Coordinator to
another Coordinator. In that case, the value will indicate the current
queueing/dequeuing time of the forwarded-to Coordinator.

In addition, client applications and drivers can optionally augment the
requests they send to arangod with the header `x-arango-queue-time-seconds`.
If set, the value of the header should contain the maximum server-side
queuing time (in seconds) that the client application is willing to accept.
If the header is set in an incoming request, arangod will compare the current
dequeuing time from its scheduler with the maximum queue time value contained
in the request header. If the current queueing time exceeds the value set
in the header, arangod will reject the request and return HTTP 412
(precondition failed) with the error code 21004 (queue time violated).
In a cluster, the `x-arango-queue-time-seconds` request header will be
checked on the receiving Coordinator, before any request forwarding.

Support info API
----------------

Expand Down
0