8000 APM-164: Add basic overload control to arangod. (#14796) · arangodb/arangodb@9b6c51b · GitHub
[go: up one dir, main page]

Skip to content

Commit 9b6c51b

Browse files
jsteemannkvahed
andauthored
APM-164: Add basic overload control to arangod. (#14796)
Co-authored-by: Jan <jsteemann@users.noreply.github.com> Co-authored-by: Kaveh Vahedipour <kaveh@arangodb.com>
1 parent 724054c commit 9b6c51b

17 files changed

+362
-68
lines changed

CHANGELOG

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,28 @@
11
devel
22
-----
33

4+
* Add basic overload control to arangod.
5+
This change adds the `x-arango-queue-time-seconds` header to all responses
6+
sent by arangod. This header contains the most recent request dequeing time
7+
(in seconds) as tracked by the scheduler. This value can be used by client
8+
applications and drivers to detect server overload and react on it.
9+
The new startup option `--http.return-queue-time-header` can be set to
10+
`false` to suppress these headers in responses sent by arangod.
11+
12+
In addition, client applications and drivers can optionally augment their
13+
requests sent to arangod with a header of the same name. If set, the
14+
value of the header should contain the maximum queuing time (in seconds)
15+
that the client is willing to accept. If the header is set in an incoming
16+
request, arangod will compare the current dequeing time from its scheduler
17+
with the maximum queue time value contained in the request. If the current
18+
dequeing time exceeds the value set in the header, arangod will reject the
19+
request and return HTTP 412 (precondition failed) with the new error code
20+
21004 (queue time violated).
21+
22+
There is also a new metric `arangodb_scheduler_queue_time_violations_total`
23+
that is increased whenever a request is dropped because of the requested
24+
queue time not satisfiable.
25+
426
* Fixed a bug for array indexes on update of documents. See BTS-548.
527

628
* Prevent some possible deadlocks under high load regarding transactions and
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
name: arangodb_scheduler_queue_time_violations_total
2+
introducedIn: "3.9.0"
3+
help: |
4+
Number of tasks/requests dropped and not added to internal queue
5+
due to the client-specified queue time requirements not being
6+
satisfiable.
7+
unit: number
8+
type: counter
9+
category: Scheduler
10+
complexity: simple
11+
exposedBy:
12+
- coordinator
13+
- dbserver
14+
- agent
15+
- single
16+
description: |
17+
Number of tasks/requests dropped because the client-specified queue time
18+
requirements, as indicated by client applications in the request header
19+
"x-arango-queue-time-seconds" could not be satisfied by the receiving
20+
server instance. This happens when the actual time need to queue/dequeue
21+
requests on the scheduler queue exceeds the maximum time value that the
22+
client has specified in the request.
23+
Whenever this happens, the client application will get an HTTP 412 error
24+
response back with error code 21004 ("queue time violated").
25+
Although the metric is exposed on all instance types, it will very likely
26+
always be 0 on DB servers, simply because coordinators do not forward the
27+
"x-arango-queue-time-seconds" when they send internal requests to DB
28+
servers.

arangod/GeneralServer/CommTask.cpp

Lines changed: 61 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -65,31 +65,6 @@ inline bool startsWith(std::string const& path, char const* other) {
6565
path.compare(0, size, other, size) == 0);
6666
}
6767

68-
} // namespace
69-
70-
// -----------------------------------------------------------------------------
71-
// --SECTION-- constructors and destructors
72-
// -----------------------------------------------------------------------------
73-
74-
CommTask::CommTask(GeneralServer& server,
75-
ConnectionInfo info)
76-
: _server(server),
77-
_connectionInfo(std::move(info)),
78-
_connectionStatistics(ConnectionStatistics::acquire()),
79-
_auth(AuthenticationFeature::instance()) {
80-
TRI_ASSERT(_auth != nullptr);
81-
_connectionStatistics.SET_START();
82-
}
83-
84-
CommTask::~CommTask() {
85-
_connectionStatistics.SET_END();
86-
}
87-
88-
// -----------------------------------------------------------------------------
89-
// --SECTION-- protected methods
90-
// -----------------------------------------------------------------------------
91-
92-
namespace {
9368
TRI_vocbase_t* lookupDatabaseFromRequest(application_features::ApplicationServer& server,
9469
GeneralRequest& req) {
9570
// get database name from request
@@ -127,8 +102,53 @@ bool resolveRequestContext(application_features::ApplicationServer& server,
127102
// the "true" means the request is the owner of the context
128103
return true;
129104
}
105+
106+
bool queueTimeViolated(GeneralRequest const& req) {
107+
// check if the client sent the "x-arango-queue-time-seconds" header
108+
bool found = false;
109+
std::string const& queueTimeValue = req.header(StaticStrings::XArangoQueueTimeSeconds, found);
110+
if (found) {
111+
// yes, now parse the sent time value. if the value sent by client cannot be
112+
// parsed as a double, then it will be handled as if "0.0" was sent - i.e. no
113+
// queuing time restriction
114+
double requestedQueueTime = StringUtils::doubleDecimal(queueTimeValue);
115+
if (requestedQueueTime > 0.0) {
116+
// value is > 0.0, so now check the last dequeue time that the scheduler reported
117+
double lastDequeueTime = static_cast<double>(
118+
SchedulerFeature::SCHEDULER->getLastLowPriorityDequeueTime()) / 1000.0;
119+
120+
if (lastDequeueTime > requestedQueueTime) {
121+
// the log topic should actually be REQUESTS here, but the default log level
122+
// for the REQUESTS log topic is FATAL, so if we logged here in INFO level,
123+
// it would effectively be suppressed. thus we are using the Scheduler's
124+
// log topic here, which is somewhat related.
125+
SchedulerFeature::SCHEDULER->trackQueueTimeViolation();
126+
LOG_TOPIC("1bbcc", WARN, Logger::THREADS)
127+
<< "dropping incoming request because the client-specified maximum queue time requirement ("
128+
<< requestedQueueTime << "s) would be violated by current queue time (" << lastDequeueTime << "s)";
129+
return true;
130+
}
131+
}
132+
}
133+
return false;
134+
}
135+
130136
} // namespace
131137

138+
CommTask::CommTask(GeneralServer& server,
139+
ConnectionInfo info)
140+
: _server(server),
141+
_connectionInfo(std::move(info)),
142+
_connectionStatistics(ConnectionStatistics::acquire()),
143+
_auth(AuthenticationFeature::instance()) {
144+
TRI_ASSERT(_auth != nullptr);
145+
_connectionStatistics.SET_START();
146+
}
147+
148+
CommTask::~CommTask() {
149+
_connectionStatistics.SET_END();
150+
}
151+
132152
/// Must be called before calling executeRequest, will send an error
133153
/// response if execution is supposed to be aborted
134154

@@ -311,6 +331,12 @@ void CommTask::finishExecution(GeneralResponse& res, std::string const& origin)
311331
// use "IfNotSet" to not overwrite an existing response header
312332
res.setHeaderNCIfNotSet(StaticStrings::XContentTypeOptions, StaticStrings::NoSniff);
313333
}
334+
335+
// add "x-arango-queue-time-seconds" header
336+
if (_server.server().getFeature<GeneralServerFeature>().returnQueueTimeHeader()) {
337+
res.setHeaderNC(StaticStrings::XArangoQueueTimeSeconds,
338+
std::to_string(static_cast<double>(SchedulerFeature::SCHEDULER->getLastLowPriorityDequeueTime()) / 1000.0));
339+
}
314340
}
315341

316342
/// Push this request into the execution pipeline
@@ -336,8 +362,17 @@ void CommTask::executeRequest(std::unique_ptr<GeneralRequest> request,
336362
LOG_TOPIC("2cece", WARN, Logger::REQUESTS)
337363
<< "could not find corresponding request/response";
338364
}
339-
365+
340366
rest::ContentType const respType = request->contentTypeResponse();
367+
368+
// check if "x-arango-queue-time-seconds" header was set, and its value
369+
// is above the current dequeing time
370+
if (::queueTimeViolated(*request)) {
371+
sendErrorResponse(rest::ResponseCode::PRECONDITION_FAILED,
372+
respType, messageId, TRI_ERROR_QUEUE_TIME_REQUIREMENT_VIOLATED);
373+
return;
374+
}
375+
341376
// create a handler, this takes ownership of request and response
342377
auto& server = _server.server();
343378
auto& factory = server.getFeature<GeneralServerFeature>().handlerFactory();
@@ -351,7 +386,6 @@ void CommTask::executeRequest(std::unique_ptr<GeneralRequest> request,
351386
VPackBuffer<uint8_t>());
352387
return;
353388
}
354-
355389
// forward to correct server if necessary
356390
bool forwarded;
357391
auto res = handler->forwardRequest(forwarded);

arangod/GeneralServer/GeneralServerFeature.cpp

Lines changed: 14 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -139,6 +139,7 @@ GeneralServerFeature::GeneralServerFeature(application_features::ApplicationServ
139139
: ApplicationFeature(server, "GeneralServer"),
140140
_allowMethodOverride(false),
141141
_proxyCheck(true),
142+
_returnQueueTimeHeader(true),
142143
_permanentRootRedirect(true),
143144
_redirectRootTo("/_admin/aardvark/index.html"),
144145
_supportInfoApiPolicy("hardened"),
@@ -219,6 +220,11 @@ void GeneralServerFeature::collectOptions(std::shared_ptr<ProgramOptions> option
219220
"if true, use a permanent redirect. If false, use a temporary",
220221
new BooleanParameter(&_permanentRootRedirect))
221222
.setIntroducedIn(30712);
223+
224+
options->addOption("--http.return-queue-time-header",
225+
"if true, return the 'x-arango-queue-time-seconds' response header",
226+
new BooleanParameter(&_returnQueueTimeHeader))
227+
.setIntroducedIn(30900);
222228

223229
options->addOption("--frontend.proxy-request-check",
224230
"enable proxy request checking",
@@ -282,9 +288,8 @@ void GeneralServerFeature::prepare() {
282288
}
283289

284290
void GeneralServerFeature::start() {
285-
_jobManager.reset(new AsyncJobManager);
286-
287-
_handlerFactory.reset(new RestHandlerFactory());
291+
_jobManager = std::make_unique<AsyncJobManager>();
292+
_handlerFactory = std::make_unique<RestHandlerFactory>();
288293

289294
defineHandlers();
290295
buildServers();
@@ -321,17 +326,19 @@ void GeneralServerFeature::unprepare() {
321326
_jobManager.reset();
322327
}
323328

324-
double GeneralServerFeature::keepAliveTimeout() const {
329+
double GeneralServerFeature::keepAliveTimeout() const noexcept {
325330
return _keepAliveTimeout;
326331
}
327332

328-
bool GeneralServerFeature::proxyCheck() const { return _proxyCheck; }
333+
bool GeneralServerFeature::proxyCheck() const noexcept { return _proxyCheck; }
334+
335+
bool GeneralServerFeature::returnQueueTimeHeader() const noexcept { return _returnQueueTimeHeader; }
329336

330337
std::vector<std::string> GeneralServerFeature::trustedProxies() const {
331338
return _trustedProxies;
332339
}
333340

334-
bool GeneralServerFeature::allowMethodOverride() const {
341+
bool GeneralServerFeature::allowMethodOverride() const noexcept {
335342
return _allowMethodOverride;
336343
}
337344

@@ -350,7 +357,7 @@ Result GeneralServerFeature::reloadTLS() { // reload TLS data from disk
350357
return res;
351358
}
352359

353-
bool GeneralServerFeature::permanentRootRedirect() const {
360+
bool GeneralServerFeature::permanentRootRedirect() const noexcept {
354361
return _permanentRootRedirect;
355362
}
356363

arangod/GeneralServer/GeneralServerFeature.h

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -45,13 +45,14 @@ class GeneralServerFeature final : public application_features::ApplicationFeatu
4545
void stop() override final;
4646
void unprepare() override final;
4747

48-
double keepAliveTimeout() const;
49-
bool proxyCheck() const;
48+
double keepAliveTimeout() const noexcept;
49+
bool proxyCheck() const noexcept ;
50+
bool returnQueueTimeHeader() const noexcept;
5051
std::vector<std::string> trustedProxies() const;
51-
bool allowMethodOverride() const;
52+
bool allowMethodOverride() const noexcept;
5253
std::vector<std::string> const& accessControlAllowOrigins() const;
5354
Result reloadTLS();
54-
bool permanentRootRedirect() const;
55+
bool permanentRootRedirect() const noexcept;
5556
std::string redirectRootTo() const;
5657
std::string const& supportInfoApiPolicy() const noexcept;
5758

@@ -87,6 +88,7 @@ class GeneralServerFeature final : public application_features::ApplicationFeatu
8788
double _keepAliveTimeout = 300.0;
8889
bool _allowMethodOverride;
8990
bool _proxyCheck;
91+
bool _returnQueueTimeHeader;
9092
bool _permanentRootRedirect;
9193
std::vector<std::string> _trustedProxies;
9294
std::vector<std::string> _accessControlAllowOrigins;

arangod/Scheduler/Scheduler.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -233,6 +233,9 @@ class Scheduler {
233233

234234
virtual void toVelocyPack(velocypack::Builder&) const = 0;
235235
virtual QueueStatistics queueStatistics() const = 0;
236+
237+
/// @brief returns the last stored dequeue time [ms]
238+
virtual uint64_t getLastLowPriorityDequeueTime() const noexcept = 0;
236239

237240
/// @brief approximate fill grade of the scheduler's queue (in %)
238241
virtual double approximateQueueFillGrade() const = 0;

arangod/Scheduler/SupervisedScheduler.cpp

Lines changed: 26 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -183,6 +183,8 @@ DECLARE_GAUGE(
183183
"Total number of ongoing RestHandlers coming from the low prio queue");
184184
DECLARE_COUNTER(arangodb_scheduler_queue_full_failures_total,
185185
"Tasks dropped and not added to internal queue");
186+
DECLARE_COUNTER(arangodb_scheduler_queue_time_violations_total,
187+
"Tasks dropped because the client-requested queue time restriction would be violated");
186188
DECLARE_GAUGE(arangodb_scheduler_queue_length, uint64_t,
187189
"Server's internal queue length");
188190
DECLARE_COUNTER(arangodb_scheduler_threads_started_total,
@@ -238,6 +240,8 @@ SupervisedScheduler::SupervisedScheduler(application_features::ApplicationServer
238240
arangodb_scheduler_threads_stopped_total{})),
239241
_metricsQueueFull(server.getFeature<arangodb::MetricsFeature>().add(
240242
arangodb_scheduler_queue_full_failures_total{})),
243+
_metricsQueueTimeViolations(server.getFeature<arangodb::MetricsFeature>().add(
244+
arangodb_scheduler_queue_time_violations_total{})),
241245
_ongoingLowPriorityGauge(_server.getFeature<arangodb::MetricsFeature>().add(
242246
arangodb_scheduler_ongoing_low_prio{})),
243247
_metricsLastLowPriorityDequeueTime(
@@ -766,7 +770,8 @@ std::unique_ptr<SupervisedScheduler::WorkItemBase> SupervisedScheduler::getWork(
766770
return nullptr;
767771
};
768772

769-
// how often did we check for new work without success
773+
// how often did we check for new work without success (note: this counter
774+
// is used only to reduce the "last dequeue time" metric in case of inactivity)
770775
uint64_t iterations = 0;
771776
uint64_t maxCheckedQueue = 0;
772777

@@ -822,13 +827,16 @@ std::unique_ptr<SupervisedScheduler::WorkItemBase> SupervisedScheduler::getWork(
822827
}
823828

824829
// nothing to do for a long time, but the previously stored dequeue time
825-
// is still set to something > 5ms (note: we use 5 here because a deque time
826-
// of > 0ms is not very unlikely for any request)
827-
if (maxCheckedQueue == LowPriorityQueue &&
828-
iterations >= 10 &&
829-
_metricsLastLowPriorityDequeueTime.load(std::memory_order_relaxed) > 5) {
830-
// set the dequeue time back to 0.
831-
setLastLowPriorityDequeueTime(0);
830+
// is still set to something > 0ms.
831+
// now gradually decrease the stored dequeue time, so that in a period
832+
// of inactivity the dequeue time smoothly goes down back to 0, but not
833+
// abruptly
834+
if (maxCheckedQueue == LowPriorityQueue && iterations % 4 == 0) {
835+
auto old = _metricsLastLowPriorityDequeueTime.load(std::memory_order_relaxed);
836+
if (old > 0) {
837+
// reduce dequeue time to 66%
838+
setLastLowPriorityDequeueTime((old * 2) / 3);
839+
}
832840
}
833841

834842
if (state->_sleepTimeout_ms == 0) {
@@ -930,7 +938,7 @@ SupervisedScheduler::WorkerState::WorkerState(SupervisedScheduler& scheduler)
930938
_sleeping(false),
931939
_ready(false),
932940
_lastJobStarted(clock::now()),
933-
_thread(new SupervisedSchedulerWorkerThread(scheduler._server, scheduler)) {}
941+
_thread(std::make_unique<SupervisedSchedulerWorkerThread>(scheduler._server, scheduler)) {}
934942

935943
bool SupervisedScheduler::WorkerState::start() { return _thread->start(); }
936944

@@ -977,6 +985,15 @@ void SupervisedScheduler::trackEndOngoingLowPriorityTask() {
977985
}
978986
}
979987

988+
void SupervisedScheduler::trackQueueTimeViolation() {
989+
++_metricsQueueTimeViolations;
990+
}
991+
992+
/// @brief returns the last stored dequeue time [ms]
993+
uint64_t SupervisedScheduler::getLastLowPriorityDequeueTime() const noexcept {
994+
return _metricsLastLowPriorityDequeueTime.load();
995+
}
996+
980997
void SupervisedScheduler::setLastLowPriorityDequeueTime(uint64_t time) noexcept {
981998
// update only probabilistically, in order to reduce contention on the gauge
982999
if ((_sharedPRNG.rand() & 7) == 0) {

arangod/Scheduler/SupervisedScheduler.h

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,11 @@ class SupervisedScheduler final : public Scheduler {
5858
void trackBeginOngoingLowPriorityTask();
5959
void trackEndOngoingLowPriorityTask();
6060

61+
void trackQueueTimeViolation();
62+
63+
/// @brief returns the last stored dequeue time [ms]
64+
uint64_t getLastLowPriorityDequeueTime() const noexcept override;
65+
6166
/// @brief set the time it took for the last low prio item to be dequeued
6267
/// (time between queuing and dequeing) [ms]
6368
void setLastLowPriorityDequeueTime(uint64_t time) noexcept;
@@ -221,6 +226,7 @@ class SupervisedScheduler final : public Scheduler {
221226
Counter& _metricsThreadsStarted;
222227
Counter& _metricsThreadsStopped;
223228
Counter& _metricsQueueFull;
229+
Counter& _metricsQueueTimeViolations;
224230
Gauge<uint64_t>& _ongoingLowPriorityGauge;
225231

226232
/// @brief amount of time it took for the last low prio item to be dequeued

js/common/bootstrap/errors.js

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -345,10 +345,10 @@
345345
"ERROR_AGENCY_CANNOT_REBUILD_DBS" : { "code" : 20021, "message" : "Cannot rebuild readDB and spearHead" },
346346
"ERROR_AGENCY_MALFORMED_TRANSACTION" : { "code" : 20030, "message" : "malformed agency transaction" },
347347
"ERROR_SUPERVISION_GENERAL_FAILURE" : { "code" : 20501, "message" : "general supervision failure" },
348-
"ERROR_QUEUE_FULL" : { "code" : 21003, "message" : "named queue is full" },
348+
"ERROR_QUEUE_FULL" : { "code" : 21003, "message" : "queue is full" },
349+
"ERROR_QUEUE_TIME_REQUIREMENT_VIOLATED" : { "code" : 21004, "message" : "queue time violated" },
349350
"ERROR_ACTION_OPERATION_UNABORTABLE" : { "code" : 6002, "message" : "this maintenance action cannot be stopped" },
350351
"ERROR_ACTION_UNFINISHED" : { "code" : 6003, "message" : "maintenance action still processing" },
351-
"ERROR_NO_SUCH_ACTION" : { "code" : 6004, "message" : "no such maintenance action" },
352352
"ERROR_HOT_BACKUP_INTERNAL" : { "code" : 7001, "message" : "internal hot backup error" },
353353
"ERROR_HOT_RESTORE_INTERNAL" : { "code" : 7002, "message" : "internal hot restore error" },
354354
"ERROR_BACKUP_TOPOLOGY" : { "code" : 7003, "message" : "backup does not match this topology" },

0 commit comments

Comments
 (0)
0