8000 Feature/aql subquery execute parallel gather 2 (#11221) · arangodb/arangodb@4794ebe · GitHub
[go: up one dir, main page]

Skip to content

Commit 4794ebe

Browse files
goedderzmchackiMarkus Pfeiffer
authored
Feature/aql subquery execute parallel gather 2 (#11221)
* Fixed range-handling for Modification Executors * DataRange handling in ModificationExecutor * Honor batch-size defined by UpstreamExecutor * Fixed compile issue * More fixes in modification * Remvoed log devel * Fixed profiler Test. for NoResults node we cahnge the behaviour * Activated getSome failure tests in ExecuteRestHandler * Fixed skipping in Index * Let the MultiDependencySingleROwFetcher return the correct states. * Fixed non-maintainer compilation * Attempt to fix windows compile issue * Fixed the non-maintainer compile ina different way * Added API in MultiAqlItemBlockInputRange to get Number of dependencies * Comments * Savepoint commit, does not compile, but no harm is done. Will start breaking things now * Another savepoint commit. does not compile, yet. * First draft of new Style SortingGather not yet implemented: Parallelism this needs to be handled in ExecutionBlockImpl now. * Allow waiting within old-style subquery * Fixed invalid skipRwos in unsorted gather * First draft of ParallelUnsortedGatherExecutor * Removed unused local variables * Added some Assertions in MultiAqlItemBlockInputRange * Initialize dependdencies of MultiDependencyFetcher * Fixed skipRows loop in UnsortingGatherNode * Fixed return state of GatherNode * Added an assertion before accessing a vectir unbounded * Fixed uninitialized member in DistributeExecutor * Fixed use before vector initialization in SortingGather * Fixed uninitialized dependencies in MultiDepRowFetcher * First step towards parallel Aql * Fixed an assertion * Fixed upstream skipping in ParallelUnsortedGather * [WIP] Changed Api for MultiDepExecutors in ExecBlockImpl (not yet in the actual executors) * Moved AqlCallSet into a separate file * Changed SortingGather to use the new API * Changed ParallelUnsortedGather to use the new API * Changed UnsortedGather to use the new API * Moved AqlCall operator<< into .cpp file * Implement operator<< for AqlCallSet * Fix boolean mix-up * Fixed state machine: go to UPSTREAM when the AqlCallSet is not empty * Fixed assertion * Bugfix * SortingGather bugfixes * Added init() method to fix an assertion and cleanup * Removed unused variable * Fixed constrained sort * Fixed constrained sort #2 * Fix boolean mix-up * Remove old interface * Use call parameter for upstream request in produceRows * Remove more old interface code * Add skip methods to MultiAqlItemBlockInputRange * Skip in UnsortedGather * skip for UnsortedGather * Fix skip and upstream calls for UnsortedGather * skipRowsRange change * Remove useless comments * Moved multi-dep related code from ExeBlockImpl to MultiFetcher * Cleanup in SortingGather, implemented parallel fullCount * Try to fix a windows compile error * Simplify and extend skipRowsRange for UnsortedGatherExecutor * Made ParallelUnsortedGather actually parallel * Removed erroneous assertion * Undid erroneous change * Fixed MacOs compile. Also disabled tests for non-relevant AqlCallStacks. They will b 10000 e removed * Fixed initialize Cursor for multi dependency blocks * Fixed fullCount case in parallel unsorted gather * Fixed fullCount upstream call of ParallelUnsortedGatherExecutor * Fixed fullCount in SortingGather * Windows \o/ if you cannot work properly with constexpr and static asserts, we do not let you do it! * Do not advance in Unsorted gather if there are still rows to skip * Add more comparison operators for AqlCall limits * Send clientCall limits to upstream in SortingGather * Improved fullCount in SortingGatherExectur * Disabled a cluster profile test. We now ask the RemoteNode more often if it already has data. It is a bit unclear to me if this is now better performance wise (<< i think so) or triggers undesired side effects * Helpless attempt to work around issues in stonage Visual Studio Compiler we are using. * Clearly adding an operator on a well defined type causes ambigousness on random basic types using the operator Co-authored-by: Michael Hackstein <michael@arangodb.com> Co-authored-by: Markus Pfeiffer <markus@arangodb.com>
1 parent fd763cb commit 4794ebe

25 files changed

+803
-466
lines changed

arangod/Aql/AqlCall.cpp

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,7 @@
3030
#include <velocypack/Collection.h>
3131
#include <velocypack/Slice.h>
3232

33+
#include <iostream>
3334
#include <map>
3435
#include <string_view>
3536

@@ -279,3 +280,19 @@ auto AqlCall::toString() const -> std::string {
279280
stream << *this;
280281
return stream.str();
281282
}
283+
284+
auto aql::operator<<(std::ostream& out, AqlCall::Limit const& limit) -> std::ostream& {
285+
return std::visit(overload{[&out](size_t const& i) -> std::ostream& {
286+
return out << i;
287+
},
288+
[&out](AqlCall::Infinity const&) -> std::ostream& {
289+
return out << "unlimited";
290+
}},
291+
limit);
292+
}
293+
294+
auto aql::operator<<(std::ostream& out, AqlCall const& call) -> std::ostream& {
295+
return out << "{ skip: " << call.getOffset() << ", softLimit: " << call.softLimit
296+
<< ", hardLimit: " << call.hardLimit
297+
<< ", fullCount: " << std::boolalpha << call.fullCount << " }";
298+
}

arangod/Aql/AqlCall.h

Lines changed: 21 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@
2828
#include "Cluster/ResultT.h"
2929

3030
#include <cstddef>
31+
#include <iosfwd>
3132
#include <variant>
3233

3334
namespace arangodb::velocypack {
@@ -50,14 +51,14 @@ struct AqlCall {
5051

5152
AqlCall() = default;
5253
// Replacements for struct initialization
53-
explicit AqlCall(size_t offset, Limit softLimit = Infinity{},
54-
Limit hardLimit = Infinity{}, bool fullCount = false)
54+
explicit constexpr AqlCall(size_t offset, Limit softLimit = Infinity{},
55+
Limit hardLimit = 9E7A Infinity{}, bool fullCount = false)
5556
: offset{offset}, softLimit{softLimit}, hardLimit{hardLimit}, fullCount{fullCount} {}
5657

5758
enum class LimitType { SOFT, HARD };
58-
AqlCall(size_t offset, bool fullCount, Infinity)
59+
constexpr AqlCall(size_t offset, bool fullCount, Infinity)
5960
: offset{offset}, softLimit{Infinity{}}, hardLimit{Infinity{}}, fullCount{fullCount} {}
60-
AqlCall(size_t offset, bool fullCount, size_t limit, LimitType limitType)
61+
constexpr AqlCall(size_t offset, bool fullCount, size_t limit, LimitType limitType)
6162
: offset{offset},
6263
softLimit{limitType == LimitType::SOFT ? Limit{limit} : Limit{Infinity{}}},
6364
hardLimit{limitType == LimitType::HARD ? Limit{limit} : Limit{Infinity{}}},
@@ -158,6 +159,7 @@ struct AqlCall {
158159
return skippedRows;
159160
}
160161

162+
// TODO this is the same as shouldSkip(), remove one of them.
161163
[[nodiscard]] bool needSkipMore() const noexcept {
162164
return (0 < getOffset()) || (getLimit() == 0 && needsFullCount());
163165
}
@@ -174,6 +176,8 @@ struct AqlCall {
174176
std::visit(minus, hardLimit);
175177
}
176178

179+
bool hasLimit() const { return hasHardLimit() || hasSoftLimit(); }
180+
177181
bool hasHardLimit() const {
178182
return !std::holds_alternative<AqlCall::Infinity>(hardLimit);
179183
}
@@ -184,6 +188,7 @@ struct AqlCall {
184188

185189
bool needsFullCount() const { return fullCount; }
186190

191+
// TODO this is the same as needSkipMore(), remove one of them.
187192
bool shouldSkip() const {
188193
return getOffset() > 0 || (getLimit() == 0 && needsFullCount());
189194
}
@@ -199,6 +204,15 @@ constexpr bool operator<(AqlCall::Limit const& a, AqlCall::Limit const& b) {
199204
return std::get<size_t>(a) < std::get<size_t>(b);
200205
}
201206

207+
constexpr bool operator<(AqlCall::Limit const& a, size_t b) {
208+
if (std::holds_alternative<AqlCall::Infinity>(a)) {
209+
return false;
210+
}
211+
return std::get<size_t>(a) < b;
212+
}
213+
214+
constexpr bool operator<(size_t a, AqlCall::Limit const& b) { return !(b < a); }
215+
202216
constexpr AqlCall::Limit operator+(AqlCall::Limit const& a, size_t n) {
203217
return std::visit(overload{[n](size_t const& i) -> AqlCall::Limit {
204218
return i + n;
@@ -243,22 +257,10 @@ constexpr bool operator==(AqlCall const& left, AqlCall const& right) {
243257
left.skippedRows == right.skippedRows;
244258
}
245259

246-
inline std::ostream& operator<<(std::ostream& out,
247-
const arangodb::aql::AqlCall::Limit& limit) {
248-
return std::visit(arangodb::overload{[&out](size_t const& i) -> std::ostream& {
249-
return out << i;
250-
},
251-
[&out](arangodb::aql::AqlCall::Infinity const&) -> std::ostream& {
252-
return out << "unlimited";
253-
}},
254-
limit);
255-
}
260+
auto operator<<(std::ostream& out, const arangodb::aql::AqlCall::Limit& limit)
261+
-> std::ostream&;
256262

257-
inline std::ostream& operator<<(std::ostream& out, const arangodb::aql::AqlCall& call) {
258-
return out << "{ skip: " << call.getOffset() << ", softLimit: " << call.softLimit
259-
<< ", hardLimit: " << call.hardLimit
260-
<< ", fullCount: " << std::boolalpha << call.fullCount << " }";
261-
}
263+
auto operator<<(std::ostream& out, const arangodb::aql::AqlCall& call) -> std::ostream&;
262264

263265
} // namespace arangodb::aql
264266

arangod/Aql/AqlCallSet.cpp

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
////////////////////////////////////////////////////////////////////////////////
2+
/// DISCLAIMER
3+
///
4+
/// Copyright 2020 ArangoDB GmbH, Cologne, Germany
5+
///
6+
/// Licensed under the Apache License, Version 2.0 (the "License");
7+
/// you may not use this file except in compliance with the License.
8+
/// You may obtain a copy of the License at
9+
///
10+
/// http://www.apache.org/licenses/LICENSE-2.0
11+
///
12+
/// Unless required by applicable law or agreed to in writing, software
13+
/// distributed under the License is distributed on an "AS IS" BASIS,
14+
/// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
15+
/// See the License for the specific language governing permissions and
16+
/// limitations under the License.
17+
///
18+
/// Copyright holder is ArangoDB GmbH, Cologne, Germany
19+
///
20+
/// @author Tobias Gödderz
21+
////////////////////////////////////////////////////////////////////////////////
22+
23+
#include "AqlCallSet.h"
24+
25+
using namespace arangodb;
26+
using namespace arangodb::aql;
27+
28+
auto aql::operator<<(std::ostream& out, AqlCallSet::DepCallPair const& callPair) -> std::ostream& {
29+
return out << callPair.dependency << " => " << callPair.call;
30+
}
31+
32+
auto aql::operator<<(std::ostream& out, AqlCallSet const& callSet) -> std::ostream& {
33+
out << "[";
34+
auto first = true;
35+
for (auto const& it : callSet.calls) {
36+
if (first) {
37+
out << " ";
38+
first = false;
39+
} else {
40+
out << ", ";
41+
}
42+
out << it;
43+
}
44+
out << " ]";
45+
return out;
46+
}
47+
48+
auto AqlCallSet::empty() const noexcept -> bool {
49+
return calls.empty();
50+
}
51+
52+
auto AqlCallSet::size() const noexcept -> size_t {
53+
return calls.size();
54+
}

arangod/Aql/AqlCallSet.h

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
////////////////////////////////////////////////////////////////////////////////
2+
/// DISCLAIMER
3+
///
4+
/// Copyright 2020 ArangoDB GmbH, Cologne, Germany
5+
///
6+
/// Licensed under the Apache License, Version 2.0 (the "License");
7+
/// you may not use this file except in compliance with the License.
8+
/// You may obtain a copy of the License at
9+
///
10+
/// http://www.apache.org/licenses/LICENSE-2.0
11+
///
12+
/// Unless required by applicable law or agreed to in writing, software
13+
/// distributed under the License is distributed on an "AS IS" BASIS,
14+
/// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
15+
/// See the License for the specific language governing permissions and
16+
/// limitations under the License.
17+
///
18+
/// Copyright holder is ArangoDB GmbH, Cologne, Germany
19+
///
20+
/// @author Tobias Gödderz
21+
////////////////////////////////////////////////////////////////////////////////
22+
23+
#ifndef ARANGOD_AQL_AQLCALLSET_H
24+
#define ARANGOD_AQL_AQLCALLSET_H
25+
26+
#include "Aql/AqlCall.h"
27+
28+
#include <iosfwd>
29+
#include <vector>
30+
31+
namespace arangodb::aql {
32+
33+
// Partial map dep -> call. May be empty.
34+
// IMPORTANT: Are expected to be saved in increasing order (regarding dependency)
35+
struct AqlCallSet {
36+
struct DepCallPair {
37+
std::size_t dependency{};
38+
AqlCall call;
39+
};
40+
std::vector<DepCallPair> calls;
41+
42+
[[nodiscard]] auto empty() const noexcept -> bool;
43+
44+
[[nodiscard]] auto size() const noexcept -> size_t;
45+
};
46+
47+
auto operator<<(std::ostream& out, AqlCallSet::DepCallPair const& callPair)
48+
-> std::ostream&;
49+
auto operator<<(std::ostream&, AqlCallSet const&) -> std::ostream&;
50+
51+
} // namespace arangodb::aql
52+
53+
#endif // ARANGOD_AQL_AQLCALLSET_H

arangod/Aql/AqlCallStack.cpp

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -87,6 +87,12 @@ void AqlCallStack::pushCall(AqlCall&& call) {
8787
_operations.push(call);
8888
}
8989

90+
void AqlCallStack::pushCall(AqlCall const& call) {
91+
// TODO is this correct on subqueries?
92+
TRI_ASSERT(isRelevant());
93+
_operations.push(call);
10000 94+
}
95+
9096
void AqlCallStack::stackUpMissingCalls() {
9197
while (!isRelevant()) {
9298
// For every depth, we add an additional default call.

arangod/Aql/AqlCallStack.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,9 @@ class AqlCallStack {
6363
// Put another call on top of the stack.
6464
void pushCall(AqlCall&& call);
6565

66+
// Put another call on top of the stack.
67+
void pushCall(AqlCall const& call);
68+
6669
// fill up all missing calls within this stack s.t. we reach depth == 0
6770
// This needs to be called if an executor requires to be fully executed, even if skipped,
6871
// even if the subquery it is located in is skipped.

arangod/Aql/ClusterNodes.cpp

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -635,6 +635,12 @@ void GatherNode::setParallelism(GatherNode::Parallelism value) {
635635
_parallelism = value;
636636
}
637637

638+
GatherNode::SortMode GatherNode::evaluateSortMode(size_t numberOfShards,
639+
size_t shardsRequiredForHeapMerge) noexcept {
640+
return numberOfShards >= shardsRequiredForHeapMerge ? SortMode::Heap
641+
: SortMode::MinElement;
642+
}
643+
638644
SingleRemoteOperationNode::SingleRemoteOperationNode(
639645
ExecutionPlan* plan, size_t id, NodeType mode, bool replaceIndexNode,
640646
std::string const& key, Collection const* collection,

arangod/Aql/ClusterNodes.h

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -315,10 +315,7 @@ class GatherNode final : public ExecutionNode {
315315

316316
/// @returns sort mode for the specified number of shards
317317
static SortMode evaluateSortMode(size_t numberOfShards,
318-
size_t shardsRequiredForHeapMerge = 5) noexcept {
319-
return numberOfShards >= shardsRequiredForHeapMerge ? SortMode::Heap
320-
: SortMode::MinElement;
321-
}
318+
size_t shardsRequiredForHeapMerge = 5) noexcept;
322319

323320
/// @brief constructor with an id
324321
GatherNode(ExecutionPlan* plan, size_t id, SortMode sortMode,

0 commit comments

Comments
 (0)
0