8000 Merge branch 'devel' of github.com:arangodb/arangodb into devel · msand/arangodb@cc9395d · GitHub
[go: up one dir, main page]

Skip to content

Commit cc9395d

Browse files
committed
Merge branch 'devel' of github.com:arangodb/arangodb into devel
2 parents 53e0dc6 + bca3ba5 commit cc9395d

File tree

2 files changed

+197
-125
lines changed

2 files changed

+197
-125
lines changed

Documentation/Books/Users/AqlExamples/Join.mdpp

Lines changed: 194 additions & 123 deletions
< 3E95 tr class="diff-line-row">
Original file line numberDiff line numberDiff line change
@@ -6,115 +6,122 @@ this extra collection to create a result from two collections.
66

77
First of all, we'll query a few users together with their friends' ids. For that,
88
we'll use all *relations* that have a value of *friend* in their *type* attribute.
9-
Relationships are established by using the *from* and *to* attributes in the
10-
*relations* collection, which point to the *id* values in the *users* collection.
9+
Relationships are established by using the *friendOf* and *thisUser* attributes in the
10+
*relations* collection, which point to the *userId* values in the *users* collection.
1111

1212
!SUBSECTION Join tuples
1313

14-
We'll start with a SQL-ish result set and return each tuple (user name, friend id)
14+
We'll start with a SQL-ish result set and return each tuple (user name, friends userId)
1515 10000
separately. The AQL query to generate such result is:
1616

1717
```js
18-
FOR u IN users
19-
FILTER u.active == true
20-
LIMIT 0, 4
21-
FOR f IN relations
22-
FILTER f.type == "friend" && f.from == u.id
23-
RETURN {
24-
"user" : u.name,
25-
"friendId" : f.to
18+
FOR u IN users
19+
FILTER u.active == true
20+
LIMIT 0, 4
21+
FOR f IN relations
22+
FILTER f.type == "friend" && f.friendOf == u.userId
23+
RETURN {
24+
"user" : u.name,
25+
"friendId" : f.thisUser
2626
}
2727

28-
[
29-
{
30-
"user" : "Abigail",
31-
"friendId" : 108
32-
},
33-
{
34-
"user" : "Abigail",
35-
"friendId" : 102
36-
},
37-
{
38-
"user" : "Abigail",
39-
"friendId" : 106
40-
},
41-
{
42-
"user" : "Fred",
43-
"friendId" : 209
44-
},
45-
{
46-
"user" : "Mary",
47-
"friendId" : 207
48-
},
49-
{
50-
"user" : "Mary",
51-
"friendId" : 104
52-
},
53-
{
54-
"user" : "Mariah",
55-
"friendId" : 203
56-
},
57-
{
58-
"user" : "Mariah",
59-
"friendId" : 205
60-
}
28+
[
29+
{
30+
"user" : "Abigail",
31+
"friendId" : 108
32+
},
33+
{
34+
"user" : "Abigail",
35+
"friendId" : 102
36+
},
37+
{
38+
"user" : "Abigail",
39+
"friendId" : 106
40+
},
41+
{
42+
"user" : "Fred",
43+
"friendId" : 209
44+
},
45+
{
46+
"user" : "Mary",
47+
"friendId" : 207
48+
},
49+
{
50+
"user" : "Mary",
51+
"friendId" : 104
52+
},
53+
{
54+
"user" : "Mariah",
55+
"friendId" : 203
56+
},
57+
{
58+
"user" : "Mariah",
59+
"friendId" : 205
60+
}
6161
]
6262
```
6363

64+
We iterate over the collection users. Only the 'active' users will be examined.
65+
For each of these users we will search for up to 4 friends. We locate friends
66+
by comparing the *userId* of our current user with the *friendOf* attribute of the
< 8000 /code>67+
*relations* document. For each of those relations found we return the users name
68+
and the userId of the friend.
69+
70+
6471
!SUBSECTION Horizontal lists
6572

6673

67-
Note that in the above result, a user might be returned multiple times. This is the
74+
Note that in the above result, a user can be returned multiple times. This is the
6875
SQL way of returning data. If this is not desired, the friends' ids of each user
6976
can be returned in a horizontal list. This will return each user at most once.
7077

7178
The AQL query for doing so is:
7279

7380
```js
74-
FOR u IN users
75-
FILTER u.active == true LIMIT 0, 4
76-
RETURN {
77-
"user" : u.name,
81+
FOR u IN users
82+
FILTER u.active == true LIMIT 0, 4
83+
RETURN {
84+
"user" : u.name,
7885
"friendIds" : (
79-
FOR f IN relations
80-
FILTER f.from == u.id && f.type == "friend"
81-
RETURN f.to
86+
FOR f IN relations
87+
FILTER f.friendOf == u.userId && f.type == "friend"
88+
RETURN f.thisUser
8289
)
8390
}
8491

85-
[
86-
{
87-
"user" : "Abigail",
88-
"friendIds" : [
89-
108,
90-
102,
91-
106
92-
]
93-
},
94-
{
95-
"user" : "Fred",
96-
"friendIds" : [
97-
209
98-
]
99-
},
100-
{
101-
"user" : "Mary",
102-
"friendIds" : [
103-
207,
104-
104
105-
]
106-
},
107-
{
108-
"user" : "Mariah",
109-
"friendIds" : [
110-
203,
111-
205
112-
]
113-
}
92+
[
93+
{
94+
"user" : "Abigail",
95+
"friendIds" : [
96+
108,
97+
102,
98+
106
99+
]
100+
},
101+
{
102+
"user" : "Fred",
103+
"friendIds" : [
104+
209
105+
]
106+
},
107+
{
108+
"user" : "Mary",
109+
"friendIds" : [
110+
207,
111+
104
112+
]
113+
},
114+
{
115+
"user" : "Mariah",
116+
"friendIds" : [
117+
203,
118+
205
119+
]
120+
}
114121
]
115122
```
116123

117-
In this query we are still iterating over the users in the *users* collection
124+
In this query we are still iterating over the users in the *users* collection
118125
and for each matching user we are executing a sub-query to create the matching
119126
list of related users.
120127

@@ -124,48 +131,112 @@ To not only return friend ids but also the names of friends, we could "join" the
124131
*users* collection once more (something like a "self join"):
125132

126133
```js
127-
FOR u IN users
128-
FILTER u.active == true
129-
LIMIT 0, 4
130-
RETURN {
131-
"user" : u.name,
134+
FOR u IN users
135+
FILTER u.active == true
136+
LIMIT 0, 4
137+
RETURN {
138+
"user" : u.name,
132139
"friendIds" : (
133-
FOR f IN relations
134-
FILTER f.from == u.id && f.type == "friend"
135-
FOR u2 IN users
136-
FILTER f.to == u2.id
140+
FOR f IN relations
141+
FILTER f.friendOf == u.userId && f.type == "friend"
142+
FOR u2 IN users
143+
FILTER f.thisUser == u2.useId
137144
RETURN u2.name
138-
)
139-
}
140-
141-
[
142-
{
143-
"user" : "Abigail",
144-
"friendIds" : [
145-
"Jim",
146-
"Jacob",
147-
"Daniel"
148-
]
149-
},
150-
{
151-
"user" : "Fred",
152-
"friendIds" : [
153-
"Mariah"
154-
]
155-
},
156-
{
157-
"user" : "Mary",
158-
"friendIds" : [
159-
"Isabella",
160-
"Michael"
161-
]
162-
},
163-
{
164-
"user" : "Mariah",
165-
"friendIds" : [
166-
"Madison",
167-
"Eva"
168-
]
169-
}
145+
)
146+
}
147+
148+
[
149+
{
150+
"user" : "Abigail",
151+
"friendIds" : [
152+
"Jim",
153+
"Jacob",
154+
"Daniel"
155+
]
156+
},
157+
{
158+
"user" : "Fred",
159+
"friendIds" : [
160+
"Mariah"
161+
]
162+
},
163+
{
164+
"user" : "Mary",
165+
"friendIds" : [
166+
"Isabella",
167+
"Michael"
168+
]
169+
},
170+
{
171+
"user" : "Mariah",
172+
"friendIds" : [
173+
"Madison",
174+
"Eva"
175+
]
176+
}
177+
]
178+
```
179+
180+
This query will then again in term fetch the clear text name of the
181+
friend from the users collection. So here we iterate the users collection,
182+
and for each hit the relations collection, and for each hit once more the
183+
users collection.
184+
185+
!SUBSECTION Outer joins
186+
187+
Lets find the lonely people in our database - those without friends.
188+
189+
```js
190+
191+
FOR user IN users
192+
LET friendList = (
193+
FOR f IN relations
194+
FILTER f.friendOf == u.userId
195+
RETURN 1
196+
)
197+
FILTER LENGTH(friendList) == 0
198+
RETURN { "user" : user.name }
199+
200+
[
201+
{
202+
"user" : "Abigail"
203+
},
204+
{
205+
"user" : "Fred"
206+
}
207+
]
208+
```
209+
210+
So, for each user we pick the list of her friends and count them. The ones where
211+
count equals zero are the lonely people. Using *RETURN 1* in the sub-query
212+
saves even more precious CPU cycles and gives the optimizer more alternatives.
213+
214+
!SUBSECTION Pitfalls
215+
216+
Since we're free of schematas, there is no way to tell the format of the
217+
documents. So, if your documents don't contain an attribute, it defaults to
218+
null. We can however check our data for accuracy like this:
219+
220+
```
221+
RETURN LENGTH(FOR u IN users FILTER u.userId == null RETURN 1)
222+
[
223+
10000
224+
]
225+
RETURN LENGTH(FOR f IN relations FILTER f.friendOf == null RETURN 1)
226+
[
227+
10000
170228
]
171229
```
230+
231+
So that the above queries return 10k matches each, the result of i.e. the Join
232+
tuples query will become 100.000.000 items large and will use much memory plus
233+
compution time. So it is generaly a good idea to revalidate that the criteria
234+
for your join conditions exist.
235+
236+
Using indices on the properties can speed up the operation significantly.
237+
ou can use the explain helper to revalidate your query actualy uses them.
238+
239+
If you work with joins on edge collections you would typicaly aggregate over
240+
the internal fields *_id*, *_from* and *_to* (where *_id* equals *userId*,
241+
*_from* *friendOf* and *_to* would be *thisUser* in our examples). Arangodb
242+
implicitely creates indices on them.

js/server/tests/shell-cluster-agency.js

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
/*jshint strict: true */
2-
/*global require, fail, assertFalse, assertTrue, assertEqual, assertNotEqual, assertMatch, ArangoAgency */
2+
/*global require, fail, assertFalse, assertTrue, assertEqual, assertNotEqual, ArangoAgency */
33

44
////////////////////////////////////////////////////////////////////////////////
55
/// @brief test the agency communication layer
@@ -82,7 +82,8 @@ function AgencySuite () {
8282
////////////////////////////////////////////////////////////////////////////////
8383

8484
testVersion : function () {
85-
assertMatch(/^etcd/, agency.version());
85+
var agencyVersion = JSON.parse(agency.version());
86+
assertEqual(agencyVersion.internalVersion, "2");
8687
},
8788

8889
////////////////////////////////////////////////////////////////////////////////

0 commit comments

Comments
 (0)
0