8000 Documentation/migrate cookbook devel 32 (#3290) · MohammedDeveloper/arangodb@fbfd349 · GitHub
[go: up one dir, main page]

Skip to content

Commit fbfd349

Browse files
dothebartfceller
authored andcommitted
Documentation/migrate cookbook devel 32 (arangodb#3290)
* migrate the Cookbook into the ArangoDB core repository * make cookbook links local
1 parent 3e64024 commit fbfd349

File tree

73 files changed

+8681
-16
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

73 files changed

+8681
-16
lines changed
Lines changed: 288 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,288 @@
1+
Avoiding parameter injection in AQL
2+
===================================
3+
4+
Problem
5+
-------
6+
7+
I don't want my AQL queries to be affected by parameter injection.
8+
9+
### What is parameter injection?
10+
11+
Parameter injection means that potentially content is inserted into a query,
12+
and that injection may change the meaning of the query. It is a security issue
13+
that may allow an attacker to execute arbitrary queries on the database data.
14+
15+
It often occurs if applications trustfully insert user-provided inputs into a
16+
query string, and do not fully or incorrectly filter them. It also occurs often
17+
when applications build queries naively, without using security mechanisms often
18+
provided by database software or querying mechanisms.
19+
20+
Parameter injection examples
21+
----------------------------
22+
23+
Assembling query strings with simple string concatenation looks trivial,
24+
but is potentially unsafe. Let's start with a simple query that's fed with some
25+
dynamic input value, let's say from a web form. A client application or a Foxx
26+
route happily picks up the input value, and puts it into a query:
27+
28+
```js
29+
/* evil ! */
30+
var what = req.params("searchValue"); /* user input value from web form */
31+
...
32+
var query = "FOR doc IN collection FILTER doc.value == " + what + " RETURN doc";
33+
db._query(query, params).toArray();
34+
```
35+
36+
The above will probably work fine for numeric input values.
37+
38+
What could an attacker do to this query? Here are a few suggestions to use for the
39+
`searchValue` parameter:
40+
41+
- for returning all documents in the collection: `1 || true`
42+
- for removing all documents: `1 || true REMOVE doc IN collection //`
43+
- for inserting new documents: `1 || true INSERT { foo: "bar" } IN collection //`
44+
45+
It should have become obvious that this is extremely unsafe and should be avoided.
46+
47+
An pattern often seen to counteract this is trying to quote and escape potentially
48+
unsafe input values before putting them into query strings. This may work in some situations,
49+
but it's easy to overlook something or get it subtly wrong:
50+
51+
```js
52+
/* we're sanitzing now, but it's still evil ! */
53+
var value = req.params("searchValue").replace(/'/g, '');
54+
...
55+
var query = "FOR doc IN collection FILTER doc.value == '" + value + "' RETURN doc";
56+
db._query(query, params).toArray();
57+
```
58+
59+
The above example uses single quotes for enclosing the potentially unsafe user
60+
input, and also replaces all single quotes in the input value beforehand. Not only may
61+
that change the user input (leading to subtle errors such as *"why does my search for
62+
`O'Brien` don't return any results?"*), but it is also unsafe. If the user input contains
63+
a backslash at the end (e.g. `foo bar\`), that backslash will escape the closing single
64+
quote, allowing the user input to break out of the string fence again.
65+
66+
It gets worse if user input is inserted into the query at multiple places. Let's assume
67+
we have a query with two dynamic values:
68+
69+
```js
70+
query = "FOR doc IN collection FILTER doc.value == '" + value + "' && doc.type == '" + type + "' RETURN doc";
71+
```
72+
73+
If an attacker inserted `\` for parameter `value` and ` || true REMOVE doc IN collection //` for
74+
parameter `type`, then the effective query would become
75+
76+
```
77+
FOR doc IN collection FILTER doc.value == '\' && doc.type == ' || true REMOVE doc IN collection //' RETURN doc
78+
```
79+
80+
which is highly undesirable.
81+
82+
83+
Solution
84+
--------
85+
86+
Instead of mixing query string fragments with user inputs naively via string
87+
concatenation, use either **bind parameters** or a **query builder**. Both can
88+
help to avoid the problem of injection, because they allow separating the actual
89+
query operations (like `FOR`, `INSERT`, `REMOVE`) from (user input) values.
90+
91+
This recipe focuses on using bind parameters. This is not to say that query
92+
builders shouldn't be used. They were simply omitted here for the sake of simplicity.
93+
To get started with a using an AQL query builder in ArangoDB or other JavaScript
94+
environments, have a look at [aqb](https://www.npmjs.com/package/aqb) (which comes
95+
bundled with ArangoDB). Inside ArangoDB, there are also [Foxx queries](../../Manual/Foxx/index.html)
96+
which can be combined with aqb.
97+
98+
### What bind parameters are
99+
100+
Bind parameters in AQL queries are special tokens that act as placeholders for
101+
actual values. Here's an example:
102+
103+
```
104+
FOR doc IN collection
105+
FILTER doc.value == @what
106+
RETURN doc
107+
```
108+
109+
In the above query, `@what` is a bind parameter. In order to execute this query,
110+
a value for bind parameter `@what` must be specified. Otherwise query execution will
111+
fail with error 1551 (*no value specified for declared bind parameter*). If a value
112+
for `@what` gets specified, the query can be executed. However, the query string
113+
and the bind parameter values (i.e. the contents of the `@what` bind parameter) will
114+
be handled separately. What's in the bind parameter will always be treated as a value,
115+
and it can't get out of its sandbox and change the semantic meaning of a query.
116+
117+
### How bind parameters are used
118+
119+
To execute a query with bind parameters, the query string (containing the bind
120+
parameters) and the bind parameter values are specified separately (note that when
121+
the bind parameter value is assigned, the prefix `@` needs to be omitted):
122+
123+
```js
124+
/* query string with bind parameter */
125+
var query = "FOR doc IN collection FILTER doc.value == @what RETURN doc";
126+
127+
/* actual value for bind parameter */
128+
var params = { what: 42 };
129+
130+
/* run query, specifying query string and bind parameter separately */
131+
db._query(query, params).toArray();
132+
```
133+
134+
If a malicious user would set `@what` to a value of `1 || true`, this wouldn't do
135+
any harm. AQL would treat the contents of `@what` as a single string token, and
136+
the meaning of the query would remain unchanged. The actually executed query would be:
137+
138+
```
139+
FOR doc IN collection
140+
FILTER doc.value == "1 || true"
141+
RETURN doc
142+
```
143+
144+
Thanks to bind parameters it is also impossible to turn a selection (i.e. read-only)
145+
query into a data deletion query.
146+
147+
### Using JavaScript variables as bind parameters
148+
149+
There is also a template string generator function `aql` that can be used to safely
150+
(and conveniently) built AQL queries using JavaScript variables and expressions. It
151+
can be invoked as follows:
152+
153+
```js
154+
const aql = require('@arangodb')aql; // not needed in arangosh
155+
156+
var value = "some input value";
157+
var query = aql`FOR doc IN collection
158+
FILTER doc.value == ${value}
159+
RETURN doc`;
160+
var result = db._query(query).toArray();
161+
```
162+
163+
Note that an ES6 template string is used for populating the `query` variable. The
164+
string is assembled using the `aql` generator function which is bundled with
165+
ArangoDB. The template string can contain references to JavaScript variables or
166+
expressions via `${...}`. In the above example, the query references a variable
167+
named `value`. The `aql` function generates an object with two separate
168+
attributes: the query string, containing references to bind parameters, and the actual
169+
bind parameter values.
170+
171+
Bind parameter names are automatically generated by the `aql` function:
172+
173+
```js
174+
var value = "some input value";
175+
aql`FOR doc IN collection FILTER doc.value == ${value} RETURN doc`;
176+
177+
{
178+
"query" : "FOR doc IN collection FILTER doc.value == @value0 RETURN doc",
179+
"bindVars" : {
180+
"value0" : "some input value"
181+
}
182+
}
183+
```
184+
185+
### Using bind parameters in dynamic queries
186+
187+
Bind parameters are helpful, so it makes sense to use them for handling the dynamic values.
188+
You can even use them for queries that itself are highly dynamic, for example with conditional
189+
`FILTER` and `LIMIT` parts. Here's how to do this:
190+
191+
```js
192+
/* note: this example has a slight issue... hang on reading */
193+
var query = "FOR doc IN collection";
194+
var params = { };
195+
196+
if (useFilter) {
197+
query += " FILTER doc.value == @what";
198+
params.what = req.params("searchValue");
199+
}
200+
201+
if (useLimit) {
202+
/* not quite right, see below */
203+
query += " LIMIT @offset, @count";
204+
params.offset = req.params("offset");
205+
params.count = req.params("count");
206+
}
207+
208+
query += " RETURN doc";
209+
db._query(query, params).toArray();
210+
```
211+
212+
Note that in this example we're back to string concatenation, but without the problem of
213+
the query being vulnerable to arbitrary modifications.
214+
215+
### Input value validation and sanitation
216+
217+
Still you should prefer to be paranoid, and try to detect invalid input values as early as
218+
possible, at least before executing a query with them. This is because some input parameters
219+
may affect the runtime behavior of queries negatively or, when modified, may lead to queries
220+
throwing runtime errors instead of returning valid results. This isn't something an attacker
221+
should deserve.
222+
223+
`LIMIT` is a good example for this: if used with a single argument, the argument should
224+
be numeric. When `LIMIT` is given a string value, executing the query will fail. You
225+
may want to detect this early and don't return an HTTP 500 (as this would signal attackers
226+
that they were successful breaking your application).
227+
228+
Another problem with `LIMIT` is that high `LIMIT` values are likely more expensive than low
229+
ones, and you may want to disallow using `LIMIT` values exceeding a certain threshold.
230+
231+
Here's what you could do in such cases:
232+
233+
```js
234+
var query = "FOR doc IN collection LIMIT @count RETURN doc";
235+
236+
/* some default value for limit */
237+
var params = { count: 100 };
238+
239+
if (useLimit) {
240+
var count = req.params("count");
241+
242+
/* abort if value does not look like an integer */
243+
if (! preg_match(/^d+$/, count)) {
244+
throw "invalid count value!";
245+
}
246+
247+
/* actually turn it into an integer */
248+
params.count = parseInt(count, 10); // turn into numeric value
249+
}
250+
251+
if (params.count < 1 || params.count > 1000) {
252+
/* value is outside of accepted thresholds */
253+
throw "invalid count value!";
254+
}
255+
256+
db._query(query, params).toArray();
257+
```
258+
259+
This is a bit more complex, but that's a price you're likely willing to pay for a
260+
bit of extra safety. In reality you may want to use a framework for validation (such as
261+
[joi](https://www.npmjs.com/package/joi) which comes bundled with ArangoDB) instead
262+
of writing your own checks all over the place.
263+
264+
### Bind parameter types
265+
266+
There are two types of bind parameters in AQL:
267+
268+
- bind parameters for values: those are prefixed with a single `@` in AQL queries, and
269+
are specified without the prefix when they get their value assigned. These bind parameters
270+
can contain any valid JSON value.
271+
272+
Examples: `@what`, `@searchValue`
273+
274+
- bind parameters for collections: these are prefixed with `@@` in AQL queries, and are
275+
replaced with the name of a collection. When the bind parameter value is assigned, the
276+
parameter itself must be specified with a single `@` prefix. Only string values are allowed
277+
for this type of bind parameters.
278+
279+
Examples: `@@collection`
280+
281+
The latter type of bind parameter is probably not used as often, and it should not be used
282+
together with user input. Otherwise users may freely determine on which collection your
283+
AQL queries will operate (note: this may be a valid use case, but normally it is extremely
284+
undesired).
285+
286+
**Authors**: [Jan Steemann](https://github.com/jsteemann)
287+
288+
**Tags**: #injection #aql #security
Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
Creating test data with AQL
2+
===========================
3+
4+
Problem
5+
-------
6+
7+
I want to create some test documents.
8+
9+
Solution
10+
--------
< 97AE /td>11+
12+
If you haven't yet created a collection to hold the documents, create one now using the
13+
ArangoShell:
14+
15+
```js
16+
db._create("myCollection");
17+
```
18+
19+
This has created a collection named *myCollection*.
20+
21+
One of the easiest ways to fill a collection with test data is to use an AQL query that
22+
iterates over a range.
23+
24+
Run the following AQL query from the **AQL editor** in the web interface to insert 1,000
25+
documents into the just created collection:
26+
27+
```
28+
FOR i IN 1..1000
29+
INSERT { name: CONCAT("test", i) } IN myCollection
30+
```
31+
32+
The number of documents to create can be modified easily be adjusting the range boundary
33+
values.
34+
35+
To create more complex test data, adjust the AQL query!
36+
37+
Let's say we also want a `status` attribute, and fill it with integer values between `1` to
38+
(including) `5`, with equal distribution. A good way to achieve this is to use the modulo
39+
operator (`%`):
40+
41+
```
42+
FOR i IN 1..1000
43+
INSERT {
44+
name: CONCAT("test", i),
45+
status: 1 + (i % 5)
46+
} IN myCollection
47+
```
48+
49+
To create pseudo-random values, use the `RAND()` function. It creates pseudo-random numbers
50+
between 0 and 1. Use some factor to scale the random numbers, and `FLOOR()` to convert the
51+
scaled number back to an integer.
52+
53+
For example, the following query populates the `value` attribute with numbers between 100 and
54+
150 (including):
55+
56+
```
57+
FOR i IN 1..1000
58+
INSERT {
59+
name: CONCAT("test", i),
60+
value: 100 + FLOOR(RAND() * (150 - 100 + 1))
61+
} IN myCollection
62+
```
63+
64+
After the test data has been created, it is often helpful to verify it. The
65+
`RAND()` function is also a good candidate for retrieving a random sample of the documents in the
66+
collection. This query will retrieve 10 random documents:
67+
68+
```
69+
FOR doc IN myCollection
70+
SORT RAND()
71+
LIMIT 10
72+
RETURN doc
73+
```
74+
75+
The `COLLECT` clause is an easy mechanism to run an aggregate analysis on some attribute. Let's
76+
say we wanted to verify the data distribution inside the `status` attribute. In this case we
77+
could run:
78+
79+
```
80+
FOR doc IN myCollection
81+
COLLECT value = doc.value WITH COUNT INTO count
82+
RETURN {
83+
value: value,
84+
count: count
85+
}
86+
```
87+
88+
The above query will provide the number of documents per distinct `value`.
89+
90+
**Author:** [Jan Steemann](https://github.com/jsteemann)
91+
92+
**Tags**: #aql

0 commit comments

Comments
 (0)
0