8000 much better diagnostics for AQL query results cache (#6580) · soualid/arangodb@ae85fbc · GitHub
[go: up one dir, main page]

Skip to content

Commit ae85fbc

Browse files
authored
much better diagnostics for AQL query results cache (arangodb#6580)
1 parent 53b51d6 commit ae85fbc

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

57 files changed

+2383
-362
lines changed

CHANGELOG

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
11
devel
22
-----
33

4+
* added more AQL query results cache inspection and control functionality
5+
46
* the query editor within the web ui is now catching http 501 responses
57
propertly.
68

Documentation/Books/AQL/ExecutionAndPerformance/QueryCache.md

Lines changed: 53 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -29,15 +29,15 @@ Query eligibility
2929
-----------------
3030

3131
The query results cache will consider two queries identical if they have exactly the
32-
same query string. Any deviation in terms of whitespace, capitalization etc.
33-
will be considered a difference. The query string will be hashed and used as
34-
the cache lookup key. If a query uses bind parameters, these will also be hashed
32+
same query string and the same bind variables. Any deviation in terms of whitespace,
33+
capitalization etc. will be considered a difference. The query string will be hashed
34+
and used as the cache lookup key. If a query uses bind parameters, these will also be hashed
3535
and used as part of the cache lookup key.
3636

3737
That means even if the query strings of two queries are identical, the query results
3838
cache will treat them as different queries if they have different bind parameter
3939
values. Other components that will become part of a query's cache key are the
40-
`count` and `fullCount` attributes.
40+
`count`, `fullCount` and `optimizer` attributes.
4141

4242
If the cache is turned on, the cache will check at the very start of execution
4343
whether it has a result ready for this particular query. If that is the case,
@@ -55,7 +55,11 @@ A query is eligible for caching only if all of the following conditions are met:
5555
* the query string is at least 8 characters long
5656
* the query is a read-only query and does not modify data in any collection
5757
* no warnings were produced while executing the query
58-
* the query is deterministic and only uses deterministic functions
58+
* the query is deterministic and only uses deterministic functions whose results
59+
are marked as cacheable
60+
* the size of the query result does not exceed the cache's configured maximal
61+
size for individual cache results or cumulated results
62+
* the query is not executed using a streaming cursor
5963

6064
The usage of non-deterministic functions leads to a query not being cachable.
6165
This is intentional to avoid caching of function results which should rather
@@ -136,16 +140,32 @@ require("@arangodb/aql/cache").properties({ mode: "on" });
136140
```
137141

138142
The maximum number of cached results in the cache for each database can be configured
139-
at server start using the configuration parameter `--query.cache-entries`.
140-
This parameter can be used to put an upper bound on the number of query results in
141-
each database's query cache and thus restrict the cache's memory consumption.
143+
at server start using the following configuration parameters:
142144

143-
The value can also be adjusted at runtime as follows:
145+
* `--query.cache-entries`: maximum number of results in query result cache per database
146+
* `--query.cache-entries-max-size`: maximum cumulated size of results in query result cache per database
147+
* `--query.cache-entry-max-size`: maximum size of an invidiual result entry in query result cache
148+
* `--query.cache-include-system-collections`: whether or not to include system collection queries in the query result cache
149+
150+
These parameters can be used to put an upper bound on the number and size of query
151+
results in each database's query cache and thus restrict the cache's memory consumption.
152+
153+
These value can also be adjusted at runtime as follows:
144154

145155
```
146-
require("@arangodb/aql/cache").properties({ maxResults: 200 });
156+
require("@arangodb/aql/cache").properties({
157+
maxResults: 200,
158+
maxResultsSize: 8 * 1024 * 1024,
159+
maxEntrySize: 1024 * 1024,
160+
includeSystem: false
161+
});
147162
```
148163

164+
The above will limit the number of cached results in the query results cache to 200
165+
results per database, and to 8 MB cumulated query result size per database. The maximum
166+
size of each query cache entry is restricted to 8MB. Queries that involve system
167+
collections are excluded from caching.
168+
149169

150170
Per-query configuration
151171
-----------------------
@@ -168,7 +188,7 @@ var stmt = db._createStatement({
168188
stmt.execute();
169189
```
170190

171-
When using the `db._query()` function, the `cache` attribute can be set as allows:
191+
When using the `db._query()` function, the `cache` attribute can be set as follows:
172192

173193
```
174194
db._query({
@@ -184,10 +204,32 @@ if the result was retrieved from the query cache, and `false` otherwise. Clients
184204
this attribute to check if a specific query was served from the cache or not.
185205

186206

207+
Query results cache inspection
208+
------------------------------
209+
210+
The contents of the query results cache can be checked at runtime using the cache's
211+
`toArray()` function:
212+
213+
```
214+
require("@arangodb/aql/cache").toArray();
215+
```
216+
217+
This will return a list of all query results stored in the current database's query
218+
results cache.
219+
220+
The query results cache for the current database can be cleared at runtime using the
221+
cache's `clear` function:
222+
223+
```
224+
require("@arangodb/aql/cache").clear();
225+
```
226+
227+
187228
Restrictions
188229
------------
189230

190231
Query results that are returned from the query results cache may contain execution statistics
191232
stemming from the initial, uncached query execution. This means for a cached query results,
192233
the *extra.stats* attribute may contain stale data, especially in terms of the *executionTime*
193234
and *profile* attribute values.
235+

Documentation/Books/AQL/ExecutionAndPerformance/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,4 +15,4 @@ This chapter describes AQL features related to query executions and query perfor
1515
parts of the plan are responsible. The query-profiler can show you execution statistics for every
1616
stage of the query execution.
1717

18-
* [The AQL query result cache](QueryCache.md): an optional query results cache is used to avoid repeated calculation of the same query results.
18+
* [The AQL query result cache](QueryCache.md): an optional query results cache can be used to avoid repeated calculation of the same query results.

Documentation/Books/HTTP/AqlQueryCache/README.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,9 @@
1-
HTTP Interface for the AQL query cache
2-
======================================
1+
HTTP Interface for the AQL query results cache
2+
==============================================
33

4-
This section describes the API methods for controlling the AQL query cache.
4+
This section describes the API methods for controlling the AQL query results cache.
5+
6+
@startDocuBlock GetApiQueryCacheCurrent
57

68
@startDocuBlock DeleteApiQueryCache
79

Documentation/Books/HTTP/SUMMARY.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@
2626
* [Query Results](AqlQueryCursor/QueryResults.md)
2727
* [Accessing Cursors](AqlQueryCursor/AccessingCursors.md)
2828
* [AQL Queries](AqlQuery/README.md)
29-
* [AQL Query Cache](AqlQueryCache/README.md)
29+
* [AQL Query Results Cache](AqlQueryCache/README.md)
3030
* [AQL User Functions Management](AqlUserFunctions/README.md)
3131
* [Simple Queries](SimpleQuery/README.md)
3232
* [Async Result Handling](AsyncResultsManagement/README.md)

Documentation/Books/Manual/Programs/Arangod/Query.md

Lines changed: 45 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -38,9 +38,12 @@ The default is *true*.
3838

3939
`--query.tracking-with-bindvars flag`
4040

41-
If *true*, then the bind variables will be tracked for all running and slow
42-
AQL queries. This option only has an effect if `--query.tracking` was set to
43-
*true*. Tracking of bind variables can be disabled by setting the option to *false*.
41+
If *true*, then the bind variables will be tracked and shown for all running
42+
and slow AQL queries. When set to *true*, this will also enable the display of
43+
bind variable values in the list of cached AQL query results.
44+
This option only has an effect if `--query.tracking` was set to *true* or when
45+
the query results cache is used.
46+
Tracking and displaying bind variable values can be disabled by setting the option to *false*.
4447

4548
The default is *true*.
4649

@@ -75,26 +78,58 @@ attribute when running a query.
7578

7679
The default value is *128*.
7780

78-
## AQL Query caching mode
81+
## AQL Query results caching mode
7982

8083
`--query.cache-mode`
8184

82-
Toggles the AQL query cache behavior. Possible values are:
85+
Toggles the AQL query results cache behavior. Possible values are:
8386

84-
* *off*: do not use query cache
85-
* *on*: always use query cache, except for queries that have their *cache*
87+
* *off*: do not use query results cache
88+
* *on*: always use query results cache, except for queries that have their *cache*
8689
attribute set to *false*
87-
* *demand*: use query cache only for queries that have their *cache*
90+
* *demand*: use query results cache only for queries that have their *cache*
8891
attribute set to *true*
8992

90-
## AQL Query cache size
93+
## AQL Query results cache size
9194

9295
`--query.cache-entries`
9396

9497
Maximum number of query results that can be stored per database-specific query
95-
cache. If a query is eligible for caching and the number of items in the
98+
results cache. If a query is eligible for caching and the number of items in the
9699
database's query cache is equal to this threshold value, another cached query
97100
result will be removed from the cache.
98101

99102
This option only has an effect if the query cache mode is set to either *on* or
100103
*demand*.
104+
105+
The default value is *128*.
106+
107+
`--query.cache-entries-max-size`
108+
109+
Maximum cumulated size of query results that can be stored per database-specific
110+
query results cache. When inserting a query result into the query results cache,
111+
it is check if the total size of cached results would exceed this value, and if so,
112+
another cached query result will be removed from the cache before inserting a new
113+
one.
114+
115+
This option only has an effect if the query cache mode is set to either *on* or
116+
*demand*.
117+
118+
The default value is *256 MB*.
119+
120+
`--query.cache-entry-max-size`
121+
122+
Maximum size of individual query results that can be stored in any database's query
123+
results cache. Query results are only eligible for caching when their size does not exceed
124+
this setting's value.
125+
126+
The default value is *16 MB*.
127+
128+
`--query.cache-include-system-collections`
129+
130+
Whether or not to store results of queries that involve system collections in
131+
the query results cache. Not storing these results is normally beneficial when using the
132+
query results cache, as queries on system collections are internal to ArangoDB and will
133+
only use space in the query results cache unnecessarily.
134+
135+
The default value is *false*.

Documentation/Books/Manual/ReleaseNotes/NewFeatures34.md

Lines changed: 54 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -820,6 +820,53 @@ the optimizer in 3.4 will now be able to use a sparse index on `value`:
820820
The optimizer in 3.3 was not able to detect this, and refused to use sparse indexes
821821
for such queries.
822822

823+
### Query results cache
824+
825+
The AQL query results cache in ArangoDB 3.4 has got additional parameters to
826+
control which queries should be stored in the cache.
827+
828+
In addition to the already existing configuration option `--query.cache-entries`
829+
that controls the maximum number of query results cached in each database's
830+
query results cac F438 he, there now exist the following extra options:
831+
832+
- `--query.cache-entries-max-size`: maximum cumulated size of the results stored
833+
in each database's query results cache
834+
- `--query.cache-entry-max-size`: maximum size for an individual cache result
835+
- `--query.cache-include-system-collections`: whether or not results of queries
836+
that involve system collections should be stored in the query results cache
837+
838+
These options allow more effective control of the amount of memory used by the
839+
query results cache, and can be used to better utilitize the cache memory.
840+
841+
The cache configuration can be changed at runtime using the `properties` function
842+
of the cache. For example, to limit the per-database number of cache entries to
843+
256 MB and to limit the per-database cumulated size of query results to 64 MB,
844+
and the maximum size of each individual cache entry to 1MB, the following call
845+
could be used:
846+
847+
```
848+
require("@arangodb/aql/cache").properties({
849+
maxResults: 256,
850+
maxResultsSize: 64 * 1024 * 1024,
851+
maxEntrySize: 1024 * 1024,
852+
includeSystem: false
853+
});
854+
```
855+
856+
The contents of the query results cache can now also be inspected at runtime using
857+
the cache's new `toArray` function:
858+
859+
```
860+
require("@arangodb/aql/cache").toArray();
861+
```
862+
863+
This will show all query results currently stored in the query results cache of
864+
the current database, along with their query strings, sizes, number of results
865+
and original query run times.
866+
867+
The functionality is also available via HTTP REST APIs.
868+
869+
823870
### Miscellaneous changes
824871

825872
When creating query execution plans for a query, the query optimizer was fetching
@@ -867,7 +914,7 @@ undesired. Creating a streaming cursor for such queries will solve both problems
867914
Please note that streaming cursors will use resources all the time till you
868915
fetch the last chunk of results.
869916

870-
Depending on the storage engine you use this has different consequences:
917+
Depending on the storage engine used this has different consequences:
871918

872919
- **MMFiles**: While before collection locks would only be held during the creation of the cursor
873920
(the first request) and thus until the result set was well prepared,
@@ -876,27 +923,28 @@ Depending on the storage engine you use this has different consequences:
876923

877924
While Multiple reads are possible, one write operation will effectively stop
878925
all other actions from happening on the collections in question.
879-
- **Rocksdb**: Reading occurs on the state of the data when the query
926+
- **RocksDB**: Reading occurs on the state of the data when the query
880927
was started. Writing however will happen during working with the cursor.
881928
Thus be prepared for possible conflicts if you have other writes on the collections,
882929
and probably overrule them by `ignoreErrors: True`, else the query
883930
will abort by the time the conflict happenes.
884931

885932
Taking into account the above consequences, you shouldn't use streaming
886-
cursors light minded for data modification queries.
933+
cursors light-minded for data modification queries.
887934

888935
Please note that the query options `cache`, `count` and `fullCount` will not work with streaming
889936
cursors. Additionally, the query statistics, warnings and profiling data will only be available
890-
when the last result batch for the query is sent.
937+
when the last result batch for the query is sent. Using a streaming cursor will also prevent
938+
the query results being stored in the AQL query results cache.
891939

892940
By default, query cursors created via the cursor API are non-streaming in ArangoDB 3.4,
893941
but streaming can be enabled on a per-query basis by setting the `stream` attribute
894942
in the request to the cursor API at endpoint `/_api/cursor`.
895943

896-
However, streaming cursors are enabled for the following parts of ArangoDB in 3.4:
944+
However, streaming cursors are enabled automatically for the following parts of ArangoDB in 3.4:
897945

898946
* when exporting data from collections using the arangoexport binary
899-
* when using `db.<collection>.toArray()` from the Arango shell.
947+
* when using `db.<collection>.toArray()` from the Arango shell
900948

901949
Native implementations
902950
----------------------

Documentation/Books/Manual/ReleaseNotes/UpgradingChanges34.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -273,6 +273,14 @@ APIs:
273273

274274
The following APIs have been added or augmented:
275275

276+
- additional `stream` attribute in queries HTTP API
277+
278+
The REST APIs for retrieving the list of currently running and slow queries
279+
at `GET /_api/query/current` and `GET /_api/query/slow` are now returning an
280+
additional attribute `stream` for each query.
281+
282+
This attribute indicates whether the query was started using a streaming cursor.
283+
276284
- `POST /_api/document/{collection}` now supports repsert (replace-insert).
277285

278286
This can be achieved by using the URL parameter `overwrite=true`. When set to
@@ -341,6 +349,12 @@ The following APIs have been added or augmented:
341349
In single-server mode, the *shardingStrategy* attribute is meaningless and
342350
will be ignored.
343351

352+
- a new API for inspecting the contents of the AQL query results cache has been added
353+
to endpoint `GET /_api/query/cache/entries`
354+
355+
This API returns the current contents of the AQL query results cache of the
356+
currently selected database.
357+
344358
- APIs for view management have been added at endpoint `/_api/view`.
345359

346360
- The REST APIs for modifying graphs at endpoint `/_api/gharial` now support returning

Documentation/DocuBlocks/Rest/AQL/DeleteApiQueryCache.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
@RESTHEADER{DELETE /_api/query-cache, Clears any results in the AQL query results cache}
66

77
@RESTDESCRIPTION
8-
clears the query cache
8+
clears the query results cache for the current database
99
@RESTRETURNCODES
1010

1111
@RESTRETURNCODE{200}

0 commit comments

Comments
 (0)
0