8000 Incorrect sorting with sort criteria partially covered by index · Issue #10440 · arangodb/arangodb · GitHub
[go: up one dir, main page]

Skip to content

Incorrect sorting with sort criteria partially covered by index #10440

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mpoeter opened this issue Nov 14, 2019 · 5 comments
Closed

Incorrect sorting with sort criteria partially covered by index #10440

mpoeter opened this issue Nov 14, 2019 · 5 comments
Assignees
Labels
1 Bug 2 Fixed Resolution 3 AQL Query language related 3 Index
Milestone

Comments

@mpoeter
Copy link
Contributor
mpoeter commented Nov 14, 2019

My Environment

  • ArangoDB Version: 3.4.8 & self-compiled devel branch
  • Storage Engine: RocksDB
  • Deployment Mode: Single Server
  • Operating System: Ubuntu 18.04
  • Total RAM in your machine: 16GB

Component, Query & Data

Affected feature:
AQL query using web interface

AQL query (if applicable):

FOR a in activities
    FILTER a.user_id == "31132147"
    SORT a.datetime DESC, a.order DESC, a._key DESC
    LIMIT 10
    RETURN KEEP(a, "_key", "name", "order", "datetime")

AQL explain (if applicable):

Query String (178 chars, cacheable: true):
 FOR a in activities
     FILTER a.user_id == "31132147"
     SORT a.datetime DESC, a.order DESC, a._key DESC
     LIMIT 10
     RETURN KEEP(a, "_key", "name", "order", "datetime")

Execution plan:
 Id   NodeType          Est.   Comment
  1   SingletonNode        1   * ROOT
 12   IndexNode          589     - FOR a IN activities   /* reverse skiplist index scan */
  6   CalculationNode    589       - LET #5 = a.`order`   /* attribute expression */   /* collections used: a : activities */
  7   CalculationNode    589       - LET #7 = a.`_key`   /* attribute expression */   /* collections used: a : activities */
  8   SortNode           589       - SORT #5 DESC, #7 DESC
  9   LimitNode           10       - LIMIT 0, 10
 10   CalculationNode     10       - LET #9 = KEEP(a, "_key", "name", "order", "datetime")   /* simple expression */   /* collections used: a : activities */
 11   ReturnNode          10       - RETURN #9

Indexes used:
 By   Type       Collection   Unique   Sparse   Selectivity   Fields                                 Ranges
 12   skiplist   activities   false    false        37.29 %   [ `user_id`, `datetime`, `_key` ]   (a.`user_id` == "31132147")

Functions used:
 Name   Deterministic   Cacheable   Uses V8
 KEEP   true            true        false  

Optimization rules applied:
 Id   RuleName
  1   move-calculations-up
  2   move-filters-up
  3   move-calculations-up-2
  4   move-filters-up-2
  5   use-indexes
  6   remove-filter-covered-by-index
  7   use-index-for-sort
  8   remove-unnecessary-calculations-2

Problem:
The query optimizer recognizes correctly that the filter and the first sorting criterion are covered by the index on [user_id, datetime, _key], so it removes them. However, this still leaves the SORT node with the two remaining criteria and the query therefore produces a result that is not sorted correctly.

Expected result:
Result is sorted correctly

Workaround:
Create an index that covers all fields used for sorting.

Steps to reproduce

  1. Create a collection activities
  2. Insert these sample documents sample.json.gz
  3. Create an index for [user_id, datetime, _key]
  4. Run the query
@jsteemann
Copy link
Contributor

I think the dataset does not match the query somehow.
The dataset does not include any user_id attributes for any of the documents.

However, I will make up some data on my own and try to reproduce.

@OmarAyo OmarAyo added 3 AQL Query language related 3 Index labels Nov 14, 2019
@jsteemann
Copy link
Contributor

Reproduced. Looking into it...

@jsteemann jsteemann self-assigned this Nov 14, 2019
@mpoeter
Copy link
Contributor Author
mpoeter commented Nov 14, 2019

Right, sorry. I simply took the result of the query and forgot that it doesn't include a user_id.

@jsteemann
Copy link
Contributor

@jsteemann jsteemann added this to the 3.4.9 milestone Nov 14, 2019
ObiWahn added a commit that referenced this issue Nov 19, 2019
…ture/dump-db-properties

* 'devel' of https://github.com/arangodb/arangodb:
  Feature/aql subquery splicing with gather (#10341)
  Add braces (#10460)
  fixed issue #10440: Incorrect sorting with sort criteria partially covered by index (#10442)
  Remove as now unused SingleBlockFetcher (#10416)
ObiWahn added a commit that referenced this issue Nov 19, 2019
…ture/remove-db-properties-route

* 'devel' of https://github.com/arangodb/arangodb:
  Feature/aql subquery splicing with gather (#10341)
  Add braces (#10460)
  fixed issue #10440: Incorrect sorting with sort criteria partially covered by index (#10442)
KVS85 pushed a commit that referenced this issue Nov 20, 2019
…vered by index (#10443)

* fixed issue #10440: Incorrect sorting with sort criteria partially covered by index

* Update CHANGELOG
KVS85 pushed a commit that referenced this issue Nov 20, 2019
…vered by index (#10443)

* fixed issue #10440: Incorrect sorting with sort criteria partially covered by index

* Update CHANGELOG
ObiWahn added a commit that referenced this issue Nov 26, 2019
…re-3.5/query-timeout

* '3.5' of https://github.com/arangodb/arangodb:
  Allow the optimizer to use indexes (#10543)
  Updated arangosync to 0.7.0 (#10532)
  Bug fix 3.5/little weekend fix (#10527)
  Bug fix 3.5/issue 10470 (#10497)
  add tests for inventory (#10464) (#10495)
  fixed issue #10440: Incorrect sorting with sort criteria partially covered by index (#10443)
  [3.5] Added precondition to ensure that server is still as seen before. (#10477)
  improve error message (#10478)
  allow in-place analyzer creation via link definition (#10466) (#10481)
  make replication timeouts configurable via startup options (#10473)
  [3.5] no coordinators left behind (#10420)
  don't return any in-progress indexes (#10431)
  fix invalid assertion (#10428)
  show type of index when non-implemented exception is thrown (#10425)
  Fix dump tests
  Improve syncer error message (#10404)
  DEVSUP-492 (#10379) (#10402)
KVS85 pushed a commit that referenced this issue Dec 26, 2019
@maxkernbach
Copy link
Contributor

Hi @mpoeter,

Version 3.4.9, which includes a fix for the sorting, has just been released.
From the Changelog:

In case this issue still occurs after upgrading, please comment and we will reopen the ticket.

Please find the latest release in the download section.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1 Bug 2 Fixed Resolution 3 AQL Query language related 3 Index
Projects
None yet
Development

No branches or pull requests

4 participants
0