10000 AQL with path filters returns unexpected results · Issue #3597 · arangodb/arangodb · GitHub
[go: up one dir, main page]

Skip to content
8000

AQL with path filters returns unexpected results #3597

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
7 of 27 tasks
vanobig opened this issue Nov 6, 2017 · 8 comments
Closed
7 of 27 tasks

AQL with path filters returns unexpected results #3597

vanobig opened this issue Nov 6, 2017 · 8 comments
Assignees
Labels
1 Bug 2 Fixed Resolution 3 Graph Graph query engine
Milestone

Comments

@vanobig
Copy link
vanobig commented Nov 6, 2017

my environment running ArangoDB

I'm using the latest ArangoDB of the respective release series:

  • 2.8
  • 3.0
  • 3.1
  • 3.2
  • self-compiled devel branch

Mode:

  • Cluster
  • Single-Server

Storage-Engine:

  • mmfiles
  • rocksdb

On this operating system:

  • DCOS on
    • AWS
    • Azure
    • own infrastructure
  • Linux
    • Debian .deb
    • Ubuntu .deb
    • SUSE .rpm
    • RedHat .rpm
    • Fedora .rpm
    • Gentoo
    • docker - official docker library
    • other:
  • Windows, version:
  • MacOS, version: Sierra, 10.12.6

this is an AQL-related issue:

[x] I'm using graph features

I'm issuing AQL via:

  • web interface with this browser: Chrome Version 61.0.3163.100 (Official Build) (64-bit) running on this OS: MacOS, version: Sierra, 10.12.6
  • arangosh
  • this Driver: arangodb/go-driver

I've run db._explain("<my aql query>") and it didn't shed more light on this.

The AQL query in question is:
FOR v, e, p in 3 OUTBOUND "company/jquery" company_teams, team_contributors, committed OPTIONS {uniqueVertices: "global", bfs: true} FILTER p.vertices[1]._key == "1055031914" AND length(p.vertices[2].parents) < 2 RETURN v._id

The issue can be reproduced using this dataset:
https://drive.google.com/open?id=1VIZFL3vcZTw5q2ur-FzZQNrAa1u1jYXx

These are the steps to reproduce:
Provided dataset has the following structure
company (vertex) -> company_teams (edge) -> team (vertex) -> team_contributors (edge) -> contributor (vertex) -> committed (edge) -> commit (vertex)
For the simplisity, each collection (vertex and edge) has only one record.

Run query #1
FOR v, e, p in 3 OUTBOUND "company/jquery" company_teams, team_contributors, committed OPTIONS {uniqueVertices: "global", bfs: true} FILTER p.vertices[1]._key == "1055031914" RETURN v._id
and see that result has one commit ("commit/1571475"), exactly what I expect

Run query #2
FOR v, e, p in 3 OUTBOUND "company/jquery" company_teams, team_contributors, committed OPTIONS {uniqueVertices: "global", bfs: true} FILTER length(p.vertices[2].parents) < 2 RETURN v._id
and see that result has one commit ("commit/1571475"), again, exactly what I expect

From the queries #1 and #2 I've tried 2 different filters and both returned the commit record.
I assume that 2 filters combined together should give me the same result.

Run query #3
FOR v, e, p in 3 OUTBOUND "company/jquery" company_teams, team_contributors, committed OPTIONS {uniqueVertices: "global", bfs: true} FILTER p.vertices[1]._key == "1055031914" AND length(p.vertices[2].parents) < 2 RETURN v._id
and see that result is empty

@mchacki
Copy link
Member
mchacki commented Nov 8, 2017

Hi,

thanks for reporting, managed to reproduce it locally, fix is under way.
Will keep you posted here.

@mchacki mchacki self-assigned this Nov 8, 2017
@mchacki mchacki added this to the 3.2.7 milestone Nov 8, 2017
@vanobig
Copy link
Author
vanobig commented Nov 8, 2017

@mchacki, thanks for the heads up.

@vanobig
Copy link
Author
vanobig commented Nov 8, 2017

@mchacki, do you guys have an idea of why this might be happening? And explain what kind of queries are affected by this bug? I see another issue, and I have a hunch it's somewhat related, but I don't want to create another issue yet.

@mchacki
Copy link
Member
mchacki commented Nov 14, 2017

@vanobig There were actually two bugs:

  1. in Query nr 1 the optimizer used a fast path optimization as it does not use the e and p
    variables. In this case we can do a much more performant neighbors search.
    However initially this optimization was not used if a filter is given on p and there did not handle the filter condition itself. Nevertheless the Optimizer got more clever in one of the recent releases and moved the filter into the traverser + uses the optimized version + removed the filter from the query.
    => The Traverser said it would validate the condition but did not. So you only get your "correct" result by coincidence as it is the only result. If there would be more results (not matching the condition!) you would get 8000 them as well ;(

  2. There was bug in one of the traversers that has an off-by-one error on the vertex depths.
    Meaning it did not filter depth n vertices, but depth n-1 vertices with the condition.
    So in your case FILTER p.vertices[1]._key == was actually evaluated as FILTER p.vertices[0]._key ==. ONly applied to {"BFS": true} traversals. Both of the above are fixed in the 3.2.7 version, which is about to be released.

@vanobig
Copy link
Author
vanobig commented Nov 14, 2017

Thank you @mchacki, from the way you explained this, it does sound like the issue we experience.

@sleto-it
Copy link
Contributor

Hi @vanobig,

Thanks again for opening this ticket. Version 3.2.7 has now been released and it includes a fix for this problem (PR #3697)

Please could you upgrade? I am now closing this ticket. If the issue persists, please feel free to comment

Many thanks,

@sleto-it sleto-it added the 2 Fixed Resolution label Nov 15, 2017
@vanobig
Copy link
Author
vanobig commented Nov 16, 2017

Just finished testing, and yes, I do not see the issue any more. Confirm fixed.
Thank you guys for your effort.

@sleto-it
Copy link
Contributor

Thanks for the feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1 Bug 2 Fixed Resolution 3 Graph Graph query engine
Projects
None yet
Development

No branches or pull requests

4 participants
0