-
Notifications
You must be signed in to change notification settings - Fork 296
support path semantic in non-recursive-path #4405
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @excaliburwyj,
Thanks for the contribution. I think the overall idea make sense. Though I'm in middle of migrating recursive join to a different parallel computation framework (#4404). I'll review this in detail during the weekend. Hope this works fine for you.
51401b7 to
aaa2bc4
Compare
Whoops, forgot to review this PR. Let me do it today. Sorry about this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @excaliburwyj,
As I'm reviewing this PR, another idea pops into my mind. Instead of first finding a join operator on top of recursive join and then append a filter (which is the logic you implemented in PathSemanticRewriter), I wonder if the following approach will be simpler.
First we add a scalar function hasDuplication in the system. It will be a function that take arbitrary number of nodes, or relations (a relation can be recursive), e.g.
MATCH (a)-[e]->(b)-[]->(c) WHERE NOT hasDuplication(a, b, c)
The above is an equivalent form of
MATCH p = (a)-[e]->(b)-[]->(c) WHERE is_acyclic(p)
The difference is that it doesn't require a path variable.
Once this is done, for a given MATCH query, we can check if the current semantic is trail or acyclic. If so, and binding stage, we directly add a predicate expression hasDuplication with either nodes or relations depends on the semantic.
In this approach, we can limited change in function module and binder module. And we don't need to worry about planning and optimization.
Let me know what you think.
|
Hi @andyfengHKU, |
|
Hi @excaliburwyj, I totally agree with the idea that we should generate a different result when the semantic is set to What I'm suggesting is that, at implementation level, we construct an additional filter in the binder and this is hidden from the user. For example, user inputs are Internally we perform a rewrite to the second query as User is not aware of this rewrite but we can achieve a |
|
Hi @andyfengHKU |
aaa2bc4 to
d471344
Compare
|
Hi @andyfengHKU |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @excaliburwyj,
yeah this one looks great to me. I wanna make some refactor and also add some tests. Though I cannot directly modify on your branch. Do you mind if I merge it into a dev branch, make some update and then merge to master?
|
Hi @andyfengHKU, |
Description
For non-recursive paths,such as "match (a)-[b]-(c)-[d]-(e) return e;", path semantics do not work.
support has been added:
Fixes # (issue)
Contributor agreement