Support any expression for dynamic filtering probe side#17169
Support any expression for dynamic filtering probe side#17169zhenxiao merged 2 commits intoprestodb:masterfrom
Conversation
289430c to
6b02f07
Compare
ec9b51d to
5a03fa5
Compare
rongrong
left a comment
There was a problem hiding this comment.
What's the motivation of this PR? What problem does it solve? If it enables new use cases, please add new tests to reflect that.
presto-main/src/main/java/com/facebook/presto/sql/planner/optimizations/PredicatePushDown.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
This logic is not about whether the operator is a comparison operator. What does it do? Please name the function to reflect the logic. Thanks!
There was a problem hiding this comment.
yep, supported operator should be comparison, excluding not_equal and distinct, and the expression needs:
- left child contains left variables and right child contains right variables
- or left child contains right variables and right child contains left variables
get it updated
There was a problem hiding this comment.
This logic is repeating the logic in the function. Since the function is only used once, might as well just inline. It would be easier to read that way too.
There was a problem hiding this comment.
sure. will inline
5a03fa5 to
3819145
Compare
|
thank you, @rongrong |
kaikalur
left a comment
There was a problem hiding this comment.
I would like more high level explanation in the PR description. It is using too many special cases and correctness is not obvious. Also it doesn't seem to check for non-deterministic expressions. And if you are doing this for any specific case, please elaborate.
There was a problem hiding this comment.
Why not left join? Are you assuming right side is going to be the build?
There was a problem hiding this comment.
yes, dynamic filtering applies to broadcast join, and the right side is the build.
since dynamic filtering is generating predicates and propagate to the probe side, right join is fine, but left join might not be correct.
There was a problem hiding this comment.
Looks like you are assuming that RIGHT join will never be rewritten to LEFT due to some other optimizations? I'm just wondering if this is useful. Please add a real usecase and also more comprehensive testing (including hive end to end) and also a session param to control this feature (with defualt off) if we are convinced this is a good thing to do.
There was a problem hiding this comment.
I do not think RIGHT will be rewritten to LEFT when it comes here, since:
- dynamic filtering only applies to broadcast join, we already have session properties to turn it on/off
- the code here is a branch assuming dynamic filtering is on
- cost based optimizer is not used
SystemSessionProperties.isEnableDynamicFiltering is the session property to control dynamic filtering
The use case is:
SELECT o.orderkey FROM orders o RIGHT JOIN lineitem l ON l.orderkey + 1 = o.orderkey
The motivation to work on this feature it, when I was implementing comparison operator support for dynamic filtering, was trying to start from simple, so assuming left and right to be variable, and only for inner join. Actually, find support right join and expression for probe side is just a few lines changes, so submitted this PR.
|
kindly ping |
I would like to see hive end to end tests running with this feature and also specific tests with predicate being effective and not as well. I'm concerned we are adding something that's marginally useful but potentially troublesome in terms of correctness. Outer joins confuse me :) |
hi @kaikalur end to end tests added in |
99b42da to
00d3100
Compare
|
will take a look after tests pass |
fba865f to
fde4b1b
Compare
kaikalur
left a comment
There was a problem hiding this comment.
Just test without explictly setting broadcast join and it doesn't break anything.
fde4b1b to
c73407b
Compare
There was a problem hiding this comment.
This logic is repeating the logic in the function. Since the function is only used once, might as well just inline. It would be easier to read that way too.
There was a problem hiding this comment.
the variables are not necessary, if you want to keep them rightContainsContainsRightVariables and rightContainsContainsLeftVariables should be rightChild... instead.
There was a problem hiding this comment.
oh, nice catch. Will fix
There was a problem hiding this comment.
I'm not familiar with this test setup, how are these testing the correctness of dynamic filters?
There was a problem hiding this comment.
AbstractTestJoinQueries testcases will be triggered in
TestHiveDistributedJoinQueriesWithDynamicFiltering and
TestHiveDistributedJoinQueriesWithDynamicFilteringAndFilterPushdown
and many other tests, to guarantee the correctness
|
The release note should start with a verb, something like "Add support xxx", I think a single release note capturing both enhancement for dynamic filter would be fine. |
rongrong
left a comment
There was a problem hiding this comment.
Please fix nits before merging. Thanks!
c73407b to
d9859ba
Compare
d9859ba to
7358930
Compare
Test plan -
TestDynamicFilter
TestHiveDistributedJoinQueriesWithDynamicFilteringAndFilterPushdown.java