Rewrite expressions in query plans when variables are constant.#19836
Rewrite expressions in query plans when variables are constant.#19836feilong-liu merged 3 commits intoprestodb:masterfrom
Conversation
6f2b197 to
26b3a1b
Compare
f026e02 to
1262756
Compare
99ee828 to
ab08ba2
Compare
There was a problem hiding this comment.
why do we need this constraint?
There was a problem hiding this comment.
No particular reason, it's just I am being cautious here.
There was a problem hiding this comment.
I would remove this. not sure what benefits we're getting by limiting the supported types here.
|
Do we also rerun constant folding if we find all of the variables in an expression have become constants? for example: |
There was a problem hiding this comment.
this looks like a useful utility function we can add to PlannerUtils, something like isEquality
There was a problem hiding this comment.
Looks like that we already have such utils, updated to use existing utils.
...ava/com/facebook/presto/sql/planner/optimizations/RewriteExpressionWithConstantVariable.java
Outdated
Show resolved
Hide resolved
ab08ba2 to
7a50a7c
Compare
3fc758e to
11047ba
Compare
I agree. Let's make an issue to model the cost reduction that we expect for such a ProjectNode ? |
38860b0 to
31f3e6b
Compare
There was a problem hiding this comment.
Since the constant folding rule will make some Join outputs redundant, IMO we should run it immediately after (and call out the need for it as well). Otherwise there's a chance we move PruneUnreferencedOutputs even further down during a refactor
There was a problem hiding this comment.
Does this get replaced with an empty values node with some other rule later ?
There was a problem hiding this comment.
Yes, it will be convert to empty values node by predicate pushdown later
There was a problem hiding this comment.
nit: Using Map keys like orderkey_11 may give the impression that the plan matcher will do an exact match with this text, but this is not the case. Let's use more descriptive key name for this assignments matcher
There was a problem hiding this comment.
Found the root cause - When the PlanNode node is an OutputNode we can skip running planAndReplace on this output node and just call accept on it's source, since the OutputNode will never have a parent. Once I added :
@Override
public PlanNodeWithConstant visitOutput(OutputNode node, Void context)
{
PlanNodeWithConstant updatedSource = accept(node.getSource());
return new PlanNodeWithConstant(node.replaceChildren(ImmutableList.of(updatedSource.getPlanNode())), ImmutableMap.of());
}
the duplicate Projects added went away.
We can do this for other terminal nodes as well - TableWriter, TableFinish ?
aaneja
left a comment
There was a problem hiding this comment.
LGTM % some minor comments
713d9fe to
2fc07ea
Compare
7a7e1f4 to
8342573
Compare
|
Would it be appropriate to add documentation of this |
8342573 to
cbcf620
Compare
cbcf620 to
7cb8801
Compare
|
Codenotify: Notifying subscribers in CODENOTIFY files for diff 3d6bba2...7cb8801.
|
steveburnett
left a comment
There was a problem hiding this comment.
LGTM! (docs)
Pull branch, local build of docs, everything looks good. Thanks!
Rewrite expressions in query plans if we know some variables are constant.
The benefit of this optimizer is in three aspects:
select orderkey, l.partkey, o.orderstatus from lineitem l join orders o using (orderkey) where orderstatus='F', with the optimization, orderstatus will not be in the join hashtable.A variable reference expression in a query plan can be a constant 1) after a filter expression, for example,
select orderstatus, shippriority, count(*) from orders where orderstatus='F' group by orderstatus, shippriority, orderstatus will be a constant after the filter expression. 2) after a project node which assigns constant to a variable reference.After we know these constant variable references, we can avoid passing them along the query plans. Take the above query as an example, orderstatus is not passed along the plan.
Query:
select orderstatus, shippriority, count(*) from orders where orderstatus='F' group by orderstatus, shippriorityBefore:
After optimization:
Test plan - (Please fill in how you tested your changes)
Add unit tests, and also run verifier runs
verifier run, run2, run3, run4, run5, run6