Transform in values to in filter#23161
Conversation
|
Yes. It resolves #22728. |
|
Thanks for your advice. |
|
(Edit to acknowledge a release note entry has been added - please ignore this comment.) If this doesn't need a release note - and I don't believe it does - please add this so this PR isn't in the "Missing Release Notes" section of the release notes process (see #23079 for an example). Release Notes |
| implements Rule<SemiJoinNode> | ||
| { | ||
| private static final Pattern<SemiJoinNode> PATTERN = semiJoin().with(filteringSource() | ||
| .matching(project().with(source() |
There was a problem hiding this comment.
It's possible that there are multiple projects between values and semi join nodes
There was a problem hiding this comment.
yes. I see that, in previous version, it has more than one projects. However, in current version, it has only one project there.Do you have some ideas how can I match multiple levels to catch the ValuesNode?
There was a problem hiding this comment.
Not sure about the pattern but values->project* should still be constant? Maybe we have a separate inline constant values rule that'll be generally useful and this rule. So if you have something like:
VALUES('abcd')->Project(subtr(field_0, 1, 2))
you can rewrite that as:
VALUES('ab')
Using the RowExpressionInterpreter?
| public Result apply(SemiJoinNode semiJoinNode, Captures captures, Context context) | ||
| { | ||
| PlanNode source = semiJoinNode.getSource(); | ||
| return context.getLookup().resolveGroup(semiJoinNode.getFilteringSource()).findFirst() |
There was a problem hiding this comment.
Having the whole logic within the single return statement may not be a good idea.
There was a problem hiding this comment.
Yeah makes it hard to read
| Optional.empty()); | ||
| } | ||
|
|
||
| public static Property<SemiJoinNode, PlanNode> filteringSource() |
There was a problem hiding this comment.
Put it in a statics class "SemiJoin" like other node specific properties below
| public Result apply(SemiJoinNode semiJoinNode, Captures captures, Context context) | ||
| { | ||
| PlanNode source = semiJoinNode.getSource(); | ||
| return context.getLookup().resolveGroup(semiJoinNode.getFilteringSource()).findFirst() |
There was a problem hiding this comment.
Yeah makes it hard to read
| .flatMap(projectNode -> context.getLookup().resolveGroup(projectNode.getSources().get(0)).findFirst()) | ||
| .map(ValuesNode.class::cast) | ||
| .map(ValuesNode::getRows) | ||
| // check that all values are only a single row expression (no struct/row types) |
There was a problem hiding this comment.
Hmm why not struct/row? How about array types?
There was a problem hiding this comment.
In fact the apply function works for struct/row and array. The only problem is how to match the pattern of variable projects between valuesNode and SemiJoinNode.
|
Nit of formatting the release note entry to use ` and not ', and add a space between the session property and the PR number. |
ee31b0a to
95fdd1d
Compare
Description
IN (VALUES ...) should be translated into simple IN LIST
Motivation and Context
The issue comes from #22728 IN (VALUES ..) should be translated as simple IN LIST
If VALUES table only has one column, transformed it into IN LIST
fixes #22728
Impact
Add another optimization rule, TransformInValuesToInFilter:
If filteringSource of SemiJoinNode comes from a ValueNode with only one column, transform it into a ProjectNode with outputVariable of the SemiJoinNode assigned to the IN LIST predicate. The outputVariable will be filtered in a later stage.
This rule has to be used together with another rule, InlineProjectionsOnValues on #23245
Test Plan
Test the rule fires on in the scenario and returns correct query plan.
Contributor checklist
Release Notes