ESQL: Drop PropagateInlineEvals optimizer rule#138270
ESQL: Drop PropagateInlineEvals optimizer rule#138270bpintea wants to merge 7 commits intoelastic:mainfrom
Conversation
This drops the PropagateInlineEvals rule that moves an Eval (or part of) it from the RHS of an InlineJoin to the LHS of it. Namely, the evaluation of the groups. This can be done directly in the ReplaceAggregateNestedExpressionWithEval rule, that creates these evaluations in the first place. This rule is now InlineJoins aware.
|
Hi @bpintea, I've created a changelog YAML for you. |
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
| public static List<Attribute> computeOutput(LogicalPlan source, LogicalPlan target) { | ||
| Set<Attribute> stubRelationOutput = new LinkedHashSet<>(target.output()); | ||
| stubRelationOutput.addAll(source.references().stream().filter(Attribute::synthetic).toList()); | ||
| private static List<Attribute> computeOutput(LogicalPlan destinationPlan, LogicalPlan sourcePlan) { |
There was a problem hiding this comment.
I've updated the naming here as I struggled myself to match how they're used to what they're called (here and above). Can revert it my understanding and comments are wrong, though.
There was a problem hiding this comment.
These are cosmetic. Stumbled upon them during development, as tests were failing and realised they're hard to read.
|
|
||
| addedAttrs.put(key, newFunctionAttr); | ||
| return newFunctionAttr; | ||
| return addedAttrs.computeIfAbsent(newFunctionAttr.ignoreId(), k -> newFunctionAttr); |
There was a problem hiding this comment.
This is unrelated too. I got to it as I was considering adding an IdIgnoringAttributeSet and was inspecting other places where this would help, then noticed this can be made easier to read.
There was a problem hiding this comment.
Only added one extra test, existing tests do test the change already.
astefan
left a comment
There was a problem hiding this comment.
I spent two hours on the code in this PR and imho it's making things worse.
PropagateInlineEvals is a "clean" rule that is clear on what its intention are. Reading and understanding that code is easy and more people looking at it will understand what is it doing than looking at the new version in this PR.
In these two hours I tried to refactor the code in ReplaceAggregateNestedExpressionWithEval to eliminate the intricate logic that tries to do different things depending on the Aggregate being on the RHS of an InlineJoin or a regular Aggregate and, especially the code here makes it much harder to really understand the new logic. How this code was before was crystal clear, now I cannot understand what it tries to do even with comments.
Unless the code in this rule is refactored in a much cleaner way, with all the risks mentioned here (unless there is some other implication) I believe the original version is:
- much cleaner for anyone experienced or beginner trying to read and approach this code
- is making the
InlineJoinimplementation much less invasive, something that I think is paramount withInlineJoin.
I am making my review as "Comment" and will defer to the original author of the issue this PR is addressing - @alex-spies - to veto on the approval or rejection of this PR.
FWIW, I've refactor a bit that part, extracting the
I'm myself a bit on the fence a bit about this, at least when freshly reading the code: one rule corrects what other rule produced, which incorrectly, but only in some cases (with
I do agree with this, however. |
There was a problem hiding this comment.
I like this change, thank you @bpintea .
I agree with @astefan that the resulting rule is a bit complex. Well, what we do is a bit complex :) But with this change, the optimizer doesn't generate a broken intermediate logical plan that needs to be fixed in a subsequent step. This makes the current state confusing and it will become even more confusing if/when someone decides to put another optimizer rule between PropagateInlineEvals and ReplaceAggregateNestedExpressionWithEval in the future.
I think such broken intermediate states should lead to test failures in the future because they make it hard to reason about query plans and optimizer rules if said rules cannot assume a consistent plan as input. So this change is required IMHO. I'd like to add corresponding assertions some time soon that should just fail our tests if they detect inconsistent intermediate query plans (such assertions will be off in prod builds).
I can think of 2 more ways that may make this easier to grok. One is: rather than doing everything in one go, the code from PropagateInlineEvals could just be moved and become a second step inside ReplaceAggregateNestedExpressionWithEval. It'd be the exact same amount of complexity that we have now and there'll still be an intermediate step where the query plan is inconsistent, but it's fully owned and controlled by a single rule and after the rule has run, the query plan is consistent again.
The second way is to try and run ReplaceAggregateNestedExpressionWithEval before substituting the InlineStats. That should automatically move any Eval node required for the agg into the left hand side of the inline join. I don't know how complex this change would be, though, but wanted to mention it for completeness' sake.
| public static LogicalPlan stubSource(UnaryPlan sourcePlan, LogicalPlan target) { | ||
| return sourcePlan.replaceChild(new StubRelation(sourcePlan.source(), StubRelation.computeOutput(sourcePlan, target))); | ||
| public static LogicalPlan stubSource(UnaryPlan destination, LogicalPlan target) { | ||
| return destination.replaceChild(StubRelation.of(destination, target)); |
There was a problem hiding this comment.
nit: it's a bit confusing that we renamed target to sourcePlan in SubRelation.java but not here.
| * Replaces the source of the {@code destination} plan with a stub, preserving the output from the {@code target} plan, which | ||
| * the stub substitutes (or theoretically points to). | ||
| */ |
There was a problem hiding this comment.
The added comments help, but I'm still struggling without an example. Can we add examples to the javadoc?
| stubRelationOutput.addAll(source.references().stream().filter(Attribute::synthetic).toList()); | ||
| private static List<Attribute> computeOutput(LogicalPlan destinationPlan, LogicalPlan sourcePlan) { | ||
| Set<Attribute> stubRelationOutput = new LinkedHashSet<>(sourcePlan.output()); | ||
| stubRelationOutput.addAll(destinationPlan.references().stream().toList()); |
There was a problem hiding this comment.
I think this is a (quite confusing) cover up for another bug.
This should only be needed if the LHS of the inline join fails to output some attributes that are needed for the aggregate node on the RHS of the inline join. But that's a broken state to begin with. Since this PR gets rid of such broken intermediate steps, it'd be great to fix this as well when we can.
I ran tests with this commented out. This seems to deal with #137187. It should not be needed in principle. Let's mark it via comment, so we can remove this hack/workaround once #137923 gets merged?
This is from ImplicitCastingMultiTypedDateTruncInlinestats_ByWithEvalWithFilter:
[2025-12-16T19:14:36,703][INFO ][o.e.x.e.o.L.changes ] [test-cluster-0] Rule logical.SubstituteSurrogatePlans applied with change
Limit[10000[INTEGER],false,false] = Limit[10000[INTEGER],false,false]
\_Limit[10[INTEGER],false,false] = \_Limit[10[INTEGER],false,false]
\_OrderBy[[Order[yr{r}#3834,DESC,FIRST], Order[hire_date{f}#3844,DESC,FIRST]]] = \_OrderBy[[Order[yr{r}#3834,DESC,FIRST], Order[hire_date{f}#3844,DESC,FIRST]]]
\_InlineStats[] ! \_InlineJoin[LEFT,[yr{r}#3834],[yr{r}#3834]]
\_Aggregate[[yr{r}#3834],[FilteredExpression[COUNT($$emp_no$converted_to$long{f$}#3845,true[BOOLEAN],PT0S[TIME_DURATION]),h ! |_Eval[[DATETRUNC(P1Y[DATE_PERIOD],hire_date{f}#3844) AS yr#3834]]
ire_date{f}#3844 > 66268800000000 ! | \_EsqlProject[[!emp_no, hire_date{f}#3844]]
0000[DATE_NANOS]] AS c#3837, yr{r}#3834]] ! | \_EsRelation[employees,employees_incompatible][!emp_no, hire_date{f}#3844, $$emp_no$converted_to$l..]
\_Eval[[DATETRUNC(P1Y[DATE_PERIOD],hire_date{f}#3844) AS yr#3834]] ! \_Aggregate[[yr{r}#3834],[FilteredExpression[COUNT($$emp_no$converted_to$long{f$}#3845,true[BOOLEAN],PT0S[TIME_DURATION]),h
\_EsqlProject[[!emp_no, hire_date{f}#3844]] ! ire_date{f}#3844 > 66268800000000
\_EsRelation[employees,employees_incompatible][!emp_no, hire_date{f}#3844, $$emp_no$converted_to$l..] ! 0000[DATE_NANOS]] AS c#3837, yr{r}#3834]]
! \_StubRelation[[!emp_no, hire_date{f}#3844, yr{r}#3834]]
Note that before the rule ran, the EsqlProject after the EsRelation was already faulty; it threw away the synthetic attribute $$emp_no$converted_to$... that's needed in the COUNT.
| * {@code EVAL `a + 1` = a + 1, `x % 2` = x % 2 | INLINE STATS SUM(`a+1`_ref) BY `x % 2`_ref} | ||
| */ | ||
| public final class ReplaceAggregateNestedExpressionWithEval extends OptimizerRules.OptimizerRule<Aggregate> { | ||
| public final class ReplaceAggregateNestedExpressionWithEval extends Rule<LogicalPlan, LogicalPlan> { |
There was a problem hiding this comment.
I'd add an example to the javadoc, so it's easier to expect what this does with inline stats.
| return inlineJoin; | ||
| } | ||
|
|
||
| private static LogicalPlan rule(Aggregate aggregate, @Nullable Holder<Eval> evalForIJHolder) { |
There was a problem hiding this comment.
Maybe evalForIJHolder -> evalForInlineJoin? I was puzzled what IJ was supposed to be for a bit.
This drops the PropagateInlineEvals rule that moves an Eval (or part of) it from the RHS of an InlineJoin to the LHS of it. Namely, the evaluation of the groups.
This can be done directly in the
ReplaceAggregateNestedExpressionWithEval rule, that creates these evaluations in the first place. This rule is now InlineJoins aware.
Closes #124754