ESQL: Drop PropagateInlineEvals optimizer rule by bpintea · Pull Request #138270 · elastic/elasticsearch

bpintea · 2025-11-19T07:54:50Z

This drops the PropagateInlineEvals rule that moves an Eval (or part of) it from the RHS of an InlineJoin to the LHS of it. Namely, the evaluation of the groups.

This can be done directly in the
ReplaceAggregateNestedExpressionWithEval rule, that creates these evaluations in the first place. This rule is now InlineJoins aware.

Closes #124754

This drops the PropagateInlineEvals rule that moves an Eval (or part of) it from the RHS of an InlineJoin to the LHS of it. Namely, the evaluation of the groups. This can be done directly in the ReplaceAggregateNestedExpressionWithEval rule, that creates these evaluations in the first place. This rule is now InlineJoins aware.

elasticsearchmachine · 2025-11-19T07:55:15Z

Hi @bpintea, I've created a changelog YAML for you.

elasticsearchmachine · 2025-11-19T10:48:06Z

Pinging @elastic/es-analytical-engine (Team:Analytics)

bpintea · 2025-11-19T10:51:51Z

...k/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plan/logical/join/StubRelation.java

-    public static List<Attribute> computeOutput(LogicalPlan source, LogicalPlan target) {
-        Set<Attribute> stubRelationOutput = new LinkedHashSet<>(target.output());
-        stubRelationOutput.addAll(source.references().stream().filter(Attribute::synthetic).toList());
+    private static List<Attribute> computeOutput(LogicalPlan destinationPlan, LogicalPlan sourcePlan) {


I've updated the naming here as I struggled myself to match how they're used to what they're called (here and above). Can revert it my understanding and comments are wrong, though.

bpintea · 2025-11-19T10:52:03Z

x-pack/plugin/esql/qa/testFixtures/src/main/resources/inlinestats.csv-spec

These are cosmetic. Stumbled upon them during development, as tests were failing and realised they're hard to read.

bpintea · 2025-11-19T10:52:22Z

...a/org/elasticsearch/xpack/esql/optimizer/rules/logical/local/PushExpressionsToFieldLoad.java

-
-        addedAttrs.put(key, newFunctionAttr);
-        return newFunctionAttr;
+        return addedAttrs.computeIfAbsent(newFunctionAttr.ignoreId(), k -> newFunctionAttr);


This is unrelated too. I got to it as I was considering adding an IdIgnoringAttributeSet and was inspecting other places where this would help, then noticed this can be made easier to read.

bpintea · 2025-11-19T10:55:27Z

...gin/esql/src/test/java/org/elasticsearch/xpack/esql/optimizer/LogicalPlanOptimizerTests.java

Only added one extra test, existing tests do test the change already.

astefan

I spent two hours on the code in this PR and imho it's making things worse.
PropagateInlineEvals is a "clean" rule that is clear on what its intention are. Reading and understanding that code is easy and more people looking at it will understand what is it doing than looking at the new version in this PR.

In these two hours I tried to refactor the code in ReplaceAggregateNestedExpressionWithEval to eliminate the intricate logic that tries to do different things depending on the Aggregate being on the RHS of an InlineJoin or a regular Aggregate and, especially the code here makes it much harder to really understand the new logic. How this code was before was crystal clear, now I cannot understand what it tries to do even with comments.

Unless the code in this rule is refactored in a much cleaner way, with all the risks mentioned here (unless there is some other implication) I believe the original version is:

much cleaner for anyone experienced or beginner trying to read and approach this code
is making the InlineJoin implementation much less invasive, something that I think is paramount with InlineJoin.

I am making my review as "Comment" and will defer to the original author of the issue this PR is addressing - @alex-spies - to veto on the approval or rejection of this PR.

…nlineEvals

bpintea · 2025-12-15T20:00:01Z

especially the code here makes it much harder to really understand the new logic

FWIW, I've refactor a bit that part, extracting the Evals creating in own method. Should hopefully improve a bit the logic. I've also improved a bit the comments, though marginally.

the original version is:
much cleaner for anyone experienced or beginner trying to read and approach this code

I'm myself a bit on the fence a bit about this, at least when freshly reading the code: one rule corrects what other rule produced, which incorrectly, but only in some cases (with INLINE STATS); otherwise it's correct. I, for one, would prefer to deal with the complexity upfront, in one place (where possible).

is making the InlineJoin implementation much less invasive, something that I think is paramount with InlineJoin.

I do agree with this, however.
But don't know how maintainable is this separation: looking at how many rules are InlineJoin aware now (6), and nodes (2), this isolation doesn't look very promising.
(But this might become an issue for functionality like IN (<subquery>), where we might need to generalise this multi-staged execution.)

alex-spies

I like this change, thank you @bpintea .

I agree with @astefan that the resulting rule is a bit complex. Well, what we do is a bit complex :) But with this change, the optimizer doesn't generate a broken intermediate logical plan that needs to be fixed in a subsequent step. This makes the current state confusing and it will become even more confusing if/when someone decides to put another optimizer rule between PropagateInlineEvals and ReplaceAggregateNestedExpressionWithEval in the future.

I think such broken intermediate states should lead to test failures in the future because they make it hard to reason about query plans and optimizer rules if said rules cannot assume a consistent plan as input. So this change is required IMHO. I'd like to add corresponding assertions some time soon that should just fail our tests if they detect inconsistent intermediate query plans (such assertions will be off in prod builds).

I can think of 2 more ways that may make this easier to grok. One is: rather than doing everything in one go, the code from PropagateInlineEvals could just be moved and become a second step inside ReplaceAggregateNestedExpressionWithEval. It'd be the exact same amount of complexity that we have now and there'll still be an intermediate step where the query plan is inconsistent, but it's fully owned and controlled by a single rule and after the rule has run, the query plan is consistent again.

The second way is to try and run ReplaceAggregateNestedExpressionWithEval before substituting the InlineStats. That should automatically move any Eval node required for the agg into the left hand side of the inline join. I don't know how complex this change would be, though, but wanted to mention it for completeness' sake.

alex-spies · 2025-12-16T15:21:49Z

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plan/logical/join/InlineJoin.java

-    public static LogicalPlan stubSource(UnaryPlan sourcePlan, LogicalPlan target) {
-        return sourcePlan.replaceChild(new StubRelation(sourcePlan.source(), StubRelation.computeOutput(sourcePlan, target)));
+    public static LogicalPlan stubSource(UnaryPlan destination, LogicalPlan target) {
+        return destination.replaceChild(StubRelation.of(destination, target));


nit: it's a bit confusing that we renamed target to sourcePlan in SubRelation.java but not here.

alex-spies · 2025-12-16T15:23:41Z

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plan/logical/join/InlineJoin.java

+     * Replaces the source of the {@code destination} plan with a stub, preserving the output from the {@code target} plan, which
+     * the stub substitutes (or theoretically points to).
     */


The added comments help, but I'm still struggling without an example. Can we add examples to the javadoc?

alex-spies · 2025-12-16T15:30:55Z

...k/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plan/logical/join/StubRelation.java

-        stubRelationOutput.addAll(source.references().stream().filter(Attribute::synthetic).toList());
+    private static List<Attribute> computeOutput(LogicalPlan destinationPlan, LogicalPlan sourcePlan) {
+        Set<Attribute> stubRelationOutput = new LinkedHashSet<>(sourcePlan.output());
+        stubRelationOutput.addAll(destinationPlan.references().stream().toList());


I think this is a (quite confusing) cover up for another bug.

This should only be needed if the LHS of the inline join fails to output some attributes that are needed for the aggregate node on the RHS of the inline join. But that's a broken state to begin with. Since this PR gets rid of such broken intermediate steps, it'd be great to fix this as well when we can.

I ran tests with this commented out. This seems to deal with #137187. It should not be needed in principle. Let's mark it via comment, so we can remove this hack/workaround once #137923 gets merged?

This is from ImplicitCastingMultiTypedDateTruncInlinestats_ByWithEvalWithFilter:

[2025-12-16T19:14:36,703][INFO ][o.e.x.e.o.L.changes ] [test-cluster-0] Rule logical.SubstituteSurrogatePlans applied with change Limit[10000[INTEGER],false,false] = Limit[10000[INTEGER],false,false] \_Limit[10[INTEGER],false,false] = \_Limit[10[INTEGER],false,false] \_OrderBy[[Order[yr{r}#3834,DESC,FIRST], Order[hire_date{f}#3844,DESC,FIRST]]] = \_OrderBy[[Order[yr{r}#3834,DESC,FIRST], Order[hire_date{f}#3844,DESC,FIRST]]] \_InlineStats[] ! \_InlineJoin[LEFT,[yr{r}#3834],[yr{r}#3834]] \_Aggregate[[yr{r}#3834],[FilteredExpression[COUNT($$emp_no$converted_to$long{f$}#3845,true[BOOLEAN],PT0S[TIME_DURATION]),h ! |_Eval[[DATETRUNC(P1Y[DATE_PERIOD],hire_date{f}#3844) AS yr#3834]] ire_date{f}#3844 > 66268800000000 ! | \_EsqlProject[[!emp_no, hire_date{f}#3844]] 0000[DATE_NANOS]] AS c#3837, yr{r}#3834]] ! | \_EsRelation[employees,employees_incompatible][!emp_no, hire_date{f}#3844, $$emp_no$converted_to$l..] \_Eval[[DATETRUNC(P1Y[DATE_PERIOD],hire_date{f}#3844) AS yr#3834]] ! \_Aggregate[[yr{r}#3834],[FilteredExpression[COUNT($$emp_no$converted_to$long{f$}#3845,true[BOOLEAN],PT0S[TIME_DURATION]),h \_EsqlProject[[!emp_no, hire_date{f}#3844]] ! ire_date{f}#3844 > 66268800000000 \_EsRelation[employees,employees_incompatible][!emp_no, hire_date{f}#3844, $$emp_no$converted_to$l..] ! 0000[DATE_NANOS]] AS c#3837, yr{r}#3834]] ! \_StubRelation[[!emp_no, hire_date{f}#3844, yr{r}#3834]]

Note that before the rule ran, the EsqlProject after the EsRelation was already faulty; it threw away the synthetic attribute $$emp_no$converted_to$... that's needed in the COUNT.

alex-spies · 2025-12-16T17:10:20Z

...asticsearch/xpack/esql/optimizer/rules/logical/ReplaceAggregateNestedExpressionWithEval.java

 * {@code EVAL `a + 1` = a + 1, `x % 2` = x % 2 | INLINE STATS SUM(`a+1`_ref) BY `x % 2`_ref}
 */
-public final class ReplaceAggregateNestedExpressionWithEval extends OptimizerRules.OptimizerRule<Aggregate> {
+public final class ReplaceAggregateNestedExpressionWithEval extends Rule<LogicalPlan, LogicalPlan> {


I'd add an example to the javadoc, so it's easier to expect what this does with inline stats.

alex-spies · 2025-12-16T17:34:37Z

...asticsearch/xpack/esql/optimizer/rules/logical/ReplaceAggregateNestedExpressionWithEval.java

+        return inlineJoin;
+    }
+
+    private static LogicalPlan rule(Aggregate aggregate, @Nullable Holder<Eval> evalForIJHolder) {


Maybe evalForIJHolder -> evalForInlineJoin? I was puzzled what IJ was supposed to be for a bit.

bpintea added >enhancement :Analytics/ES|QL AKA ESQL v9.3.0 labels Nov 19, 2025

bpintea and others added 2 commits November 19, 2025 08:55

Update docs/changelog/138270.yaml

9f25f2b

[CI] Auto commit changes from spotless

b23ddc6

bpintea requested review from alex-spies and astefan November 19, 2025 10:47

bpintea marked this pull request as ready for review November 19, 2025 10:47

elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Nov 19, 2025

bpintea commented Nov 19, 2025

View reviewed changes

bpintea mentioned this pull request Nov 27, 2025

[ES|QL] Prevent fused fields from being pushed into StubRelations #138620

Closed

astefan reviewed Dec 4, 2025

View reviewed changes

bpintea added 4 commits December 15, 2025 16:54

Merge remote-tracking branch 'upstream/main' into enh/drop_PropagateI…

b6fc9ee

…nlineEvals

small refactorings, added more comments

642302a

refactor Evals creating in own method

7e5d0c0

Merge remote-tracking branch 'upstream/main' into enh/drop_PropagateI…

8df9a0a

…nlineEvals

alex-spies approved these changes Dec 16, 2025

View reviewed changes

elasticsearchmachine added v9.4.0 and removed v9.3.0 labels Dec 17, 2025

Conversation

bpintea commented Nov 19, 2025

Uh oh!

elasticsearchmachine commented Nov 19, 2025

Uh oh!

elasticsearchmachine commented Nov 19, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

astefan left a comment

Choose a reason for hiding this comment

Uh oh!

bpintea commented Dec 15, 2025

Uh oh!

alex-spies left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

alex-spies left a comment •

edited

Loading