Skip to content

ESQL: Drop PropagateInlineEvals optimizer rule#138270

Open
bpintea wants to merge 7 commits intoelastic:mainfrom
bpintea:enh/drop_PropagateInlineEvals
Open

ESQL: Drop PropagateInlineEvals optimizer rule#138270
bpintea wants to merge 7 commits intoelastic:mainfrom
bpintea:enh/drop_PropagateInlineEvals

Conversation

@bpintea
Copy link
Contributor

@bpintea bpintea commented Nov 19, 2025

This drops the PropagateInlineEvals rule that moves an Eval (or part of) it from the RHS of an InlineJoin to the LHS of it. Namely, the evaluation of the groups.

This can be done directly in the
ReplaceAggregateNestedExpressionWithEval rule, that creates these evaluations in the first place. This rule is now InlineJoins aware.

Closes #124754

This drops the PropagateInlineEvals rule that moves an Eval (or part of)
it from the RHS of an InlineJoin to the LHS of it. Namely, the
evaluation of the groups.

This can be done directly in the
ReplaceAggregateNestedExpressionWithEval rule, that creates these
evaluations in the first place. This rule is now InlineJoins aware.
@elasticsearchmachine
Copy link
Collaborator

Hi @bpintea, I've created a changelog YAML for you.

@bpintea bpintea marked this pull request as ready for review November 19, 2025 10:47
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Nov 19, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

public static List<Attribute> computeOutput(LogicalPlan source, LogicalPlan target) {
Set<Attribute> stubRelationOutput = new LinkedHashSet<>(target.output());
stubRelationOutput.addAll(source.references().stream().filter(Attribute::synthetic).toList());
private static List<Attribute> computeOutput(LogicalPlan destinationPlan, LogicalPlan sourcePlan) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the naming here as I struggled myself to match how they're used to what they're called (here and above). Can revert it my understanding and comments are wrong, though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are cosmetic. Stumbled upon them during development, as tests were failing and realised they're hard to read.


addedAttrs.put(key, newFunctionAttr);
return newFunctionAttr;
return addedAttrs.computeIfAbsent(newFunctionAttr.ignoreId(), k -> newFunctionAttr);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unrelated too. I got to it as I was considering adding an IdIgnoringAttributeSet and was inspecting other places where this would help, then noticed this can be made easier to read.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only added one extra test, existing tests do test the change already.

Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spent two hours on the code in this PR and imho it's making things worse.
PropagateInlineEvals is a "clean" rule that is clear on what its intention are. Reading and understanding that code is easy and more people looking at it will understand what is it doing than looking at the new version in this PR.

In these two hours I tried to refactor the code in ReplaceAggregateNestedExpressionWithEval to eliminate the intricate logic that tries to do different things depending on the Aggregate being on the RHS of an InlineJoin or a regular Aggregate and, especially the code here makes it much harder to really understand the new logic. How this code was before was crystal clear, now I cannot understand what it tries to do even with comments.

Unless the code in this rule is refactored in a much cleaner way, with all the risks mentioned here (unless there is some other implication) I believe the original version is:

  • much cleaner for anyone experienced or beginner trying to read and approach this code
  • is making the InlineJoin implementation much less invasive, something that I think is paramount with InlineJoin.

I am making my review as "Comment" and will defer to the original author of the issue this PR is addressing - @alex-spies - to veto on the approval or rejection of this PR.

@bpintea
Copy link
Contributor Author

bpintea commented Dec 15, 2025

especially the code here makes it much harder to really understand the new logic

FWIW, I've refactor a bit that part, extracting the Evals creating in own method. Should hopefully improve a bit the logic. I've also improved a bit the comments, though marginally.

the original version is:
much cleaner for anyone experienced or beginner trying to read and approach this code

I'm myself a bit on the fence a bit about this, at least when freshly reading the code: one rule corrects what other rule produced, which incorrectly, but only in some cases (with INLINE STATS); otherwise it's correct. I, for one, would prefer to deal with the complexity upfront, in one place (where possible).

is making the InlineJoin implementation much less invasive, something that I think is paramount with InlineJoin.

I do agree with this, however.
But don't know how maintainable is this separation: looking at how many rules are InlineJoin aware now (6), and nodes (2), this isolation doesn't look very promising.
(But this might become an issue for functionality like IN (<subquery>), where we might need to generalise this multi-staged execution.)

Copy link
Contributor

@alex-spies alex-spies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this change, thank you @bpintea .

I agree with @astefan that the resulting rule is a bit complex. Well, what we do is a bit complex :) But with this change, the optimizer doesn't generate a broken intermediate logical plan that needs to be fixed in a subsequent step. This makes the current state confusing and it will become even more confusing if/when someone decides to put another optimizer rule between PropagateInlineEvals and ReplaceAggregateNestedExpressionWithEval in the future.

I think such broken intermediate states should lead to test failures in the future because they make it hard to reason about query plans and optimizer rules if said rules cannot assume a consistent plan as input. So this change is required IMHO. I'd like to add corresponding assertions some time soon that should just fail our tests if they detect inconsistent intermediate query plans (such assertions will be off in prod builds).

I can think of 2 more ways that may make this easier to grok. One is: rather than doing everything in one go, the code from PropagateInlineEvals could just be moved and become a second step inside ReplaceAggregateNestedExpressionWithEval. It'd be the exact same amount of complexity that we have now and there'll still be an intermediate step where the query plan is inconsistent, but it's fully owned and controlled by a single rule and after the rule has run, the query plan is consistent again.

The second way is to try and run ReplaceAggregateNestedExpressionWithEval before substituting the InlineStats. That should automatically move any Eval node required for the agg into the left hand side of the inline join. I don't know how complex this change would be, though, but wanted to mention it for completeness' sake.

public static LogicalPlan stubSource(UnaryPlan sourcePlan, LogicalPlan target) {
return sourcePlan.replaceChild(new StubRelation(sourcePlan.source(), StubRelation.computeOutput(sourcePlan, target)));
public static LogicalPlan stubSource(UnaryPlan destination, LogicalPlan target) {
return destination.replaceChild(StubRelation.of(destination, target));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it's a bit confusing that we renamed target to sourcePlan in SubRelation.java but not here.

Comment on lines +59 to 61
* Replaces the source of the {@code destination} plan with a stub, preserving the output from the {@code target} plan, which
* the stub substitutes (or theoretically points to).
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The added comments help, but I'm still struggling without an example. Can we add examples to the javadoc?

stubRelationOutput.addAll(source.references().stream().filter(Attribute::synthetic).toList());
private static List<Attribute> computeOutput(LogicalPlan destinationPlan, LogicalPlan sourcePlan) {
Set<Attribute> stubRelationOutput = new LinkedHashSet<>(sourcePlan.output());
stubRelationOutput.addAll(destinationPlan.references().stream().toList());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a (quite confusing) cover up for another bug.

This should only be needed if the LHS of the inline join fails to output some attributes that are needed for the aggregate node on the RHS of the inline join. But that's a broken state to begin with. Since this PR gets rid of such broken intermediate steps, it'd be great to fix this as well when we can.

I ran tests with this commented out. This seems to deal with #137187. It should not be needed in principle. Let's mark it via comment, so we can remove this hack/workaround once #137923 gets merged?

This is from ImplicitCastingMultiTypedDateTruncInlinestats_ByWithEvalWithFilter:

[2025-12-16T19:14:36,703][INFO ][o.e.x.e.o.L.changes      ] [test-cluster-0] Rule logical.SubstituteSurrogatePlans applied with change
Limit[10000[INTEGER],false,false]                                                                                                 = Limit[10000[INTEGER],false,false]
\_Limit[10[INTEGER],false,false]                                                                                                  = \_Limit[10[INTEGER],false,false]
  \_OrderBy[[Order[yr{r}#3834,DESC,FIRST], Order[hire_date{f}#3844,DESC,FIRST]]]                                                  =   \_OrderBy[[Order[yr{r}#3834,DESC,FIRST], Order[hire_date{f}#3844,DESC,FIRST]]]
    \_InlineStats[]                                                                                                               !     \_InlineJoin[LEFT,[yr{r}#3834],[yr{r}#3834]]
      \_Aggregate[[yr{r}#3834],[FilteredExpression[COUNT($$emp_no$converted_to$long{f$}#3845,true[BOOLEAN],PT0S[TIME_DURATION]),h !       |_Eval[[DATETRUNC(P1Y[DATE_PERIOD],hire_date{f}#3844) AS yr#3834]]
ire_date{f}#3844 > 66268800000000                                                                                                 !       | \_EsqlProject[[!emp_no, hire_date{f}#3844]]
0000[DATE_NANOS]] AS c#3837, yr{r}#3834]]                                                                                         !       |   \_EsRelation[employees,employees_incompatible][!emp_no, hire_date{f}#3844, $$emp_no$converted_to$l..]
        \_Eval[[DATETRUNC(P1Y[DATE_PERIOD],hire_date{f}#3844) AS yr#3834]]                                                        !       \_Aggregate[[yr{r}#3834],[FilteredExpression[COUNT($$emp_no$converted_to$long{f$}#3845,true[BOOLEAN],PT0S[TIME_DURATION]),h
          \_EsqlProject[[!emp_no, hire_date{f}#3844]]                                                                             ! ire_date{f}#3844 > 66268800000000
            \_EsRelation[employees,employees_incompatible][!emp_no, hire_date{f}#3844, $$emp_no$converted_to$l..]                 ! 0000[DATE_NANOS]] AS c#3837, yr{r}#3834]]
                                                                                                                                  !         \_StubRelation[[!emp_no, hire_date{f}#3844, yr{r}#3834]]

Note that before the rule ran, the EsqlProject after the EsRelation was already faulty; it threw away the synthetic attribute $$emp_no$converted_to$... that's needed in the COUNT.

* {@code EVAL `a + 1` = a + 1, `x % 2` = x % 2 | INLINE STATS SUM(`a+1`_ref) BY `x % 2`_ref}
*/
public final class ReplaceAggregateNestedExpressionWithEval extends OptimizerRules.OptimizerRule<Aggregate> {
public final class ReplaceAggregateNestedExpressionWithEval extends Rule<LogicalPlan, LogicalPlan> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd add an example to the javadoc, so it's easier to expect what this does with inline stats.

return inlineJoin;
}

private static LogicalPlan rule(Aggregate aggregate, @Nullable Holder<Eval> evalForIJHolder) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe evalForIJHolder -> evalForInlineJoin? I was puzzled what IJ was supposed to be for a bit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ESQL: Simplify PropagateInlineEvals

4 participants