Skip to content

ES|QL: Improve aggregation over constants handling#112392

Open
astefan wants to merge 26 commits intoelastic:mainfrom
astefan:aggregations_over_constants
Open

ES|QL: Improve aggregation over constants handling#112392
astefan wants to merge 26 commits intoelastic:mainfrom
astefan:aggregations_over_constants

Conversation

@astefan
Copy link
Contributor

@astefan astefan commented Aug 30, 2024

This change consists of:

  • Add separate rule for dealing with nulls in aggregations
  • Duplicate SubstituteSurrogate in "Operator Optimization" batch
  • Many more tests
  • Add test for median_absolute_deviation function
  • Add mv handling to top function
  • Allows PropagateEvalFoldables rule to also deal with aggregate functions

Addresses part of #100634. Missing bits:

Fixes #110257
Fixes #104430
Fixes #100170

Needs more tests for the new rule and the existent ones in LogicalPlanOptimizerTests.

Duplicate SubstituteSurrogate in "Operator Optimization" batch
Many more tests
Add tests for mad
Add mv handling to top function
null |null |null
;

########### failing :-( with InvalidArgumentException: Does not support yet aggregations over constants
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be removed once we have mv_ function for st_centroid_agg.

@@ -197,11 +202,18 @@ public AggregatorFunctionSupplier supplier(List<Integer> inputChannels) {
public Expression surrogate() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The surrogate method, as it stands now, is more a "surrogate-expression-for-foldable-scenario" kind of method. This implies that the behavior that existed below before this change is not possible anymore.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right - we should probably look into introducing a different interface altogether: surrogate was initially used for expressions that knew they'd be transformed.
But it evolved into a mechanism for "folding" however not to a value, but another expression (which itself might be foldable or not).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll leave this one for a follow up I think.

* All aggregate functions that are also nullable (COUNT_DISTINCT and COUNT are exceptions), will get a NULL
* field replacement by the FoldNull rule, COUNT_DISTINCT will benefit from PropagateEvalFoldables.
*/
public final class ReplaceAggregatesWithNull extends OptimizerRules.OptimizerRule<Aggregate> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This rule is a simplified variant of SubstituteSurrogates.

Map<AggregateFunction, Attribute> aggFuncToAttr = new HashMap<>(); // existing aggregate and their respective attributes
List<Alias> transientEval = new ArrayList<>(); // surrogate functions eval
boolean changed = false;
boolean hasSurrogates = false;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've done this to shortcircuit the execution earlier in the execution.

assertEquals(countd, rule.rule(countd));
countd = new CountDistinct(EMPTY, NULL, NULL);
assertEquals(new Literal(EMPTY, null, LONG), rule.rule(countd));
assertEquals(countd, rule.rule(countd));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the consequence of CountDistinct not being nullable anymore.

Copy link
Member

@costin costin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - great tests and comments!


@Override
public Nullability nullable() {
return Nullability.FALSE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@@ -197,11 +202,18 @@ public AggregatorFunctionSupplier supplier(List<Integer> inputChannels) {
public Expression surrogate() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right - we should probably look into introducing a different interface altogether: surrogate was initially used for expressions that knew they'd be transformed.
But it evolved into a mechanism for "folding" however not to a value, but another expression (which itself might be foldable or not).


@Override
public Expression surrogate() {
return field().foldable() ? field() : null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Values not only merges values, but also removes duplicates (If no test was triggered because of this, we should add some!)

ROW x = [1, 1, 2] | STATS a = VALUES(x)

-> [1, 2]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, BUT I think we have a problem with the documentation. It's not mentioning this aspect. There were other misses in our functions docs (which are fixed in this PR), I think we need to review our documentation on functions and double check its correctness and completeness. I will create an issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ivancea thank you for pro-actively checking this PR 🙏, that was very helpful.
I've created two issues:

Copy link
Contributor

@alex-spies alex-spies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great @astefan ! I think this change is sound and added mostly minor remarks.

My only major remark is: I think we need LogicalPlanOptimizerTests cases that prove that the foldable propagation actually takes place. The csv tests are great, but they do not prove that foldable propagation actually takes place, only that the result is correct.

But you already mentioned more optimizer tests as one of the tasks to un-draft :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of scope: stats.csv-spec has a bunch of ...OfConst tests that overlap a lot with the tests here, except that they normally start with from employees. Because these test stats more than row, maybe we should move test cases like row ... | stats ... from here to stats.csv-spec in a follow-up PR.

} else if (p instanceof Aggregate agg) {
List<NamedExpression> newAggs = new ArrayList<>(agg.aggregates().size());
agg.aggregates().forEach(e -> {
if (Alias.unwrap(e) instanceof AggregateFunction) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it cannot propagate into the groups, as in

... | eval x = [1,2,3] | stats sum(field) by x

right? Maybe it's worth adding a comment.

That's another thing we could optimize though if needed, as I think STATS ... BY const is the same as STATS ... | eval x = mv_values(const) | mv_expand x. Not sure that's worth maintaining an optimization rule for, though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's not covered. Unintentionally, just I didn't think about this use case.
Will leave it for a follow up, though. There are many things going on in this PR.

Comment on lines 91 to 102
} else {
// All aggs actually have been optimized away
// \_Aggregate[[],[AVG([NULL][NULL]) AS s]]
// Replace by a local relation with one row, followed by an eval, e.g.
// \_Eval[[MVAVG([NULL][NULL]) AS s]]
// \_LocalRelation[[{e}#21],[ConstantNullBlock[positions=1]]]
plan = new LocalRelation(
source,
List.of(new EmptyAttribute(source)),
LocalSupplier.of(new Block[] { BlockUtils.constantBlock(PlannerUtils.NON_BREAKING_BLOCK_FACTORY, null, 1) })
);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code in lines 86-106 also happens in SubstituteSurrogates, and kinda-sorta also in ReplaceStatsAggExpressionWithEval. I opened #110345 but maybe, instead of reducing the number of opt. rules, we should just refactor the code path that moves expressions out of aggregates and into evals. We could start here and make sure the code is the same as in SubstituteSurrogates.

@astefan astefan marked this pull request as ready for review October 3, 2024 09:53
@elasticsearchmachine elasticsearchmachine added the needs:triage Requires assignment of a team area label label Oct 3, 2024
@astefan astefan added >bug auto-backport-and-merge and removed needs:triage Requires assignment of a team area label labels Oct 3, 2024
@alex-spies alex-spies requested review from alex-spies and removed request for alex-spies November 8, 2024 14:13
Copy link
Contributor

@alex-spies alex-spies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heya, just wanted to pick this up again and summarize what I think we need to do:

  • Per the discussion with Costin, percentile(field, null) is still not well defined (null vs invalid query) - okay to hash this out in a follow-up but maybe invalidating for now is safer w.r.t. bwc.
  • ST_CENTROID_AGG(null) should return null instead of Point(NaN NaN).
  • Some additional test cases won't hurt.
  • Fixing this edge case where all agg functions are optimized away

Consider this unblocked from my side as my main reason for requesting changes was the discussion about percentile(field, null) and similar cases. We could solve some problems in follow-up PRs as well, as I think the general approach here works :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have to hash this out now, but maybe it'd be safer to start with a validation exception now - we can still go back and return null later, while the other way around could be considered a breaking change, albeit in a very edge case scenario.

@felixbarny
Copy link
Member

Is this PR superseded by #139797 or is it something we're planning to do additionally?

@astefan
Copy link
Contributor Author

astefan commented Jan 16, 2026

Is this PR superseded by #139797 or is it something we're planning to do additionally?

I haven't checked the other PR, only speaking about this one I created and explored some time ago.
If that other PR uses the same tests that this PR has in it (which I believe is the better part of the PR) and they all pass, I think it would ok to close this PR and consider it superseded by #139797.

@ivancea
Copy link
Contributor

ivancea commented Jan 16, 2026

I'll take a look in case the other PR missed something we should move 👀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL auto-backport Automatically create backport pull requests when merged >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.19.0 v9.4.0

Projects

None yet

9 participants