Add support for top-level arithmetic ops to TS|STATS by felixbarny · Pull Request #140135 · elastic/elasticsearch

felixbarny · 2026-01-02T18:57:39Z

This is what's happening at a high level:

TranslateTimeSeriesAggregate now not only handles AggregateFunctions, but all Functions, including BinaryScalarFunctions
Going into TranslateTimeSeriesAggregate, the aggregates are not be split up into evals anymore. The TranslateTimeSeriesAggregate rule now runs earlier in the optimizer (before ReplaceAggregateNestedExpressionWithEval and friends).
- Enables adding all TimeSeriesAggregateFunctions to the first aggregation phase, without some TimeSeriesAggregateFunctions being placed in nested Evals.
- Also ensures sure we can properly insert the default last_over_time function for expressions like foo + 1 or max(foo + 1), before the inner foo + 1 is extracted into an eval.
Nested expressions in the groupings of the TimeSeriesAggregate are still be replaced with an eval to make time bucket handling easier.
Extracts the injection of the default last_over_time function outside of TranslateTimeSeriesAggregate and into the analysis phase, so that InsertFromAggregateMetricDouble runs after the insertion of last_over_time. If it would execute later, the last_over_time function can't be resolved for downsampling indices where metrics are of type aggregate_metric_double. It needs to run after field resolution as the injection of the default inner agg is type-dependent - we have a different strategy for histograms.
TimeSeriesGroupByAll has been moved from the initialize to the resolution phase of the analyzer - after InsertDefaultInnerTimeSeriesAggregate, so that it can take that into account. That also fixes a missing reference issue in the nested eval for queries like network.total_bytes_in * 8.

Queries that are supported now but weren't before:

Bare metric (with group-by-all)
- TS k8s | STATS network.cost
Group by all now supports post processing
- TS k8s | STATS network.cost | SORT network.cost
- Previously, there was a bug that complained about missing references as the id of the alias changed
Top-level arithmetic operations between metric and scalar
- TS k8s | STATS 10 + max(10 + network.total_bytes_in)
- Also supports implicit last_over_time and group-by-all
- TS k8s | STATS network.total_bytes_in * 8
Top-level arithmetic operations between metric and metric
- Also supports implicit last_over_time and group-by-all
- TS k8s | STATS in_n_out=network.eth0.rx + network.eth0.tx
- TS k8s | STATS max(last_over_time(network.eth0.tx::double) / (last_over_time(network.eth0.tx::double) + last_over_time(network.eth0.rx::double)))

closes #139570, #138702, #139580

Child PRs

PromQL support will be added in a follow-up:

PromQL: support for top-level binary operators #140541

pabloem · 2026-01-04T17:40:52Z

this is awesome. Thanks for tackling it @felixbarny

...c/main/java/org/elasticsearch/xpack/esql/analysis/InsertDefaultInnerTimeSeriesAggregate.java

.../java/org/elasticsearch/xpack/esql/optimizer/rules/logical/TranslateTimeSeriesAggregate.java

dnhatn · 2026-01-05T22:37:52Z

@felixbarny Thank you for tackling this. I've spent quite some time on this. I think TranslateTimeSeriesAggregate should be a rule in the Analyzer, not an optimizer rule. As you found, it should execute before we substitute expressions around aggregations. However, I think the approach you proposed can be fragile, since we might scatter the substitutions for aggregations and groupings. I played with these rules and I think we can move TranslateTimeSeriesAggregate to the Analyzer. Are you okay to continue working this issue? Otherwise, we can share the work.

felixbarny · 2026-01-06T18:21:51Z

I think TranslateTimeSeriesAggregate should be a rule in the Analyzer, not an optimizer rule.

I guess this implies that TranslatePromqlToTimeSeriesAggregate will also need to be executed during analysis so that it can run before TranslateTimeSeriesAggregate.

When I asked @costin why we don't run the PromQL translation during analysis, this was his response:

Preservation of Query Integrity: The optimization process must assume a valid input query and should not modify the abstract syntax tree (AST). This is critical to ensure that any validation failures are reported directly and accurately to the user, without obfuscation from query transformation.

Node Self-Sufficiency and Output Alignment: Each node within the query structure must fully and explicitly describe its own output. The existing discrepancy between the output reported by the PromQL tree and its actual translation needs to be resolved. This resolution should be handled during the tree assembly phase.

However, I think the approach you proposed can be fragile, since we might scatter the substitutions for aggregations and groupings.

Fair. Re-using ReplaceAggregateNestedExpressionWithEval but just for groupings was an attempt to simplify and reduce the surface area of the change. But I can try to do without it. I guess it'll still imply creating a nested eval for the time bucket so that both the first and the second pass groupings can re-use it. This will be very similar to what ReplaceAggregateNestedExpressionWithEval is doing, but isolated to just time bucket groupings. Is that what you had in mind?

What do you think of phasing out the change where in the first step, the PromQL and time series aggregate translation is still happening in the optimization phase, but runs early and also doesn't use ReplaceAggregateNestedExpressionWithEval internally? We can then follow-up and separately discuss whether to move it to the analysis phase, which should then also be simpler and incremental change. I'm happy to keep working on it.

dnhatn · 2026-01-06T18:34:05Z

What do you think of phasing out the change where in the first step, the PromQL and time series aggregate translation is still happening in the optimization phase, but runs early and also doesn't use ReplaceAggregateNestedExpressionWithEval internally? We can then follow-up and separately discuss whether to move it to the analysis phase, which should then also be simpler and incremental change. I'm happy to keep working on it.

++ Moving the PromQL and translate-time-series rules to the beginning of the logical optimizer rules is a good step toward moving them to the analysis phase.

…y-ops

elasticsearchmachine · 2026-01-12T09:08:04Z

Hi @felixbarny, I've created a changelog YAML for you.

elasticsearchmachine · 2026-01-12T09:08:05Z

Pinging @elastic/es-storage-engine (Team:StorageEngine)

kkrik-es

I like the direction, and the logic seems a bit cleaner. Will leave it to Nhat to approve.

sidosera

This looks great! I also love we move PromQL translation earlier in the chain.

I'm happy to accept to unblock, would still love to hear Nhat take when they get a chance.

…y-ops

dnhatn

One question, but this looks great. Thanks Felix for all the iterations.

dnhatn · 2026-01-12T19:15:16Z

...k/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/optimizer/LogicalPlanOptimizer.java

-            new PruneUnusedIndexMode(),
-            // after translating metric aggregates, we need to replace surrogate substitutions and nested expressions again.
+            // re-executing the next two rules is a relic of when time series aggregates were translated after surrogate substitution
+            // removing this would fail in ccs scenarios where the remote cluster is on an older version (caught by bwc tests)


I don't think we need to call SubstituteSurrogateAggregations twice consecutively.

This is how I had it initially but omitting it caused bwc tests to fail. Something related to class cast exceptions of different block instances.
Maybe we can look for a solution after merging this PR?

That works.

Examples of queries that are supported now: * `network.bytes_in * 8` * `network.eth0.rx + network.eth0.tx` * `max(network.total_bytes_in) * 8` * `network.total_bytes_in{cluster!="prod"} / network.total_bytes_in{cluster!="staging"}` Follow-up from elastic#140135

Examples of queries that are supported now: * `network.bytes_in * 8` * `network.eth0.rx + network.eth0.tx` * `max(network.total_bytes_in) * 8` * `network.total_bytes_in{cluster!="prod"} / network.total_bytes_in{cluster!="staging"}` Follow-up from #140135

Examples of queries that are supported now: * `network.bytes_in * 8` * `network.eth0.rx + network.eth0.tx` * `max(network.total_bytes_in) * 8` * `network.total_bytes_in{cluster!="prod"} / network.total_bytes_in{cluster!="staging"}` Follow-up from elastic#140135

felixbarny and others added 7 commits January 2, 2026 18:36

Improve wrapping of bare metrics in default over time function

ddc3039

[CI] Auto commit changes from spotless

b1f958a

Apply default inner agg during analysis, before amd handling

cfa98e5

Update test assertion

a1a7a3d

Apply spotless suggestions

cf6c949

Resolve all references in PromqlCommand

67203d5

Add support for top-level arithmetic ops to TS|STATS and PromQL

82d309f

elasticsearchmachine added external-contributor Pull request authored by a developer outside the Elasticsearch team v9.4.0 labels Jan 2, 2026

[CI] Auto commit changes from spotless

13d27a4

felixbarny linked an issue Jan 2, 2026 that may be closed by this pull request

ES|QL: Better validation for last_over_time #139580

Closed

dnhatn self-requested a review January 2, 2026 19:37

astefan reviewed Jan 5, 2026

View reviewed changes

astefan requested a review from costin January 5, 2026 16:56

This was referenced Jan 7, 2026

Apply group by all logic not only to top-level aggregates #140248

Merged

Make TimeSeriesAggregate TimestampAware #140270

Merged

felixbarny and others added 8 commits January 7, 2026 16:39

Merge remote-tracking branch 'origin/main' into ts-binary-ops

a5ea442

Don't replace nested expressions with eval when translating TS|STATS

5b7fe65

Revert changes in ReplaceAggregateNestedExpressionWithEval

9452cb7

Revert remaining changes to ReplaceAggregateNestedExpressionWithEval

7f07ee1

Avoid modifying the source

dadf068

Avoid adding over time aggregation to reference attributes

5e88fa2

Merge branch 'main' into ts-binary-ops

8b9730d

[CI] Auto commit changes from spotless

b3a9e11

felixbarny self-assigned this Jan 8, 2026

Merge remote-tracking branch 'origin/main' into ts-binary-ops

8849d14

felixbarny added 3 commits January 12, 2026 09:41

Merge remote-tracking branch 'felixbarny/ts-binary-ops' into ts-binar…

d2082dc

…y-ops

Fix table formatting

5af3c71

Add missing ts_stats_binary_ops capability to test cases

e620cef

felixbarny marked this pull request as ready for review January 12, 2026 09:07

felixbarny requested a review from kkrik-es January 12, 2026 09:07

elasticsearchmachine added the needs:triage Requires assignment of a team area label label Jan 12, 2026

felixbarny added >enhancement :StorageEngine/ES|QL Timeseries / metrics / PromQL / logsdb capabilities in ES|QL labels Jan 12, 2026

elasticsearchmachine added the Team:StorageEngine label Jan 12, 2026

elasticsearchmachine removed the needs:triage Requires assignment of a team area label label Jan 12, 2026

Update docs/changelog/140135.yaml

cb52fd1

kkrik-es reviewed Jan 12, 2026

View reviewed changes

sidosera approved these changes Jan 12, 2026

View reviewed changes

felixbarny and others added 4 commits January 12, 2026 16:50

Merge remote-tracking branch 'origin/main' into ts-binary-ops

51208fc

Merge remote-tracking branch 'felixbarny/ts-binary-ops' into ts-binar…

4c990c4

…y-ops

Merge branch 'main' into ts-binary-ops

5ec1491

[CI] Auto commit changes from spotless

3cd1581

dnhatn approved these changes Jan 12, 2026

View reviewed changes

felixbarny merged commit e82b7ca into elastic:main Jan 13, 2026
36 checks passed

felixbarny mentioned this pull request Jan 13, 2026

PromQL: support for top-level binary operators #140541

Merged

felixbarny mentioned this pull request Jan 15, 2026

org.elasticsearch.xpack.esql.core.QlIllegalArgumentException: Unsupported expression [AVG_OVER_TIME(pull_requests)] #140683

Closed

eranweiss-elastic pushed a commit to eranweiss-elastic/elasticsearch that referenced this pull request Jan 15, 2026

Add support for top-level arithmetic ops to TS|STATS (elastic#140135)

36f6ea5

This was referenced Jan 16, 2026

Add rule to wrap bare metrics in default over time function #140090

Merged

Better default for over_time aggregation #138702

Closed

spinscale pushed a commit to spinscale/elasticsearch that referenced this pull request Jan 21, 2026

Add support for top-level arithmetic ops to TS|STATS (elastic#140135)

5927749

Conversation

felixbarny commented Jan 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pabloem commented Jan 4, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dnhatn commented Jan 5, 2026

Uh oh!

felixbarny commented Jan 6, 2026

Uh oh!

dnhatn commented Jan 6, 2026

Uh oh!

elasticsearchmachine commented Jan 12, 2026

Uh oh!

elasticsearchmachine commented Jan 12, 2026

Uh oh!

kkrik-es left a comment

Choose a reason for hiding this comment

Uh oh!

sidosera left a comment

Choose a reason for hiding this comment

Uh oh!

dnhatn left a comment

Choose a reason for hiding this comment

Uh oh!

dnhatn Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

felixbarny Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

dnhatn Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

felixbarny commented Jan 2, 2026 •

edited

Loading