Add support for top-level arithmetic ops to TS|STATS#140135
Add support for top-level arithmetic ops to TS|STATS#140135felixbarny merged 49 commits intoelastic:mainfrom
Conversation
|
this is awesome. Thanks for tackling it @felixbarny |
...c/main/java/org/elasticsearch/xpack/esql/analysis/InsertDefaultInnerTimeSeriesAggregate.java
Outdated
Show resolved
Hide resolved
...c/main/java/org/elasticsearch/xpack/esql/analysis/InsertDefaultInnerTimeSeriesAggregate.java
Outdated
Show resolved
Hide resolved
...c/main/java/org/elasticsearch/xpack/esql/analysis/InsertDefaultInnerTimeSeriesAggregate.java
Outdated
Show resolved
Hide resolved
.../java/org/elasticsearch/xpack/esql/optimizer/rules/logical/TranslateTimeSeriesAggregate.java
Outdated
Show resolved
Hide resolved
|
@felixbarny Thank you for tackling this. I've spent quite some time on this. I think TranslateTimeSeriesAggregate should be a rule in the Analyzer, not an optimizer rule. As you found, it should execute before we substitute expressions around aggregations. However, I think the approach you proposed can be fragile, since we might scatter the substitutions for aggregations and groupings. I played with these rules and I think we can move TranslateTimeSeriesAggregate to the Analyzer. Are you okay to continue working this issue? Otherwise, we can share the work. |
I guess this implies that When I asked @costin why we don't run the PromQL translation during analysis, this was his response:
Fair. Re-using What do you think of phasing out the change where in the first step, the PromQL and time series aggregate translation is still happening in the optimization phase, but runs early and also doesn't use |
++ Moving the PromQL and translate-time-series rules to the beginning of the logical optimizer rules is a good step toward moving them to the analysis phase. |
|
Hi @felixbarny, I've created a changelog YAML for you. |
|
Pinging @elastic/es-storage-engine (Team:StorageEngine) |
kkrik-es
left a comment
There was a problem hiding this comment.
I like the direction, and the logic seems a bit cleaner. Will leave it to Nhat to approve.
sidosera
left a comment
There was a problem hiding this comment.
This looks great! I also love we move PromQL translation earlier in the chain.
I'm happy to accept to unblock, would still love to hear Nhat take when they get a chance.
dnhatn
left a comment
There was a problem hiding this comment.
One question, but this looks great. Thanks Felix for all the iterations.
| new PruneUnusedIndexMode(), | ||
| // after translating metric aggregates, we need to replace surrogate substitutions and nested expressions again. | ||
| // re-executing the next two rules is a relic of when time series aggregates were translated after surrogate substitution | ||
| // removing this would fail in ccs scenarios where the remote cluster is on an older version (caught by bwc tests) |
There was a problem hiding this comment.
I don't think we need to call SubstituteSurrogateAggregations twice consecutively.
There was a problem hiding this comment.
This is how I had it initially but omitting it caused bwc tests to fail. Something related to class cast exceptions of different block instances.
Maybe we can look for a solution after merging this PR?
Examples of queries that are supported now:
* `network.bytes_in * 8`
* `network.eth0.rx + network.eth0.tx`
* `max(network.total_bytes_in) * 8`
* `network.total_bytes_in{cluster!="prod"} / network.total_bytes_in{cluster!="staging"}`
Follow-up from elastic#140135
Examples of queries that are supported now:
* `network.bytes_in * 8`
* `network.eth0.rx + network.eth0.tx`
* `max(network.total_bytes_in) * 8`
* `network.total_bytes_in{cluster!="prod"} / network.total_bytes_in{cluster!="staging"}`
Follow-up from #140135
Examples of queries that are supported now:
* `network.bytes_in * 8`
* `network.eth0.rx + network.eth0.tx`
* `max(network.total_bytes_in) * 8`
* `network.total_bytes_in{cluster!="prod"} / network.total_bytes_in{cluster!="staging"}`
Follow-up from elastic#140135
This is what's happening at a high level:
TranslateTimeSeriesAggregatenow not only handlesAggregateFunctions, but allFunctions, includingBinaryScalarFunctionsTranslateTimeSeriesAggregate, the aggregates are not be split up into evals anymore. TheTranslateTimeSeriesAggregaterule now runs earlier in the optimizer (beforeReplaceAggregateNestedExpressionWithEvaland friends).TimeSeriesAggregateFunctions to the first aggregation phase, without someTimeSeriesAggregateFunctions being placed in nestedEvals.last_over_timefunction for expressions likefoo + 1ormax(foo + 1), before the innerfoo + 1is extracted into an eval.TimeSeriesAggregateare still be replaced with an eval to make time bucket handling easier.last_over_timefunction outside ofTranslateTimeSeriesAggregateand into the analysis phase, so thatInsertFromAggregateMetricDoubleruns after the insertion oflast_over_time. If it would execute later, thelast_over_timefunction can't be resolved for downsampling indices where metrics are of typeaggregate_metric_double. It needs to run after field resolution as the injection of the default inner agg is type-dependent - we have a different strategy for histograms.TimeSeriesGroupByAllhas been moved from the initialize to the resolution phase of the analyzer - afterInsertDefaultInnerTimeSeriesAggregate, so that it can take that into account. That also fixes a missing reference issue in the nested eval for queries likenetwork.total_bytes_in * 8.Queries that are supported now but weren't before:
TS k8s | STATS network.costTS k8s | STATS network.cost | SORT network.costTS k8s | STATS 10 + max(10 + network.total_bytes_in)TS k8s | STATS network.total_bytes_in * 8TS k8s | STATS in_n_out=network.eth0.rx + network.eth0.txTS k8s | STATS max(last_over_time(network.eth0.tx::double) / (last_over_time(network.eth0.tx::double) + last_over_time(network.eth0.rx::double)))closes #139570, #138702, #139580
Child PRs
TimeSeriesAggregateTimestampAware#140270PromQL support will be added in a follow-up: