ESQL: Fix wrong attribute shadowing in pushdown rules#105650
ESQL: Fix wrong attribute shadowing in pushdown rules#105650elasticsearchmachine merged 24 commits intoelastic:mainfrom
Conversation
d2fe543 to
74ec036
Compare
74ec036 to
726fee1
Compare
|
I tried an alternative solution as well which turned |
| Bezalel Simmel | Bezalel | Simmel | ||
| ; | ||
|
|
||
| overwriteNameAfterSort#[skip:-8.13.0] |
There was a problem hiding this comment.
I think we should backport this bugfix to 8.13, so I'm setting the skip labels accordingly.
This is unique and should avoid name clashes better.
|
Also thanks to @luigidellaquila for suggesting this approach, which updates the attributes in the |
luigidellaquila
left a comment
There was a problem hiding this comment.
Thanks @alex-spies, looks pretty good.
I left only one comment about possible improvements on tests
...gin/esql/src/test/java/org/elasticsearch/xpack/esql/optimizer/LogicalPlanOptimizerTests.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/esql/qa/testFixtures/src/main/resources/eval.csv-spec
Outdated
Show resolved
Hide resolved
costin
left a comment
There was a problem hiding this comment.
Looks good - the code is duplicated 3 times so please encapsulate it.
Add more tests also some with references sort x | eval y = x, x = y, chained to guarantee the nature of the plan before the rule kicks in.
| if (existingAlias == null) { | ||
| String tempName = SubstituteSurrogates.rawTemporaryName(attr.name(), "temp_name", (int) attr.id().toLong()); | ||
|
|
||
| Alias tempNameForShadowedAttr = new Alias(Source.EMPTY, tempName, attr); |
There was a problem hiding this comment.
Preserve the source (useful to see where this came from) and set the synthetic parameter to true.
There was a problem hiding this comment.
I agree this should be synthetic. In particular, it should not be computed on a data node, that would just send duplicate columns over the wire.
However, currently this does not just work; if I mark these aliases as synth, the ProjectAwayColumns physical optimizer rule prunes them, but I get an NPE because now they are seemingly computed nowhere. (Happens precisely here.)
This requires additional investigation to get right. How about I create a separate issue for that? For the time being, I'll mark it as TODO.
There was a problem hiding this comment.
Please raise an issue to double check the behaviour in ProjectAwayColumns.
...k/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/optimizer/LogicalPlanOptimizer.java
Outdated
Show resolved
Hide resolved
...k/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/optimizer/LogicalPlanOptimizer.java
Outdated
Show resolved
Hide resolved
...k/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/optimizer/LogicalPlanOptimizer.java
Outdated
Show resolved
Hide resolved
|
|
||
| AttributeMap<Alias> aliasesForShadowedOrderByAttrs = nonShadowedOrders.replacedAttributes; | ||
| @SuppressWarnings("unchecked") | ||
| List<Order> newOrder = (List<Order>) (Object) nonShadowedOrders.rewrittenExpressions; |
There was a problem hiding this comment.
(List<Order>) (Object)? Everything is an object - if anything cast the object to List directly.
There was a problem hiding this comment.
Casting directly does not compile (Inconvertible types), but I can make this slightly less ugly by using List<?> instead of Object.
...k/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/optimizer/LogicalPlanOptimizer.java
Outdated
Show resolved
Hide resolved
| if (aliasesForShadowedOrderByAttrs.isEmpty() == false) { | ||
| List<Alias> newAliases = aliasesForShadowedOrderByAttrs.values().stream().toList(); | ||
|
|
||
| LogicalPlan plan = new Eval(Source.EMPTY, orderBy.child(), newAliases); |
There was a problem hiding this comment.
I think using the source of the OrderBy is the best option - here, we're renaming some attributes for it. (Current node source contains the already overwritten/shadowed attributes, instead.)
| LogicalPlan plan = new Eval(Source.EMPTY, orderBy.child(), newAliases); | ||
| plan = eval.replaceChild(plan); | ||
| plan = new OrderBy(orderBy.source(), plan, newOrder); | ||
| plan = new EsqlProject(Source.EMPTY, plan, eval.output()); |
There was a problem hiding this comment.
Use Project directly and preserve the source.
| AttributeSet generates = new AttributeSet(Expressions.asAttributes(enrichFields)); | ||
| // In case of aliases we generate both the alias and its target. | ||
| // E.g. in ENRICH policy ON field WITH alias = enrich_field | ||
| // we generate both `alias` and `enrich_field`. |
There was a problem hiding this comment.
This looks incorrect as enrich_field is hidden behind alias.
There was a problem hiding this comment.
Yeah, it's weird that we have to do this. I found a place in the physical optimizer that needs to handle Enrich in a similarly weird way.
To avoid cementing this weirdness, I'll remove this from the PR, at the cost of having to disable the dependency checker again. I'll follow this up with another PR, after investigating the weird handling of attributes in Enrich.
There was a problem hiding this comment.
Raise an issue for the Enrich problem to make sure it doesn't get lost.
x-pack/plugin/ql/src/main/java/org/elasticsearch/xpack/ql/expression/NameId.java
Outdated
Show resolved
Hide resolved
I'll add a CSV test for this; the added |
alex-spies
left a comment
There was a problem hiding this comment.
Thanks for your remarks, @costin and @luigidellaquila ! Should be ready for the next round, now :)
costin
left a comment
There was a problem hiding this comment.
Thanks - the improvements are visible. LGTM
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
|
Hi @alex-spies, I've created a changelog YAML for you. |
Fix elastic#105434 Fixes accidental shadowing when pushing down `GROK`/`DISSECT`, `EVAL` or `ENRICH` past a `SORT`. Example for how this works: ``` ... | SORT x | EVAL x = y ... pushing this down just like that would be incorrect as x is used in the SORT, so we turn this essentially into ... | EVAL $$x = x | EVAL x = y | SORT $$x | DROP $$x ... ``` The same logic is applied to `GROK`/`DISSECT` and `ENRICH`. This allows to re-enable the dependency checker (after fixing a small bug in it when handling `ENRICH`).
💚 Backport successful
|
#105808) * ESQL: Fix wrong attribute shadowing in pushdown rules (#105650) Fix #105434 Fixes accidental shadowing when pushing down `GROK`/`DISSECT`, `EVAL` or `ENRICH` past a `SORT`. Example for how this works: ``` ... | SORT x | EVAL x = y ... pushing this down just like that would be incorrect as x is used in the SORT, so we turn this essentially into ... | EVAL $$x = x | EVAL x = y | SORT $$x | DROP $$x ... ``` The same logic is applied to `GROK`/`DISSECT` and `ENRICH`. This allows to re-enable the dependency checker (after fixing a small bug in it when handling `ENRICH`). * Make OptimizerRules compile again
Fix #105434
Fixes accidental shadowing when pushing down
GROK/DISSECT,EVALorENRICHpast aSORT.Example for how this works:
The same logic is applied to
GROK/DISSECTandENRICH.This allows to re-enable the dependency checker (after fixing a small bug in it when handling
ENRICH).