-
Notifications
You must be signed in to change notification settings - Fork 25.9k
ESQL: Validate unique plan attribute names #110488
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 3 commits
134a3e8
9d9c70f
0d4e1df
6f36c29
cd48514
3a5dab7
f71ef42
42be4eb
2a0d630
11da00c
0d6758e
0e13eaf
878eb99
e8609f2
5b4be72
2f1778a
73a9736
e3531ff
2cb23f5
222698c
1a648e4
658bebc
e2760a5
6ac0fa9
0f3c274
091c099
f478d59
6f00534
535c954
fb85e16
6f98cde
fdcc40e
3779900
07e9584
32f76a0
94737ff
ea9b9a9
f5d9568
a387165
233d68d
d0723b0
fb17126
bc354f4
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,5 @@ | ||
| pr: 110488 | ||
| summary: "ESQL: Validate unique plan attribute names" | ||
| area: ES|QL | ||
| type: bug | ||
| issues: [] |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -69,13 +69,6 @@ a:integer | b:integer | c:null | z:integer | |
| 1 | 2 | null | null | ||
| ; | ||
|
|
||
| evalRowWithNull2 | ||
| row a = 1, null, b = 2, c = null, null | eval z = a+b; | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This one had ambiguous attribute names.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we consider this a bug fix or a breaking change? (IMHO it's a bug, but I see it's questionable)
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think yes, this is a bug, since there's no way to refer to the attributes named |
||
|
|
||
| a:integer | null:null | b:integer | c:null | null:null | z:integer | ||
| 1 | null | 2 | null | null | 3 | ||
| ; | ||
|
|
||
| evalRowWithNull3 | ||
| row a = 1, b = 2, x = round(null) | eval z = a+b+x; | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -8,8 +8,10 @@ | |
| package org.elasticsearch.xpack.esql.optimizer; | ||
|
|
||
| import org.elasticsearch.xpack.esql.core.common.Failures; | ||
| import org.elasticsearch.xpack.esql.core.expression.Attribute; | ||
| import org.elasticsearch.xpack.esql.core.expression.AttributeSet; | ||
| import org.elasticsearch.xpack.esql.core.expression.Expressions; | ||
| import org.elasticsearch.xpack.esql.core.expression.NameId; | ||
| import org.elasticsearch.xpack.esql.core.plan.QueryPlan; | ||
| import org.elasticsearch.xpack.esql.core.plan.logical.LogicalPlan; | ||
| import org.elasticsearch.xpack.esql.plan.logical.Aggregate; | ||
|
|
@@ -36,6 +38,9 @@ | |
| import org.elasticsearch.xpack.esql.plan.physical.RowExec; | ||
| import org.elasticsearch.xpack.esql.plan.physical.ShowExec; | ||
|
|
||
| import java.util.HashSet; | ||
| import java.util.Set; | ||
|
|
||
| import static org.elasticsearch.xpack.esql.core.common.Failure.fail; | ||
|
|
||
| class OptimizerRules { | ||
|
|
@@ -49,9 +54,24 @@ void checkPlan(P p, Failures failures) { | |
| AttributeSet input = p.inputSet(); | ||
| AttributeSet generated = generates(p); | ||
| AttributeSet missing = refs.subtract(input).subtract(generated); | ||
| if (missing.size() > 0) { | ||
| if (missing.isEmpty() == false) { | ||
| failures.add(fail(p, "Plan [{}] optimized incorrectly due to missing references {}", p.nodeString(), missing)); | ||
| } | ||
|
|
||
| Set<String> outputAttributeNames = new HashSet<>(); | ||
| Set<NameId> outputAttributeIds = new HashSet<>(); | ||
| for (Attribute outputAttr : p.output()) { | ||
| if (outputAttributeNames.add(outputAttr.name()) == false || outputAttributeIds.add(outputAttr.id()) == false) { | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Actually, I changed my mind on this: It is the actual name that needs to be unique; otherwise, bugs could slip in because we somehow end up using qualifiers on accident; qualifiers are not respected by our optimization rules, e.g. If we end up using qualifiers after all (I think that's really for the future and we should really remove them until then), we can easily update the validation. |
||
| failures.add( | ||
| fail( | ||
| p, | ||
| "Plan [{}] optimized incorrectly due to duplicate output attribute {}", | ||
| p.nodeString(), | ||
| outputAttr.toString() | ||
| ) | ||
| ); | ||
| } | ||
| } | ||
| } | ||
|
|
||
| protected AttributeSet references(P p) { | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ambiguities were possible with
rowearlier; the message suggesting disambiguation does not make sense - that's not possible. This was carried over fromqland made sense for SQL. If we have ambiguities in ESQL, that's a bug IMO.