Skip to content
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
134a3e8
Unique output attribute names after optimization
alex-spies Jul 4, 2024
9d9c70f
Enforce unique row attribute names in verifier
alex-spies Jul 4, 2024
0d4e1df
Update docs/changelog/110488.yaml
alex-spies Jul 4, 2024
6f36c29
Add tests for grok, dissect, enrich
alex-spies Jul 5, 2024
cd48514
Add tests for keep
alex-spies Jul 5, 2024
3a5dab7
Make row consistent with other plans
alex-spies Jul 5, 2024
f71ef42
Update docs/changelog/110488.yaml
alex-spies Jul 5, 2024
42be4eb
Add test for drop, rename and stats
alex-spies Jul 5, 2024
2a0d630
Merge remote-tracking branch 'upstream/main' into validate-unique-pla…
alex-spies Jul 5, 2024
11da00c
Add test dataset with deeper field hierarchy
alex-spies Jul 9, 2024
0d6758e
Add hierarchical shadowing test for eval
alex-spies Jul 9, 2024
0e13eaf
Add hierarchical tests for drop, dissect
alex-spies Jul 9, 2024
878eb99
Add hierarchical tests for enrich
alex-spies Jul 9, 2024
e8609f2
Add hierarchical tests for grok, keep
alex-spies Jul 9, 2024
5b4be72
Add hierarchical tests for rename, row
alex-spies Jul 9, 2024
2f1778a
Add more extreme case for stats
alex-spies Jul 9, 2024
73a9736
Add new capability for this fix
alex-spies Jul 9, 2024
e3531ff
Merge remote-tracking branch 'upstream/main' into validate-unique-pla…
alex-spies Jul 9, 2024
2cb23f5
Fix EsRelation.equals, mutation in ResolveUnionTypes
alex-spies Jul 9, 2024
222698c
Merge remote-tracking branch 'upstream/main' into validate-unique-pla…
alex-spies Jul 10, 2024
1a648e4
Make union types use unique attribute names
alex-spies Jul 11, 2024
658bebc
Cleanup leftover
alex-spies Jul 11, 2024
e2760a5
Merge remote-tracking branch 'upstream/main' into validate-unique-pla…
alex-spies Jul 11, 2024
6ac0fa9
Revert "Cleanup leftover"
alex-spies Jul 15, 2024
0f3c274
Revert "Make union types use unique attribute names"
alex-spies Jul 15, 2024
091c099
Revert "Fix EsRelation.equals, mutation in ResolveUnionTypes"
alex-spies Jul 15, 2024
f478d59
More ENRICH tests with internal shadowing
alex-spies Jul 15, 2024
6f00534
More consistent test names
alex-spies Jul 15, 2024
535c954
More KEEP tests
alex-spies Jul 15, 2024
fb85e16
Update docs
alex-spies Jul 15, 2024
6f98cde
Add more tests
alex-spies Jul 15, 2024
fdcc40e
Improve doc wording
alex-spies Jul 15, 2024
3779900
Improve GROK docs
alex-spies Jul 15, 2024
07e9584
Merge remote-tracking branch 'upstream/main' into validate-unique-pla…
alex-spies Jul 15, 2024
32f76a0
Merge remote-tracking branch 'upstream/main' into validate-unique-pla…
alex-spies Jul 16, 2024
94737ff
Update RENAME docs and tests
alex-spies Jul 16, 2024
ea9b9a9
Avoid duplicate field attribs from union type res
alex-spies Jul 16, 2024
f5d9568
Fix leftovers
alex-spies Jul 16, 2024
a387165
Make tests deterministic
alex-spies Jul 16, 2024
233d68d
Fix rename shadowing docs
alex-spies Jul 16, 2024
d0723b0
Apply Liam's doc remarks
alex-spies Jul 16, 2024
fb17126
Don't describe KEEP precedence twice
alex-spies Jul 17, 2024
bc354f4
Merge remote-tracking branch 'upstream/main' into validate-unique-pla…
alex-spies Jul 17, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions docs/changelog/110488.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pr: 110488
summary: "ESQL: Validate unique plan attribute names"
area: ES|QL
type: bug
issues: []
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,6 @@
import java.util.function.Predicate;
import java.util.function.Supplier;

import static java.util.Collections.singletonList;

public final class AnalyzerRules {

public abstract static class AnalyzerRule<SubPlan extends LogicalPlan> extends Rule<SubPlan, LogicalPlan> {
Expand Down Expand Up @@ -138,14 +136,6 @@ public static List<Attribute> maybeResolveAgainstList(
)
.toList();

return singletonList(
ua.withUnresolvedMessage(
"Reference ["
+ ua.qualifiedName()
+ "] is ambiguous (to disambiguate use quotes or qualifiers); "
+ "matches any of "
+ refs
)
);
throw new IllegalStateException("Reference [" + ua.qualifiedName() + "] is ambiguous; " + "matches any of " + refs);

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ambiguities were possible with row earlier; the message suggesting disambiguation does not make sense - that's not possible. This was carried over from ql and made sense for SQL. If we have ambiguities in ESQL, that's a bug IMO.

}
}
Original file line number Diff line number Diff line change
Expand Up @@ -69,13 +69,6 @@ a:integer | b:integer | c:null | z:integer
1 | 2 | null | null
;

evalRowWithNull2
row a = 1, null, b = 2, c = null, null | eval z = a+b;

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one had ambiguous attribute names.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we consider this a bug fix or a breaking change? (IMHO it's a bug, but I see it's questionable)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think yes, this is a bug, since there's no way to refer to the attributes named null in following commands.


a:integer | null:null | b:integer | c:null | null:null | z:integer
1 | null | 2 | null | null | 3
;

evalRowWithNull3
row a = 1, b = 2, x = round(null) | eval z = a+b+x;

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@
import java.util.ArrayList;
import java.util.BitSet;
import java.util.Collection;
import java.util.HashSet;
import java.util.LinkedHashSet;
import java.util.List;
import java.util.Set;
Expand Down Expand Up @@ -349,10 +350,15 @@ private static void checkRegexExtractOnlyOnStrings(LogicalPlan p, Set<Failure> f

private static void checkRow(LogicalPlan p, Set<Failure> failures) {
if (p instanceof Row row) {
Set<String> outputAttributeNames = new HashSet<>();

row.fields().forEach(a -> {
if (EsqlDataTypes.isRepresentable(a.dataType()) == false) {
failures.add(fail(a, "cannot use [{}] directly in a row assignment", a.child().sourceText()));
}
if (outputAttributeNames.add(a.name()) == false) {
failures.add(fail(a, "cannot use the name [{}] multiple times in a row assignment", a.name()));
}
});
}
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,10 @@
package org.elasticsearch.xpack.esql.optimizer;

import org.elasticsearch.xpack.esql.core.common.Failures;
import org.elasticsearch.xpack.esql.core.expression.Attribute;
import org.elasticsearch.xpack.esql.core.expression.AttributeSet;
import org.elasticsearch.xpack.esql.core.expression.Expressions;
import org.elasticsearch.xpack.esql.core.expression.NameId;
import org.elasticsearch.xpack.esql.core.plan.QueryPlan;
import org.elasticsearch.xpack.esql.core.plan.logical.LogicalPlan;
import org.elasticsearch.xpack.esql.plan.logical.Aggregate;
Expand All @@ -36,6 +38,9 @@
import org.elasticsearch.xpack.esql.plan.physical.RowExec;
import org.elasticsearch.xpack.esql.plan.physical.ShowExec;

import java.util.HashSet;
import java.util.Set;

import static org.elasticsearch.xpack.esql.core.common.Failure.fail;

class OptimizerRules {
Expand All @@ -49,9 +54,24 @@ void checkPlan(P p, Failures failures) {
AttributeSet input = p.inputSet();
AttributeSet generated = generates(p);
AttributeSet missing = refs.subtract(input).subtract(generated);
if (missing.size() > 0) {
if (missing.isEmpty() == false) {
failures.add(fail(p, "Plan [{}] optimized incorrectly due to missing references {}", p.nodeString(), missing));
}

Set<String> outputAttributeNames = new HashSet<>();
Set<NameId> outputAttributeIds = new HashSet<>();
for (Attribute outputAttr : p.output()) {
if (outputAttributeNames.add(outputAttr.name()) == false || outputAttributeIds.add(outputAttr.id()) == false) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.name() -> .qualifiedName()?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I changed my mind on this:

It is the actual name that needs to be unique; otherwise, bugs could slip in because we somehow end up using qualifiers on accident; qualifiers are not respected by our optimization rules, e.g. mergeOutputAttributes; this PR demonstrates that qualifiers are entirely unused, and the validation for the current state should reflect current assumptions.

If we end up using qualifiers after all (I think that's really for the future and we should really remove them until then), we can easily update the validation.

failures.add(
fail(
p,
"Plan [{}] optimized incorrectly due to duplicate output attribute {}",
p.nodeString(),
outputAttr.toString()
)
);
}
}
}

protected AttributeSet references(P p) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,15 @@ public class VerifierTests extends ESTestCase {
private final Analyzer defaultAnalyzer = AnalyzerTestUtils.expandedDefaultAnalyzer();
private final Analyzer tsdb = AnalyzerTestUtils.analyzer(AnalyzerTestUtils.tsdbIndexResolution());

public void testRowAllowsOnlyUniqueAttributeNames() {
assertEquals("1:19: cannot use the name [a] multiple times in a row assignment", error("row a = 1, b = 2, a = 3"));
assertEquals("1:11: cannot use the name [1] multiple times in a row assignment", error("row 1, 2, 1"));
assertEquals(
"1:35: cannot use the name [null] multiple times in a row assignment",
error("row a = 1, null, b = 2, c = null, null | eval z = a+b")
);
}

public void testIncompatibleTypesInMathOperation() {
assertEquals(
"1:40: second argument of [a + c] must be [datetime or numeric], found value [c] type [keyword]",
Expand Down