Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions docs/changelog/110488.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
pr: 110488
summary: "ESQL: Validate unique plan attribute names"
area: ES|QL
type: bug
issues:
- 110541
2 changes: 2 additions & 0 deletions docs/reference/esql/processing-commands/dissect.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,8 @@ multiple values, `DISSECT` will process each value.

`pattern`::
A <<esql-dissect-patterns,dissect pattern>>.
If a field name conflicts with an existing column, the existing column is dropped.
If a field name is used more than once, only the rightmost duplicate creates a column.

`<separator>`::
A string used as the separator between appended values, when using the <<esql-append-modifier,append modifier>>.
Expand Down
7 changes: 6 additions & 1 deletion docs/reference/esql/processing-commands/enrich.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -28,11 +28,16 @@ name as the `match_field` defined in the <<esql-enrich-policy,enrich policy>>.
The enrich fields from the enrich index that are added to the result as new
columns. If a column with the same name as the enrich field already exists, the
existing column will be replaced by the new column. If not specified, each of
the enrich fields defined in the policy is added
the enrich fields defined in the policy is added.
A column with the same name as the enrich field will be dropped unless the
enrich field is renamed.

`new_nameX`::
Enables you to change the name of the column that's added for each of the enrich
fields. Defaults to the enrich field name.
If a column has the same name as the new name, it will be discarded.
If a name (new or original) occurs more than once, only the rightmost duplicate
creates a new column.

*Description*

Expand Down
4 changes: 3 additions & 1 deletion docs/reference/esql/processing-commands/eval.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -13,10 +13,12 @@ EVAL [column1 =] value1[, ..., [columnN =] valueN]

`columnX`::
The column name.
If a column with the same name already exists, the existing column is dropped.
If a column name is used more than once, only the rightmost duplicate creates a column.

`valueX`::
The value for the column. Can be a literal, an expression, or a
<<esql-functions,function>>.
<<esql-functions,function>>. Can use columns defined left of this one.

*Description*

Expand Down
15 changes: 15 additions & 0 deletions docs/reference/esql/processing-commands/grok.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@ multiple values, `GROK` will process each value.

`pattern`::
A grok pattern.
If a field name conflicts with an existing column, the existing column is discarded.
If a field name is used more than once, a multi-valued column will be created with one value
per each occurrence of the field name.

*Description*

Expand Down Expand Up @@ -64,4 +67,16 @@ include::{esql-specs}/docs.csv-spec[tag=grokWithToDatetime]
|===
include::{esql-specs}/docs.csv-spec[tag=grokWithToDatetime-result]
|===

If a field name is used more than once, `GROK` creates a multi-valued
column:

[source.merge.styled,esql]
----
include::{esql-specs}/docs.csv-spec[tag=grokWithDuplicateFieldNames]
----
[%header.monospaced.styled,format=dsv,separator=|]
|===
include::{esql-specs}/docs.csv-spec[tag=grokWithDuplicateFieldNames-result]
|===
// end::examples[]
4 changes: 3 additions & 1 deletion docs/reference/esql/processing-commands/keep.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,8 @@ KEEP columns

`columns`::
A comma-separated list of columns to keep. Supports wildcards.
See below for the behavior in case an existing column matches multiple
given wildcards or column names.

*Description*

Expand All @@ -26,7 +28,7 @@ Fields are added in the order they appear. If one field matches multiple express
2. Partial wildcard expressions (for example: `fieldNam*`)
3. Wildcard only (`*`)

If a field matches two expressions with the same precedence, the right-most expression wins.
If a field matches two expressions with the same precedence, the rightmost expression wins.

Refer to the examples for illustrations of these precedence rules.

Expand Down
1 change: 1 addition & 0 deletions docs/reference/esql/processing-commands/lookup.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ LOOKUP table ON match_field1[, match_field2, ...]

`table`::
The name of the `table` provided in the request to match.
If the table's column names conflict with existing columns, the existing columns will be dropped.

`match_field`::
The fields in the input to match against the table.
Expand Down
4 changes: 3 additions & 1 deletion docs/reference/esql/processing-commands/rename.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,9 @@ RENAME old_name1 AS new_name1[, ..., old_nameN AS new_nameN]
The name of a column you want to rename.

`new_nameX`::
The new name of the column.
The new name of the column. If it conflicts with an existing column name,
the existing column is dropped. If multiple columns are renamed to the same
name, all but the rightmost column with the same new name are dropped.

*Description*

Expand Down
3 changes: 3 additions & 0 deletions docs/reference/esql/processing-commands/stats.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -15,12 +15,15 @@ STATS [column1 =] expression1[, ..., [columnN =] expressionN]
`columnX`::
The name by which the aggregated value is returned. If omitted, the name is
equal to the corresponding expression (`expressionX`).
If multiple columns have the same name, all but the rightmost column with this
name will be ignored.

`expressionX`::
An expression that computes an aggregated value.

`grouping_expressionX`::
An expression that outputs the values to group by.
If its name coincides with one of the computed columns, that column will be ignored.

NOTE: Individual `null` values are skipped when computing aggregations.

Expand Down
1 change: 1 addition & 0 deletions docs/reference/esql/source-commands/row.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ ROW column1 = value1[, ..., columnN = valueN]

`columnX`::
The column name.
In case of duplicate column names, only the rightmost duplicate creates a column.

`valueX`::
The value for the column. Can be a literal, an expression, or a
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,6 @@
import java.util.function.Predicate;
import java.util.function.Supplier;

import static java.util.Collections.singletonList;

public final class AnalyzerRules {

public abstract static class AnalyzerRule<SubPlan extends LogicalPlan> extends Rule<SubPlan, LogicalPlan> {
Expand Down Expand Up @@ -138,14 +136,6 @@ public static List<Attribute> maybeResolveAgainstList(
)
.toList();

return singletonList(
ua.withUnresolvedMessage(
"Reference ["
+ ua.qualifiedName()
+ "] is ambiguous (to disambiguate use quotes or qualifiers); "
+ "matches any of "
+ refs
)
);
throw new IllegalStateException("Reference [" + ua.qualifiedName() + "] is ambiguous; " + "matches any of " + refs);
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -96,8 +96,8 @@ public class CsvTestsDataLoader {
"cartesian_multipolygons.csv"
);
private static final TestsDataset DISTANCES = new TestsDataset("distances", "mapping-distances.json", "distances.csv");

private static final TestsDataset K8S = new TestsDataset("k8s", "k8s-mappings.json", "k8s.csv", "k8s-settings.json", true);
private static final TestsDataset ADDRESSES = new TestsDataset("addresses", "mapping-addresses.json", "addresses.csv", null, true);

public static final Map<String, TestsDataset> CSV_DATASET_MAP = Map.ofEntries(
Map.entry(EMPLOYEES.indexName, EMPLOYEES),
Expand All @@ -121,7 +121,8 @@ public class CsvTestsDataLoader {
Map.entry(AIRPORT_CITY_BOUNDARIES.indexName, AIRPORT_CITY_BOUNDARIES),
Map.entry(CARTESIAN_MULTIPOLYGONS.indexName, CARTESIAN_MULTIPOLYGONS),
Map.entry(K8S.indexName, K8S),
Map.entry(DISTANCES.indexName, DISTANCES)
Map.entry(DISTANCES.indexName, DISTANCES),
Map.entry(ADDRESSES.indexName, ADDRESSES)
);

private static final EnrichConfig LANGUAGES_ENRICH = new EnrichConfig("languages_policy", "enrich-policy-languages.json");
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
street:keyword,number:keyword,zip_code:keyword,city.name:keyword,city.country.name:keyword,city.country.continent.name:keyword,city.country.continent.planet.name:keyword,city.country.continent.planet.galaxy:keyword
Keizersgracht,281,1016 ED,Amsterdam,Netherlands,Europe,Earth,Milky Way
Kearny St,88,CA 94108,San Francisco,United States of America,North America,Earth,Milky Way
Marunouchi,2-7-2,100-7014,Tokyo,Japan,Asia,Earth,Milky Way
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,19 @@ first_name:keyword | left:keyword | full_name:keyword | right:keyword | last_nam
Georgi | left | Georgi Facello | right | Facello
;

shadowingSubfields
FROM addresses
| KEEP city.country.continent.planet.name, city.country.name, city.name
| DISSECT city.name "%{city.country.continent.planet.name} %{?}"
| SORT city.name
;

city.country.name:keyword | city.name:keyword | city.country.continent.planet.name:keyword
Netherlands | Amsterdam | null
United States of America | San Francisco | San
Japan | Tokyo | null
;

shadowingSelf
FROM employees
| KEEP first_name, last_name
Expand All @@ -50,6 +63,18 @@ last_name:keyword | left:keyword | foo:keyword | middle:keyword | ri
Facello | left | Georgi1 Georgi2 Facello | middle | right | Georgi1 | Georgi2 | Facello
;

shadowingInternal
FROM employees
| KEEP first_name, last_name
| WHERE last_name == "Facello"
| EVAL name = concat(first_name, "1 ", last_name)
| DISSECT name "%{foo} %{foo}"
;

first_name:keyword | last_name:keyword | name:keyword | foo:keyword
Georgi | Facello | Georgi1 Facello | Facello
;


complexPattern
ROW a = "1953-01-23T12:15:00Z - some text - 127.0.0.1;"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -436,6 +436,23 @@ ROW a = "1.2.3.4 [2023-01-23T12:15:00.000Z] Connected"
// end::grokWithEscape-result[]
;

grokWithDuplicateFieldNames
// tag::grokWithDuplicateFieldNames[]
FROM addresses
| KEEP city.name, zip_code
| GROK zip_code "%{WORD:zip_parts} %{WORD:zip_parts}"
// end::grokWithDuplicateFieldNames[]
| SORT city.name
;

// tag::grokWithDuplicateFieldNames-result[]
city.name:keyword | zip_code:keyword | zip_parts:keyword
Amsterdam | 1016 ED | ["1016", "ED"]
San Francisco | CA 94108 | ["CA", "94108"]
Tokyo | 100-7014 | null
// end::grokWithDuplicateFieldNames-result[]
;

basicDissect
// tag::basicDissect[]
ROW a = "2023-01-23T12:15:00.000Z - some text - 127.0.0.1"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -122,3 +122,53 @@ FROM employees | STATS COUNT(*), MIN(salary * 10), MAX(languages)| DROP `COUNT(
MIN(salary * 10):i | MAX(languages):i
253240 | 5
;

// Not really shadowing, but let's keep the name consistent with the other command's tests
shadowingInternal
FROM employees
| SORT emp_no ASC
| KEEP emp_no, first_name, last_name
| DROP last_name, last_name
| LIMIT 2
;

emp_no:integer | first_name:keyword
10001 | Georgi
10002 | Bezalel
;

shadowingInternalWildcard
FROM employees
| SORT emp_no ASC
| KEEP emp_no, first_name, last_name
| DROP last*name, last*name, last*, last_name
| LIMIT 2
;

emp_no:integer | first_name:keyword
10001 | Georgi
10002 | Bezalel
;

subfields
FROM addresses
| DROP city.country.continent.planet.name, city.country.continent.name, city.country.name, number, street, zip_code, city.country.continent.planet.name
| SORT city.name
;

city.country.continent.planet.galaxy:keyword | city.name:keyword
Milky Way | Amsterdam
Milky Way | San Francisco
Milky Way | Tokyo
;

subfieldsWildcard
FROM addresses
| DROP *.name, number, street, zip_code, *ame
;

city.country.continent.planet.galaxy:keyword
Milky Way
Milky Way
Milky Way
;
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,34 @@ ROW left = "left", foo = "foo", client_ip = "172.21.0.5", env = "env", right = "
left:keyword | client_ip:keyword | env:keyword | right:keyword | foo:keyword
;

shadowingSubfields
required_capability: enrich_load
FROM addresses
| KEEP city.country.continent.planet.name, city.country.name, city.name
| EVAL city.name = REPLACE(city.name, "San Francisco", "South San Francisco")
| ENRICH city_names ON city.name WITH city.country.continent.planet.name = airport
| SORT city.name
;

city.country.name:keyword | city.name:keyword | city.country.continent.planet.name:text
Netherlands | Amsterdam | null
United States of America | South San Francisco | San Francisco Int'l
Japan | Tokyo | null
;

shadowingSubfieldsLimit0
required_capability: enrich_load
FROM addresses
| KEEP city.country.continent.planet.name, city.country.name, city.name
| EVAL city.name = REPLACE(city.name, "San Francisco", "South San Francisco")
| ENRICH city_names ON city.name WITH city.country.continent.planet.name = airport
| SORT city.name
| LIMIT 0
;

city.country.name:keyword | city.name:keyword | city.country.continent.planet.name:text
;

shadowingSelf
required_capability: enrich_load
ROW left = "left", client_ip = "172.21.0.5", env = "env", right = "right"
Expand Down Expand Up @@ -107,6 +135,46 @@ ROW left = "left", airport = "Zurich Airport ZRH", city = "Zürich", middle = "m
left:keyword | city:keyword | middle:keyword | right:keyword | airport:text | region:text | city_boundary:geo_shape
;

shadowingInternal
required_capability: enrich_load
ROW city = "Zürich"
| ENRICH city_names ON city WITH x = airport, x = region
;

city:keyword | x:text
Zürich | Bezirk Zürich
;

shadowingInternalImplicit
required_capability: enrich_load
ROW city = "Zürich"
| ENRICH city_names ON city WITH airport = region
;

city:keyword | airport:text
Zürich | Bezirk Zürich
;

shadowingInternalImplicit2
required_capability: enrich_load
ROW city = "Zürich"
| ENRICH city_names ON city WITH airport, airport = region
;

city:keyword | airport:text
Zürich | Bezirk Zürich
;

shadowingInternalImplicit3
required_capability: enrich_load
ROW city = "Zürich"
| ENRICH city_names ON city WITH airport = region, airport
;

city:keyword | airport:text
Zürich | Zurich Int'l
;

simple
required_capability: enrich_load

Expand Down
Loading