ESQL: Fix injected attributes's IDs in UnionAll branches by bpintea · Pull Request #141262 · elastic/elasticsearch

bpintea · 2026-01-26T11:23:04Z

This fixes the generation of name IDs for the attributes corresponding to the unmapped fields and are pushed to different branches in UntionAll.

So far, one set of IDs was generated and reused for all subplans. This is now updated to individual set per subplan. Along the change, the handling of Fork in ResolveUnmapped has been somewhat simplified.

Also, more unit tests have been completed (where the plans are simple enough) and the plan comments updated to replace the EsqlProject with the now merged Project.

A minor collateral proposed change: the CSV spec-based tests skipped due to missing capabilities are now logged.

This fixes the generation of name IDs for the attributes corresponding to the unmapped fields and are pushed to different branches in UntionAll. So far, one set of IDs was generated and reused for all subplans. This is now updated to own set per subplan. A minor collateral proposed change: the CSV spec-based tests skipped due to missing capabilities are now logged.

elasticsearchmachine · 2026-01-26T11:23:29Z

Pinging @elastic/es-analytical-engine (Team:Analytics)

elasticsearchmachine · 2026-01-26T11:23:30Z

Hi @bpintea, I've created a changelog YAML for you.

…_id_on_branches

GalLalouche · 2026-01-26T15:48:11Z

...k/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/rules/ResolveUnmapped.java

    }

-    private static List<FieldAttribute> fieldsToLoad(List<UnresolvedAttribute> unresolved, Set<String> exclude) {
+    private static List<FieldAttribute> fieldsToLoad(Set<UnresolvedAttribute> unresolved, List<String> exclude) {


Isn't this a mistake? I would have expected the List to be the iteratee, and the Set to be the one we check contains on, but it seems to be the other way around.

It's a Set because the initial collection of UnresolvedAttributes is dedup'd -- this is what unresolvedLinkedSet() produces (// Some plans may reference the same UA multiple times (Aggregate groupings in aggregates, Eval): dedupe)
It's a List because that's what EsRelation#output (and then Expressions#names` produces.

What we want here is to exclude those attributes produces by the EsRelation itself into which we would then later inject/insist the extractors.

Not sure if it's worth instantiating new collection types to wrap the existing ones.

GalLalouche · 2026-01-26T15:49:20Z

...k/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/rules/ResolveUnmapped.java

+     */
+    private static Fork patchFork(Fork fork) {
        List<LogicalPlan> newChildren = new ArrayList<>(fork.children().size());
+        Holder<Boolean> changed = new Holder<>(false);


Could you add a comment explaining the different between changed and patched, since those names are too similar (or rename them so it's more obvious).

Renamed changed to childrenChanged

GalLalouche · 2026-01-26T15:50:06Z

...k/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/rules/ResolveUnmapped.java

-        unresolved.forEach(u -> aliasesMap.computeIfAbsent(u.name(), k -> nullAlias(u)));
-        return new ArrayList<>(aliasesMap.values());
+    private static List<Alias> nullAliases(Set<UnresolvedAttribute> unresolved) {
+        List<Alias> aliases = new ArrayList<>(unresolved.size());


Why not use addAll (Or even just a basic map for that matter)?

Not sure I understand how would addAll help. A map could, but I find the streams too heavy for a relatively simple iteration. But let me know if I misunderstood your suggestion.

GalLalouche · 2026-01-26T18:26:48Z

...k/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/analysis/AnalyzerUnmappedTests.java

@@ -63,7 +63,10 @@
 import static org.hamcrest.Matchers.hasItems;


Newly merged golden tests? 😁

Great. I'll extend in a subsequent PR, since this isn't going to be the only one.

GalLalouche · 2026-01-26T18:28:34Z

x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/CsvTests.java

             * are tested in integration tests.
             */
-            assumeFalse(
+            assumeFalseLogging(


Since you're already touching this, perhaps change the logging from "X is not supported" to "capability" is not enabled?

And if you do, you can just define a single nice helper function to check if a capability is enabled!

Not sure, I personally find the existing "CSV tests cannot currently..." or "... in csv tests" messages better, tbh, since it's not about a capability not being enabled (it actually is enabled, and that's the "problem"), but the CSV testing infrastructure not being developed enough to support a new feature, right? But will let the other chime in as well.

GalLalouche · 2026-01-26T18:29:36Z

...k/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/rules/ResolveUnmapped.java

-            if (descendantOutputsAttribute(project, attribute) == false) {
-                nullAliases.add(nullAlias(attribute));
-            }
+    private static Project patchForkProject(Project project, Holder<Boolean> changed) {


Please add a comment explaining what this method does.

Why do you need a holden here? Can't you just check in the parent if the reference has changed?

I've added a comment.

Thanks, fixed (got like that through iterations).

GalLalouche · 2026-01-26T18:31:09Z

...k/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/rules/ResolveUnmapped.java

+        if (projectOutput.equals(childOutput) == false) {
+            List<Attribute> delta = new ArrayList<>(childOutput);
+            delta.removeAll(projectOutput);
+            project = project.withProjections(mergeOutputAttributes(delta, projectOutput));


Could we please avoid this pattern of renaming the input parameter and then returning it outside the block? Just use an early exit above.

This is a pre-existing pattern. Some folks find it easier to read code with fewer returns. (Myself, I don't necessarily, but I don't mind this style either).

returning it outside the block

...reason being: if the control hasn't visited the block, the input is simply returned with no change.

I personally find it unfathomable that it's harder to read code with more early exits than it is to read code with more changes if the input variable, but to each their own I guess 🥲.

GalLalouche · 2026-01-26T18:32:22Z

...k/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/rules/ResolveUnmapped.java

    }

+    // Some plans may reference the same UA multiple times (Aggregate groupings in aggregates, Eval): dedupe
+    private static Set<UnresolvedAttribute> unresolvedLinkedSet(List<UnresolvedAttribute> unresolved) {


I think the return type here should be LinkedHashSet or at least SequencedSet, given the method name. If you opt for the latter, then consider also renaming this method.

I think the return type here should be LinkedHashSet

Updated.

…_id_on_branches

astefan

LGTM with two questions.

astefan · 2026-01-27T10:51:17Z

...k/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/rules/ResolveUnmapped.java

+        var projectOutput = project.output();
+        var childOutput = project.child().output();
+        if (projectOutput.equals(childOutput) == false) {


for a reviewer would have been easier to assess if this equals could be tricky or not by seeing the actual type of the .output(). In IDE this type is List.

is there a scenario where this equals is missed because the same elements exists in both lists but in different order?

I've updated the declarations.

There could be, yes. But in this case the delta list in the branch will be empty. The project will still be recreated, but the resulting instance will be equal to the previous one and the operation will eventually either leave the plan unchanged or changed, but due to other modifications. In any case, there should be a guard against that empty delta list, to avoid creating a new necessary intense equal to the previous one -- thanks.

Since these changes aren't functionally impacting, I'd apply them to a follow-up PR (unless other changes will be required), if ok with you?

That's ok. Thanks.

astefan · 2026-01-27T11:50:14Z

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java

+                    Set<AbstractConvertFunction> converts = oldOutputToConvertFunctions.get(oldAttr.name());
+                    if (converts != null) {


Why not the contains and .get approach? I see the code that reaches this part is fairly safe to assume that there won't be any null sets returned for a key, but we are not sure how this code will evolve in the future.

This isn't strictly related, but I thought I might need an update of this code and spotted the pattern.
There's nothing wrong from the functional PoV, but the code, as it was, checks if the key is in the map, then does it again, but fetching the corresponding value. The code as is in the proposed change only does the latter. If the result/value is null, the key isn't in there. (Well, I guess it could [have] be[en] a null value, to be exact, but that would have resulted in a NPE by now(?))

astefan · 2026-01-27T11:52:09Z

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java

-                                convert.replaceChildren(Collections.singletonList(oldAttr))
+                                convert.replaceChildren(Collections.singletonList(oldAttr)),
+                                null, // generate a new id
+                                true // this'll be used to Project the synthetic attributes out when finishing analysis


alex-spies · 2026-01-28T12:05:51Z

...k/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/analysis/AnalyzerUnmappedTests.java

-            | KEEP emp_*
+            | KEEP emp_no, *


This is now testing something else, but the original test was a valid case. Do we want to add the original one back?

alex-spies · 2026-01-28T12:15:15Z

...k/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/analysis/AnalyzerUnmappedTests.java

+     *                  null[INTEGER] AS salary#35]]
+     *             \_Subquery[]
+     *               \_Filter[TOLONG(does_not_exist1{r}#20) > 1[INTEGER]]
+     *                 \_Eval[[null[NULL] AS does_not_exist1#20, null[NULL] AS does_not_exist2#51]]


The comment reflects that we now don't re-use name ids from subquery branches; it's asserted in the test above, but not here - shouldn't we also assert the name ids for does_not_exist2 in this case?

alex-spies · 2026-01-28T12:20:26Z

...k/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/analysis/AnalyzerUnmappedTests.java

     *               \_Aggregate[[],[COUNT(*[KEYWORD],true[BOOLEAN],PT0S[TIME_DURATION]) AS c#4]]
-     *                 \_Eval[[null[NULL] AS does_not_exist#53]]
-     *                   \_EsRelation[employees][_meta_field{f}#24, emp_no{f}#18, first_name{f}#19, .
+     *                 \_Eval[[null[NULL] AS does_not_exist#54]]


This is one of those funny cases where we might wrongly break a plan by adding this EVAL does_not_exist = null here - the field might actually exist and the downstream STATS might actually be computing COUNT(does_not_exist) (even though the field does_not_exist is not in the output of this subquery).

Added to #138888

Adding in #141340 (removeShadowing). This has been implemented for two out of three cases (the "load" case and adding the null-aliasing to an existing Eval, but was missing when adding a new Eval on top of a source).
Note however that this effort is mostly for producing a correct, but otherwise later still failing plan: if the null-aliasing is injected below a STATS that doesn't export the attribute (which is why the null-aliasing is done in the first place), the attribute will remain missing and the verification later will fail the query.
This is what the initially introduced testFailStatsThenKeep or testFailStatsThenEval test.

alex-spies

Looks good, thanks Bogdan. Only minor remarks, which can be addressed later. I tracked one of them already in the meta issue #138888.

Also, this gets rid of the refresh mechanism, right? Can we check off the corresponding to-do item on #138888?

alex-spies · 2026-01-28T12:33:55Z

...k/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/analysis/AnalyzerUnmappedTests.java

-     *           \_Eval[[TOLONG(does_not_exist2{r}#74) AS $$does_not_exist2$converted_to$long#78]]
-     *             \_Eval[[TOLONG(does_not_exist1{r}#26) AS $$does_not_exist1$converted_to$long#69]]
-     *               \_Eval[[null[KEYWORD] AS _meta_field#42, null[INTEGER] AS emp_no#43, null[KEYWORD] AS first_name#44,
-     *                      null[TEXT] AS gender#45, null[DATETIME] AS hire_date#46, null[TEXT] AS job#47, null[KEYWORD] AS job.raw#48,
-     *                      null[INTEGER] AS languages#49, null[KEYWORD] AS last_name#50, null[LONG] AS long_noidx#51,
-     *                      null[INTEGER] AS salary#52]]
-     *                 \_Subquery[]
-     *                   \_Filter[TOLONG(does_not_exist1{r}#26) > 2[INTEGER]]
-     *                     \_Eval[[null[NULL] AS does_not_exist1#26, null[NULL] AS does_not_exist2#71]]


This subquery branch used to be inconsistent: does_not_exist2 is added with name id 71 after the esrelation, but it's used to define $$does_not_exist2$converted_to$long later while being referenced under id 74.

We're not currently asserting correct name ids. Maybe we can borrow the dependency checker for this? (It might also become a part of the analyzer/verifier pipeline as part of #137362, but we don't have this yet)

alex-spies · 2026-01-28T12:48:47Z

...k/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/rules/ResolveUnmapped.java

+        var unresolvedLinkedSet = unresolvedLinkedSet(unresolved);

-        var transformed = load ? load(plan, unresolved) : nullify(plan, unresolved);
+        var transformed = load ? load(plan, unresolvedLinkedSet) : nullify(plan, unresolvedLinkedSet);


If the unresolveds being in a linked (thus order-preserving) set is important, should the signature of load and nullify require a LinkedHashSet rather than a Set? We shouldn't be implicitly relying on a linked set if the compiler can guarantee this for us.

alex-spies · 2026-01-28T12:49:46Z

...k/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/rules/ResolveUnmapped.java

-        Map<String, Alias> aliasesMap = new LinkedHashMap<>(unresolved.size());
-        unresolved.forEach(u -> aliasesMap.computeIfAbsent(u.name(), k -> nullAlias(u)));
-        return new ArrayList<>(aliasesMap.values());
+    private static List<Alias> nullAliases(Set<UnresolvedAttribute> unresolved) {


Same here: If the output list is supposed to be stable, I think we should explicitly require a LinkedHashSet.

bpintea · 2026-01-29T15:53:27Z

Only minor remarks, which can be addressed later.

@alex-spies will address the points in the next PR (along with the ones Andrei raised). BTW, will follow on some of the points left on the closed PR too.
Just to avoid one more CI cycle.

Gal, Andrei, Alex, thanks for the reviews and pointers!

elasticsearchmachine · 2026-01-29T15:55:07Z

💔 Backport failed

Status	Branch	Result
❌	9.3	Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 141262

bpintea · 2026-01-30T12:16:14Z

.../esql/src/main/java/org/elasticsearch/xpack/esql/optimizer/rules/PlanConsistencyChecker.java


    private static void checkMissingFork(QueryPlan<?> plan, Failures failures) {
        for (QueryPlan<?> child : plan.children()) {
+            // TODO: this checks the set-semantics, but not the ordering


@ioanatia do we want a simple iteration on a subplan's output to check that at same position there's an equality of name and type with that in fork's output? Can there be a set-equality, but not same order?

alex-spies · 2026-02-02T17:16:09Z

💚 All backports created successfully

Status	Branch	Result
✅	9.3

Questions ?

Please refer to the Backport tool documentation

) This fixes the generation of name IDs for the attributes corresponding to the unmapped fields and are pushed to different branches in `UntionAll`. So far, one set of IDs was generated and reused for all subplans. This is now updated to individual set per subplan. Along the change, the handling of `Fork` in `ResolveUnmapped` has been somewhat simplified. Also, more unit tests have been completed (where the plans are simple enough) and the plan comments updated to replace the `EsqlProject` with the now merged `Project`. A minor collateral proposed change: the CSV spec-based tests skipped due to missing capabilities are now logged. (cherry picked from commit 8e3113c)

…) (#141675) * ESQL: Fix injected attributes's IDs in UnionAll branches (#141262) This fixes the generation of name IDs for the attributes corresponding to the unmapped fields and are pushed to different branches in `UntionAll`. So far, one set of IDs was generated and reused for all subplans. This is now updated to individual set per subplan. Along the change, the handling of `Fork` in `ResolveUnmapped` has been somewhat simplified. Also, more unit tests have been completed (where the plans are simple enough) and the plan comments updated to replace the `EsqlProject` with the now merged `Project`. A minor collateral proposed change: the CSV spec-based tests skipped due to missing capabilities are now logged. (cherry picked from commit 8e3113c) * Fix tests 9.3 does not have #139058, so the implicit limits at the top of subquery branches are still in place. Adjust the expectations accordingly. * Checkstyle --------- Co-authored-by: Bogdan Pintea <bogdan.pintea@elastic.co>

bpintea added >bug auto-backport Automatically create backport pull requests when merged :Analytics/ES|QL AKA ESQL v9.3.1 v9.4.0 labels Jan 26, 2026

elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Jan 26, 2026

Update docs/changelog/141262.yaml

fd346a1

bpintea added 2 commits January 26, 2026 12:58

checkstyle

6453bfc

Merge remote-tracking branch 'upstream/main' into fix/unmapped_unique…

c213602

…_id_on_branches

bpintea requested review from GalLalouche, alex-spies and astefan January 26, 2026 15:29

GalLalouche requested changes Jan 26, 2026

View reviewed changes

bpintea added 2 commits January 27, 2026 07:14

review comments

0aae586

Merge remote-tracking branch 'upstream/main' into fix/unmapped_unique…

6b19262

…_id_on_branches

GalLalouche approved these changes Jan 27, 2026

View reviewed changes

astefan approved these changes Jan 27, 2026

View reviewed changes

alex-spies reviewed Jan 28, 2026

View reviewed changes

alex-spies mentioned this pull request Jan 28, 2026

ESQL: support for mapping-unavailable fields meta-issue #138888

Open

61 tasks

alex-spies approved these changes Jan 28, 2026

View reviewed changes

bpintea merged commit 8e3113c into elastic:main Jan 29, 2026
36 checks passed

bpintea deleted the fix/unmapped_unique_id_on_branches branch January 29, 2026 15:54

elasticsearchmachine added the backport pending label Jan 29, 2026

This was referenced Jan 30, 2026

ESQL: introduce support for mapping-unavailable fields #139417

Closed

ESQL: introduce support for mapping-unavailable fields (Fork from #139417) #140463

Merged

bpintea commented Jan 30, 2026

View reviewed changes

alex-spies mentioned this pull request Feb 2, 2026

[9.3] ESQL: Fix injected attributes's IDs in UnionAll branches (#141262) #141675

Merged

		@@ -63,7 +63,10 @@
		import static org.hamcrest.Matchers.hasItems;

		Set<AbstractConvertFunction> converts = oldOutputToConvertFunctions.get(oldAttr.name());
		if (converts != null) {

Conversation

bpintea commented Jan 26, 2026

Uh oh!

elasticsearchmachine commented Jan 26, 2026

Uh oh!

elasticsearchmachine commented Jan 26, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bpintea Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

astefan left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alex-spies Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alex-spies left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bpintea commented Jan 29, 2026

Uh oh!

Uh oh!

bpintea Jan 27, 2026 •

edited

Loading

alex-spies Jan 28, 2026 •

edited

Loading