ESQL: Fix incorrectly optimized fork with nullify unmapped_fields by kanoshiou · Pull Request #143030 · elastic/elasticsearch

kanoshiou · 2026-02-25T09:24:29Z

This PR fixes a bug where Fork.withSubPlans() incorrectly reassigned new NameIds to its output attributes, breaking references in the upper plan. This issue specifically manifests when using FORK alongside the SET unmapped_fields="nullify" mode.

By design, a FORK assigns new NameIds to its output attributes via refreshOutput() to decouple them from the internal branches. This isolation is necessary to prevent unintended side effects during plan optimizations, such as aggressive constant folding leaking across branches.

However, the previous implementation unconditionally re-minted these NameIds every time withSubPlans() was called. Because of this, any node sitting above the FORK (like EVAL or STATS) that already held a reference to the initial NameIds would suddenly point to a nonexistent ID. Downstream analysis rules would then fail to resolve these orphaned references, causing the plan execution to fail with an "optimized incorrectly due to missing references" error.

Fixes #142762

elasticsearchmachine · 2026-02-25T12:26:44Z

Pinging @elastic/es-analytical-engine (Team:Analytics)

# Conflicts: # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/action/EsqlCapabilities.java

astefan · 2026-02-27T08:59:03Z

buildkite test this

kanoshiou · 2026-02-27T11:27:31Z

@astefan, I can not reproduce the failure, and it seems not related to this PR.

…fork-nullify-optimization

astefan · 2026-03-06T10:27:11Z

buildkite test this

alex-spies · 2026-03-06T13:01:59Z

Heya, in #141340, I'm adding a test that is failing due to this in GenerativeForkIT (build scan).

I'll have to mute it in the other PR and it'd be super nice to unmute as part of this PR. In any case, will put the mute into muted-tests.yml and point to the corresponding issue this fixes, #142762.

Update: muting in c85e691

…fork-nullify-optimization

astefan · 2026-03-11T16:51:36Z

buildkite test this

astefan · 2026-03-11T17:17:13Z

@alex-spies I've unmuted that test. From my local tests, it didn't complain. We'll see what CI says

alex-spies

Thanks a lot @kanoshiou and @astefan !

The fix for Bug 2 was independently discovered by @idegtiarenko while fixing #141870. Apologies that I didn't spot that we're fixing the same issue in both PRs.

The fix for Bug 1 (keeping name ids for existing fork attributes while refreshing) looks super good to me. Thanks a lot!

alex-spies · 2026-03-11T18:45:48Z

...k/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/rules/ResolveUnmapped.java

+        Set<String> childOutputNames = new HashSet<>();
+        for (LogicalPlan child : plan.children()) {
+            for (Attribute attr : child.output()) {
+                childOutputNames.add(attr.name());
+            }
+        }
+        unresolved.removeIf(ua -> childOutputNames.contains(ua.name()));
+
+        if (unresolved.isEmpty()) {
+            return plan;
+        }


Thanks, I agree that this is the right solution!

And I'm super sorry I didn't notice this earlier @astefan . Bug 2 from the PR description is indeed the same problem that @idegtiarenko fixed in #142300, and you two ended up implementing the very same solution.

The silver lining is that we very much agree on the solution. The added tests in this PR are great, too, as they highlight different queries where ImplicitCasting can introduce an intermediate step between ResolveRefs and ResolveUnmapped.

alex-spies · 2026-03-11T18:48:37Z

x-pack/plugin/esql/qa/testFixtures/src/main/resources/unmapped-nullify.csv-spec

+required_capability: optional_fields_nullify_tech_preview
+required_capability: fork_v9
+required_capability: fix_fork_unmapped_nullify


nit: I think we need fork_v9 to exclude this from CsvTests, but we should only require 1 of optional_fields_nullify_tech_preview and fix_fork_unmapped_nullify, no?

Also applies below.

alex-spies · 2026-03-11T18:49:17Z

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/action/EsqlCapabilities.java

+         * Bug 2: After {@code ImplicitCasting} runs, the plan may remain unresolved because it requires a
+         * subsequent {@code ResolveRefs} pass to fully resolve. However, {@code ResolveUnmapped} runs before
+         * that second {@code ResolveRefs} pass and mistakenly treats those still-unresolved attributes as
+         * user-introduced unmapped fields, incorrectly nullifying valid references.


nit: unfortunately, this got stale. Bug 2 is already fixed.

alex-spies · 2026-03-11T18:50:03Z

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/action/EsqlCapabilities.java

        FIX_FULL_TEXT_FUNCTIONS_ON_RENAMED_FIELDS,

+        /**
+         * Fixes two independent analysis bugs in {@code FORK} with {@code unmapped_fields="nullify"}.


super nit: references to the to-be-closed issues (#142762
and #142543) are always nice in capabilities.

alex-spies · 2026-03-11T19:06:04Z

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/core/expression/Expressions.java

+     * matches one in {@code existingOutput}. Genuinely new attributes get fresh NameIds.
     */
-    public static List<Attribute> toReferenceAttributes(List<? extends NamedExpression> named) {
+    public static List<Attribute> toReferenceAttributes(List<? extends NamedExpression> named, List<Attribute> existingOutput) {


nit: method name doesn't really say what this does, now. Maybe toReferenceAttributesPreservingIds?

# Conflicts: # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/rules/ResolveUnmapped.java # x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/analysis/AnalyzerUnmappedTests.java

kanoshiou · 2026-03-12T08:36:18Z

@astefan conflicts resolved

astefan · 2026-03-12T08:38:08Z

buildkite test this

…fork-nullify-optimization

astefan · 2026-03-12T10:07:19Z

buildkite test this

…fork-nullify-optimization

astefan · 2026-03-12T11:02:23Z

buildkite test this

bpintea

Left a note. But if the CI is happy, LGTM.
Also, would update the PR description:

Because NameId equality drives attribute identity across the whole plan tree

That's not quite correct -- see comment.

bpintea · 2026-03-12T11:22:41Z

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plan/logical/Fork.java


    protected List<Attribute> refreshedOutput() {
-        return toReferenceAttributes(outputUnion(children()));
+        return toReferenceAttributesPreservingIds(outputUnion(children()), this.output());


Hmm, this no longer refreshes the IDs. Meaning that some assumptions are either broken, or they were incorrect to have, or they're no longer valid (in the meantime).

@bpintea can you expand on this, please? What use cases would this impact/break/change? (are we missing tests?)

bpintea · 2026-03-12T11:25:46Z

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plan/logical/Fork.java

-        // We don't want to keep the same attributes that are outputted by the FORK branches.
-        // Keeping the same attributes can have unintended side effects when applying optimizations like constant folding.


This shouldn't be removed. The attribute IDs from within the branches are not kept / reused atop Fork.

kanoshiou · 2026-03-12T11:44:51Z

Thanks for the review, @bpintea! I've updated the PR description to reflect your feedback.

elasticsearchmachine · 2026-03-12T12:29:23Z

💔 Backport failed

Status	Branch	Result
❌	9.3	Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 143030

kanoshiou · 2026-03-12T12:33:30Z

I can help with the backport.

kanoshiou · 2026-03-12T12:57:31Z

To backport this PR, I think we need to wait for #144097 to be merged first.

…elocations * upstream/main: (49 commits) CCS logging fixes (elastic#144070) Improve CPS cluster exclusion handling (elastic#143488) Remove snapshot condition now that node_reduce phase is in non-snapshot builds (elastic#144090) Drop deprecation warnings when updating a mapping in the cluster state applier (elastic#143884) (elastic#144040) Add ensureGreenAndNoInitializingShards helper (elastic#144044) Removed unnecessary applies_to blocks from deprecated query (elastic#144096) [CPS] Use single CrossProjectModeDecider instance (elastic#144030) Fix ESQL TS requests with LIMIT 0 (elastic#144031) ESQL: Remove `create` methods in aggs (elastic#144098) ES|QL: Refactor ChangeLimitOperator (elastic#144017) Add Paginated Hit Source Tests (elastic#142592) Fix test failure not preferred (elastic#144019) Remove serialization logic from EIS authorization response (elastic#144021) ESQL: CSV schema inference and parsing enhancements (elastic#144050) ESQL: Fix incorrectly optimized fork with nullify unmapped_fields (elastic#143030) Fix MMR release test using subqueries (elastic#144087) Refactoring `UserAgentPlugin` (elastic#140712) Drop non-finite samples in Prometheus remote write (elastic#144055) [TEST] Wait for internal inference indices to be created in authorization IT (elastic#143885) Disable ndjson datasource QA tests in release-tests (elastic#143992) ...

alex-spies · 2026-03-13T13:08:35Z

To backport this PR, I think we need to wait for #144097 to be merged first.

Thanks @kanoshiou !

It's merged!

We may still see conflicts because #143399 was not backported. I don't think it should, though, as this one required a new transport version, increasing the cost of backports a bit.

kanoshiou · 2026-03-13T14:18:33Z

💚 All backports created successfully

Status	Branch	Result
✅	9.3

Questions ?

Please refer to the Backport tool documentation

kanoshiou · 2026-03-13T14:35:36Z

💚 All backports created successfully

Status	Branch	Result
✅	9.3

Questions ?

Please refer to the Backport tool documentation

…astic#143030) This PR fixes a bug where `Fork.withSubPlans()` incorrectly reassigned new `NameId`s to its output attributes, breaking references in the upper plan. This issue specifically manifests when using `FORK` alongside the `SET unmapped_fields="nullify"` mode. By design, a `FORK` assigns new `NameId`s to its output attributes via `refreshOutput()` to decouple them from the internal branches. This isolation is necessary to prevent unintended side effects during plan optimizations, such as aggressive constant folding leaking across branches. However, the previous implementation unconditionally re-minted these `NameId`s every time `withSubPlans()` was called. Because of this, any node sitting above the `FORK` (like `EVAL` or `STATS`) that already held a reference to the initial `NameId`s would suddenly point to a nonexistent ID. Downstream analysis rules would then fail to resolve these orphaned references, causing the plan execution to fail with an *"optimized incorrectly due to missing references"* error. Fixes elastic#142762 (cherry picked from commit 5fb7136) # Conflicts: # muted-tests.yml # x-pack/plugin/esql/qa/server/src/main/java/org/elasticsearch/xpack/esql/qa/rest/generative/GenerativeRestTest.java # x-pack/plugin/esql/qa/testFixtures/src/main/resources/unmapped-nullify.csv-spec # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/action/EsqlCapabilities.java # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/rules/ResolveUnmapped.java # x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/analysis/AnalyzerUnmappedTests.java

…astic#143030) This PR fixes a bug where `Fork.withSubPlans()` incorrectly reassigned new `NameId`s to its output attributes, breaking references in the upper plan. This issue specifically manifests when using `FORK` alongside the `SET unmapped_fields="nullify"` mode. By design, a `FORK` assigns new `NameId`s to its output attributes via `refreshOutput()` to decouple them from the internal branches. This isolation is necessary to prevent unintended side effects during plan optimizations, such as aggressive constant folding leaking across branches. However, the previous implementation unconditionally re-minted these `NameId`s every time `withSubPlans()` was called. Because of this, any node sitting above the `FORK` (like `EVAL` or `STATS`) that already held a reference to the initial `NameId`s would suddenly point to a nonexistent ID. Downstream analysis rules would then fail to resolve these orphaned references, causing the plan execution to fail with an *"optimized incorrectly due to missing references"* error. Fixes elastic#142762 (cherry picked from commit 5fb7136)

…43030) (#144386) This PR fixes a bug where `Fork.withSubPlans()` incorrectly reassigned new `NameId`s to its output attributes, breaking references in the upper plan. This issue specifically manifests when using `FORK` alongside the `SET unmapped_fields="nullify"` mode. By design, a `FORK` assigns new `NameId`s to its output attributes via `refreshOutput()` to decouple them from the internal branches. This isolation is necessary to prevent unintended side effects during plan optimizations, such as aggressive constant folding leaking across branches. However, the previous implementation unconditionally re-minted these `NameId`s every time `withSubPlans()` was called. Because of this, any node sitting above the `FORK` (like `EVAL` or `STATS`) that already held a reference to the initial `NameId`s would suddenly point to a nonexistent ID. Downstream analysis rules would then fail to resolve these orphaned references, causing the plan execution to fail with an *"optimized incorrectly due to missing references"* error. Fixes #142762 (cherry picked from commit 5fb7136) Co-authored-by: kanoshiou <uiaao@tuta.io>

…astic#143030) This PR fixes a bug where `Fork.withSubPlans()` incorrectly reassigned new `NameId`s to its output attributes, breaking references in the upper plan. This issue specifically manifests when using `FORK` alongside the `SET unmapped_fields="nullify"` mode. By design, a `FORK` assigns new `NameId`s to its output attributes via `refreshOutput()` to decouple them from the internal branches. This isolation is necessary to prevent unintended side effects during plan optimizations, such as aggressive constant folding leaking across branches. However, the previous implementation unconditionally re-minted these `NameId`s every time `withSubPlans()` was called. Because of this, any node sitting above the `FORK` (like `EVAL` or `STATS`) that already held a reference to the initial `NameId`s would suddenly point to a nonexistent ID. Downstream analysis rules would then fail to resolve these orphaned references, causing the plan execution to fail with an *"optimized incorrectly due to missing references"* error. Fixes elastic#142762

Fix analysis bugs in FORK with unmapped-nullify sub-plans

02f92ee

elasticsearchmachine added needs:triage Requires assignment of a team area label v9.4.0 external-contributor Pull request authored by a developer outside the Elasticsearch team labels Feb 25, 2026

Update docs/changelog/143030.yaml

7df7ea6

alex-spies added the :Analytics/ES|QL AKA ESQL label Feb 25, 2026

elasticsearchmachine added Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) and removed needs:triage Requires assignment of a team area label labels Feb 25, 2026

kanoshiou added 3 commits February 25, 2026 20:32

unmute

d292b63

Merge branch 'main' into fork-nullify-optimization

b803604

Merge branch 'main' into fork-nullify-optimization

dcc1b98

# Conflicts: # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/action/EsqlCapabilities.java

astefan added the >bug label Feb 27, 2026

Merge branch 'main' into fork-nullify-optimization

1449883

astefan self-assigned this Mar 4, 2026

astefan added 2 commits March 4, 2026 17:58

Don't duplicate the analyzer rule (sub-optimal solution)

307bfe0

Merge branch 'main' of https://github.com/elastic/elasticsearch into …

610f4f9

…fork-nullify-optimization

astefan requested review from alex-spies and bpintea March 6, 2026 11:29

astefan added 4 commits March 11, 2026 13:00

Merge branch 'main' of https://github.com/elastic/elasticsearch into …

ac5eed1

…fork-nullify-optimization

Update (fix) tests

c1a58cc

Merge branch 'main' of https://github.com/elastic/elasticsearch into …

41be8c4

…fork-nullify-optimization

Unmute test

a6c6677

alex-spies approved these changes Mar 11, 2026

View reviewed changes

astefan added auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) auto-backport Automatically create backport pull requests when merged labels Mar 12, 2026

Merge branch 'main' into fork-nullify-optimization

9525d79

# Conflicts: # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/rules/ResolveUnmapped.java # x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/analysis/AnalyzerUnmappedTests.java

Merge branch 'main' of https://github.com/elastic/elasticsearch into …

3c2398c

…fork-nullify-optimization

Merge branch 'main' of https://github.com/elastic/elasticsearch into …

47e8c56

…fork-nullify-optimization

bpintea approved these changes Mar 12, 2026

View reviewed changes

elasticsearchmachine merged commit 5fb7136 into elastic:main Mar 12, 2026
38 checks passed

elasticsearchmachine added the backport pending label Mar 12, 2026

astefan mentioned this pull request Mar 12, 2026

ESQL: plan optimized incorrectly with unmapped_fields nullify, FORK and RENAME #144094

Closed

astefan mentioned this pull request Mar 12, 2026

ESQL: unmapped_fields nullify leads to conflicting data types with FORK and missing references #142543

Closed

kanoshiou mentioned this pull request Mar 13, 2026

[9.3] ESQL: Fix incorrectly optimized fork with nullify unmapped_fields (#143030) #144200

Closed

kanoshiou mentioned this pull request Mar 13, 2026

[9.3] ESQL: Fix incorrectly optimized fork with nullify unmapped_fields (#143030) #144205

Closed

astefan mentioned this pull request Mar 17, 2026

ESQL: Fix incorrectly optimized fork with nullify unmapped_fields (#143030) #144386

Merged

astefan removed the backport pending label Mar 18, 2026

		// We don't want to keep the same attributes that are outputted by the FORK branches.
		// Keeping the same attributes can have unintended side effects when applying optimizations like constant folding.

Conversation

kanoshiou commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elasticsearchmachine commented Feb 25, 2026

Uh oh!

astefan commented Feb 27, 2026

Uh oh!

kanoshiou commented Feb 27, 2026

Uh oh!

astefan commented Mar 6, 2026

Uh oh!

alex-spies commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

astefan commented Mar 11, 2026

Uh oh!

astefan commented Mar 11, 2026

Uh oh!

alex-spies left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kanoshiou commented Mar 12, 2026

Uh oh!

astefan commented Mar 12, 2026

Uh oh!

astefan commented Mar 12, 2026

Uh oh!

astefan commented Mar 12, 2026

Uh oh!

bpintea left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kanoshiou commented Mar 12, 2026

Uh oh!

Uh oh!

elasticsearchmachine commented Mar 12, 2026

💔 Backport failed

Uh oh!

kanoshiou commented Mar 12, 2026

Uh oh!

kanoshiou commented Mar 12, 2026

Uh oh!

alex-spies commented Mar 13, 2026

Uh oh!

kanoshiou commented Mar 13, 2026

💚 All backports created successfully

Questions ?

Uh oh!

kanoshiou commented Mar 13, 2026

💚 All backports created successfully

Questions ?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

kanoshiou commented Feb 25, 2026 •

edited

Loading

alex-spies commented Mar 6, 2026 •

edited

Loading