ESQL: introduce support for mapping-unavailable fields (Fork from #139417) by GalLalouche · Pull Request #140463 · elastic/elasticsearch

GalLalouche · 2026-01-09T16:20:47Z

This a PR fork of #139417, with the empahsis of getting the NULLIFY option ready. Main changes:

Added a couple more tests.
Marked the capability as snapshot only.
Protected some tests with snapshot/capability.

…ullify

This will allow re-evaluating the output past a RENAME/DROP/KEEP, once an unmapping field is injected.

…ullify

alex-spies · 2026-01-12T13:23:38Z

...k/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/rules/ResolveUnmapped.java

+            var outputNames = eval.outputSet().names();
+            var evalRefNames = eval.references().names();
+            for (Alias a : nullAliases) {
+                if (outputNames.contains(a.name()) == false) {


We only check the output of the existing eval - not the output of the leaf below it. This can some times put the plan into an inconsistent state during resolution, where properly resolved references point to a field that was shadowed. It's probably fine, because that should only happen when the plan was invalid to begin with.

Example: Consider a field foo that exists in the index

SET unmapped_fields=\"nullify\"; from test | where foo > 1 | drop foo | where foo > 2 [2026-01-12T14:21:39,991][TRACE][o.e.x.e.a.A.changes ] [runTask-0] Rule rules.ResolveUnmapped applied with change Filter[?foo > 2[INTEGER]] = Filter[?foo > 2[INTEGER]] \_ResolvingProject[org.elasticsearch.xpack.esql.analysis.Analyzer$ResolveRefs$$Lambda/0x000000002b759490@5c6fd1e4,[bar{f}#130]] = \_ResolvingProject[org.elasticsearch.xpack.esql.analysis.Analyzer$ResolveRefs$$Lambda/0x000000002b759490@5c6fd1e4,[bar{f}#130]] \_Filter[foo{f}#131 > 1[INTEGER]] = \_Filter[foo{f}#131 > 1[INTEGER]] \_EsRelation[test][bar{f}#130, foo{f}#131] ! \_Eval[[null[NULL] AS foo#132]] ! \_EsRelation[test][bar{f}#130, foo{f}#131]

The first filter where foo > 1 first resolves correctly, but then ResolveUnmapped finds that foo is missing downstream - and adds an Eval that shadows the correctly resolved foo, making the filter use a missing column.

This could lead to wrong error message if we start checking for plan consistency during analyzer runs already - so it's best to add a test.

Added a follow-up item to #138888.

Initial test existing (testFailEvalAfterDrop), but introducing also a new one (testFailFilterAfterDrop).

alex-spies · 2026-01-12T13:40:41Z

...k/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/rules/ResolveUnmapped.java

+        );
+        // insert an Eval on top of those LeafPlan that are children of n-ary plans (could happen with UnionAll)
+        transformed = transformed.transformUp(
+            n -> n instanceof UnaryPlan == false && n instanceof LeafPlan == false,


nullify adds wrong null fields to the lookup index when a field is missing after the join. This will need fixing, even though it seems not to actually trigger a visible bug at the moment. I think we want to exclude the right hand side of joins, as that never corresponds to a FROM command.

SET unmapped_fields=\"nullify\"; FROM employees| EVAL language_code = languages| LOOKUP JOIN languages_lookup ON language_code | limit 1 | where missing::integer > 1

Notice how we add an eval with nulls to the lookup index (even though we cannot actually execute this eval on the lookup side!)

[2026-01-12T14:36:17,805][TRACE][o.e.x.e.a.A.changes ] [runTask-0] Rule rules.ResolveUnmapped applied with change Filter[TOINTEGER(?missing) > 1[INTEGER]] = Filter[TOINTEGER(?missing) > 1[INTEGER]] \_Limit[1[INTEGER],false,false] = \_Limit[1[INTEGER],false,false] \_LookupJoin[LEFT,[language_code{r}#233],[language_code{f}#259],false,null] = \_LookupJoin[LEFT,[language_code{r}#233],[language_code{f}#259],false,null] |_Eval[[languages{f}#238 AS language_code#233]] ! |_Eval[[languages{f}#238 AS language_code#233, null[NULL] AS missing#261]] | \_EsRelation[employees][avg_worked_seconds{f}#236, birth_date{f}#243, emp_n..] = | \_EsRelation[employees][avg_worked_seconds{f}#236, birth_date{f}#243, emp_n..] \_EsRelation[languages_lookup][LOOKUP][language_code{f}#259, language_name{f}#260] ! \_Eval[[null[NULL] AS missing#261]] ! \_EsRelation[languages_lookup][LOOKUP][language_code{f}#259, language_name{f}#260]

Added a follow-up item to #138888.

alex-spies · 2026-01-12T14:21:22Z

...k/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/rules/ResolveUnmapped.java

+    // TODO: would an alternative to this be to drop the current Fork and have ResolveRefs#resolveFork re-resolve it. We might need
+    // some plan delimiters/markers to make it unequivocal which nodes belong to "make Fork work" - like (Limit-Project[-Eval])s - and
+    // which don't.
+    private static Fork patchFork(Fork fork, List<Attribute> aliasAttributes) {


What's required to patch FORK is currently quite complex. I think we should see if it can be simplified.

We simplified Project via ResolvingProject; previously, this required workarounds because a Project had fixed output attributes rather than computing them from its inputs. It kinda looks like Fork has similar problems, and it's quite a bit of a dance to get it to work here.

Added a follow-up item to #138888.

The handling of Fork has been somewhat simplified in #141262. But overall it remains complex and is similar here and in PruneColumns (see #pruneColumnsInFork).

Added a follow-up item to #138888.

Should we maybe rather extract a stand-alone issue about Fork re-resolution.

alex-spies · 2026-01-12T14:40:18Z

...k/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/rules/ResolveUnmapped.java

+     * {@link Analyzer.ResolveRefs} to attempt again to wire them to the newly added aliases. That's what this method does.
+     */
+    private static LogicalPlan refreshUnresolved(LogicalPlan plan, List<UnresolvedAttribute> unresolved) {
+        return plan.transformExpressionsOnlyUp(UnresolvedAttribute.class, ua -> {


I think we shouldn't have to transform up here. If there are children with attributes that ResolveRefs deemed unresolvable, ResolveUnmapped shouldn't yet be looking at the current plan, but still be doing its work on the unresolved children.

In fact, maybe we shouldn't mark fields as unresolveable until we're in the clean-up step. Being unresolvable just cannot be determined in a single run of ResolveRefs in case of nullify or load.

Added a follow-up item to #138888.

maybe we shouldn't mark fields as unresolveable until we're in the clean-up step

I think this might be an optimisation that can be dropped, but didn't look into it, if that really is the case.

I think we shouldn't have to transform up here.

The comment may read as "this code is wrong, it shouldn't do that". But I think you mean: "this, as well as handling UnresolvedAttribute_s, can potentially be refactored". Making the distinction because with the existing optimisation, the new feature / code has to do a refresh.

Added a follow-up item to #138888.

I'll leave it there, but it might be a follow-up to #138888 itself; i.e.: we could make the code pre-unmapped fields simpler by removing that pre-existing optimization.

alex-spies · 2026-01-12T14:53:46Z

...k/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/rules/ResolveUnmapped.java

+     * @return A plan having all nodes recreated (no properties changed, otherwise). This is needed to clear internal, lazy-eval'd and
+     * cached state, such as the output. The rule inserts new attributes in the plan, so the output of all the nodes downstream these
+     * insertions need be recomputed.
+     */
+    private static LogicalPlan refreshChildren(LogicalPlan plan) {


IMHO this is a smell. If a plan node's output depends on its children, no re-computation should be required - changing the children should not require additional steps for the .output() method to return correct results.

Added a follow-up item to #138888.

Fixed in #141262

elasticsearchmachine · 2026-01-12T15:33:27Z

💔 Backport failed

Status	Branch	Result
❌	9.3	Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 140463

This introduces support for mapping-unavailable fields (present and not mapped or just missing). The behaviour is controlled through a new SET setting unmapped_fields, which can take the values "FAIL", "NULLIFY", "LOAD". An optional field behaves just like a "normal", mapped field, with regards to how it flows through the commands chain: it can be simply used in the commands, as if present in the source, but can no longer be referenced once dropped - explicitly, with DROP, or not selected by a KEEP, or RENAME that doesn't reference it -, or past a STATS reduction. However, unlike a mapped field, if it's not reference at all, it won't show up in the output of a simple FROM index. Currently, the schema difference between nullified fields and the loaded ones is in the type: nullified ones are of data type NULL, while the loaded ones are KEYWORD. The implementation difference w.r.t. logical plan building is that the nullified fields are created as null value aliasing on top of the data source, while the loaded one are pushed as extractors into the source (this leverages the INSIST work). The partially mapped fields are also covered: when the setting is "load", these fields will be extracted from those indices that have the field, but isn't mapped. In case there's a conflict between the loaded KEYWORD field and the mapped type in the fields that have this field mapped, an explicit conversion is needed, just like with union types. Related: elastic#138888 (cherry picked from commit ff745c0) # Conflicts: # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/AnalyzerContext.java # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plan/QuerySettings.java # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/session/EsqlSession.java # x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/optimizer/promql/PromqlLogicalPlanOptimizerTests.java # x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/parser/SetParserTests.java

alex-spies · 2026-01-12T17:37:29Z

💚 All backports created successfully

Status	Branch	Result
✅	9.3

Questions ?

Please refer to the Backport tool documentation

…i-project-tests * upstream/main: (23 commits) Fix `testAckListenerReceivesNacksIfPublicationTimesOut` (elastic#140514) Reduce priority of clear-cache tasks (elastic#139685) Add docs and tests about `StreamOutput` to memory (elastic#140365) ES|QL - dense_vector support for COUNT, PRESENT, ABSENT aggregator functions (elastic#139914) Add release notes for v9.2.4 release (elastic#140487) Add release notes for v9.1.10 release (elastic#140488) Add conncectors release notes for 9.1.10, 9.2.4 (elastic#140499) Add parameter support in PromQL query durations (elastic#139873) Improve testing of STS credentials reloading (elastic#140114) Fix zstd native binary publishing script to support newer versions (elastic#140485) Add FlattenedFieldBinaryVsSortedSetDocValuesSyntheticSourceIT (elastic#140489) Store fallback match only text fields in binary doc values (elastic#140189) [DiskBBQ] Use the new merge executor for intra-merge parallelism (elastic#139942) ESQL: introduce support for mapping-unavailable fields (elastic#140463) Add ESNextOSQVectorsScorerTests (elastic#140436) Disable high cardinality tests on release builds (elastic#140503) ESQL: TRange timezone support (elastic#139911) Directly compressing `StreamOutput` (elastic#140502) ES|QL - fix dense vector enrich bug (elastic#139774) Use CrossProjectModeDecider in RemoteClusterService (elastic#140481) ...

alex-spies · 2026-01-13T17:30:18Z