Skip to content

ESQL: introduce support for mapping-unavailable fields#139417

Closed
bpintea wants to merge 37 commits intoelastic:mainfrom
bpintea:feat/set_unmapped_nullify
Closed

ESQL: introduce support for mapping-unavailable fields#139417
bpintea wants to merge 37 commits intoelastic:mainfrom
bpintea:feat/set_unmapped_nullify

Conversation

@bpintea
Copy link
Contributor

@bpintea bpintea commented Dec 12, 2025

This introduces support for mapping-unavailable fields (present and not mapped or just missing). The behaviour is controlled through a new SET setting unmapped_fields, which can take the values "FAIL", "NULLIFY", "LOAD".

An optional field behaves just like a "normal", mapped field, with regards to how it flows through the commands chain: it can be simply used in the commands, as if present in the source, but can no longer be referenced once dropped - explicitly, with DROP, or not selected by a KEEP, or RENAME that doesn't reference it -, or past a STATS reduction.

However, unlike a mapped field, if it's not reference at all, it won't show up in the output of a simple FROM index.

Currently, the schema difference between nullified fields and the loaded ones is in the type: nullified ones are of data type NULL, while the loaded ones are KEYWORD.
The implementation difference w.r.t. logical plan building is that the nullified fields are created as null value aliasing on top of the data source, while the loaded one are pushed as extractors into the source (this leverages the INSIST work).

The partially mapped fields are also covered: when the setting is "load", these fields will be extracted from those indices that have the field, but isn't mapped. In case there's a conflict between the loaded KEYWORD field and the mapped type in the fields that have this field mapped, an explicit conversion is needed, just like with union types.

Related: #138888

@github-actions
Copy link
Contributor

ℹ️ Important: Docs version tagging

👋 Thanks for updating the docs! Just a friendly reminder that our docs are now cumulative. This means all 9.x versions are documented on the same page and published off of the main branch, instead of creating separate pages for each minor version.

We use applies_to tags to mark version-specific features and changes.

Expand for a quick overview

When to use applies_to tags:

✅ At the page level to indicate which products/deployments the content applies to (mandatory)
✅ When features change state (e.g. preview, ga) in a specific version
✅ When availability differs across deployments and environments

What NOT to do:

❌ Don't remove or replace information that applies to an older version
❌ Don't add new information that applies to a specific version without an applies_to tag
❌ Don't forget that applies_to tags can be used at the page, section, and inline level

🤔 Need help?

@elasticsearchmachine
Copy link
Collaborator

Hi @bpintea, I've created a changelog YAML for you.

Copy link
Contributor

@GalLalouche GalLalouche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left a few small style comments/suggestions, but otherwise LGTM!

import org.elasticsearch.xpack.esql.core.type.DataType;

/**
* This is a unmapped-fields strategy discriminator.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit 1: an unmapped-fields (but see below).
Nit 2: I think that the Javadocs on the main class is redundant, given the Javadocs on the enums themselves. This Javadoc isn't any more informative than the enum name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit 1: addressed.
Nit 2: kept, as this is what an editor will display when showing info about the class/enum.


/**
* In case the query references a field that's not present in the index mapping, alias this field to value {@code null} of type
* {@link DataType}.{@code NULL}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use {@link DataType#NULL} to link to the enum variant directly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, rite, thanks! Fixed.

NULLIFY,

/**
* Just like {@code NULLIFY}, but instead of null-aliasing, insert extractors in the data source.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. (In general) Use {@link NULLIFY} to link to the enum variant directly, but see below.
  2. This is very much unlike NULLIFY, given that it actually (attempts) to load the data... Perhaps rephrase to:
In case the query references a field that's not present in the index mapping, attempts to load it from {@code _source}."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. It seems one cannot ref from within the enum with link.
  2. Applied, thx.

* The rule handles fields that don't show up in the index mapping, but are used within the query. These fields can either be missing
* entirely, or be present in the document, but not in the mapping (which can happen with non-dynamic mappings).
* <p>
* In the case of the former ones, the rule introducees {@code EVAL missing = NULL} commands (null-aliasing / null-Eval'ing).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. s/introducees/introduces.
  2. This comment is misleading, since it reads like the rule chooses nullifying/loading based on the data being present or not, and not based on the enum passed to it, which is configured by the user.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Fixed, thx.
  2. Added a clarification.

// insert an Eval on top of those LeafPlan that are children of n-ary plans (could happen with UnionAll)
transformed = transformed.transformUp(
n -> n instanceof UnaryPlan == false && n instanceof LeafPlan == false,
nAry -> evalUnresolved(nAry, nullAliases)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While the overload here does "the right thing", I find it pretty confusing that it only works due the child being more specific. Consider just using two different names here, e.g., evalUnresolveUnary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

Holder<Boolean> patched = new Holder<>(false);
child = child.transformDown(
// TODO add a suitable forEachDownMayReturnEarly equivalent
n -> patched.get() == false && n instanceof Project, // process top Project only (Fork-injected)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same downcast comment here :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a tiny subtlety here: I want to skip the type checks for the remainder of the tree once the boolean flag flips.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, this is further refined by reusing the newer transformDownSkipBranch.

}

private static void assertSourceType(LogicalPlan source) {
switch (source) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I think this part would be simpler with a plain old if (cond) { throw }.

Copy link
Contributor Author

@bpintea bpintea Dec 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cond would be hard to read: source instance of EsRelation == false && (or the || alternative) repeated two more times.

private static List<Alias> nullAliases(List<UnresolvedAttribute> unresolved) {
Map<String, Alias> aliasesMap = new LinkedHashMap<>(unresolved.size());
for (var u : unresolved) {
if (aliasesMap.containsKey(u.name()) == false) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using putIfAbsent, or if you want to avoid the allocation: computeIfAbsent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, applied.

return new Alias(attribute.source(), attribute.name(), NULLIFIED);
}

// collect all UAs in the node
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace comment with a method rename :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I chose to extend the comment, since a method name would "loose" too much detail.

import static org.hamcrest.Matchers.hasSize;
import static org.hamcrest.Matchers.is;

public class AnalyzerUnmappedTests extends ESTestCase {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cough Golden tests 🧐 cough

ROW x = 1
| EVAL language_code = does_not_exist::INTEGER
| LOOKUP JOIN languages_lookup ON language_code
;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about stuff like:

SET unmapped_fields="nullify"\;
ROW x = 1
| EVAL language_code = does_not_exist::INTEGER
SET unmapped_fields="load"\;
| LOOKUP JOIN languages_lookup ON language_code
;

?

Is that allowed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is not, there's just one SET allowed per statment/query.

Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is some very good work in this PR, especially the attention to the different use cases and commands that can use "unmapped fields". Good job with this and the tests; I learned a bit about this feature by looking and trying out different queries from tests.

I am seeing difficulties in adding such a broad feature with different tricky/special commands to account for (fork, sub-queries, _insist and union types) and I am seeing some inconsistent behavior and outcome (see my comments). I don't think I have enough context to understand the full list of functionality for this feature, but I am seeing some things missing and I am curious on the plan for those missing bits/limitations (maybe?).

I will go over the PR one more time, at least. But, so far it is looking good, ignoring the missing bits/issues I mentioned above.

Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Left some comments exploratory in nature and, if they are valid, a potential set of improvements. But can be addressed later with further work on this feature. Great work in this PR!


A comment for later discussions (if any), not a show stopper, more like a heads up for some potential confusion amongst users and weird hard-to-track-down bugs later in the development cycles.

The unmapped_message is materialized whenever KEEP is used. If I do FROM partial_mapping_sample_data | SORT unmapped_message I get an error about the field not being found. That's fair and this shows that there should be a mechanism for some "special" fields to be made available in a query command. I am also aware that this should improve with follow-up PRs.

If these "special" fields (unmapped, not visible etc) need a separate mechanism to be made available, why are we treating them equal to other fields, the "normal" ones? IMHO, to have a clean story around these unmapped fields, we need to "signal" their presence in a different way that shouldn't lead users to think that they are treated equally. With _insist command (that is snapshot only currently) this intention is very clear: _insist is a separate mechanism by which a user tells ESQL that whatever field name _insist refers to, that field is "special" in some way or another.

At some point, I think this transparent way of loading unmapped fields with keep will come back and bite us because these fields are not like the regular fields. We shouldn't treat them as regular fields, to eliminate any sort of confusion. This could also cripple our work on optimizations where we have several mechanisms to add evals to account for possible shadowed fields, where we move (potentially eliminate) commands around. This will make it easier to introduce bugs hard to track down and fix later in the development cycle.

Looking at how these fields are used and loaded and because they depend on being KEEPed in some way or another, I am wondering why the explicit reference to a potentially unmapped field is not part of FROM.

Just a humble 2c.

private static LogicalPlan nullify(LogicalPlan plan, List<UnresolvedAttribute> unresolved) {
var nullAliases = nullAliases(unresolved);

// insert an Eval on top of every LeafPlan, if there's a UnaryPlan atop it
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason why you treat these two branches (unary and nary) differently? Why not in the same transformUp check?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, thanks, this is now being merged.


var transformed = load ? load(plan, unresolved) : nullify(plan, unresolved);

return transformed.equals(plan) ? plan : refreshPlan(transformed, unresolved);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be enough with transformed == plan?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, now it does, thanks.

return insisted;
}

// TODO: would an alternative to this be to drop the current Fork and have ResolveRefs#resolveFork re-resolve it. We might need
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having to re-resolve Fork (or better said, update whatever output Fork is creating) is something I encountered as well. It's a clash between the Fork's need to jump in the plan very early in the process and resolve the output columns and align them properly between its branches and the fact that some plans change/add/remove references from the plan and whatever Fork already resolved needs a "refresh".

Unfortunately, this is not the only tricky place in the ESQL planner that needs special treatment. Union types, inline stats, TS are other features that need special hand-holding.

Realistically, imho, we might be better off documenting these special treatments somewhere so that future us know how to approach stuff at that time and if needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think we should probably refactor Fork for more robustness (a similar comment).

Comment on lines +205 to +221
var refreshed = refreshUnresolved(plan, unresolved);
return refreshChildren(refreshed);
}

/**
* The UAs that haven't been resolved are marked as unresolvable with a custom message. This needs to be removed for
* {@link Analyzer.ResolveRefs} to attempt again to wire them to the newly added aliases. That's what this method does.
*/
private static LogicalPlan refreshUnresolved(LogicalPlan plan, List<UnresolvedAttribute> unresolved) {
return plan.transformExpressionsOnlyUp(UnresolvedAttribute.class, ua -> {
if (unresolved.contains(ua)) {
unresolved.remove(ua);
// Besides clearing the message, we need to refresh the nameId to avoid equality with the previous plan.
// (A `new UnresolvedAttribute(ua.source(), ua.name())` would save an allocation, but is problematic with subtypes.)
ua = (ua.withId(new NameId())).withUnresolvedMessage(null);
}
return ua;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am curious why these two steps (refreshUnresolved for clearing UAs unresolved messages and refreshChildren) can't be performed in a single tree pass, I am clearly missing a detail that was probably obvious in some of your tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They could probably have been, though it might not have been a clear code. But refreshing the plan has been factored away.

Copy link
Contributor

@alex-spies alex-spies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gave it another go. Need to do at least another round, this has become huge.

Ideas for follow-up work, summarized from my comments below:

  • We may want to issue a warning if a user sets unmapped_fields:"load" but the source for one of the indices is disabled.
  • We need to figure out how to avoid having both UnresolvedNamePattern and UnresolvedPattern. I think there's a way, and in the current state, the attribute/named expression hierarchy is pretty confusing with these two co-existing.
  • Our generative tests should probably randomly prepend one of the 3 settings for unmapped_fields.
  • We want more tests for unmapped_fields=\"load\". E.g.
    • Tests for the behavior of union types when an index has an unmapped/missing field that is multi-typed based on the mapping of 2 other indices.
    • More tests that do something with the unmapped/missing fields (e.g. join or enrich on them!)
    • More tests that do something before using the unmapped fields, esp. join on an unrelated field.
    • Subqueries using an unmapped/missing field

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe let's add comments to the two files to inform folks that whatever's added here should probably also be reflected in unmapped_fields.csv-spec and vice versa?

2024-10-23T12:15:03.360Z | 173.21.2.162 | 3450232 | Connected to 10.1.0.3!
;

fieldIsMappedToNonKeywordSingleIndex
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test and the ones below until unmappedFieldAppearsLast (resp. unmappedFieldDoesNotAppearLast) lost their meaning IMO. I believe the originals from unmapped_field.csv-spec were about using INSIST on actually mapped fields. As it stands, the tests here are confusing, esp. because the name implies that they're about unmapped fields but they're not. I don't think they're worth keeping.

2024-10-23T12:15:03.360Z | Connected to 10.1.0.3! | Disconnected from 10.1.0.3
;

# Kept this test disabled to show the difference from IGNORE_🐔: if everywhere unmapped and not mentioned, a field (message) won't show
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have an IGNORE_🐔?

Suggested change
# Kept this test disabled to show the difference from IGNORE_🐔: if everywhere unmapped and not mentioned, a field (message) won't show
# Kept this test disabled to show the difference from INSIST_🐔: if everywhere unmapped and not mentioned, a field (message) won't show


# Kept this test disabled to show the difference from IGNORE_🐔: if everywhere unmapped and not mentioned, a field (message) won't show
# up at all.
fieldIsUnmappedButSourceIsDisabledSingleIndex-Ignore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there's much value in keeping around an ignored test to compare this with an in-development command. There's already a test at the very beginning of this file that shows that we don't just load unmapped fields if there's just a FROM.

Also applies below.

However! I do think that we should have a couple single-index tests with partial_mapping_excluded_source_sample_data. Because that index' mapping has a disabled source, and we should showcase the behavior in case that the source is outright disabled. (In which case, unmapped_fields:"nullify" and unmapped_fields:"load" seem to be equivalent.)

;

#####################
# Multi index tests #
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need another set of tests where we have a union type that is unmapped in one of the indices, and outright missing (with source disabled) in another.

Comment on lines +325 to +334
insistOnTopOfInsistMultiIndex
required_capability: unmapped_fields
required_capability: index_metadata_field
required_capability: optional_fields

SET unmapped_fields="load"\;
FROM partial_mapping_sample_data, sample_data METADATA _index
| KEEP _index, @timestamp, foo, message, unmapped_message
| SORT @timestamp DESC
;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another test duplication that doesn't make sense without INSIST.

Comment on lines +410 to +419
fieldIsMappedToDifferentTypesMultiIndex
required_capability: index_metadata_field
required_capability: unmapped_fields
required_capability: optional_fields

SET unmapped_fields="load"\;
FROM sample_data_ts_long, sample_data METADATA _index
| KEEP _index, @timestamp
| SORT _index
;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More tests that just show that existing behavior stays intact. Less useful than the original test that had an INSIST @timestamp, but still useful; although again maybe better solved via randomization.

2024-10-23T13:55:01.543Z | partial_mapping_sample_data | Connected to 10.1.0.1!
;

partialMappingUnionTypes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pre-existing but misleading: there are no union types involved here. The only thing that happens is that we pass message to to_string first thing, but there's no mapping conflict to be solved.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to know how we behave w.r.t. subqueries and found out that loading unmapped fields seems to work inside a subquery, but when attempted after, we encounter an ISE:

SET unmapped_fields=\"load\"; from (from partial_mapping_sample_data), (from partial_mapping_sample_data) | keep unmapped_message

{"error":{"root_cause":[{"type":"illegal_state_exception","reason":"Found 2 problems\nline 1:34: Plan [EsqlProject[[@timestamp{f}#45, client_ip{f}#47, event_duration{f}#46, message{f}#48, unmapped_message{f}#60]]] optimized incorrectly due to missing references [unmapped_message{f}#60]\nline 1:70: Plan [EsqlProject[[@timestamp{f}#49, client_ip{f}#51, event_duration{f}#50, message{f}#52, unmapped_message{f}#61]]] optimized incorrectly due to missing references [unmapped_message{f}#61]"}],"type":"illegal_state_exception","reason":"Found 2 problems\nline 1:34: Plan [EsqlProject[[@timestamp{f}#45, client_ip{f}#47, event_duration{f}#46, message{f}#48, unmapped_message{f}#60]]] optimized incorrectly due to missing references [unmapped_message{f}#60]\nline 1:70: Plan [EsqlProject[[@timestamp{f}#49, client_ip{f}#51, event_duration{f}#50, message{f}#52, unmapped_message{f}#61]]] optimized incorrectly due to missing references [unmapped_message{f}#61]"},"status":500}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same problem happens for unmapped_fields="nullify"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fxed in #141262 and #141371.

@alex-spies alex-spies added test-release Trigger CI checks against release build auto-backport Automatically create backport pull requests when merged labels Jan 9, 2026
@alex-spies
Copy link
Contributor

Closing this in favor of #140463.

We can't push to this PR's branch due to permissions, so we re-created it. The copy includes all of the commits from here.

@alex-spies alex-spies closed this Jan 9, 2026
@alex-spies
Copy link
Contributor

@astefan , to answer your question in Bogdan's absence: the use case we'd like to cover is "I have a query and I'm pointing it at different index patterns". E.g. in O11y, we have fields that are used in many indices, but may be missing in others; or a roll-over in a data stream may add a field, but we want a query to still work when we run against the pre-rolled over index. Another use case are pre-defined dashboards based on ESQL queries, where you may want to pop in a different index pattern but re-use the same query.

For these cases, you'd have to INSIST essentially on a large number of fields that the query may use. It's simpler to add a SET unmapped_fields="nullify" (or "load") at the beginning and implicitly have the query add INSISTs for you whenever you mention a new column name.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL auto-backport Automatically create backport pull requests when merged ES|QL-ui Impacts ES|QL UI >feature Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) test-release Trigger CI checks against release build v9.3.0 v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants