Skip to content

Fix metadata fields being nullified/loaded by unmapped_fields setting#143155

Merged
quackaplop merged 12 commits intoelastic:mainfrom
quackaplop:unmapped_fields_metadata
Feb 27, 2026
Merged

Fix metadata fields being nullified/loaded by unmapped_fields setting#143155
quackaplop merged 12 commits intoelastic:mainfrom
quackaplop:unmapped_fields_metadata

Conversation

@quackaplop
Copy link
Copy Markdown
Contributor

@quackaplop quackaplop commented Feb 26, 2026

Summary

Fixes #141907

When SET unmapped_fields="nullify" or "load" is used, metadata fields like _score, _id, _index, _version, etc. are incorrectly treated as regular unmapped fields. Instead of producing an error that guides the user to add METADATA _score to the FROM clause, the fields are silently resolved as NULL (nullify mode) or treated as unmapped keyword fields (load mode).

This is problematic because:

  • It hides user mistakes — the user likely intended to use the metadata field but forgot METADATA
  • In the case of _score, it obscures the fact that no search scoring was performed
  • It violates the principle that metadata fields require explicit opt-in via the METADATA clause

Root cause

ResolveUnmapped.collectUnresolved() collects all UnresolvedAttributes in the plan and feeds them into the nullify/load logic. Since metadata fields that aren't declared in the METADATA clause remain unresolved after ResolveRefs, they get picked up and silently resolved by ResolveUnmapped.

Fix

Added a check in collectUnresolved() to exclude attributes whose names match known metadata fields (MetadataAttribute.isSupported()). This lets them stay unresolved so the verifier produces the proper Unknown column [_score] error, guiding the user to declare FROM ... METADATA _score.

When metadata fields are declared via the METADATA clause, they get resolved during ResolveRefs and never appear as UnresolvedAttributes — so the new filter has no effect on the happy path.

Test plan

All tests iterate over every entry in MetadataAttribute.ATTRIBUTES_MAP, so new metadata fields are automatically covered.

Failure tests (both nullify and load modes):

  • KEEP — FROM test | KEEP <field>
  • EVAL — FROM test | EVAL x = <field>
  • WHERE — FROM test | WHERE <field> IS NOT NULL
  • SORT — FROM test | SORT <field>
  • STATS — FROM test | STATS x = COUNT(<field>)
  • RENAME — FROM test | RENAME <field> AS renamed

Complex query shapes (both modes):

  • After pipeline breaker — FROM test | STATS c = COUNT(*) | KEEP _score
  • Inside FORK branch — FROM test | FORK (WHERE _score > 1) ...
  • Inside subquery — FROM (FROM test | WHERE _score > 1)

Happy path (metadata declared via METADATA clause, full plan structure validated):

  • Nullify mode — FROM test METADATA <field> | KEEP <field>
  • Load mode — same

Regression suites:

  • AnalyzerUnmappedTests — all pass
  • AnalyzerTests — all pass

@elasticsearchmachine elasticsearchmachine added needs:triage Requires assignment of a team area label v9.4.0 labels Feb 26, 2026
@quackaplop quackaplop force-pushed the unmapped_fields_metadata branch 2 times, most recently from af82b7c to 52d3206 Compare February 26, 2026 15:28
@elasticsearchmachine elasticsearchmachine added Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) and removed needs:triage Requires assignment of a team area label labels Feb 26, 2026
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

Copy link
Copy Markdown
Contributor

@alex-spies alex-spies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. We can consider adding a yaml test before merging that confirms a correct error message outside of just unit tests. It's probably also fine without, though.

assertThat(Expressions.name(row.fields().getFirst()), is("x"));
}

public void testFailMetadataFieldInKeep() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is nice, I'd suggest a couple more complex query shapes though.

For nullify, we don't disallow FORK, LOOKUP JOIN, subqueries, etc.

I'd also suggest trying to refer to the missing metadata field after (not inside) pipeline breakers.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. Added a few

@quackaplop quackaplop force-pushed the unmapped_fields_metadata branch from 52d3206 to 1fb3b9a Compare February 26, 2026 16:22
@quackaplop quackaplop removed the request for review from GalLalouche February 26, 2026 16:28
@quackaplop quackaplop force-pushed the unmapped_fields_metadata branch from 1fb3b9a to f39378f Compare February 26, 2026 16:39
Metadata fields (_score, _id, _index, etc.) were incorrectly treated as
unmapped fields when SET unmapped_fields="nullify" or "load" was used.
This silently returned NULL instead of producing a proper error guiding
the user to add METADATA to the FROM clause.

Closes elastic#141907
@quackaplop quackaplop force-pushed the unmapped_fields_metadata branch from f39378f to 85e6475 Compare February 26, 2026 17:25
Copy link
Copy Markdown
Contributor

@alex-spies alex-spies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you and feel free to :shipit: at your own discretion @quackaplop !

@quackaplop
Copy link
Copy Markdown
Contributor Author

buildkite test this

@quackaplop
Copy link
Copy Markdown
Contributor Author

buildkite test this

@quackaplop quackaplop merged commit e76c5c1 into elastic:main Feb 27, 2026
7 of 10 checks passed
@quackaplop quackaplop deleted the unmapped_fields_metadata branch February 27, 2026 12:56
PeteGillinElastic pushed a commit to PeteGillinElastic/elasticsearch that referenced this pull request Feb 27, 2026
…elastic#143155)

Metadata fields (_score, _id, _index, etc.) were incorrectly treated as
unmapped fields when SET unmapped_fields="nullify" or "load" was used.
This silently returned NULL instead of producing a proper error guiding
the user to add METADATA to the FROM clause.

Closes elastic#141907
szybia added a commit to szybia/elasticsearch that referenced this pull request Feb 27, 2026
…cations

* upstream/main: (35 commits)
  Create ARM bulk sqrI8 implementation (elastic#142461)
  Rework get-snapshots predicates (elastic#143161)
  Refactor downsampling fetchers and producers (elastic#140357)
  ESQL: Unmute test and add extra logging to generative test validation (elastic#143168)
  Fix metadata fields being nullified/loaded by unmapped_fields setting (elastic#143155)
  Determine remote cluster version (elastic#142494)
  Populate failure message for aborted clones (elastic#143206)
  Allow kibana_system role to read and manage logs streams (elastic#143053)
  Mute org.elasticsearch.xpack.esql.CsvIT test {csv-spec:eval.DocsLength} elastic#143224
  Mute org.elasticsearch.xpack.esql.CsvIT test {csv-spec:eval.DocsByteLength} elastic#143223
  Mute org.elasticsearch.xpack.esql.CsvIT test {csv-spec:docs.DocsBitLength} elastic#143222
  Fix FloatVectorScorerSupplier bulkScore bug (elastic#143211)
  ESQL: Add data node execution for external sources (elastic#143209)
  [ESQL] Cleanup commands docs (elastic#143058)
  [ML]Fix latest transforms disregarding updates when sort and sync fields are non-monotonic (elastic#142856)
  Mute org.elasticsearch.index.mapper.IpFieldMapperTests testSyntheticSourceInObject elastic#143212
  Tests: Fix StoreDirectoryMetricsIT (elastic#143084)
  ESQL: Add distribution strategy for external sources (elastic#143194)
  CSV IT spec (elastic#142585)
  Fix VectorScorerOSQBenchmark.score to read corrections properly (elastic#143137)
  ...
@alex-spies
Copy link
Copy Markdown
Contributor

I noticed that this should probably be backported to 9.3. because it's a bugfix and because otherwise we'll make the backport of #141340 more complex.

@alex-spies
Copy link
Copy Markdown
Contributor

💚 All backports created successfully

Status Branch Result
9.3

Questions ?

Please refer to the Backport tool documentation

alex-spies pushed a commit to alex-spies/elasticsearch that referenced this pull request Mar 2, 2026
…elastic#143155)

Metadata fields (_score, _id, _index, etc.) were incorrectly treated as
unmapped fields when SET unmapped_fields="nullify" or "load" was used.
This silently returned NULL instead of producing a proper error guiding
the user to add METADATA to the FROM clause.

Closes elastic#141907

(cherry picked from commit e76c5c1)

# Conflicts:
#	x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/rules/ResolveUnmapped.java
@alex-spies alex-spies removed the :Analytics/Compute Engine Analytics in ES|QL label Mar 2, 2026
elasticsearchmachine pushed a commit that referenced this pull request Mar 2, 2026
…etting (#143155) (#143373)

* Fix metadata fields being nullified/loaded by unmapped_fields setting (#143155)

Metadata fields (_score, _id, _index, etc.) were incorrectly treated as
unmapped fields when SET unmapped_fields="nullify" or "load" was used.
This silently returned NULL instead of producing a proper error guiding
the user to add METADATA to the FROM clause.

Closes #141907

(cherry picked from commit e76c5c1)

# Conflicts:
#	x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/rules/ResolveUnmapped.java

* Make metadata ATTRIBUTES_MAP public

---------

Co-authored-by: Oleg Lvovitch <oleg.lvovitch@elastic.co>
tballison pushed a commit to tballison/elasticsearch that referenced this pull request Mar 3, 2026
…elastic#143155)

Metadata fields (_score, _id, _index, etc.) were incorrectly treated as
unmapped fields when SET unmapped_fields="nullify" or "load" was used.
This silently returned NULL instead of producing a proper error guiding
the user to add METADATA to the FROM clause.

Closes elastic#141907
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.3.2 v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ESQL: unmapped_fields="nullify"/"load" shouldn't nullify/load metadata, esp. _score

3 participants