Fix metadata fields being nullified/loaded by unmapped_fields setting#143155
Fix metadata fields being nullified/loaded by unmapped_fields setting#143155quackaplop merged 12 commits intoelastic:mainfrom
Conversation
af82b7c to
52d3206
Compare
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
alex-spies
left a comment
There was a problem hiding this comment.
Looks good. We can consider adding a yaml test before merging that confirms a correct error message outside of just unit tests. It's probably also fine without, though.
| assertThat(Expressions.name(row.fields().getFirst()), is("x")); | ||
| } | ||
|
|
||
| public void testFailMetadataFieldInKeep() { |
There was a problem hiding this comment.
This is nice, I'd suggest a couple more complex query shapes though.
For nullify, we don't disallow FORK, LOOKUP JOIN, subqueries, etc.
I'd also suggest trying to refer to the missing metadata field after (not inside) pipeline breakers.
There was a problem hiding this comment.
Good idea. Added a few
52d3206 to
1fb3b9a
Compare
1fb3b9a to
f39378f
Compare
Metadata fields (_score, _id, _index, etc.) were incorrectly treated as unmapped fields when SET unmapped_fields="nullify" or "load" was used. This silently returned NULL instead of producing a proper error guiding the user to add METADATA to the FROM clause. Closes elastic#141907
f39378f to
85e6475
Compare
alex-spies
left a comment
There was a problem hiding this comment.
LGTM. Thank you and feel free to
at your own discretion @quackaplop !
|
buildkite test this |
|
buildkite test this |
…elastic#143155) Metadata fields (_score, _id, _index, etc.) were incorrectly treated as unmapped fields when SET unmapped_fields="nullify" or "load" was used. This silently returned NULL instead of producing a proper error guiding the user to add METADATA to the FROM clause. Closes elastic#141907
…cations * upstream/main: (35 commits) Create ARM bulk sqrI8 implementation (elastic#142461) Rework get-snapshots predicates (elastic#143161) Refactor downsampling fetchers and producers (elastic#140357) ESQL: Unmute test and add extra logging to generative test validation (elastic#143168) Fix metadata fields being nullified/loaded by unmapped_fields setting (elastic#143155) Determine remote cluster version (elastic#142494) Populate failure message for aborted clones (elastic#143206) Allow kibana_system role to read and manage logs streams (elastic#143053) Mute org.elasticsearch.xpack.esql.CsvIT test {csv-spec:eval.DocsLength} elastic#143224 Mute org.elasticsearch.xpack.esql.CsvIT test {csv-spec:eval.DocsByteLength} elastic#143223 Mute org.elasticsearch.xpack.esql.CsvIT test {csv-spec:docs.DocsBitLength} elastic#143222 Fix FloatVectorScorerSupplier bulkScore bug (elastic#143211) ESQL: Add data node execution for external sources (elastic#143209) [ESQL] Cleanup commands docs (elastic#143058) [ML]Fix latest transforms disregarding updates when sort and sync fields are non-monotonic (elastic#142856) Mute org.elasticsearch.index.mapper.IpFieldMapperTests testSyntheticSourceInObject elastic#143212 Tests: Fix StoreDirectoryMetricsIT (elastic#143084) ESQL: Add distribution strategy for external sources (elastic#143194) CSV IT spec (elastic#142585) Fix VectorScorerOSQBenchmark.score to read corrections properly (elastic#143137) ...
|
I noticed that this should probably be backported to 9.3. because it's a bugfix and because otherwise we'll make the backport of #141340 more complex. |
💚 All backports created successfully
Questions ?Please refer to the Backport tool documentation |
…elastic#143155) Metadata fields (_score, _id, _index, etc.) were incorrectly treated as unmapped fields when SET unmapped_fields="nullify" or "load" was used. This silently returned NULL instead of producing a proper error guiding the user to add METADATA to the FROM clause. Closes elastic#141907 (cherry picked from commit e76c5c1) # Conflicts: # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/rules/ResolveUnmapped.java
…etting (#143155) (#143373) * Fix metadata fields being nullified/loaded by unmapped_fields setting (#143155) Metadata fields (_score, _id, _index, etc.) were incorrectly treated as unmapped fields when SET unmapped_fields="nullify" or "load" was used. This silently returned NULL instead of producing a proper error guiding the user to add METADATA to the FROM clause. Closes #141907 (cherry picked from commit e76c5c1) # Conflicts: # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/rules/ResolveUnmapped.java * Make metadata ATTRIBUTES_MAP public --------- Co-authored-by: Oleg Lvovitch <oleg.lvovitch@elastic.co>
…elastic#143155) Metadata fields (_score, _id, _index, etc.) were incorrectly treated as unmapped fields when SET unmapped_fields="nullify" or "load" was used. This silently returned NULL instead of producing a proper error guiding the user to add METADATA to the FROM clause. Closes elastic#141907
Summary
Fixes #141907
When
SET unmapped_fields="nullify"or"load"is used, metadata fields like_score,_id,_index,_version, etc. are incorrectly treated as regular unmapped fields. Instead of producing an error that guides the user to addMETADATA _scoreto theFROMclause, the fields are silently resolved asNULL(nullify mode) or treated as unmapped keyword fields (load mode).This is problematic because:
METADATA_score, it obscures the fact that no search scoring was performedMETADATAclauseRoot cause
ResolveUnmapped.collectUnresolved()collects allUnresolvedAttributes in the plan and feeds them into the nullify/load logic. Since metadata fields that aren't declared in theMETADATAclause remain unresolved afterResolveRefs, they get picked up and silently resolved byResolveUnmapped.Fix
Added a check in
collectUnresolved()to exclude attributes whose names match known metadata fields (MetadataAttribute.isSupported()). This lets them stay unresolved so the verifier produces the properUnknown column [_score]error, guiding the user to declareFROM ... METADATA _score.When metadata fields are declared via the
METADATAclause, they get resolved duringResolveRefsand never appear asUnresolvedAttributes — so the new filter has no effect on the happy path.Test plan
All tests iterate over every entry in
MetadataAttribute.ATTRIBUTES_MAP, so new metadata fields are automatically covered.Failure tests (both nullify and load modes):
FROM test | KEEP <field>FROM test | EVAL x = <field>FROM test | WHERE <field> IS NOT NULLFROM test | SORT <field>FROM test | STATS x = COUNT(<field>)FROM test | RENAME <field> AS renamedComplex query shapes (both modes):
FROM test | STATS c = COUNT(*) | KEEP _scoreFROM test | FORK (WHERE _score > 1) ...FROM (FROM test | WHERE _score > 1)Happy path (metadata declared via
METADATAclause, full plan structure validated):FROM test METADATA <field> | KEEP <field>Regression suites:
AnalyzerUnmappedTests— all passAnalyzerTests— all pass