ESQL: Prevent pushdown of unmapped fields in filters and sorts#143460

Merged

GalLalouche merged 54 commits intoelastic:mainfrom

GalLalouche:prevent_pushdown_cursor

Mar 6, 2026

Contributor

GalLalouche commented Mar 3, 2026 •

edited

Loading

This PR fixes a bug in the behavior of SET UNMAPPED_FIELDS=LOAD where a potentially unmapped field would be pushed down to Lucene, and, when it doesn't exist in the mapping, would cause wrong results.

Resolves: #141920, #141925.

GalLalouche and others added 5 commits

March 3, 2026 12:10


          ESQL: Add name IDs to golden tests; fix synthetic names

256fb47


          Prevent filter pushdown on potentially unmapped fields

7e779c2

PotentiallyUnmappedKeywordEsField (used for fields loaded from _source
when unmapped_fields="load") had isAggregatable=true, which caused
isPushableFieldAttribute to short-circuit past the SearchStats check and
push filters down to Lucene. On shards where the field is not indexed,
the Lucene query returns no results instead of letting the compute
engine evaluate the filter on _source-loaded values.

Guard isPushableFieldAttribute against PotentiallyUnmappedKeywordEsField
so these fields are always filtered in the compute engine.

Co-authored-by: Cursor <cursoragent@cursor.com>


          Add test for sort pushdown on partially unmapped fields (elastic#141925)

d526c11

The isPushableFieldAttribute fix (rejecting PotentiallyUnmappedKeywordEsField)
already covers sort pushdown since PushTopNToSource uses the same method.
This test verifies correct sort order when a field is mapped in one index
but unmapped in another under unmapped_fields="load".

Co-authored-by: Cursor <cursoragent@cursor.com>


          Fix sort pushdown test to actually exercise the bug

8c3a2ab

Remove _index from the SORT clause — it prevented the entire sort from
being pushed to Lucene (canPushDownOrders requires all fields pushable),
masking the bug. With only SORT message, the sort is pushed down and
produces wrong order on the unmapped shard without the fix.

Co-authored-by: Cursor <cursoragent@cursor.com>


          Add PushdownGoldenTests for sort and filter pushdown coverage.

979b228

Capture single-index unmapped behavior for nullify/load and switch GoldenTestCase to read unmapped settings from parsed statements so SET-based golden tests run like type_conflicts.

Co-authored-by: Cursor <cursoragent@cursor.com>
Made-with: Cursor

GalLalouche added >feature >bug Team:Analytics :Analytics/ES|QL labels

GalLalouche requested a review from alex-spies

March 3, 2026 12:06

elasticsearchmachine added the v9.4.0 label

Collaborator

elasticsearchmachine commented Mar 3, 2026

Pinging @elastic/es-analytical-engine (Team:Analytics)


          Update docs/changelog/143460.yaml

b1c00f4

Collaborator

elasticsearchmachine commented Mar 3, 2026

Hi @GalLalouche, I've created a changelog YAML for you.


          [CI] Auto commit changes from spotless

b67dc4f

alex-spies approved these changes

View reviewed changes

Contributor

alex-spies left a comment

LGTM except some minor test suggestions (and rebasing on the other PR that needs to precede this).

x-pack/plugin/esql/qa/testFixtures/src/main/resources/unmapped-load.csv-spec Outdated Show resolved Hide resolved

x-pack/plugin/esql/qa/testFixtures/src/main/resources/unmapped-load.csv-spec Outdated

Comment on lines +734 to +736

+              required_capability: unmapped_fields
+              required_capability: optional_fields
+              required_capability: field_alias_support

Contributor

alex-spies Mar 3, 2026

One capability is probably enough, and it should be a new capability added in this PR to disable bwc tests with earlier versions that can't properly run the added tests. (Serverless + 9.3 bwc tests should fail.)

OPTIONAL_FIELDS_V2, for instance.

x-pack/plugin/esql/qa/testFixtures/src/main/resources/unmapped-load.csv-spec Outdated

+              fieldAliasAndNonExistent
+              required_capability: unmapped_fields
+              required_capability: optional_fields
+              required_capability: field_alias_support

Contributor

alex-spies Mar 3, 2026

I think field_alias_support is a hallucination, which means this test doesn't actually run at all.

I thought that @idegtiarenko put in a check that prevents using non-existant caps? Please double check that :)

x-pack/plugin/esql/qa/testFixtures/src/main/resources/unmapped-load.csv-spec

+              FROM sample_data, no_mapping_sample_data METADATA _index
+              | KEEP _index, message
+              | WHERE message == "Connection error?"
+              | SORT _index

Contributor

alex-spies Mar 3, 2026

Slightly irrelevant sorting, but not wrong

x-pack/plugin/esql/qa/testFixtures/src/main/resources/unmapped-load.csv-spec

+              SET unmapped_fields="load"\;
+              FROM sample_data, no_mapping_sample_data METADATA _index
+              | KEEP _index, message
+              | SORT message

Contributor

alex-spies Mar 3, 2026

For good measure, let's maybe add a test that sorts by multiple fields, and expressions based on unmapped fields. Also a case where an unmapped/partially mapped field is cast to a long or so.

.../optimizer/golden_tests/PushdownGoldenTests/testFilterNoPushdownWithUnmapped/load/query.esql

Contributor

alex-spies Mar 3, 2026

nit: The location of the query.esql is a little inconsistent. Let's make it consistent :) There should be a separate query.esql for nullify with a separate SET as well.

.../optimizer/golden_tests/PushdownGoldenTests/testFilterNoPushdownWithUnmapped/load/query.esql

    
            @@ -0,0 +1,4 @@
          
              SET unmapped_fields="load"; FROM sample_data

              | KEEP message, does_not_exist

              | WHERE does_not_exist::KEYWORD == "Connection error?"

Contributor

alex-spies Mar 3, 2026

Interesting to see: a case where a pushable and non-pushable (because unmapped) expression are conjuncted.

Similarly for the expressions in the SORT.

...hdownGoldenTests/testSortNoPushdownWithUnmapped/nullify/local_physical_optimization.expected

Comment on lines +6 to +7

		\_TopNExec[[Order[does_not_exist{r}#1,ASC,LAST]],5[INTEGER],70]
		\_EvalExec[[null[NULL] AS does_not_exist#1]]

Contributor

alex-spies Mar 3, 2026

TopN by a constant can be optimized to a limit, but that's a different issue.

@GalLalouche, could you please open an issue and track it internally so we consider this optimization before nullify's GA?

...pack/esql/optimizer/golden_tests/PushdownGoldenTests/testFilterPushdownNoUnmapped/query.esql

Comment on lines +2 to +3

		\| KEEP message
		\| WHERE message == "Connection error?"

Contributor

alex-spies Mar 3, 2026

Let's have a version of this that just does the filtering, no sort.

...pack/esql/optimizer/golden_tests/PushdownGoldenTests/testFilterPushdownNoUnmapped/query.esql

Contributor

alex-spies Mar 3, 2026

Generally, tests look very nice, but we can try and add a couple more funky cases. E.g. more complex filters (also disjunctions with mixed pushable/non-pushable), more complex sorts; even moreso than in the spec tests.

alex-spies mentioned this pull request

ES|QL: Fix KQL/QSTR with unmapped fields in NULLIFY mode #143399

Merged

GalLalouche and others added 13 commits

March 3, 2026 20:52


          CR: Decompose, deduplicate

8669a61


          Address Alex's PR elastic#143460 review comments

d2f1dbc

- Add OPTIONAL_FIELDS_V2 capability for pushdown-elimination BWC.
- Remove fieldAliasAndNonExistent test; fix capabilities on pushdown tests.
- Remove SORT from filterOnPartiallyUnmappedField; add fully unmapped
  multi-sort, and cast tests in unmapped-load.csv-spec.
- Add comment in LucenePushdownPredicates for PotentiallyUnmappedKeywordEsField.
- Add cross-refs between PushdownGoldenTests and LocalPhysicalPlanOptimizerTests.
- Fix assumeTrue: use OPTIONAL_FIELDS_V2 for load, remove redundant nullify.
- Write query.esql to nested path in GoldenTestCase. This was actually a
  pretty "noisy" change, since it also affect
  SubstituteRoundToGoldenTests!
- Add filter-only, conjunction pushable/non-pushable golden tests.

Made-with: Cursor


          Cleanup SpecIT logging configuration (elastic#143365)


          Add circuit breaker for query construction to prevent OOM from automa…

ed5b177

…ton-based queries (elastic#142150)


          ESQL: Fix datasource test failures on Windows and FIPS (elastic#143417)

ec13be2

Datasource tests fail on Windows CI and FIPS CI builds due to two
independent issues introduced with the external sources feature.

**Windows:** `StoragePath.of()` cannot parse `file://` URIs with
Windows drive letters. A path like `file://C:\bk\path\file.txt`
causes the colon after the drive letter `C` to be misinterpreted
as a port separator, resulting in `NumberFormatException`. Both the
production code (`LocalStorageProvider.toStoragePath()`) and the tests
construct file URIs via manual string concatenation instead of using
the existing `StoragePath.fileUri()` helper that normalizes Windows
paths correctly.

**FIPS:** `ExternalDistributedSpecIT` starts a test cluster with
`xpack.security.enabled=false`, but FIPS mode requires security to
be enabled. The Elasticsearch process dies during startup before any
test method runs. Since the test relies on plain HTTP S3 fixtures
that are inherently incompatible with FIPS, the test is now skipped
in FIPS mode.

Developed using AI-assisted tooling


          ESQL: Add extended distribution tests and fault injection for externa…

0c58650

…l sources (elastic#143420)

* ESQL: Add extended distribution tests and fault injection for external sources

Extends the test coverage for external source distributed execution with
property tests for weighted round-robin and coalescing, and adds fault
injection infrastructure for testing resilience under storage failures.

- ExtendedDistributionPropertyTests: weighted RR load balancing bounds,
  coalescing preservation, coalesced+distribution integration
- FaultInjectionRetryTests: retry policy behavior under transient and
  persistent fault patterns (503, connection reset, timeout)
- FaultInjectingS3HttpHandler: configurable S3 fault injection with
  countdown, path filtering, and auto-clearing
- FaultInjectingS3HttpHandlerIT: real HTTP server tests for the handler

Developed using AI-assisted tooling

* Update docs/changelog/143420.yaml


          Fix SQL client parsing of array header values (elastic#143408)

1b326b1


          Fix CSV-escaped quotes in generated docs examples (elastic#143449)

84b10b3

* Fix CSV-escaped quotes rendering in generated docs examples

DocsV3Support.renderTableLine() now unescapes RFC 4180 CSV quoting
(strips outer quote delimiters and replaces "" with ") so that JSON
strings in function example tables render correctly — e.g.
{"key":"value"} instead of {""key"":""value""}.

Affects json_extract and to_tdigest doc examples.

* Refine CSV unescaping to only unescape RFC 4180 doubled quotes; add tests

The previous approach stripped outer quotes from ALL quoted values, breaking
simple quoted values like "POINT(...)" and "foo". Now only cells with actual
RFC 4180 doubled-quote escaping ("") are unescaped, leaving simple quoted
values unchanged.

Added tests: testRenderingExampleResultCsvJsonUnescaping verifies JSON
unescaping works, testRenderingExampleResultSimpleQuotesPreserved verifies
simple quoted values are not modified.

Also adds changelog YAML for the PR.


          Fix MMROperatorTests (elastic#143453)

17d86a7


          [DOCS] Fix ES|QL function and commands lists versioning metadata (ela…

df85dbf

…stic#143402)

* Fix ES|QL function list versioning metadata

Audit all _snippets/lists/ files against Java @FunctionAppliesTo
annotations. Adds missing applies_to tags, corrects wrong versions,
and applies cumulative preview→ga tags where functions graduated.

Also adds missing applies_to front matter to
time-series-aggregation-functions.md landing page.

* update tags on commands lists


          Mute org.elasticsearch.compute.lucene.read.ValueSourceReaderTypeConve…

48ed462

…rsionTests testLoadAll elastic#143471


          [Transform] Stop transforms at the end of tests (elastic#139783)

05862b9

Stopping the transform at the end of test, before the reset, can help
other tests running in parallel.

Resolve elastic#122980


          Improve pattern text downgrade license test (elastic#143102)

f20aca7

Update the pattern_text downgrade test so that it includes adding docs and querying for docs.\
Specifically the test now does the following:
  1. Create data stream with trial license and pattern text field
  2. Index docs, verify they're searchable.
  3. Downgrade to basic license.
  4. Index more docs in same backing index, verify all docs searchable, verify settings unchanged.
  5. Rollover the data stream, verify the new backing index has disable_templating=true.
  6. Index more docs into the new backing index, verify all docs searchable across both indices.
  7. Search with "fields": ["pattern_field"] to verify the valueFetcher() code path works across both backing indices.

GalLalouche enabled auto-merge (squash)

March 3, 2026 22:31


          Merge branch 'main' into prevent_pushdown_cursor

d54b48f

GalLalouche disabled auto-merge

March 4, 2026 08:10

GalLalouche added 2 commits

March 4, 2026 10:21


          Fix changelog

8e77757


          Merge branch 'main' into prevent_pushdown_cursor

e64b17c

GalLalouche enabled auto-merge (squash)

March 4, 2026 16:01

GalLalouche added 6 commits

March 4, 2026 18:45


          Fix required caps

0f0ae8a


          Merge branch 'main' into prevent_pushdown_cursor

0bf1876


          Merge branch 'main' into prevent_pushdown_cursor

0175c3d


          Fix deleted line

fea828c


          Merge branch 'main' into prevent_pushdown_cursor

ee89848


          Fix hasIndexMetadata for queries with SET before FROM

28f67fd

Replace command parsing with regex that checks whole query for METADATA _index.
Use static Pattern to avoid recompilation.

Made-with: Cursor

GalLalouche disabled auto-merge

March 4, 2026 22:01

elasticsearchmachine and others added 4 commits

March 4, 2026 22:08


          [CI] Auto commit changes from spotless

10e1ff1


          Merge branch 'main' into prevent_pushdown_cursor

06f22d2


          Fix test by adding sort

be4b80b


          More strict regex

1db923f

GalLalouche force-pushed the prevent_pushdown_cursor branch from 30a7824 to 1db923f Compare

March 5, 2026 10:18

GalLalouche enabled auto-merge (squash)

March 5, 2026 10:20


          Fix nondeterministic output in test

0fbafb5

Contributor

coderabbitai bot commented Mar 6, 2026

Important

Review skipped

Auto reviews are limited based on label configuration.

🏷️ Required labels (at least one) (2)

Team:Delivery
Team:Search - Inference

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: accb0fd0-7851-4a04-8b29-55e887ab8ae1

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

GalLalouche added 2 commits

March 6, 2026 12:04


          Merge branch 'main' into prevent_pushdown_cursor

f4e575b


          Merge branch 'main' into prevent_pushdown_cursor

c22cb5f

GalLalouche merged commit 02fb937 into elastic:main

36 checks passed

spinscale pushed a commit to spinscale/elasticsearch that referenced this pull request


          ESQL: Prevent pushdown of unmapped fields in filters and sorts (elast…

2ebd401

…ic#143460)

This PR fixes a bug in the behavior of SET UNMAPPED_FIELDS=LOAD where a potentially unmapped field would be pushed down to Lucene, and, when it doesn't exist in the mapping, would cause wrong results.

Resolves: elastic#141920, elastic#141925.

szybia added a commit to szybia/elasticsearch that referenced this pull request


          Merge remote-tracking branch 'upstream/main' into get-reindex-with-re…

fc6b97d

…locations

* upstream/main: (153 commits)
  ES|QL: Update docs for TOP_SNIPPETS and DECAY (elastic#143739)
  Correctly include endpoint id in log msg in AuthorizationPoller (elastic#143743)
  Bar searching or sorting on _seq_no when disabled (elastic#143600)
  Generalize `testClientCancellation` test (elastic#143586)
  JSON_EXTRACT: zero-copy byte slicing for object, array, and number extraction (elastic#143702)
  Track recycler pages in circuit breaker (elastic#143738)
  [ESQL] Enable distributed pipeline breakers for external sources via FragmentExec (elastic#143696)
  Adding 'mode' and 'codec' fields to ES monitoring template (elastic#143673)
  [ESQL] Columnar I/O and vectorized block conversion for external sources (elastic#143703)
  Fix flaky MMR diversification YAML tests (elastic#143706)
  ES|QL codegen: check builder arguments for vector support (elastic#143724)
  Add Views Security Model (elastic#141050)
  ESQL: Prevent pushdown of unmapped fields in filters and sorts (elastic#143460)
  Don't run seq_no pruning tests in release CI (elastic#143725)
  ESQL: Support intra-row field references in ROW command (elastic#140217)
  ES|QL: Remove implicit limit in FORK branches in CSV tests (elastic#143601)
  IndexRoutingTests with and without synthetic id (elastic#143566)
  Synthetic id upgrade test in serverless (elastic#142471)
  Disable "Review skipped" comments for PRs without specified labels (elastic#143728)
  Cleanup ES|QL T-Digest code duplication, add memory accounting (elastic#143662)
  ...

prwhelan mentioned this pull request

[ML] Wait for cluster state in test #143767

Merged

sidosera pushed a commit to sidosera/elasticsearch that referenced this pull request


          ESQL: Prevent pushdown of unmapped fields in filters and sorts (elast…

511287e

…ic#143460)

This PR fixes a bug in the behavior of SET UNMAPPED_FIELDS=LOAD where a potentially unmapped field would be pushed down to Lucene, and, when it doesn't exist in the mapping, would cause wrong results.

Resolves: elastic#141920, elastic#141925.

GalLalouche mentioned this pull request

ESQL: unmapped_fields="load" can lead to wrong sort order #141925

Closed

prwhelan mentioned this pull request

[Transform] Disable PIT for CPS #143876

Closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL >bug >feature Team:Analytics v9.4.0