ES|QL: Late materialization after TopN (Node level) by GalLalouche · Pull Request #132757 · elastic/elasticsearch

GalLalouche · 2025-08-12T16:35:12Z

This PR adds a late(r) materialization for TopN queries, such that the materialization happes in the "node_reduce" phase instead of during the "data" phase.

For example, if the limit is 20, and each data node spawns 10 workers, we would only read 20 additional columns (i.e., ones not needed for the TopN) filters, instead of 200. To support this, the reducer node maintains a global list of all shard contexts used by its individual data workers (although some of those might be closed if they are no longer needed, thanks to #129454).

There is some additional book-keeping involved, since previously, every data node held a local list of shard contexts, and used its local indices to access it. To avoid changing too much (this local-index logic is spread throughout much of the code!), a new global index is introduced, which replaces the local index after all the rows are merged together in the reduce phase's TopN.

…sticsearch into testing_fetch_v2_passing

nik9000

I'm happy with it. Let's get Alex's last few comments solved and bring this thing in for a landing. We should get some rally benchmarks out of this. It's been a lot of work. We might get this for free from the nightly, but once you are good and ready to click that merge button I think you should try and find the rally tracks that we run that'll benefit from this. So we can watch them.

...in/esql/compute/src/test/java/org/elasticsearch/compute/operator/topn/TopNOperatorTests.java

...ClusterTest/java/org/elasticsearch/xpack/esql/action/EsqlReductionLateMaterializationIT.java

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plugin/ComputeService.java

GalLalouche

Thanks for the great in depth review, @alex-spies! I've addressed all things, although there is still the issue of of estimateRowSize which is waiting for @nik9000's feedback, and the questions of the dependence between the feature flags (runOnNodeReduce and reduceLateMaterialization or whatever we call it).

...ClusterTest/java/org/elasticsearch/xpack/esql/action/EsqlReductionLateMaterializationIT.java

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plugin/ComputeSearchContext.java

...in/esql/src/main/java/org/elasticsearch/xpack/esql/plugin/ComputeSearchContextByShardId.java

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plugin/ComputeService.java

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plugin/DataNodeRequest.java

...lugin/esql/src/main/java/org/elasticsearch/xpack/esql/plugin/LateMaterializationPlanner.java

...ck/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plugin/DataNodeComputeHandler.java

...in/esql/compute/src/test/java/org/elasticsearch/compute/operator/topn/TopNOperatorTests.java

...lugin/esql/src/main/java/org/elasticsearch/xpack/esql/plugin/LateMaterializationPlanner.java

GalLalouche · 2025-10-08T18:24:09Z

...ugin/esql/src/internalClusterTest/java/org/elasticsearch/xpack/esql/action/ManyShardsIT.java

            );
            for (String q : queries) {
-                QueryPragmas pragmas = randomPragmas();
+                var pragmas = randomPragmas();


@nik9000 FYI. I remember we discussed this, though I don't remember the exact solution we agreed on (if we did).

GalLalouche · 2025-10-10T11:09:45Z

x-pack/plugin/esql/qa/testFixtures/src/main/resources/dense_vector-bit.csv-spec


 FROM dense_vector
-| EVAL k = v_l2_norm(bit_vector, [1])  // workaround to enable fetching dense_vector
+| EVAL k = v_l2_norm(bit_vector, [1,2])  // workaround to enable fetching dense_vector


See #136365.

…ssing

This PR adds a late(r) materialization for TopN queries, such that the materialization happes in the "node_reduce" phase instead of during the "data" phase. For example, if the limit is 20, and each data node spawns 10 workers, we would only read 20 additional columns (i.e., ones not needed for the TopN) filters, instead of 200. To support this, the reducer node maintains a global list of all shard contexts used by its individual data workers (although some of those might be closed if they are no longer needed, thanks to elastic#129454). There is some additional book-keeping involved, since previously, every data node held a local list of shard contexts, and used its local indices to access it. To avoid changing too much (this local-index logic is spread throughout much of the code!), a new global index is introduced, which replaces the local index after all the rows are merged together in the reduce phase's TopN.

@nik9000

…9397) This PR fixes a slowness introduced in #132757 that was first encountered on our nightly benchmarks. After much digging, and with @nik9000's invaluable help, the culprit was found to be the usage of what is apparently a very slow functional Stream in what is apparently a very hot path.

@nik9000

…stic#139397) This PR fixes a slowness introduced in elastic#132757 that was first encountered on our nightly benchmarks. After much digging, and with @nik9000's invaluable help, the culprit was found to be the usage of what is apparently a very slow functional Stream in what is apparently a very hot path.

#142834) Just what it says on the tin. Follow-up to #141082 and #132757.

elastic#142834) Just what it says on the tin. Follow-up to elastic#141082 and elastic#132757.

GalLalouche added 12 commits July 15, 2025 12:17

Some small additions to make testing easier

dcd1b5f

Working LazyList

5df1f4a

More passes

c740415

Fixes to CsvTests

1ecadac

Fixes, moved hack to start, not all working

7bd74ab

All pass

9d91e9e

Refactors

2fe8058

Refactored LazyList

28f11d6

TEMP

1ada320

More Refactors

2b32f9e

More fixes

7223014

Merge branch 'main' into testing_fetch_v2_passing

290633a

elasticsearchmachine added the v9.2.0 label Aug 12, 2025

GalLalouche added 16 commits August 12, 2025 20:08

Basic cleanup

df9cf86

Checkstyle shenanigans

f78891f

Remove println

12e622c

temp

85bfde2

Replace isFirstTopN and vector with a single int and config

e60a93d

Fix indexing issues

a1b81b5

Fix test compilation

8864058

Fix test compilation

b8ef105

Merge branch 'testing_fetch_v2_passing' of github.com:GalLalouche/ela…

9ac77c4

…sticsearch into testing_fetch_v2_passing

Merge branch 'testing_fetch_v2_passing' of github.com:GalLalouche/ela…

bf08837

…sticsearch into testing_fetch_v2_passing

Merge branch 'testing_fetch_v2_passing' of github.com:GalLalouche/ela…

7de90cf

…sticsearch into testing_fetch_v2_passing

More fixes

4229215

temp removing no commit for tests to run

f358784

Fix test name

93ae30c

Merge branch 'main' into testing_fetch_v2_passing

ef58972

Replace assertion with exception

5b811b3

GalLalouche requested a review from nik9000 August 19, 2025 13:24

nik9000 approved these changes Oct 7, 2025

View reviewed changes

GalLalouche commented Oct 8, 2025

View reviewed changes

More CR fixes

ffee946

nik9000 reviewed Oct 8, 2025

View reviewed changes

...lugin/esql/src/main/java/org/elasticsearch/xpack/esql/plugin/LateMaterializationPlanner.java Outdated Show resolved Hide resolved

More CR fixes

5535ce2

GalLalouche commented Oct 8, 2025

View reviewed changes

GalLalouche added 3 commits October 9, 2025 12:54

Fix tests

db4f13d

Merge branch 'main' into testing_fetch_v2_passing

ff95ef2

Fix dense_vector test

bff5e71

GalLalouche mentioned this pull request Oct 10, 2025

ES|QL: PruneColumns fails to prune columns when forked #136365

Closed

GalLalouche commented Oct 10, 2025

View reviewed changes

GalLalouche added 4 commits October 10, 2025 14:13

Fix failing test

87966fc

Merge branch 'main' into testing_fetch_v2_passing

3721bde

Merge branch 'main' into testing_fetch_v2_passing

81b56c8

Merge branch 'main' into testing_fetch_v2_passing

e938236

GalLalouche enabled auto-merge (squash) October 12, 2025 11:34

Merge remote-tracking branch 'upstream/main' into testing_fetch_v2_pa…

d136211

…ssing

alex-spies self-assigned this Oct 13, 2025

GalLalouche merged commit 0a7d113 into elastic:main Oct 13, 2025
34 checks passed

GalLalouche mentioned this pull request Nov 20, 2025

[CI] EsqlActionBreakerIT testTopNPushedToLuceneOnSortedIndex failing #135939

Closed

dnhatn mentioned this pull request Dec 4, 2025

Ensure the order of converters in ValuesFromManyReader #139019

Merged

GalLalouche mentioned this pull request Dec 12, 2025

ESQL: Fix slowness in ValuesFromManyReader.estimatedRamBytesUsed #139397

Merged

alex-spies mentioned this pull request Dec 16, 2025

Fix node reduction pushdown tests for release tests #139548

Merged

GalLalouche mentioned this pull request Feb 23, 2026

ESQL: Remove snapshot protection from node reduce late materialization #142834

Merged

GalLalouche added a commit that referenced this pull request Feb 23, 2026

ESQL: Remove snapshot protection from node reduce late materialization (

d8aae4c

#142834) Just what it says on the tin. Follow-up to #141082 and #132757.

sidosera pushed a commit to sidosera/elasticsearch that referenced this pull request Feb 24, 2026

ESQL: Remove snapshot protection from node reduce late materialization (

bb80680

elastic#142834) Just what it says on the tin. Follow-up to elastic#141082 and elastic#132757.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ES|QL: Late materialization after TopN (Node level)#132757

ES|QL: Late materialization after TopN (Node level)#132757
GalLalouche merged 83 commits intoelastic:mainfrom
GalLalouche:testing_fetch_v2_passing

GalLalouche commented Aug 12, 2025 •

edited

Loading

Uh oh!

nik9000 left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

GalLalouche left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

GalLalouche Oct 8, 2025

Uh oh!

GalLalouche Oct 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

GalLalouche commented Aug 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nik9000 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

GalLalouche left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

GalLalouche Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

GalLalouche Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

GalLalouche commented Aug 12, 2025 •

edited

Loading