ESQL: Reuse Block Reader only when few fields by nik9000 · Pull Request #141672 · elastic/elasticsearch

nik9000 · 2026-02-02T17:01:13Z

Stops ESQL from reusing the BlockLoad.ColumnAtATimeReaders when
loading many of these fields at once. Attempting to reuse these readers
means we have to keep all of them in memory. If we don't reuse we can
release the memory for each field as we load each Block's. When you
load hundreds of blocks this really adds up.

Important: This only works for column-at-a-time readers. This waits for a follow-up change. For truly row-by-row readers, this is fine. They don't use much memory anyway. But we use row-by-row readers as a fallback for reading doc values when loading from many segments. That seems important. Usually if the query wants to load hundreds of fields, it's after a topn. And usually those are "from many segments".

Stops ESQL from reusing the `BlockLoad.ColumnAtATimeReader`s when loading many of these fields at once. Attempting to reuse these readers means we have to keep all of them in memory. If we don't reuse we can release the memory for each field as we load each `Block`'s. When you load hundreds of blocks this really adds up. Important: This only works for column-at-a-time readers. Mostly, the row-by-row readers don't take much space anyway.

elasticsearchmachine · 2026-02-02T17:01:39Z

Pinging @elastic/es-analytical-engine (Team:Analytics)

nik9000 · 2026-02-02T17:02:11Z

I'm still looking into heap attack tests. I might need the follow-up work to really hit - but chekcing.

…esql_supply_block_loader

nik9000

OK. A few heap attack tests pass now.

Next I'll see if I can get ColumnAtATimeReader working the FromMany reader.

martijnvg

Thanks Nik! I think this looks good.

Looks like some unit test failures for test expecting block loader but now get io supplier. This can be fixed by resolving the supplier and then checking to block loader.

martijnvg · 2026-02-03T11:58:58Z

...in/esql/src/main/java/org/elasticsearch/xpack/esql/planner/EsPhysicalOperationProviders.java

                s.storedFieldsSequentialProportion()
            )
        );
+        boolean reuseColumnLoaders = fieldExtractExec.attributesToExtract().size() <= context.plannerSettings()


This includes fields used for sorting, grouping, sorting etc? I think if sorting does gets pushed down, then those fields aren't part of attributes to extract. Similar to when WHERE gets pushed down. In the push down case, we aren't being slowed down here.

Maybe for other things we could have exceptions. Like for TS command for _tsid and @timestamp. Not in this PR and maybe in a follow up.

This is used for almost all loading ESQL has to do - sorts, groups, aggs, returning the column. All of it.

I'm not sure we really need exceptions though. This'll only come up if you need more than 30 fields at a time. Mostly this'll come up for stuff like FROM foo | LIMIT 10. Aggs rarely touch 30 fields at a time. But you can make them do it - testAggManyFieldsNoReuse does that. But it's kind of a lot.

If you use time series stuff to get the last value of like 50 fields. Or the rate of that many fields. Then this'll kick in. Probably. I don't know all of the bits y'all have.

martijnvg · 2026-02-03T11:59:54Z

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/planner/PlannerSettings.java

+     * the paths that need very high performance don't load more than a handful of fields at a time,
+     * so they <strong>do</strong> reuse fields.
+     */
+    public static final Setting<Integer> REUSE_COLUMN_LOADERS_THRESHOLD = Setting.intSetting(


I agree with this statement.

nik9000 · 2026-02-03T12:40:28Z

Looks like some unit test failures for test expecting block loader but now get io supplier. This can be fixed by resolving the supplier and then checking to block loader.

👍

We were growing more and more and more options to `OperatorTestCase.runDriver`. I need another option in #141672 so I built a builder-style test utility. This removes the original methods, migrating all callers. I've been quite liberal adding utility methods. Those are cheap in a builder-style helper because you don't have to think in terms of combinatorial explosions of parameter - just in terms of "how do I set up all the bits". Now there are ten ways to set the inputs. It's tempting to make some higher level utility methods that call these - or make the common call sites shorter. You init the most common helper like: ``` new TestDriverRunner().builder(driverContext()) ``` But I didn't want it to look like magic. Readers should see this and think, "I can add things before `builder`" and "I can add things to this `builder`."

Stops ESQL from reusing the `BlockLoad.ColumnAtATimeReader`s when loading many of these fields at once. Attempting to reuse these readers means we have to keep all of them in memory. If we don't reuse we can release the memory for each field as we load each `Block`'s. When you load hundreds of blocks this really adds up. Important: This only works for column-at-a-time readers. Mostly, the row-by-row readers don't take much space anyway.

We were growing more and more and more options to `OperatorTestCase.runDriver`. I need another option in elastic#141672 so I built a builder-style test utility. This removes the original methods, migrating all callers. I've been quite liberal adding utility methods. Those are cheap in a builder-style helper because you don't have to think in terms of combinatorial explosions of parameter - just in terms of "how do I set up all the bits". Now there are ten ways to set the inputs. It's tempting to make some higher level utility methods that call these - or make the common call sites shorter. You init the most common helper like: ``` new TestDriverRunner().builder(driverContext()) ``` But I didn't want it to look like magic. Readers should see this and think, "I can add things before `builder`" and "I can add things to this `builder`."

Stops ESQL from reusing the `BlockLoad.ColumnAtATimeReader`s when loading many of these fields at once. Attempting to reuse these readers means we have to keep all of them in memory. If we don't reuse we can release the memory for each field as we load each `Block`'s. When you load hundreds of blocks this really adds up. Important: This only works for column-at-a-time readers. Mostly, the row-by-row readers don't take much space anyway.

We were growing more and more and more options to `OperatorTestCase.runDriver`. I need another option in elastic#141672 so I built a builder-style test utility. This removes the original methods, migrating all callers. I've been quite liberal adding utility methods. Those are cheap in a builder-style helper because you don't have to think in terms of combinatorial explosions of parameter - just in terms of "how do I set up all the bits". Now there are ten ways to set the inputs. It's tempting to make some higher level utility methods that call these - or make the common call sites shorter. You init the most common helper like: ``` new TestDriverRunner().builder(driverContext()) ``` But I didn't want it to look like magic. Readers should see this and think, "I can add things before `builder`" and "I can add things to this `builder`."

ESQL: Load many fields column-at-a-time Adds support for `ColumnAtATimeReader` in the case where we're loading from many segments. This should marginally speed up loading many documents after a top n. More importantly, it lets #141672 kick in when loading from many fields. This should save significantly memory when loading thousands of fields after a `| SORT | LIMIT` sequence. Finally, this changes the rules for `BlockLoader`. Previously you *could* return `null` from `columnAtATimeReader` but must never return `null` from `rowStrideReader`. Now the rule is that you may return null from *either* of the two, but not both. This should let us delete a bunch of code. While we're at it, we should add a `read(builder, docs, offset, nullsFiltered)` override to save a copy.

nik9000 added 2 commits February 2, 2026 11:38

Merge branch 'main' into esql_supply_block_loader

75d841c

nik9000 added >bug :Analytics/ES|QL AKA ESQL v9.4.0 labels Feb 2, 2026

nik9000 requested a review from martijnvg February 2, 2026 17:01

elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Feb 2, 2026

elasticsearchmachine and others added 4 commits February 2, 2026 17:09

[CI] Auto commit changes from spotless

ccf7ff1

Enable!

347db62

Merge remote-tracking branch 'nik9000/esql_supply_block_loader' into …

bf58e09

…esql_supply_block_loader

Update

6c49fbf

nik9000 commented Feb 2, 2026

View reviewed changes

martijnvg approved these changes Feb 3, 2026

View reviewed changes

nik9000 added 2 commits February 3, 2026 07:46

TEsts

baca1e5

Merge branch 'main' into esql_supply_block_loader

d0916b7

nik9000 merged commit 4133069 into elastic:main Feb 3, 2026
36 checks passed

nik9000 mentioned this pull request Feb 3, 2026

ESQL: Tools for testing Operators #141778

Merged

jordan-powers added a commit to jordan-powers/elasticsearch that referenced this pull request Feb 3, 2026

Fix build error from merge with elastic#141672

0d06f99

nik9000 mentioned this pull request Feb 5, 2026

ESQL: Load many fields column-at-a-time #141926

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ESQL: Reuse Block Reader only when few fields #141672

ESQL: Reuse Block Reader only when few fields #141672
nik9000 merged 8 commits intoelastic:mainfrom
nik9000:esql_supply_block_loader

nik9000 commented Feb 2, 2026

Uh oh!

elasticsearchmachine commented Feb 2, 2026

Uh oh!

nik9000 commented Feb 2, 2026

Uh oh!

nik9000 left a comment

Uh oh!

martijnvg left a comment

Uh oh!

martijnvg Feb 3, 2026

Uh oh!

nik9000 Feb 3, 2026

Uh oh!

martijnvg Feb 3, 2026

Uh oh!

nik9000 commented Feb 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

nik9000 commented Feb 2, 2026

Uh oh!

elasticsearchmachine commented Feb 2, 2026

Uh oh!

nik9000 commented Feb 2, 2026

Uh oh!

nik9000 left a comment

Choose a reason for hiding this comment

Uh oh!

martijnvg left a comment

Choose a reason for hiding this comment

Uh oh!

martijnvg Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

nik9000 Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

martijnvg Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

nik9000 commented Feb 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants