Implement one-batch lookahead for index enumerators #4345

Swiddis · 2025-09-22T05:38:19Z

Description

In local benchmarking of merge operations, I saw we were spending a lot of time waiting for synchronous fetching of batches across both indices.

Because of the PIT-based design, we can't parallelize page fetches directly, but one low-hanging fruit here is to start fetching the next batch as soon as we get the current one, so by the time we start the next batch it'll already be halfway ready. This cuts enumerated merge times by ~40%.

To implement this safely, this PR needs to do a few things:

Register a new thread pool that has authentication context (we can't run background threads if we don't do this)
- See SQLPlugin.java changes. I also fixed our thread configuration settings.
- We need a new pool as we'll hang the worker pool if there's only one thread.
Safely handle whether we have a NodeClient or not within the Calcite enumeration inner loop
- This was the interface change in OpenSearchClient.java, I did several plumbing changes around that update.
Actually implement the background scanner, with a fallback to synchronous scanning if we're missing node context. BackgroundSearchScanner.java

Some alternatives for the long-term:

Implement range-based/adaptive parallel fetching
Skip paginating with the rest client and just go directly through Lucene
Core is working on streaming queries: [RFC] New search streaming API OpenSearch#18725

In draft pending testing.

Related Issues

N/A

Check List

New functionality includes testing.
New functionality has been documented.
New functionality has javadoc added.
New functionality has a user manual doc added.
New PPL command checklist all confirmed.
API changes companion pull request created.
Commits are signed per the DCO using --signoff or -s.
Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Simeon Widdis <[email protected]>

Swiddis · 2025-09-23T01:14:09Z

Security IT failures are confusing me here -- seems like they're all consistently failing but the changed code doesn't show up anywhere in any of the stack traces

…atches-in-enumeration

Signed-off-by: Simeon Widdis <[email protected]>

plugin/src/main/java/org/opensearch/sql/plugin/SQLPlugin.java

Signed-off-by: Simeon Widdis <[email protected]>

Swiddis · 2025-09-25T20:15:23Z

Some additional testing info:

I took 5 million records from the big5 benchmarking dataset and compared the current mainline with this.

First, as sanity, the results are the same for one of the queries requiring a full index enumeration:

source = big5
| eval range_bucket = case(
   `metrics.size` < -10, 'range_1',
   `metrics.size` >= -10 and `metrics.size` < 10, 'range_2',
   `metrics.size` >= 10 and `metrics.size` < 100, 'range_3',
   `metrics.size` >= 100 and `metrics.size` < 1000, 'range_4',
   `metrics.size` >= 1000 and `metrics.size` < 2000, 'range_5',
   `metrics.size` >= 2000, 'range_6')
| stats count() by range_bucket, span(`@timestamp`, 1h) as auto_span
| sort + range_bucket, + auto_span

Current mainline:

fetched rows / total rows = 48/48
+---------+---------------------+--------------+
| count() | auto_span           | range_bucket |
|---------+---------------------+--------------|
| 122464  | 2022-12-31 16:00:00 | range_5      |
| 121585  | 2022-12-31 17:00:00 | range_5      |
| 122052  | 2022-12-31 18:00:00 | range_5      |
| 122220  | 2022-12-31 19:00:00 | range_5      |
| 122163  | 2022-12-31 20:00:00 | range_5      |
| 121840  | 2022-12-31 21:00:00 | range_5      |
| 121606  | 2022-12-31 22:00:00 | range_5      |
| 121889  | 2022-12-31 23:00:00 | range_5      |
| 121088  | 2023-01-01 00:00:00 | range_5      |
| 121943  | 2023-01-01 01:00:00 | range_5      |

After update:

fetched rows / total rows = 48/48
+---------+---------------------+--------------+
| count() | auto_span           | range_bucket |
|---------+---------------------+--------------|
| 122464  | 2022-12-31 16:00:00 | range_5      |
| 121585  | 2022-12-31 17:00:00 | range_5      |
| 122052  | 2022-12-31 18:00:00 | range_5      |
| 122220  | 2022-12-31 19:00:00 | range_5      |
| 122163  | 2022-12-31 20:00:00 | range_5      |
| 121840  | 2022-12-31 21:00:00 | range_5      |
| 121606  | 2022-12-31 22:00:00 | range_5      |
| 121889  | 2022-12-31 23:00:00 | range_5      |
| 121088  | 2023-01-01 00:00:00 | range_5      |
| 121943  | 2023-01-01 01:00:00 | range_5      |

Second, I wanted to benchmark and check for impact. I already tested with joins and it's ~40% faster, but for non-joins we potentially pay overhead for nothing.

For the slowest big5 queries (BG fetches on the left, sync fetches on the right), we see slight perf gains:

For the fastest ones, the performance is approximately the same (some minor latency and throughput diffs but I'm not confident that this isn't just random variation):

Signed-off-by: Simeon Widdis <[email protected]>

…atches-in-enumeration

Swiddis · 2025-10-07T23:42:19Z

Turns out I flipped the benchmark in my head, so this is overall a regression -- going to put back in draft and figure out a better approach

…atches-in-enumeration

opensearch-trigger-bot · 2025-11-29T15:21:29Z

This PR is stalled because it has been open for 2 weeks with no activity.

opensearch-trigger-bot · 2025-12-08T09:39:27Z

The backport to 2.19-dev failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/sql/backport-2.19-dev 2.19-dev
# Navigate to the new working tree
pushd ../.worktrees/sql/backport-2.19-dev
# Create a new branch
git switch --create backport/backport-4345-to-2.19-dev
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 d28c226140f1b98db4ed8e8d76d6451f2072273f
# Push it to GitHub
git push --set-upstream origin backport/backport-4345-to-2.19-dev
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/sql/backport-2.19-dev

Then, create a pull request where the base branch is 2.19-dev and the compare/head branch is backport/backport-4345-to-2.19-dev.

LantaoJin · 2025-12-08T09:42:04Z

Turns out I flipped the benchmark in my head, so this is overall a regression -- going to put back in draft and figure out a better approach

@Swiddis , do you mean the current implementation has performance regression? So why the PR merged finally? If there is no regression, please backport it to 2.19-dev since backporting of #4884 is blocked by this backporting.

Swiddis · 2025-12-08T17:10:56Z

Couldn't get the regression to reliably repro & the regression was smaller than the gains in more tests, so I wanted to see what the diff was in the OSB benchmarks. I don't see any benchmark diff since merging.

Wasn't planning on backporting this originally since it's still largely experimental, can open the PR at least

…ct#4345)

…ct#4345) Signed-off-by: Simeon Widdis <[email protected]>

…ct#4345)

…rs (#4345) (#4916)

Implement simple pre-fetching for index enumerators

320bf8e

Signed-off-by: Simeon Widdis <[email protected]>

Swiddis added the enhancement New feature or request label Sep 22, 2025

Swiddis added 2 commits September 24, 2025 17:48

Merge remote-tracking branch 'upstream/main' into feature/pre-fetch-b…

8ea3372

…atches-in-enumeration

Fix the thread pool handling for background execution

08e40e6

Signed-off-by: Simeon Widdis <[email protected]>

Swiddis commented Sep 24, 2025

View reviewed changes

plugin/src/main/java/org/opensearch/sql/plugin/SQLPlugin.java Show resolved Hide resolved

Swiddis added 2 commits September 24, 2025 19:00

Fix a link

e38b709

Signed-off-by: Simeon Widdis <[email protected]>

Implement background scanner handling optional missing NodeClients

25b9ce5

Signed-off-by: Simeon Widdis <[email protected]>

Swiddis marked this pull request as ready for review September 25, 2025 20:15

Swiddis requested review from GumpacG, LantaoJin, MaxKsyunz, YANG-DB, Yury-Fridlyand, acarbonetto, anirudha, dai-chen, derek-ho, forestmvey, joshuali925, kavithacm, mengweieric, noCharger, penghuo, ps48, qianheng-aws, seankao-az, vamsimanohar and ykmr1224 as code owners September 25, 2025 20:15

Revert to sql-worker

c5eb5a0

Signed-off-by: Simeon Widdis <[email protected]>

Swiddis dismissed vamsimanohar’s stale review via c5eb5a0 September 30, 2025 23:22

Simplify reset

8b68539

Signed-off-by: Simeon Widdis <[email protected]>

Swiddis removed the v3.3.0 label Sep 30, 2025

Merge remote-tracking branch 'upstream/main' into feature/pre-fetch-b…

be20f67

…atches-in-enumeration

Swiddis requested review from penghuo and vamsimanohar October 3, 2025 20:15

Swiddis added the calcite calcite migration releated label Oct 3, 2025

Swiddis marked this pull request as draft October 7, 2025 23:42

Merge remote-tracking branch 'upstream/main' into feature/pre-fetch-b…

d3035f8

…atches-in-enumeration

Swiddis marked this pull request as ready for review November 14, 2025 21:58

penghuo approved these changes Nov 14, 2025

View reviewed changes

opensearch-trigger-bot bot added the stalled label Nov 29, 2025

RyanL1997 approved these changes Dec 1, 2025

View reviewed changes

Swiddis merged commit d28c226 into opensearch-project:main Dec 1, 2025
38 checks passed

Swiddis deleted the feature/pre-fetch-batches-in-enumeration branch December 1, 2025 18:46

LantaoJin added the backport 2.19-dev label Dec 8, 2025

opensearch-trigger-bot bot added the backport-failed label Dec 8, 2025

Swiddis added a commit to Swiddis/sql that referenced this pull request Dec 8, 2025

Implement one-batch lookahead for index enumerators (opensearch-proje…

a1bbfd6

…ct#4345)

Swiddis mentioned this pull request Dec 8, 2025

[Backport 2.19-dev] Implement one-batch lookahead for index enumerators (#4345) #4916

Merged

8 tasks

Swiddis added a commit to Swiddis/sql that referenced this pull request Dec 8, 2025

Implement one-batch lookahead for index enumerators (opensearch-proje…

0bf4773

…ct#4345) Signed-off-by: Simeon Widdis <[email protected]>

asifabashar pushed a commit to asifabashar/sql that referenced this pull request Dec 10, 2025

Implement one-batch lookahead for index enumerators (opensearch-proje…

8e24881

…ct#4345)

LantaoJin added backport-manually Filed a PR to backport manually. and removed stalled labels Dec 10, 2025

Swiddis added a commit that referenced this pull request Dec 10, 2025

[Backport 2.19-dev] Implement one-batch lookahead for index enumerato…

d994ca9

…rs (#4345) (#4916)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implement one-batch lookahead for index enumerators #4345

Implement one-batch lookahead for index enumerators #4345

Uh oh!

Swiddis commented Sep 22, 2025 •

edited

Loading

Uh oh!

Swiddis commented Sep 23, 2025

Uh oh!

Uh oh!

Swiddis commented Sep 25, 2025 •

edited

Loading

Uh oh!

Swiddis commented Oct 7, 2025

Uh oh!

opensearch-trigger-bot bot commented Nov 29, 2025

Uh oh!

Uh oh!

opensearch-trigger-bot bot commented Dec 8, 2025

Uh oh!

LantaoJin commented Dec 8, 2025

Uh oh!

Swiddis commented Dec 8, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Implement one-batch lookahead for index enumerators #4345

Implement one-batch lookahead for index enumerators #4345

Uh oh!

Conversation

Swiddis commented Sep 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issues

Check List

Uh oh!

Swiddis commented Sep 23, 2025

Uh oh!

Uh oh!

Swiddis commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Swiddis commented Oct 7, 2025

Uh oh!

opensearch-trigger-bot bot commented Nov 29, 2025

Uh oh!

Uh oh!

opensearch-trigger-bot bot commented Dec 8, 2025

Uh oh!

LantaoJin commented Dec 8, 2025

Uh oh!

Swiddis commented Dec 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Swiddis commented Sep 22, 2025 •

edited

Loading

Swiddis commented Sep 25, 2025 •

edited

Loading

Swiddis commented Dec 8, 2025 •

edited

Loading