Skip to content

Conversation

@sunqijun1
Copy link
Contributor

Description

Currently, OpenSearch's restrictions and checks on search slice parameters are inappropriate. In the code, search slices size check in each shard execution. If there is a problem with the verification on the shard, an exception will be thrown on the shard. This is a relatively slow process.

Therefore, we need to change this approach. we need to move the validation of slice parameters to the coordinator node to detect issues earlier.

Related Issues

Resolves #18963

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@sunqijun1 sunqijun1 requested a review from a team as a code owner August 7, 2025 13:39
@github-actions github-actions bot added bug Something isn't working Other labels Aug 7, 2025
@sunqijun1 sunqijun1 force-pushed the dev/fix_large_slices branch 2 times, most recently from 97c1aa7 to f42bf6f Compare August 7, 2025 13:43
@github-actions
Copy link
Contributor

github-actions bot commented Aug 7, 2025

❌ Gradle check result for f42bf6f: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@sunqijun1 sunqijun1 force-pushed the dev/fix_large_slices branch from f42bf6f to 955ccc2 Compare August 7, 2025 13:59
@github-actions
Copy link
Contributor

github-actions bot commented Aug 7, 2025

❌ Gradle check result for 955ccc2: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@sunqijun1 sunqijun1 closed this Aug 8, 2025
@sunqijun1 sunqijun1 reopened this Aug 8, 2025
@sunqijun1 sunqijun1 force-pushed the dev/fix_large_slices branch from 955ccc2 to a8be685 Compare August 8, 2025 02:47
sunqijun.jun added 3 commits August 8, 2025 11:16
Signed-off-by: sunqijun.jun <[email protected]>
Signed-off-by: sunqijun.jun <[email protected]>
@sunqijun1 sunqijun1 force-pushed the dev/fix_large_slices branch from d1db8cb to ab6e2e4 Compare August 8, 2025 03:16
@github-actions
Copy link
Contributor

github-actions bot commented Aug 8, 2025

❌ Gradle check result for ab6e2e4: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@sunqijun1 sunqijun1 closed this Aug 8, 2025
@sunqijun1 sunqijun1 reopened this Aug 8, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Aug 8, 2025

❌ Gradle check result for ab6e2e4: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Contributor

github-actions bot commented Aug 8, 2025

✅ Gradle check result for ab6e2e4: SUCCESS

@codecov
Copy link

codecov bot commented Aug 8, 2025

Codecov Report

❌ Patch coverage is 90.90909% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 72.94%. Comparing base (7c1052f) to head (f4ce31a).
⚠️ Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
...dex/reindex/BulkByScrollParallelizationHelper.java 87.50% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #18964      +/-   ##
============================================
+ Coverage     72.82%   72.94%   +0.11%     
- Complexity    69677    69772      +95     
============================================
  Files          5658     5658              
  Lines        320099   320108       +9     
  Branches      46348    46350       +2     
============================================
+ Hits         233110   233491     +381     
+ Misses        68088    67749     -339     
+ Partials      18901    18868      -33     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@sunqijun1 sunqijun1 requested a review from kkewwei August 9, 2025 12:27
@github-actions
Copy link
Contributor

github-actions bot commented Aug 9, 2025

❌ Gradle check result for 2c7897f: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@sunqijun1 sunqijun1 closed this Aug 9, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Aug 9, 2025

✅ Gradle check result for 2c7897f: SUCCESS

@kkewwei
Copy link
Contributor

kkewwei commented Aug 19, 2025

@ankitkala I apologize for taking up your time, but I’m genuinely unsure who else to ask for a review. As I noticed you’ve modified the reindex code in the past, I’d be extremely grateful if you could review it at your convenience. Thank you so much for your help.

@github-actions github-actions bot added the Indexing Indexing, Bulk Indexing and anything related to indexing label Aug 19, 2025
@github-actions
Copy link
Contributor

❕ Gradle check result for 4bfd6ec: UNSTABLE

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

@github-actions
Copy link
Contributor

github-actions bot commented Sep 5, 2025

❌ Gradle check result for bbf2503: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Contributor

github-actions bot commented Sep 9, 2025

✅ Gradle check result for f4ce31a: SUCCESS

Copy link
Contributor

@kkewwei kkewwei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kkewwei kkewwei merged commit 579830d into opensearch-project:main Sep 10, 2025
31 checks passed
jainankitk pushed a commit to jainankitk/OpenSearch that referenced this pull request Sep 22, 2025
…OutOfMemoryError on coordinator (opensearch-project#18964)

* bugfix for too much slices cause jvm oom

Signed-off-by: sunqijun.jun <[email protected]>

* add changelogs

Signed-off-by: sunqijun.jun <[email protected]>

* fix spotlessApply

Signed-off-by: sunqijun.jun <[email protected]>

---------

Signed-off-by: sunqijun.jun <[email protected]>
Signed-off-by: kkewwei <[email protected]>
Co-authored-by: sunqijun.jun <[email protected]>
Co-authored-by: kkewwei <[email protected]>
jainankitk pushed a commit to jainankitk/OpenSearch that referenced this pull request Sep 22, 2025
…OutOfMemoryError on coordinator (opensearch-project#18964)

* bugfix for too much slices cause jvm oom

Signed-off-by: sunqijun.jun <[email protected]>

* add changelogs

Signed-off-by: sunqijun.jun <[email protected]>

* fix spotlessApply

Signed-off-by: sunqijun.jun <[email protected]>

---------

Signed-off-by: sunqijun.jun <[email protected]>
Signed-off-by: kkewwei <[email protected]>
Co-authored-by: sunqijun.jun <[email protected]>
Co-authored-by: kkewwei <[email protected]>
Signed-off-by: Ankit Jain <[email protected]>
jainankitk pushed a commit to jainankitk/OpenSearch that referenced this pull request Sep 22, 2025
…OutOfMemoryError on coordinator (opensearch-project#18964)

* bugfix for too much slices cause jvm oom

Signed-off-by: sunqijun.jun <[email protected]>

* add changelogs

Signed-off-by: sunqijun.jun <[email protected]>

* fix spotlessApply

Signed-off-by: sunqijun.jun <[email protected]>

---------

Signed-off-by: sunqijun.jun <[email protected]>
Signed-off-by: kkewwei <[email protected]>
Co-authored-by: sunqijun.jun <[email protected]>
Co-authored-by: kkewwei <[email protected]>
Signed-off-by: Ankit Jain <[email protected]>
asimmahmood1 pushed a commit to jainankitk/OpenSearch that referenced this pull request Sep 23, 2025
…OutOfMemoryError on coordinator (opensearch-project#18964)

* bugfix for too much slices cause jvm oom

Signed-off-by: sunqijun.jun <[email protected]>

* add changelogs

Signed-off-by: sunqijun.jun <[email protected]>

* fix spotlessApply

Signed-off-by: sunqijun.jun <[email protected]>

---------

Signed-off-by: sunqijun.jun <[email protected]>
Signed-off-by: kkewwei <[email protected]>
Co-authored-by: sunqijun.jun <[email protected]>
Co-authored-by: kkewwei <[email protected]>
pranikum pushed a commit to pranikum/OpenSearch that referenced this pull request Sep 23, 2025
…OutOfMemoryError on coordinator (opensearch-project#18964)

* bugfix for too much slices cause jvm oom

Signed-off-by: sunqijun.jun <[email protected]>

* add changelogs

Signed-off-by: sunqijun.jun <[email protected]>

* fix spotlessApply

Signed-off-by: sunqijun.jun <[email protected]>

---------

Signed-off-by: sunqijun.jun <[email protected]>
Signed-off-by: kkewwei <[email protected]>
Co-authored-by: sunqijun.jun <[email protected]>
Co-authored-by: kkewwei <[email protected]>
vinaykpud pushed a commit to vinaykpud/OpenSearch that referenced this pull request Sep 26, 2025
…OutOfMemoryError on coordinator (opensearch-project#18964)

* bugfix for too much slices cause jvm oom

Signed-off-by: sunqijun.jun <[email protected]>

* add changelogs

Signed-off-by: sunqijun.jun <[email protected]>

* fix spotlessApply

Signed-off-by: sunqijun.jun <[email protected]>

---------

Signed-off-by: sunqijun.jun <[email protected]>
Signed-off-by: kkewwei <[email protected]>
Co-authored-by: sunqijun.jun <[email protected]>
Co-authored-by: kkewwei <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working Indexing Indexing, Bulk Indexing and anything related to indexing Other

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Using an excessively large reindex slice can lead to a JVM OutOfMemoryError on coordinator

2 participants