fix bug of warm index : FullFileCachedIndexInput was repeatedly closed#20055
Conversation
|
❌ Gradle check result for dacd3a5: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
dacd3a5 to
c3fc0f2
Compare
|
❕ Gradle check result for c3fc0f2: UNSTABLE Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure. |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #20055 +/- ##
============================================
- Coverage 73.29% 73.26% -0.03%
+ Complexity 71780 71743 -37
============================================
Files 5795 5795
Lines 328297 328302 +5
Branches 47282 47283 +1
============================================
- Hits 240612 240523 -89
- Misses 68368 68431 +63
- Partials 19317 19348 +31 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
server/src/main/java/org/opensearch/index/store/remote/filecache/FileCachedIndexInput.java
Show resolved
Hide resolved
server/src/main/java/org/opensearch/index/store/remote/filecache/FullFileCachedIndexInput.java
Show resolved
Hide resolved
844e7be to
7014a56
Compare
WalkthroughConverted the Changes
Sequence DiagramsequenceDiagram
participant Test
participant FullInput as FullFileCachedIndexInput
participant Holder as IndexInputHolder
participant Cleaner as CleanerRunnable
participant Atomic as closed (AtomicBoolean)
Note over Test,FullInput: Clone/close lifecycle with cleaner coordination
Test->>FullInput: clone() x3
FullInput->>Atomic: closed.get() -> false
FullInput->>FullInput: increment refCount for clones
Test->>FullInput: clone1.close()
FullInput->>Atomic: closed.get() -> false
FullInput->>Atomic: closed.set(true)
FullInput->>Holder: schedule/hold cleanup
Holder->>Cleaner: run()
Cleaner->>Atomic: closed.get()
alt not already cleaned
Cleaner->>FullInput: deRef() (decrement refCount)
Cleaner->>Atomic: closed.set(true)
end
Test->>FullInput: indexInputHolderRun() on clone1
FullInput->>Holder: run() (explicit)
Holder->>Atomic: closed.get() -> true
Holder->>Cleaner: no-op if already cleaned
Test->>FullInput: close remaining clones -> refCount -> 0
Cleaner->>FullInput: final cleanup when refs reach 0
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
server/src/main/java/org/opensearch/index/store/remote/filecache/FullFileCachedIndexInput.java (1)
101-106: Consider adding@VisibleForTestingannotation or reducing visibility.This method is intended for test use only, but it's public. Consider adding an annotation to make this intent explicit:
+ import org.opensearch.common.annotation.VisibleForTesting; ... /** - * Run resource cleaning,To be used only in test + * Run resource cleaning. To be used only in tests. */ + @VisibleForTesting public void indexInputHolderRun() { indexInputHolder.run(); }Alternatively, since the test class is in the same package, this could be made package-private.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
CHANGELOG.md(1 hunks)server/src/main/java/org/opensearch/index/store/remote/filecache/FileCachedIndexInput.java(4 hunks)server/src/main/java/org/opensearch/index/store/remote/filecache/FullFileCachedIndexInput.java(5 hunks)server/src/test/java/org/opensearch/index/store/remote/filecache/FullFileCachedIndexInputTests.java(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
- GitHub Check: gradle-check
- GitHub Check: precommit (21, windows-2025, true)
- GitHub Check: precommit (25, macos-15-intel)
- GitHub Check: precommit (25, ubuntu-24.04-arm)
- GitHub Check: precommit (25, windows-latest)
- GitHub Check: precommit (21, ubuntu-24.04-arm)
- GitHub Check: precommit (25, macos-15)
- GitHub Check: precommit (25, ubuntu-latest)
- GitHub Check: precommit (21, ubuntu-latest)
- GitHub Check: precommit (21, macos-15)
- GitHub Check: precommit (21, windows-latest)
- GitHub Check: precommit (21, macos-15-intel)
- GitHub Check: Analyze (java)
- GitHub Check: assemble (21, ubuntu-latest)
- GitHub Check: assemble (25, ubuntu-24.04-arm)
- GitHub Check: assemble (21, ubuntu-24.04-arm)
- GitHub Check: assemble (25, windows-latest)
- GitHub Check: assemble (25, ubuntu-latest)
- GitHub Check: assemble (21, windows-latest)
- GitHub Check: detect-breaking-change
🔇 Additional comments (4)
CHANGELOG.md (1)
93-93: LGTM!The changelog entry is correctly placed under the Fixed section with proper formatting and issue reference.
server/src/main/java/org/opensearch/index/store/remote/filecache/FileCachedIndexInput.java (1)
45-45: LGTM!The change from
volatile booleantoAtomicBooleanenables sharing the closed state withIndexInputHolderfor coordinated closure, which is the core fix for preventing duplicatedecRefcalls. Thefinalmodifier appropriately ensures the reference cannot be reassigned.Also applies to: 141-154
server/src/test/java/org/opensearch/index/store/remote/filecache/FullFileCachedIndexInputTests.java (1)
81-113: Well-structured test covering the key bug scenarios.This test effectively validates the fix by verifying:
close()followed byindexInputHolderRun()doesn't double-decrement (lines 96-100)indexInputHolderRun()followed byclose()doesn't double-decrement (lines 103-107)- Normal close behavior decrements correctly (lines 109-112)
This directly validates the prevention of duplicate reference count decrements.
server/src/main/java/org/opensearch/index/store/remote/filecache/FullFileCachedIndexInput.java (1)
41-42: Core fix looks correct.The shared
AtomicBooleanbetweenFullFileCachedIndexInputandIndexInputHolderensures coordinated closure - whichever executes first (close()or the cleaner'srun()) will setclosed=true, preventing the other from executing duplicate cleanup anddecRefcalls. This correctly addresses the repeated close bug.Also applies to: 115-116, 126-130
|
@gbbafna can you please review and merge this if it looks good |
Signed-off-by: Yongheng Liu <liuyonghengheng@gmail.com>
Signed-off-by: Andrew Ross <andrross@amazon.com>
9b97870 to
0c0f955
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
CHANGELOG.md(1 hunks)server/src/main/java/org/opensearch/index/store/remote/filecache/FileCachedIndexInput.java(4 hunks)server/src/main/java/org/opensearch/index/store/remote/filecache/FullFileCachedIndexInput.java(5 hunks)server/src/test/java/org/opensearch/index/store/remote/filecache/FullFileCachedIndexInputTests.java(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- server/src/test/java/org/opensearch/index/store/remote/filecache/FullFileCachedIndexInputTests.java
- CHANGELOG.md
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
- GitHub Check: gradle-check
- GitHub Check: precommit (21, macos-15-intel)
- GitHub Check: precommit (21, ubuntu-latest)
- GitHub Check: precommit (21, windows-2025, true)
- GitHub Check: precommit (25, ubuntu-24.04-arm)
- GitHub Check: precommit (21, windows-latest)
- GitHub Check: precommit (21, ubuntu-24.04-arm)
- GitHub Check: precommit (25, windows-latest)
- GitHub Check: precommit (25, macos-15-intel)
- GitHub Check: precommit (25, ubuntu-latest)
- GitHub Check: precommit (25, macos-15)
- GitHub Check: precommit (21, macos-15)
- GitHub Check: assemble (21, windows-latest)
- GitHub Check: assemble (25, ubuntu-latest)
- GitHub Check: assemble (25, windows-latest)
- GitHub Check: assemble (25, ubuntu-24.04-arm)
- GitHub Check: assemble (21, ubuntu-24.04-arm)
- GitHub Check: assemble (21, ubuntu-latest)
- GitHub Check: detect-breaking-change
- GitHub Check: Analyze (java)
🔇 Additional comments (6)
server/src/main/java/org/opensearch/index/store/remote/filecache/FileCachedIndexInput.java (2)
16-16: LGTM: AtomicBoolean field declaration.The conversion from
volatile booleantoAtomicBooleanprovides the necessary coordination between explicit close calls and the cleaner-invoked cleanup path, preventing duplicate closure and double reference decrements.Also applies to: 45-45
143-143: LGTM: Atomic closure guard.The close method correctly uses
closed.get()to check state andclosed.set(true)to mark completion, ensuring proper visibility across threads.Also applies to: 152-152
server/src/main/java/org/opensearch/index/store/remote/filecache/FullFileCachedIndexInput.java (4)
21-21: LGTM: Shared closure state.Passing the
closedAtomicBoolean to IndexInputHolder enables proper coordination between explicit close and cleaner-triggered cleanup, preventing the double-close bug.Also applies to: 41-41
87-87: LGTM: Coordinated closure with cleaner.The close method now uses the shared
closedAtomicBoolean, ensuring that either this explicit close or the cleaner's cleanup executes the decRef logic, but not both.Also applies to: 97-97
109-109: LGTM: IndexInputHolder state coordination.Storing the shared
closedAtomicBoolean enables the cleaner runnable to coordinate with explicit close calls, correctly implementing the fix for duplicate closure.Also applies to: 115-121
126-130: LGTM: Cleaner cleanup guard.The guard correctly prevents duplicate cleanup when the object becomes phantom reachable, ensuring
indexInput.close()andcache.decRef()execute at most once across both the cleaner and explicit close paths.
server/src/main/java/org/opensearch/index/store/remote/filecache/FullFileCachedIndexInput.java
Show resolved
Hide resolved
|
❌ Gradle check result for 0c0f955: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
Changes LGTM . Thanks @liuyonghengheng for this and @rayshrey for the review. Will merge once the build passes . |
Signed-off-by: Gaurav Bafna <85113518+gbbafna@users.noreply.github.com>
|
❌ Gradle check result for 033d8f2: null Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
❌ Gradle check result for 033d8f2: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
opensearch-project#20055) Signed-off-by: Yongheng Liu <liuyonghengheng@gmail.com>
opensearch-project#20055) Signed-off-by: Yongheng Liu <liuyonghengheng@gmail.com>
Description
check closed value in function FullFileCachedIndexInput.IndexInputHolder.run(),prevent FileCachedIndexInput closed repeatedly and duplicate deref the reference count of cache entry
Related Issues
Resolves #20054 #20054
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.
Summary by CodeRabbit
Bug Fixes
Tests
Documentation
✏️ Tip: You can customize this high-level summary in your review settings.