Make CacheService.get() throws AlreadyClosedException when service is stopped#122006
Merged
tlrx merged 3 commits intoelastic:mainfrom Feb 10, 2025
Merged
Make CacheService.get() throws AlreadyClosedException when service is stopped#122006tlrx merged 3 commits intoelastic:mainfrom
tlrx merged 3 commits intoelastic:mainfrom
Conversation
… stopped This is caught thanks to elastic#121210: if shard files are verified/checksumed while the node is stopping, an IllegalStateException is throw by CacheService.get() when it attempts to read data from the cache. This exception later caused the verification to fail and then the Lucene index to be marked as corrupted (which nows fails for searchable snapshots shards that are read-only and should not be corrupted at all). This pull request changes ensureLifecycleStarted(), which is called during CacheService.get(), to throw an AlreadyClosedException when the service is stopped (note that ACE extends IllegalStateException, which is convenient here). This ACE will be later specially handlded in the checksumIndex method to not mark the shard as corrupted (see elastic#121210). Closes elastic#121927
Collaborator
|
Pinging @elastic/es-distributed-indexing (Team:Distributed Indexing) |
Member
Author
|
Thanks Francisco |
arteam
added a commit
to arteam/elasticsearch
that referenced
this pull request
Feb 18, 2025
…alSearchableSnapshot` The underlying failure `java.lang.AssertionError: Searchable snapshot directory does not support the operation [createOutput` was fixed in elastic#122006. The automation bot was too aggressive in re-opening this issue. Resolve elastic#122693
arteam
added a commit
that referenced
this pull request
Feb 18, 2025
…alSearchableSnapshot` (#122831) * Unmute `FrozenSearchableSnapshotsIntegTests#testCreateAndRestorePartialSearchableSnapshot` The underlying failure `java.lang.AssertionError: Searchable snapshot directory does not support the operation [createOutput` was fixed in #122006. The automation bot was too aggressive in re-opening this issue. Resolve #122693 * Add a check for the CLOSED state along with STOPPED * Update x-pack/plugin/searchable-snapshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/full/CacheService.java Co-authored-by: Tanguy Leroux <tlrx.dev@gmail.com> --------- Co-authored-by: Tanguy Leroux <tlrx.dev@gmail.com>
ncordon
pushed a commit
to ncordon/elasticsearch
that referenced
this pull request
Apr 1, 2026
…e to shutdown (elastic#145209) This is kind of similar to previous fixes related to this test (e.g. elastic#122006) where restarting nodes cause exceptions that leads to the upper layers during recovery thinking the data is corrupted as we are not able to read from/via the cache.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is caught thanks to #121210: if shard files are verified/checksumed
while the node is stopping, an IllegalStateException is throw by
CacheService.get() when it attempts to read data from the cache. This
exception later caused the verification to fail and then the Lucene
index to be marked as corrupted (which nows fails for searchable
snapshots shards that are read-only and should not be corrupted at
all).
This pull request changes ensureLifecycleStarted(), which is called
during CacheService.get(), to throw an AlreadyClosedException when
the service is stopped (note that ACE extends IllegalStateException,
which is convenient here). This ACE will be later specially handlded
in the checksumIndex method to not mark the shard as corrupted (see
#121210).
Example of stack trace failure:
Closes #121927