Allow allocation to replacement target node on vacate completion#140150
Merged
elasticsearchmachine merged 6 commits intoelastic:mainfrom Jan 7, 2026
Merged
Conversation
Today, allocations for shards other those from the replacement source node are not allowed for the replacement target node as long as the shutdown record exists. This is unecessary since allocation should be valid as soon as the source node finishes vacates which is often earlier than removal of the shutdown record. This PR fixes it by allowing such allocation. It also handles the rare case where the source node gets disconnected during vacating. The target node prioritizes unassigned shards from the source node in this case and allow other shards once these shards are all assigned. Resolves: elastic#139897
Collaborator
|
Hi @ywangd, I've created a changelog YAML for you. |
Collaborator
|
Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination) |
nicktindall
approved these changes
Jan 5, 2026
| null | ||
| ); | ||
| assertTrue(safeGet(client().execute(PutShutdownNodeAction.INSTANCE, putShutdownRequest)).isAcknowledged()); | ||
| ensureGreen(index1); |
Contributor
There was a problem hiding this comment.
Are these two lines sufficient to know that the shard will certainly be relocated by the time we findIdOfNodeWithIndex? I would have thought there would be some asynchrony to that?
Member
Author
There was a problem hiding this comment.
Yeah you are right. I meant to revert the change of this test class since changes for NodeShutdownShardsIT already cover it. But somehow forgot before raising the PR. I have now removed it in 338daa4. Thanks for noticing!
szybia
added a commit
to szybia/elasticsearch
that referenced
this pull request
Jan 7, 2026
* upstream/main: (191 commits) Overall Decision for Deciders prioritizes THROTTLE (elastic#140237) Apply group by all logic not only to top-level aggregates (elastic#140248) [ES|QL] Refactor MV_UNION and MV_INTERSECTION to use shared set operation helper (elastic#139982) Avoid reading entire bloom filter file on reader open (elastic#139374) Mark bloom filter files for random access (elastic#139375) Ensure that the buffer used for ES93BloomFilterStoredFieldsFormat is zeroed (elastic#139034) Add busy assertion to avoid race condition for testStalledShardMigrationProperlyDetected (elastic#140230) Remove line number check for testTransitiveFindsDeepCallChain (elastic#140228) Allow a slight difference in rescored docs (elastic#139931) Mute org.elasticsearch.xpack.inference.integration.AuthorizationTaskExecutorIT testCreatesEisChatCompletion_DoesNotRemoveEndpointWhenNoLongerAuthorized elastic#138480 Start exchange sink fetchers concurrently (elastic#140196) Allow allocation to replacement target node on vacate completion (elastic#140150) Ignore JNA cleaner threads in SecureHdfsRepositoryAnalysisRestIT (elastic#139925) DeterministicQueue refactor and enhancement (elastic#140151) Always error out if CCS expression shows up when CCS is not supported (elastic#139009) Use IllegalArgumentException over RepositoryException for readonly-repository checks (elastic#140200) Guard promql capabilities in AnalyzerTests (elastic#140232) [Inference API] Fix flaky AuthorizationTaskExecutorIT tests (elastic#139978) Cleaning up exitable vector value impls (elastic#140190) [Inference API] Fix auth exception listener not called bug (elastic#139966) ...
sidosera
pushed a commit
to sidosera/elasticsearch
that referenced
this pull request
Jan 7, 2026
…stic#140150) Today, allocations for shards other those from the replacement source node are not allowed for the replacement target node as long as the shutdown record exists. This is unecessary since allocation should be valid as soon as the source node finishes vacates which is often earlier than removal of the shutdown record. This PR fixes it by allowing such allocation. It also handles the rare case where the source node gets disconnected during vacating. The target node prioritizes unassigned shards from the source node in this case and allow other shards once these shards are all assigned. Resolves: elastic#139897
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Today, allocations for shards other those from the replacement source node are not allowed for the replacement target node as long as the shutdown record exists. This is unecessary since allocation should be valid as soon as the source node finishes vacates which is often earlier than removal of the shutdown record.
This PR fixes it by allowing such allocation. It also handles the rare case where the source node gets disconnected during vacating. The target node prioritizes unassigned shards from the source node in this case and allow other shards once these shards are all assigned.
Resolves: #139897