Skip to content

Comments

Allow allocation to replacement target node on vacate completion#140150

Merged
elasticsearchmachine merged 6 commits intoelastic:mainfrom
ywangd:es-139897-allocation-during-replacement
Jan 7, 2026
Merged

Allow allocation to replacement target node on vacate completion#140150
elasticsearchmachine merged 6 commits intoelastic:mainfrom
ywangd:es-139897-allocation-during-replacement

Conversation

@ywangd
Copy link
Member

@ywangd ywangd commented Jan 5, 2026

Today, allocations for shards other those from the replacement source node are not allowed for the replacement target node as long as the shutdown record exists. This is unecessary since allocation should be valid as soon as the source node finishes vacates which is often earlier than removal of the shutdown record.

This PR fixes it by allowing such allocation. It also handles the rare case where the source node gets disconnected during vacating. The target node prioritizes unassigned shards from the source node in this case and allow other shards once these shards are all assigned.

Resolves: #139897

Today, allocations for shards other those from the replacement source
node are not allowed for the replacement target node as long as the
shutdown record exists. This is unecessary since allocation should be
valid as soon as the source node finishes vacates which is often earlier
than removal of the shutdown record.

This PR fixes it by allowing such allocation. It also handles the rare
case where the source node gets disconnected during vacating. The target
node  prioritizes unassigned shards from the source node in this case
and allow other shards once these shards are all assigned.

Resolves: elastic#139897
@ywangd ywangd requested a review from nicktindall January 5, 2026 05:57
@ywangd ywangd added >bug :Distributed/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) v9.4.0 labels Jan 5, 2026
@elasticsearchmachine elasticsearchmachine added the Team:Distributed Coordination (obsolete) Meta label for Distributed Coordination team. Obsolete. Please do not use. label Jan 5, 2026
@elasticsearchmachine
Copy link
Collaborator

Hi @ywangd, I've created a changelog YAML for you.

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination)

Copy link
Contributor

@nicktindall nicktindall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! nice one

null
);
assertTrue(safeGet(client().execute(PutShutdownNodeAction.INSTANCE, putShutdownRequest)).isAcknowledged());
ensureGreen(index1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these two lines sufficient to know that the shard will certainly be relocated by the time we findIdOfNodeWithIndex? I would have thought there would be some asynchrony to that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah you are right. I meant to revert the change of this test class since changes for NodeShutdownShardsIT already cover it. But somehow forgot before raising the PR. I have now removed it in 338daa4. Thanks for noticing!

@ywangd ywangd added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Jan 7, 2026
@elasticsearchmachine elasticsearchmachine merged commit acc582a into elastic:main Jan 7, 2026
35 checks passed
@ywangd ywangd deleted the es-139897-allocation-during-replacement branch January 7, 2026 03:18
szybia added a commit to szybia/elasticsearch that referenced this pull request Jan 7, 2026
* upstream/main: (191 commits)
  Overall Decision for Deciders prioritizes THROTTLE (elastic#140237)
  Apply group by all logic not only to top-level aggregates (elastic#140248)
  [ES|QL] Refactor MV_UNION and MV_INTERSECTION to use shared set operation helper (elastic#139982)
  Avoid reading entire bloom filter file on reader open (elastic#139374)
  Mark bloom filter files for random access (elastic#139375)
  Ensure that the buffer used for ES93BloomFilterStoredFieldsFormat is zeroed (elastic#139034)
  Add busy assertion to avoid race condition for testStalledShardMigrationProperlyDetected (elastic#140230)
  Remove line number check for testTransitiveFindsDeepCallChain (elastic#140228)
  Allow a slight difference in rescored docs (elastic#139931)
  Mute org.elasticsearch.xpack.inference.integration.AuthorizationTaskExecutorIT testCreatesEisChatCompletion_DoesNotRemoveEndpointWhenNoLongerAuthorized elastic#138480
  Start exchange sink fetchers concurrently (elastic#140196)
  Allow allocation to replacement target node on vacate completion (elastic#140150)
  Ignore JNA cleaner threads in SecureHdfsRepositoryAnalysisRestIT (elastic#139925)
  DeterministicQueue refactor and enhancement (elastic#140151)
  Always error out if CCS expression shows up when CCS is not supported (elastic#139009)
  Use IllegalArgumentException over RepositoryException for readonly-repository checks (elastic#140200)
  Guard promql capabilities in AnalyzerTests (elastic#140232)
  [Inference API] Fix flaky AuthorizationTaskExecutorIT tests (elastic#139978)
  Cleaning up exitable vector value impls (elastic#140190)
  [Inference API] Fix auth exception listener not called bug (elastic#139966)
  ...
sidosera pushed a commit to sidosera/elasticsearch that referenced this pull request Jan 7, 2026
…stic#140150)

Today, allocations for shards other those from the replacement source
node are not allowed for the replacement target node as long as the
shutdown record exists. This is unecessary since allocation should be
valid as soon as the source node finishes vacates which is often earlier
than removal of the shutdown record.

This PR fixes it by allowing such allocation. It also handles the rare
case where the source node gets disconnected during vacating. The target
node  prioritizes unassigned shards from the source node in this case
and allow other shards once these shards are all assigned.

Resolves: elastic#139897
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) >bug :Distributed/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) Team:Distributed Coordination (obsolete) Meta label for Distributed Coordination team. Obsolete. Please do not use. v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Node shutdown replacement doesn't account for complete shutdown

3 participants