Unpromotables skip replication and peer recovery#93210
Unpromotables skip replication and peer recovery#93210kingherc merged 25 commits intoelastic:mainfrom
Conversation
For skipping replication: * ReplicationTracker and Group filter promotable * Remove from in sync allocations in metadata Fixes ES-4861 For skipping peer recovery: * Shards pass directly to STARTED skipping intermediate peer recovery stages and messages Fixes ES-5257
|
Hi @kingherc, I've created a changelog YAML for you. |
|
Oh BTW, the source branch should read |
DaveCTurner
left a comment
There was a problem hiding this comment.
Looks great, pretty much how I hoped. I left some comments but nothing structural. I suspect this breaks the refresh API tho, ideally we'd address that here (or beforehand) too.
server/src/internalClusterTest/java/org/elasticsearch/cluster/routing/ShardRoutingRoleIT.java
Show resolved
Hide resolved
server/src/internalClusterTest/java/org/elasticsearch/cluster/routing/ShardRoutingRoleIT.java
Outdated
Show resolved
Hide resolved
server/src/internalClusterTest/java/org/elasticsearch/cluster/routing/ShardRoutingRoleIT.java
Outdated
Show resolved
Hide resolved
server/src/internalClusterTest/java/org/elasticsearch/cluster/routing/ShardRoutingRoleIT.java
Show resolved
Hide resolved
server/src/internalClusterTest/java/org/elasticsearch/cluster/routing/ShardRoutingRoleIT.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/shard/IndexShard.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/shard/IndexShard.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/shard/IndexShard.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/indices/recovery/PeerRecoveryTargetService.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/indices/recovery/PeerRecoveryTargetService.java
Outdated
Show resolved
Hide resolved
|
Did not handle yet the feedback, apart from the comment about Refresh. Just pushed a commit that re-enables Refresh for unpromotable shards. Feel free to review that if you'd like as well. I will continue to handle the remaining feedback, and then invite you to review finally. |
DaveCTurner
left a comment
There was a problem hiding this comment.
I left a handful of comments, mostly tiny stuff.
...rc/main/java/org/elasticsearch/action/admin/indices/refresh/TransportShardRefreshAction.java
Outdated
Show resolved
Hide resolved
...rc/main/java/org/elasticsearch/action/admin/indices/refresh/TransportShardRefreshAction.java
Outdated
Show resolved
Hide resolved
...rc/main/java/org/elasticsearch/action/admin/indices/refresh/TransportShardRefreshAction.java
Outdated
Show resolved
Hide resolved
...rc/main/java/org/elasticsearch/action/admin/indices/refresh/TransportShardRefreshAction.java
Show resolved
Hide resolved
...rc/main/java/org/elasticsearch/action/admin/indices/refresh/TransportShardRefreshAction.java
Outdated
Show resolved
Hide resolved
...rg/elasticsearch/action/admin/indices/refresh/TransportUnpromotableReplicaRefreshAction.java
Outdated
Show resolved
Hide resolved
pxsalehi
left a comment
There was a problem hiding this comment.
Thanks Iraklis. Just left some minor comments/questions.
server/src/main/java/org/elasticsearch/action/ActionModule.java
Outdated
Show resolved
Hide resolved
...rc/main/java/org/elasticsearch/action/admin/indices/refresh/TransportShardRefreshAction.java
Show resolved
Hide resolved
...rg/elasticsearch/action/admin/indices/refresh/TransportUnpromotableReplicaRefreshAction.java
Show resolved
Hide resolved
server/src/internalClusterTest/java/org/elasticsearch/cluster/routing/ShardRoutingRoleIT.java
Outdated
Show resolved
Hide resolved
|
Regarding adding a new action: you'd need to add the new action to |
Oh thanks! Totally unaware of this. I do see @DaveCTurner might you know if I should add both [r] and [u] above? |
|
I believe that registering these actions is only necessary for actions that are exposed to a |
|
In fact, I think registering these transport actions is not just unnecessary, it's actively forbidden: |
I'm not sure, I understand this. But I think, if the new internal action is not in |
|
Probably simplest to see what happens if you try it: |
kingherc
left a comment
There was a problem hiding this comment.
Hi all, I handled your comments and this PR is now ready for review. Please feel free to review and also the previous unresolved conversations.
server/src/internalClusterTest/java/org/elasticsearch/cluster/routing/ShardRoutingRoleIT.java
Show resolved
Hide resolved
| shardRoutings.addAll(routingTable.assignedShards()); // include relocation targets | ||
|
|
||
| // include relocation targets | ||
| shardRoutings.addAll(routingTable.assignedShards().stream().filter(ShardRouting::isPromotableToPrimary).toList()); |
There was a problem hiding this comment.
let's just pull out the relocation targets in that loop instead of adding all the shards again.
I now realize that iterating the shards above already contains the relocation targets, right? I am not even sure why the old code added again the assigned shards for relocation targets.
So we do not need this line at all in the end, I will simply remove it. Please tell me if I'm assuming wrong here.
server/src/main/java/org/elasticsearch/index/shard/IndexShard.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/indices/recovery/PeerRecoveryTargetService.java
Outdated
Show resolved
Hide resolved
|
Pinging @elastic/es-distributed (Team:Distributed) |
DaveCTurner
left a comment
There was a problem hiding this comment.
Looks great, there's just one more comment (an easy trap to fall into) and a few more nits.
server/src/internalClusterTest/java/org/elasticsearch/cluster/routing/ShardRoutingRoleIT.java
Outdated
Show resolved
Hide resolved
server/src/internalClusterTest/java/org/elasticsearch/cluster/routing/ShardRoutingRoleIT.java
Outdated
Show resolved
Hide resolved
server/src/internalClusterTest/java/org/elasticsearch/cluster/routing/ShardRoutingRoleIT.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/shard/IndexShard.java
Outdated
Show resolved
Hide resolved
.../org/elasticsearch/action/admin/indices/refresh/TransportUnpromotableShardRefreshAction.java
Outdated
Show resolved
Hide resolved
.../org/elasticsearch/action/admin/indices/refresh/TransportUnpromotableShardRefreshAction.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/shard/IndexShard.java
Outdated
Show resolved
Hide resolved
|
Hi all, I handled your remaining feedback. Feel free to check latest commits. It'd be great if you could review today if possible! Thanks |
For skipping replication: * ReplicationTracker and Group filter shards that are promotable to primary * Remove unpromotable shards from in sync allocations in metadata * There is a new Refresh action for unpromotable replica shards Fixes ES-4861 For skipping peer recovery: * Unpromotable shards pass directly to STARTED skipping some intermediate peer recovery stages and messages Fixes ES-5257
| .stream() | ||
| .filter(Predicate.not(ShardRouting::isPromotableToPrimary)) | ||
| .map(ShardRouting::currentNodeId) | ||
| .collect(Collectors.toUnmodifiableSet()) |
There was a problem hiding this comment.
Do we need this intermediate collection? I think it could be replaced with distinct
There was a problem hiding this comment.
I thought we dropped this. We don't even need distinct(), we're looking at node IDs for the copies of a single shard which are necessarily distinct anyway.
There was a problem hiding this comment.
I think we dropped it from another place, but not here. I can handle this in another PR.
This RecoveryPlannerService was a workaround to allow search shards to bootstrap from peers; it is now unused since peer recovery is fully skipped for search shards (elastic#93210). Relates
…erations (elastic#149) Since elastic#93210 no operations should be replicated to search shards. I think we can make stronger assertions in the SearchEngine if an index/delete/noop operation is executed.
For skipping replication:
Fixes ES-4861
For skipping peer recovery:
Fixes ES-5257