-
Notifications
You must be signed in to change notification settings - Fork 589
HDDS-7563. Add a handler for under replicated Ratis containers in RM #4025
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
...main/java/org/apache/hadoop/hdds/scm/container/replication/RatisUnderReplicationHandler.java
Show resolved
Hide resolved
...main/java/org/apache/hadoop/hdds/scm/container/replication/RatisUnderReplicationHandler.java
Outdated
Show resolved
Hide resolved
| @Override | ||
| public Map<DatanodeDetails, SCMCommand<?>> processAndCreateCommands( | ||
| Set<ContainerReplica> replicas, List<ContainerReplicaOp> pendingOps, | ||
| ContainerHealthResult result, int minHealthyForMaintenance) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering if we really need the ContainerHealthResult as a parameter in this interface. Seems in the EC case and here we just pull the containerInfo out of it. Might make the tests easier if we changed this interface. Worth having a quick look maybe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we're only using it to get ContainerInfo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should leave as it is in this PR, and then remove it in a small cleanup PR afterwards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're actually also passing ContainerHealthResult as an argument to processOverReplicatedContainer in RM from the EC handler. I think we should let this be for now and create another jira to see if things can be simplified.
.../java/org/apache/hadoop/hdds/scm/container/replication/TestRatisUnderReplicationHandler.java
Outdated
Show resolved
Hide resolved
|
This looks mostly good to me. Just a few small things to check with my comments left inline. |
sodonnel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - Will commit after green CI.
|
Thanks for reviewing @sodonnel. Merging now that CI is green. |
* master: (110 commits) HDDS-7472. EC: Fix NSSummaryEndpoint#getDiskUsage for EC keys (apache#3987) HDDS-5704. Ozone URI syntax description in help content needs to mention about ozone service id (apache#3862) HDDS-7555. Upgrade Ratis to 2.4.2-8b8bdda-SNAPSHOT. (apache#4028) HDDS-7541. FSO recursive delete directory with hierarchy takes much time for cleanup (apache#4008) HDDS-7581. Fix update-jar-report for snapshot (apache#4034) HDDS-7253. Fix exception when '/' in key name (apache#4038) HDDS-7579. Use Netty 4.1.77 for consistency (apache#4031) HDDS-7562. Suppress warning about long filenames in tar (apache#4017) HDDS-7563. Add a handler for under replicated Ratis containers in RM (apache#4025) HDDS-7497. Fix mkdir does not update bucket's usedNamespace (apache#3969) HDDS-7567. Invalid entries in LICENSE (apache#4020) HDDS-7575. Correct showing of RATIS-THREE icon in Recon UI (apache#4026) HDDS-7540. Let reusable workflow inherit secrets (apache#4012) HDDS-7568. Bump copyright year in NOTICE (apache#4018) HDDS-7394. OM RPC FairCallQueue decay decision metrics list caller username in the metric (apache#3878) HDDS-7510. Recon: Return number of open containers in `/clusterState` endpoint (apache#3989) HDDS-7561. Improve setquota, clrquota CLI usage (apache#4016) HDDS-6615. EC: Improve write performance by pipelining encode and flush (apache#3994) HDDS-7554. Recon UI should show DORMANT in pipeline status filter (apache#4010) HDDS-7540. Separate scheduled CI from push/PR workflows (apache#4004) ...
What changes were proposed in this pull request?
The new RM needs a handler that can process under replicated Ratis containers and create replication commands. Legacy RM does this in the
handleUnderReplicatedContainermethod. The new handlerRatisUnderReplicationHandlermostly has the same logic as legacy with one notable difference - we don't need to handle mis replication (failure to meet placement policy) because that's a separate case now and should get its own handler.What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-7563
How was this patch tested?
TestRatisUnderReplicationHandler