-
Notifications
You must be signed in to change notification settings - Fork 588
HDDS-6960. EC: Implement the Over-replication Handler #3572
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
umamaheswararao
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @JacksonYao287 for working on this patch. I have few comments to check. Thanks
...src/main/java/org/apache/hadoop/hdds/scm/container/replication/ECOverReplicationHandler.java
Show resolved
Hide resolved
...src/main/java/org/apache/hadoop/hdds/scm/container/replication/ECOverReplicationHandler.java
Show resolved
Hide resolved
...src/main/java/org/apache/hadoop/hdds/scm/container/replication/ECOverReplicationHandler.java
Show resolved
Hide resolved
| try { | ||
| //the command target node should be in-service and healthy | ||
| return nodeManager.getNodeStatus(dd) | ||
| .equals(NodeStatus.inServiceHealthy()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we discussed this. This is not covering healthy_readonly?
...src/main/java/org/apache/hadoop/hdds/scm/container/replication/ECOverReplicationHandler.java
Outdated
Show resolved
Hide resolved
...src/main/java/org/apache/hadoop/hdds/scm/container/replication/ECOverReplicationHandler.java
Outdated
Show resolved
Hide resolved
...test/java/org/apache/hadoop/hdds/scm/container/replication/TestECOverReplicationHandler.java
Show resolved
Hide resolved
...test/java/org/apache/hadoop/hdds/scm/container/replication/TestECOverReplicationHandler.java
Show resolved
Hide resolved
| * commands. | ||
| */ | ||
| public class ECUnderReplicationHandler implements UnderReplicationHandler { | ||
| public class ECUnderReplicationHandler implements UnhealthyReplicationHandler { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure we will call over replication as unhealthy. Don't have better name in my mind but how about simply ReplicationHandler ?
Since it is common class now, javadoc should represent the generic message. Currently it assumed for underReplication handler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ReplicationHandler seems a little confusing, it might be considered as the handler to handle replication command.
so i suggest we use unhealthy just for now and change it to an appropriate name after we find a proper one. what about "nonOptimalReplicationHandler"
|
@umamaheswararao thanks for the review, will fix the comments later |
umamaheswararao
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @JacksonYao287 for updating the patch. Patch almost looks good to me, except a small comment left.
...src/main/java/org/apache/hadoop/hdds/scm/container/replication/ECOverReplicationHandler.java
Outdated
Show resolved
Hide resolved
|
Thanks @umamaheswararao for the review , i have update this patch according to your comments, pleas take a look |
| } | ||
| /** | ||
| * Identify a new set of datanode(s) to replicate/reconstruct the container | ||
| * and form the SCM commands to send it to DN. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missed to update doc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry for missing , fixed now!
| import java.util.stream.Collectors; | ||
|
|
||
| /** | ||
| * this class holds some common methods that will be shared among |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typos: this --> This
Implementation -> implementation
umamaheswararao
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
thanks @umamaheswararao for the review! i have committed this patch to master branch |
* master: (46 commits) HDDS-6901. Configure HDDS volume reserved as percentage of the volume space. (apache#3532) HDDS-6978. EC: Cleanup RECOVERING container on DN restarts (apache#3585) HDDS-6982. EC: Attempt to cleanup the RECOVERING container when reconstruction failed at coordinator. (apache#3583) HDDS-6968. Addendum: [Multi-Tenant] Fix USER_MISMATCH error even on correct user. (apache#3578) HDDS-6794. EC: Analyze and add putBlock even on non writing node in the case of partial single stripe. (apache#3514) HDDS-6900. Propagate TimeoutException for all SCM HA Ratis calls. (apache#3564) HDDS-6938. handle NPE when removing prefixAcl (apache#3568) HDDS-6960. EC: Implement the Over-replication Handler (apache#3572) HDDS-6979. Remove unused plexus dependency declaration (apache#3579) HDDS-6957. EC: ReplicationManager - priortise under replicated containers (apache#3574) HDDS-6723. Close Rocks objects properly in OzoneManager (apache#3400) HDDS-6942. Ozone Buckets/Objects created via S3 should not allow group access (apache#3553) HDDS-6965. Increase timeout for basic check (apache#3563) HDDS-6969. Add link to compose directory in smoketest README (apache#3567) HDDS-6970. EC: Ensure DatanodeAdminMonitor can handle EC containers during decommission (apache#3573) HDDS-6977. EC: Remove references to ContainerReplicaPendingOps in TestECContainerReplicaCount (apache#3575) HDDS-6217. Cleanup XceiverClientGrpc TODOs, and document how the client works and should be used. (apache#3012) HDDS-6773. Cleanup TestRDBTableStore (apache#3434) - fix checkstyle HDDS-6773. Cleanup TestRDBTableStore (apache#3434) HDDS-6676. KeyValueContainerData#getProtoBufMessage() should set block count (apache#3371) ... Conflicts: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/server/upgrade/SCMUpgradeFinalizer.java
HDDS-6960. EC: Implement the Over-replication Handler (apache#3572) (cherry picked from commit fac23c9) Change-Id: Ic273820549544731b781b143a810ee45ef0085df
What changes were proposed in this pull request?
Implement the Over-replication Handler
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-6960
How was this patch tested?
unit test