HDFS-16456. EC: Decommission a rack with only on dn will fail when the rack number is equal with replication#4126
Conversation
|
@tasanuma Please review, thank you. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
@tasanuma The failed UT don't seem to relate to this patch, please help to check. |
tasanuma
left a comment
There was a problem hiding this comment.
@lfxy Thanks for creating the PR. I did some tests with this PR in my test cluster, and it worked well. I left some review comments about typo words.
And I have one more question. There are some clusterMap.getNumOfRacks() in BlockPlacementStatusDefault. Do we need to update them as well?
...t/java/org/apache/hadoop/hdfs/server/namenode/TestBlockPlacementPolicyRackFaultTolerant.java
Show resolved
Hide resolved
...t/java/org/apache/hadoop/hdfs/server/namenode/TestBlockPlacementPolicyRackFaultTolerant.java
Outdated
Show resolved
Hide resolved
...t/java/org/apache/hadoop/hdfs/server/namenode/TestBlockPlacementPolicyRackFaultTolerant.java
Outdated
Show resolved
Hide resolved
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/NetworkTopology.java
Outdated
Show resolved
Hide resolved
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/NetworkTopology.java
Outdated
Show resolved
Hide resolved
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/NetworkTopology.java
Outdated
Show resolved
Hide resolved
|
@tasanuma Yes, I think clusterMap.getNumOfRacks() in BlockPlacementPolicyDefault should also be updated because only non empty rack makes sense. |
|
🎊 +1 overall
This message was automatically generated. |
tasanuma
left a comment
There was a problem hiding this comment.
Thanks for updating the PR. +1.
I will merge this PR next week if there are no other reviews.
|
@surendralilhore Please comment if you have any concerns. |
|
💔 -1 overall
This message was automatically generated. |
|
Merged. Thanks for your contribution, @lfxy! |
|
@tasanuma Thank you for your review and giving a lot of useful suggestions. |
…e rack number is equal with replication (apache#4126) (cherry picked from commit cee8c62) Conflicts: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/NetworkTopology.java Change-Id: Id5f937c25d87ae48f3ccabecf8b0c5feac7ca496 (cherry picked from commit dd79aee635fdc61648e0c87bea1560dc35aee053)
…e rack number is equal with replication (#4126) (#4304) (cherry picked from commit cee8c62) Conflicts: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/NetworkTopology.java (cherry picked from commit dd79aee635fdc61648e0c87bea1560dc35aee053) Co-authored-by: caozhiqiang <lfxy@163.com> Reviewed-by: Takanobu Asanuma <tasanuma@apache.org>
…e rack number is equal with replication (apache#4126)
…e rack number is equal with replication (apache#4126)
HDFS-16456
In below scenario, decommission will fail by TOO_MANY_NODES_ON_RACK reason:
The root cause is in BlockPlacementPolicyRackFaultTolerant::getMaxNodesPerRack() function, it will give a limit parameter maxNodesPerRack for choose targets. In this scenario, the maxNodesPerRack is 1, which means each rack can only be chosen one datanode.
int maxNodesPerRack = (totalNumOfReplicas - 1) / numOfRacks + 1;
here will be called, where totalNumOfReplicas=9 and numOfRacks=9
When we decommission one dn which is only one node in its rack, the chooseOnce() in BlockPlacementPolicyRackFaultTolerant::chooseTargetInOrder() will throw NotEnoughReplicasException, but the exception will not be caught and fail to fallback to chooseEvenlyFromRemainingRacks() function.
When decommission, after choose targets, verifyBlockPlacement() function will return the total rack number contains the invalid rack, and BlockPlacementStatusDefault::isPlacementPolicySatisfied() will return false and it will also cause decommission fail.
public boolean isPlacementPolicySatisfied() { return requiredRacks <= currentRacks || currentRacks >= totalRacks; }According to the above description, we should make the below modify to fix it: