-
Notifications
You must be signed in to change notification settings - Fork 9.2k
HDFS-15945. DataNodes with zero capacity and zero blocks should be decommissioned immediately. #2854
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…commissioned immediately.
|
💔 -1 overall
This message was automatically generated. |
|
Seems the failure of |
|
💔 -1 overall
This message was automatically generated. |
virajjasani
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One minor comment, else looks good.
...ct/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
Show resolved
Hide resolved
|
Based on a discussion with @virajjasani (#2854 (comment)), I realized it doesn't matter what the capacity is. If a datanode which doesn't have any blocks, it could be decommissioned safely. Updated the PR based on that. |
virajjasani
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 (non-binding)
This reverts commit 0aa3649.
|
On second thought, the last commit has a problem. Just after restarting NameNode, NameNode hasn't received any block reports from any DataNode, so NameNode recognizes all DataNodes as zero blocks. Therefore, when restarting NameNode while decommissioning a DataNode, the DataNode becomes decommissioned imediately before replicating its blocks. Actually After all, I think we need to consider if the DataNode has zero capacity or not. If the capacity is zero, it means the DataNode has a problem with its storage, and we can decommission it safely. |
|
Oh I see, yeah this is a possibility. I agree that we should bring back zero capacity check.
Nice |
|
@virajjasani Thanks for your confirmation. Reverted the last commit and added more comment. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
The failed tests succeeded locally. |
| if (!node.checkBlockReportReceived()) { | ||
| LOG.info("Node {} hasn't sent its first block report.", node); | ||
| return false; | ||
| if (node.getCapacity() == 0 && node.getNumBlocks() == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DatanodeDescriptor#getNumBlocks() returns the variable numBlocks.
However, it is only set during initialization.
Instead, I suspect we want to DatanodeDescriptor#use numBlocks() where the number is computed, aggregated from all existing storage volumes.
| capacities[i][j] = 0; | ||
| } | ||
| } | ||
| getCluster().startDataNodes(getConf(), 1, null, true, null, null, null, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO a more complete repro of the scenario should include:
- start DN with volumes, update config to tolerate volume failures.
- intentionally corrupt the volumes (delete VERSION file, for example)
- trigger volume scanner, wait for the DN to drop the volume
Maybe we don't need a very faithful repro, but I am worried this test doesn't cover the real scenario.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your detailed reviews, @jojochuang. I will try to reproduce it by a unit test.
|
@jojochuang Sorry for being very late. We found the root cause of this problem. There is a bug in hadoop-3.3.0 that DataNode doesn't shutdown even if the number of the failed volumes is greater than |
|
As I said in the last comment, this is not a problem anymore after HDFS-15963. I'm closing this PR. |
|
Great! Glad to find out. |
JIRA: https://issues.apache.org/jira/browse/HDFS-15945