-
Notifications
You must be signed in to change notification settings - Fork 587
HDDS-3921. IllegalArgumentException triggered in SCMContainerPlacemen… #1162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…tRackAware.chooseDatanodes
xiaoyuyao
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ChenSammi for reporting the issue and propose the fix. The change LGTM.
I have a question wrt the excess calculation in handleOverReplicationContainer where we only consider inflightDeletion without inflightReplication at around line 615. Could this contribute the over replication issue as we will break out when a smaller excess value reaches 0 but the inflighreplication is still going?
...p-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ReplicationManager.java
Outdated
Show resolved
Hide resolved
...p-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ReplicationManager.java
Outdated
Show resolved
Hide resolved
|
Thanks for this change. I also wonder if there is a bug in the method In For |
@xiaoyuyao , I think inflightReplication not considered here is safer since replication has the change to fail. Imaging we have 2 healthy replicas and 2 inflight replications, this case, send the command to delete the extra 1 replica until we are sure that we have 4 healthy replicas in hand. |
It's a good point. I don't think it's a real bug, but we can improve it. HDDS-3942 to track it. |
sodonnel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This revision LGTM, pending green CI run. +1.
Lets wait for Xiaoyu to check he is happy before committing.
|
testDeleteKeyWithSlowFollower failed at leader membership check step. The test passed locally. It seems a timing issue, not relevant to this patch. |
|
Thanks @sodonnel and @xiaoyuyao for the review. |
|
LGTM, +1. Thanks @ChenSammi for the contribution and all for the reviews. I will merge the PR shortly. |
https://issues.apache.org/jira/browse/HDDS-3921