-
Notifications
You must be signed in to change notification settings - Fork 588
HDDS-8982. Infinite loop in WritableRatisContainerProvider if pipeline's nodes are not found #5742
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
adoroszlai
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @DaveTeng0 for working on this.
...hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ContainerManagerImpl.java
Outdated
Show resolved
Hide resolved
| if (containerInfo.getContainerID() != -1) { | ||
| return containerInfo; | ||
| } else { | ||
| excludeList.addPipeline(containerInfo.getPipelineID()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trying to exclude the faulty pipeline may not work, because:
Lines 192 to 196 in d1a92b9
| if (pipelines.size() == 0 && !excludeList.isEmpty()) { | |
| // if no pipelines can be found, try finding pipeline without | |
| // exclusion | |
| pipelines = pipelineManager.getPipelines(repConfig, pipelineState); | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, the reason I add it in the first part of findPipelinesByState(Line 100) of this method is to prevent that broken pipeline from being selected in the second part of findPipelinesByState (Line 164 , after pipelineManager.createPipeline) I just feel it's nice to have that, but it could be removed too. I'm open to any suggestion whether to keep it or not here. Thanks!
….container.max.retry to private attribute in WritableRatisContainerProvider
|
Please check CI run in fork before starting PR workflow. If there is a failure caused by the change (here |
…bleRatisContainerProvider
|
Thanks again @DaveTeng0 for working on this. I prefer #5911 to this one. Marking as draft for now. Will close if #5911 gets merged. |
What changes were proposed in this pull request?
HDDS-8982. Prevent infinite loop in getContainer which causes log flooded by WritableRatisContainerProvider if pipeline's nodes are not found
Please describe your PR in detail:
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-8982
How was this patch tested?
unit test