-
Notifications
You must be signed in to change notification settings - Fork 9.2k
HDFS-16359. RBF: RouterRpcServer#invokeAtAvailableNs does not take effect when retrying #3731
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…fect when retrying
|
💔 -1 overall
This message was automatically generated. |
|
@goiri @hemanthboyina @jojochuang PTAL. Thanks. |
|
💔 -1 overall
This message was automatically generated. |
|
This failed UT |
Hi @ayushtkn , I found this unit test runs out of time several times. Can we just increase the timeout? |
...-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcServer.java
Show resolved
Hide resolved
|
💔 -1 overall
This message was automatically generated. |
| * removing the already invoked namespaceinfo. | ||
| */ | ||
| private Set<FederationNamespaceInfo> getNameSpaceInfo(String nsId) { | ||
| private Set<FederationNamespaceInfo> getNameSpaceInfo( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be a static method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @goiri for your comment, I noticed that this method is currently only called in one place. So do we still need to change it to static? Could you please give me some advice. Thank you very much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is too important to be honest.
There might be some minor optimization by being static but I do not think it is that important.
The main reason to make it static would be to make clear that this method is not doing anything with the attributes of the object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, if we are changing the signature of the method, I would also add final to the arguments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for your detailed reply. I have made some modifications, please have a look again.
|
💔 -1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
Thank you @goiri very much for your review again. |
|
One doubt here:
Then, So, if the first namespace returned after excluding the default namespace is also down. Then we would still get an error, right? despite having other namespace being available? Shouldn't this be a invokeSequential kind of stuff, try one by one all the namespaces until you get the result. Am I missing something here? |
Thanks @ayushtkn for your comments and detailed explanation. I think you are right. As described on line |
ayushtkn
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, it was initialy implemented itself to try once only even in case of fault tolerance.
This PR is just fixing that behaviour.
Will raise a jira to track the change the retry mechanism to all available Namespaces…
@goiri does that makes sense?
Yes, let's fix the current behavior and open a new JIRA with the general retry issue. |
|
Merged, Thanx Everyone!!! |
…fect when retrying (apache#3731). Contributed by tomscut. Reviewed-by: Inigo Goiri <[email protected]> Signed-off-by: Ayush Saxena <[email protected]>
JIRA: HDFS-16359.
RouterRpcServer#invokeAtAvailableNs does not take effect when retrying. See HDFS-15543.
The original code of RouterRpcServer#getNameSpaceInfo looks like follwings, and
namespaceInfosis always empty here.