-
Notifications
You must be signed in to change notification settings - Fork 587
HDDS-7329. Extend ozone admin datanode usageinfo and list info to accept hostname parameter #3835
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
adoroszlai
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @xBis7 for working on this.
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/SCMNodeManager.java
Outdated
Show resolved
Hide resolved
hadoop-hdds/tools/src/main/java/org/apache/hadoop/hdds/scm/cli/datanode/ListInfoSubcommand.java
Outdated
Show resolved
Hide resolved
...mmon/src/main/java/org/apache/hadoop/hdds/scm/protocol/StorageContainerLocationProtocol.java
Outdated
Show resolved
Hide resolved
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/SCMNodeManager.java
Outdated
Show resolved
Hide resolved
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/SCMNodeManager.java
Outdated
Show resolved
Hide resolved
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/node/TestSCMNodeManager.java
Outdated
Show resolved
Hide resolved
|
@sokui @Xushaohong please review |
|
Hi @xBis7 , we should not deprecate dfs.datanode.use.datanode.hostname and always make it as false. This flag is also used by the users who deploy ozone to ip changing environment such as k8s. In those environments, the ip of the ozone components can be changed, and thus we should use the hostname for communication. This PR is to make the datanodes to be adaptive to k8s environments by letting it use hostname. Please check it out. Thanks. #3186 |
|
Hi @sokui, I wasn't aware of this. I was told that this flag is not used by the community and that we can deprecate it. I looked at #3186 and you are right we need to keep the existing code as it is. The goal here is to have a consistent CLI, so we keep the code as it is but we won't expose that behavior to the user. We need the --ip to accept only an IP address and add the --hostname parameter for accepting a hostname. We could create two new maps, one for mapping IP to UUID and one for mapping hostname to UUID and we could use those maps for accessing datanode info from the CLI. @adoroszlai @sokui What do you think about this approach? |
This sounds feasible, but some worry occurs that how to make sure the IP - UUID - hostname mapping is always consistent. If the hostname changes at some DN, we may need the retry logic with the IP-UUID mapping and need to update the corresponding hostname. |
@Xushaohong Can you provide an example in the code of how we are currently handling such a case? |
|
Moved everything back to how it was and added these two methods used only by the CLI. This achieves the wanted behavior and I think it doesn't affect the system in the cases mentioned above. Let me know what you all think. |
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/SCMNodeManager.java
Outdated
Show resolved
Hide resolved
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/NodeManager.java
Outdated
Show resolved
Hide resolved
|
I think we need to revisit this PR. Ideally, the listing of nodes should work the same for IP and hostname and should not require opening up specific APIs in SCM protocol. |
|
@kerneltime Let me explain in more details the purpose of this PR. The issue is that you can either have from master the default behavior from master with We want to have a As pointed out to me, we can't deprecate the flag or change the current logic in The current code achieves the desired behavior. from this branch with |
|
Let's discuss this in the next community sync @neils-dev |
|
@sokui @Xushaohong If this Map holds both IPs and Hostnames at all times, will this be an issue? In case we are in an ip changing environment and the ip doesn't work, hostname will still be available for the user. What do you think? Also, if you could provide a way for me to test it. |
Currently, this map is fine with IP changing environment since we either use IP or hostname, not both of them simultaneously. The prerequisite is the config The simple test environment could be found in the |
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/SCMNodeManager.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also be very careful. In my impression, in some of my test, datanodeDetails.getHostName() might give you IP instead of host name. Please test if your approach does not have bugs for this case as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sokui can you elaborate when you see an IP address being returned for hostname?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kerneltime it has been so long time, I could not remember. But my testing environment was Kubernetes. I think when we deployed ozone, I saw this happens a couple of times. But not sure what was the exact condition.
|
Thanks @sokui for your comments. On the errors you mention that may happen when using the map for both ip and hostnames mapping to UUIDs of DNs,
Can you give some detail on this? Where this can cause problems? |
@neils-dev , I did not mean your solution will have problems in this case. I just try to reminder you to be careful of this case. When the variable/method named as hostname, people may assume it is just the hostname not IP, but in some case it is not true. So far in this PR, I do not think this is an issue. |
This works fine with the current approach. Any particular scenarios or test cases? Commands to execute? I'm referring to anything more complex than getting the datanode info. |
|
This looks much better, will complete my review this week. |
Maybe you could try on the metal environment and adjust the hostname manually which is the case for hostname change. |
For k8s, if you deploy datanodes as deployment instead of statefulset, when you delete a node, it will restart with a different hostname, I think. |
|
Please add a end to end robot test which should help close the loop for the feature. |
|
@xBis7 I plan to finish this, are you OK with that? |
|
@adoroszlai Please, feel free to take this over. As far as I can recall, these were the unresolved issues that blocked this ticket
|
Thanks @xBis7, that's very useful for the future. However, here I don't intend to address all hostname-related problems, just trying to wrap it up considering your latest comment:
|
| DatanodeDetails dn = nodeStateManager.getNode(datanodeDetails); | ||
| Preconditions.checkState(dn.getParent() != null); | ||
| addToDnsToUuidMap(dnsName, datanodeDetails.getUuid()); | ||
| addToDnsToUuidMap(ipAddress, uuid); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since these are synchronized methods, I think we should do both updates (ip & hostname) in a single method call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @sadanand48, updated.
| } | ||
| updateDnsToUuidMap(oldDnsName, dnsName, datanodeDetails.getUuid()); | ||
| updateDnsToUuidMap(oldIpAddress, ipAddress, uuid); | ||
| updateDnsToUuidMap(oldHostName, hostName, uuid); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above
sadanand48
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @adoroszlai for the patch update, LGTM
|
@xBis7 please let me know if you are OK with the latest update to be merged |
|
@adoroszlai Thanks for finishing this. I've gone through the changes. LGTM! |
|
Thanks @xBis7 for the patch, @kerneltime, @sadanand48, @sokui, @Xushaohong for the review. |
What changes were proposed in this pull request?
In order to make the CLI and
ozone admin datanodesubcommands, more consistent, added a hostname parameter for the commandlist infoand an address parameter that accepts both ip and hostname for the commandusage info.There is an hdfs flag
dfs.datanode.use.datanode.hostnamechecked by the SCMNodeManager during start up and if it's true then we can get the datanode info by providing the hostname with the --ip pamameter, while ip will not be available anymore. The default case is keeping the flag's value to false where the ip is available but not the hostname.If we add the --address parameter, then we can stop using the flag and clean up the code.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-7329
How was this patch tested?
Modified the existing tests and they are all passing. Also, some unit tests for the cases where hostname or ip changes. Tested manually in a docker cluster and kubernetes.