-
Notifications
You must be signed in to change notification settings - Fork 593
HDDS-8365. Admin SCM command to decommission SCM. #4563
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…g scm from ratis ring, error handling. Includes client and server side unit tests. TODO - scm revocation for decommissioning.'
|
We should keep the CLI consistent with the existing decom CLIs. DN and OM uses |
…ent with existing decommissioning commands, ozone admin <node type> decommission, and minor changes to parameters to be consistent with decommissioning commands.
Thanks @errose28 . The argument for this scm decommission command Pushed changes to make decommission commands consistent as suggested: |
|
@neils-dev Why do we need the cluster ID to decommission SCM? The SCM receiving the command already knows its own cluster ID and cannot make changes if there is a mismatch. |
Thanks for following up. The scm request for removing the scm, ozone/hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMHAManagerImpl.java Line 378 in cffa386
On clusterId mismatch, an exception is thrown that is propagated back to the admin cli user specifying the clusterId expected for the given scm. |
|
@nandakumar131 Why was cluster ID validation for SCM removal added in #4358 ? None of the other decommission requests, or any requests that I can think of, require the user to specify the cluster ID. Cluster ID is internally generated by Ozone and abstracted from users, who refer to the cluster by service ID. The only way for a user to learn the cluster ID is by manually checking VERSION files or startup log messages AFAIK. |
Thanks. Note that the clusterId can be avail to the user with the SCM web UI. It can provide the scmid and clusterid for the scm. |
@errose28 you're right. The Cluster ID is internal, and we should not ask the user to provide the Cluster ID. In #4358 the Cluster ID is used internally for Ratis group removal as Cluster ID is used as Ratis Group ID. The validation that is done in @neils-dev we should not get the Cluster ID from the user. |
|
Thanks @nandakumar131 , @errose28 . Just had an offline discussion with Nanda. Going to modify the |
…quire the scmid (nodeid) for decommissioning an SCM. Previously required ratisaddress and clusterid now will be handled on handling and validation of scm decommission command server side, opened HDDS-8452.
|
Updated scm decommissioning command to remove required clusterId. Command issued now with scmid as --nodeid containing UUID of scm to decommission. |
|
Dependency pr supporting the admin scm decommissioning command merged and closed: |
…ng branch 'upstream/master' into HDDS-8365
… recently merged to master.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add some more tests here to verify other scenarios as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@neils-dev, thanks for working on this.
Please create a follow-up Jira to add more unit-tests.
What changes were proposed in this pull request?
ScmDecommissioning scm admin client command support. Includes removing scm from ratis ring, error handling. Includes client and server side unit tests.
ozone admin scm decommission --nodeidWhat is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-8365
How was this patch tested?
Unit tests: manual command.