-
Notifications
You must be signed in to change notification settings - Fork 588
HDDS-10626. [LeaseRecovery] OM shuts down with 'SecretKey client must have been initialized already' #6467
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… have been initialized already'
|
|
||
| OMClientResponse omClientResponse = validateAndUpdateCache(); | ||
| OMResponse omResponse = omClientResponse.getOMResponse(); | ||
| assertEquals(OzoneManagerProtocolProtos.Status.OK, omResponse.getStatus()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it okay to just let it return OK? Or is it better to return an error? Ideally the client should keep retrying until OM is running.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This path is only for ratis raft log reapply, so skip the token is OK. But I find a more clean way to handle this. I will upload a new patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new patch is tested by Pratyush, and it works, the issue doesn't happen again(It's reproducible) . @jojochuang
ashishkumar50
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ChenSammi for the patch, Change LGTM +1.
jojochuang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
merging it
|
Thanks @jojochuang and @ashishkumar50 for the review. |
… have been initialized already' (apache#6467)
… have been initialized already' (apache#6467)
… have been initialized already' (apache#6467) (cherry picked from commit 6cfe9cf)
What changes were proposed in this pull request?
OzoneManager crashed during reapply Ratis transactions during startup. During recoverLease reapply, the request wants to generate the block token which depends on SecretKey service, while SecretKey client is not instantiated at that time because OzoneManager is reapply txs, and all the other services are not yet initialized.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-10626
How was this patch tested?
Tested in the cluster where the issue is found and can repeatedly reproduced.