-
Notifications
You must be signed in to change notification settings - Fork 588
HDDS-8998. Shared mutable state causes rootca poller to miss updates. #5050
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@fapifta thank you for spotting this error, I have missed this during my implementation. |
|
@fapifta nit: what do you think about fixing this on the poller side to make a copy of the passed in set instead of having to copy the set when instantiating it? (I know it wouldn't matter in the long run because the poller would just be instantiated from the DefaultCertificateClient, but afaik it would be a better practice) |
|
@Galsza I think the responsibility is on the DefaultCertificateClient side whether it serves its internal state or not. The poller should not be bothered by internals of the DefaultCertificateClient, that is why I do the copy on instantiation. |
adoroszlai
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @fapifta for the fix.
|
Thank you for the review @adoroszlai, I am tracking down something that has the same symptom in the ha-secure test set once I add the file operations. Let's wait until the cause is clear for that one, I am not sure but it might be related to something similar, and we might fix those together, but I need some time to figure it out. |
https://issues.apache.org/jira/browse/HDDS-8983 One of these or yet another failure? |
|
@fapifta, the fix looks good, +1. OM new CSR will be rejected if root ca rotation is in progress. The sequence looks like this,
|
@adoroszlai, thanks for reporting the issue. HDDS-8983 and HDDS-8992 are the same cause, different behavior. I submitted #5052, please help to have a review @fapifta @Galsza @adoroszlai , thanks. |
|
The second problem I ran into was because of a missing config as it turned out at the end. Thank you for the reviews, I am merging this one as it is for now. |
What changes were proposed in this pull request?
The PR proposes a change in how the RootCaRotationPoller is being instantiated, as the shared rootCaCertificates set from the DefaultCertificateClient is causing a problem.
The problem was found as part of the testing done for HDDS-8605, it is causing a problem in the case of the following events in order:
The cause it the shared rootCaCertificates set, which is updated by the certificate renewal process within the DefaultCertificateClient, so that the poller once it gets to poll the new certificates, it does not recognize that there is an unkown certificate (as it already knows it via the change in the data structure during renewal), therefore it does not update its listeners about the rootCA change in this scenario.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-8998
How was this patch tested?
In #5009 I am working on to add filesystem client calls towards the DataNodes that would rely on the ServiceInfoEx update in that patch. The ServiceInfoEx update happens within OM, once the poller sends a notification to the ServiceInfoProvider (again in that patch). This test bought up the problem, and with the proposed fix, that test also runs fine. I have not shared that test so far, I will push it up shortly to the PR so reviewers can see, once this gets merged I will rebase that PR again.