-
Notifications
You must be signed in to change notification settings - Fork 588
HDDS-6686. Do Leadship check before SASL token verification. #3382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Hey @nandakumar131 , would you help to review the patch? The failed flaky test is irrelevant. They can pass locally. |
May be irrelevant here, but please see #3376 (comment) and below. |
|
@neils-dev if you would like to review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @ChenSammi. I see that retrievePassword method now not only throws an exception when the s3 user cannot be authenticated but now also when the OM is not the leader. With this new InvaidToken exception (OMNotLeader / NotReady exception), the users of the retrievePassword may not accurately process the error.
Do we need to change the S3SecurityUtil.validateS3Credential handling of an exception thrown by its call to retrievePassword?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense. Will take care of that in next patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With validateS3Credential() checking the leader status when validating the password (S3Token) from the OMRequest, is the OzoneManagerRatisUtils.checkLeaderStatus() in the line above it unnecessary, redundant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @ChenSammi, thanks for the comment and for the help you've given me when I started with ozone and ozone development.
With HDDS-4440 and HDDS-5881, S3 authentication will be carried as part of the OMRequest.
Yes, authentication is done using the S3Authentication fields of the OMRequest. This processRequest() in the ProtocalServerSideTranslatorPB is part of the path for processing requests with the authentication through S3SecurityUtil.validateS3Credential.
and these requests will not be processed by SASL. So the leader check is still needed here.
Yes, the leader check is needed here, however isn't the leader check done along with the call to S3SecurityUtil.validateS3Credentials as well? So in this code path the leader is checked twice. Once just prior to the call to S3SecurityUtil.ValidateS3Credential and once inside the call (within retrievePassword). Is the call to OzoneManagerRatisUtils.checkLeaderStatus(ozoneManager) still needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. Will remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why go through this nesting? alternatively we could ozoneManager.checkLeaderStatus
| OzoneManagerRatisUtils.checkLeaderStatus(ozoneManager); | |
| ozoneManager.checkLeaderStatus(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All protocol service API need throws ServiceException, while ozoneManager.checkLeaderStatus throws OMNotLeaderException and OMLeaderNotReadyException. So we need OzoneManagerRatisUtils.checkLeaderStatus wrapper to covert exceptions.
ozoneManager.checkLeaderStatus is used both in SASL path authentication and protocol level authentication. It throws the core exception. Let the caller wrap the core exception into desired outer exceptions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@neils-dev, could you help to take another look?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @ChenSammi, thanks for your updated comment on handling the leader exceptions.
All protocol service API need throws ServiceException, while ozoneManager.checkLeaderStatus throws OMNotLeaderException and OMLeaderNotReadyException. So we need OzoneManagerRatisUtils.checkLeaderStatus wrapper to covert exceptions.
Having that, the OzoneManagerProtocolServerServerSideTranslatorPB.processRequest needs to throw a ServiceException should isRatisEnabled == true and it is not a leader. There looks to be two cases where this happens in processRequest:
- For the case that
isRatisEnabled == trueands3Auth == false, that's what happens with your leader checkOzoneManagerRatisUtils.checkLeaderStatus. It is handled. - For the case that
isRatisEnabled == trueandrequest.hasS3Authentication == trueit needs to be handled so that theServiceExceptionis thrown in the event of a leadership check exception.This looks to be currently wrapping the leadership exception in an IOException (It is handled by the validateS3Credential (missed it on initial read). Thanks.InvalidToken) and returned to the caller through the errorOmResponse.
The javadoc for validateS3Credential can be updated to include the ServiceException that may be thrown.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @neils-dev, sorry for the late response. Just want to confirm that the comments left to address is the following, right?
The javadoc for validateS3Credential can be updated to include the ServiceException that may be thrown.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@neils-dev , the javadoc is updated. Would you like to take another look?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ChenSammi. Looks great. I tested the changes on the secure docker development clusters as well without issues - clients handle OMException raised as expected for authentication errors (S3) and client failovers for raised ServiceException wrapped OMNotALeader exceptions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @neils-dev . Do you think the PR can be merged now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (I am unable to merge.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| OzoneManagerRatisUtils.checkLeaderStatus(ozoneManager); | |
| ozoneManager.checkLeaderStatus(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can drop this method.
Applying Ratis patch [RATIS-1481], I've been able to run the |
swagle
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 LGTM
|
Thanks @adoroszlai @kerneltime @neils-dev @swagle for the code review. |
* master: (87 commits) HDDS-6686. Do Leadship check before SASL token verification. (apache#3382) HDDS-4364: [FSO]List FileStatus : startKey can be a non-existed path (apache#3481) HDDS-6091. Add file checksum to OmKeyInfo (apache#3201) HDDS-6706. Exposing Volume Information Metrics to the DataNode UI (apache#3478) HDDS-6759: Add listblock API in MockDatanodeStorage (apache#3452) HDDS-5821 Container cache management for closing RockDB (apache#3426) HDDS-6683. Refactor OM server bucket layout configuration usage (apache#3477) HDDS-6824. Revert changes made in proto.lock by HDDS-6768. (apache#3480) HDDS-6811. Bucket create message with layout type (apache#3479) HDDS-6810. Add a optional flag to trigger listStatus as part of listKeys for FSO buckets. (apache#3461) HDDS-6828. Revert RockDB version pending leak fixes (apache#3475) HDDS-6764: EC: DN ability to create RECOVERING containers for EC reconstruction. (apache#3458) HDDS-6795: EC: PipelineStateMap#addPipeline should not have precondition checks post db updates (apache#3453) HDDS-6823. Intermittent failure in TestOzoneECClient#testExcludeOnDNMixed (apache#3476) HDDS-6820. Bucket Layout Post-Finalization Validators for ACL Requests. (apache#3472) HDDS-6819. Add LEGACY to AllowedBucketLayouts in CreateBucketHandler (apache#3473) HDDS-4859. [FSO]ListKeys: seek all the files/dirs from startKey to keyPrefix (apache#3466) HDDS-6705 Add metrics for volume statistics including disk capacity, usage, Reserved (apache#3430) HDDS-6474. Add test to cover the FSO bucket list status with beyond batch boundary and cache. (apache#3379). Contributed by aswinshakil HDDS-6280. Support Container Balancer HA (apache#3423) ...
https://issues.apache.org/jira/browse/HDDS-6686