-
Notifications
You must be signed in to change notification settings - Fork 588
HDDS-13857. [STS] Design doc for STS #9223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
errose28
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for writing this up @fmorg-git. Overall looks good, but we should add more details about how the Ranger and Native authorizers fit in with this.
Additionally, we need to choose how much development can/should happen on master vs. a feature branch. For Ranger integration within the OM, as long as it is not connected to a client facing API, that should be fine to go into the master branch if it helps iterate with Ranger devs quicker.
Once the endpoint to consume and process tokens is added, that should probably be in a feature branch. We try to keep master releasable and have had issues with security related features being partially developed on master in the past. See CVE-2024-45106.
hi @errose28, thanks for your comments. From the call today, it was decided:
|
errose28
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the doc updates @fmorg-git. Just a few more minor comments.
| The format of this String is entirely up to the Ranger team. What is required from the Ozone side is to supply this String to Ranger when any | ||
| subsequent S3 API calls are made that use STS tokens. In order to achieve this, the sessionPolicy String from Ranger will |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we get a Ranger PMC member to sign off on this doc as well to indicate we will be working together on this integration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I sent Slack messages and an email asking for Ranger input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my point of view we just need to sort out the port/server part of this here but otherwise looks good to me. Let's leave this open for now to make sure others get a chance to review and work through any questions/concerns.
| The AWS IAM policy specification is vast and wide-ranging. The initial Ozone STS supports a limited | ||
| subset of its capabilities. The restrictions are outlined below: | ||
|
|
||
| - The only supported prefix in ResourceArn is `arn:aws:s3:::` - all others will be rejected. **Note**: a ResourceArn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are both reject and silently ignored. Shall we unify the behavior to reject if operations and fields are not supported?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This design behavior came after (much) discussion with external teams. One external team will send S3 actions that Ozone doesn't support, and they don't have flexibility to change what they are sending, so that's one reason for the silent ignore. This external team also mentioned that some research indicates AWS also does not fail the AssumeRole request just because the inline session policy references unknown or unsupported actions, but rather it will fail when the temporary credentials are used, so this design is accordance with it. Another external team agreed that behavior is fine for actions, but does not work for Conditions, because one can have a Condition to restrict calls by sourceIp, and if we silently ignore this, the client may incorrectly think the temporary credentials are restricted for use by that IP address, so consensus was to reject the request for that scenario.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Can you update the doc about the background reason why some are rejected and some are silently ignored?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
|
|
||
| # 3. How Ozone STS Works | ||
|
|
||
| The initial implementation of Ozone STS supports only the [AssumeRole](https://docs.aws.amazon.com/STS/latest/APIReference/API_AssumeRole.html) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @fmorg-git for preparing the doc.
The AssumeRole has a few input parameters, some are optional. Are they all supported? If not, could you clarify what parameters are supported, and what are not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated - I added the supported required and optional parameters
| - The roleArn - the role used in the original AssumeRole call | ||
| - The encrypted secretAccessKey - this will be used to validate the AWS signature when the temporary credentials are used | ||
| to make S3 API calls | ||
| - (Optional) sessionPolicy - when using the RangerOzoneAuthorizer, if Ranger successfully authorizes the AssumeRole call, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When both roleArn and sessionPolicy exist, the final decision made by Ranger is the intersection of roleArn and sessionPolicy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, that's correct - I added additional verbiage to hopefully make that more clear
|
|
||
| ## 3.5 STS Token Revocation | ||
|
|
||
| In the rare event temporary credentials need to be revoked (ex. for security reasons), a table in RocksDB will be created |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will OM be involved in the assumeRole API call and following S3 API calls using the sessionToken authorization check? Is it OM RocksDB here used, or other RocksDB?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, OM is involved in the assumeRole API and S3 API calls (OM is currently involved in S3 calls with permanent credentials). Yes, OM RocksDB is used - I added some verbiage around that.
| storing temporary credentials server-side in Ozone, the sessionToken will comprise various components needed to validate | ||
| subsequent S3 calls that use the token. The sessionToken will have the following information encoded: | ||
|
|
||
| - The originalAccessKeyId - this is the Kerberos identity of the user that created the sessionToken via the AssumeRole call. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why originalAccessKeyId is needed in sessionToken, In which step it will be used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated to add verbiage about why originalAccessKeyId is needed. I also updated the flow section to indicate where originalAccessKeyId is used.
This identity is included in the sessionToken because S3 API calls (such as PutObject) require a Kerberos identity, but the temporary credentials don't have a Kerberos identity associated to them, therefore the Kerberos identity of the user that created the token will be used in these cases.
| - The expiration time of the token (via `ShortLivedTokenIdentifier#getExpiry()`) | ||
| - The UUID of the secret key used to sign the sessionToken and encrypt the secretAccessKey (via `ShortLivedTokenIdentifier#getSecretKeyId()`) | ||
| - expiration time of the token (via `ShortLivedTokenIdentifier#getExpiry()`) | ||
| - UUID of the OzoneManager private key used to sign the sessionToken and encrypt the secretAccessKey (via `ShortLivedTokenIdentifier#getSecretKeyId()`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previous "secret key" is accurate. It's not OzoneManager private key. It's a symmentic key generated by SCM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
|
Thanks @fmorg-git for the update. The last design doc looks good to me. |
|
Thanks @ChenSammi . Hi @mneethiraj just a friendly reminder to review from Ranger side when you have a chance. Thanks! |
|
This PR has been marked as stale due to 21 days of inactivity. Please comment or remove the stale label to keep it open. Otherwise, it will be automatically closed in 7 days. |
|
I think we are just waiting for a review from the Ranger side to merge this. @mneethiraj can you give this a look? |
| Similarly, an invalid S3 action will be silently ignored. | ||
| - Supported wildcard expansions in Actions are: `s3:*`, `s3:Get*`, `s3:Put*`, `s3:List*`, | ||
| `s3:Create*`, and `s3:Delete*`. | ||
| - If using OzoneNativeAuthorizer, bucket wildcards (ex. ResourceArn `arn:aws:s3:::*`, `arn:aws:s3:::bucket*` or `*`) will be rejected, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given STS is only available with Ranger authorizer (see #3 above), I suggest removing lines 93 to 97, to avoid potential misinterpretation that STS is supported with Ozone native authorizer as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
| must be called to issue permanent S3 credentials. With these credentials, the AssumeRole API call can be made, but additional | ||
| steps below are needed for the call to be successfully authorized. | ||
| When using RangerOzoneAuthorizer, a role must be configured in Ranger UI for each role the AssumeRole API | ||
| can be used with. The user making the AssumeRole call must be in this role, and this role must have access |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The user making the AssumeRole call must be in this role
This is not accurate.
I suggest replacing the sentence from 166 to 168 with the following:
Further, Apache Ranger policies should be in place to grant the user permission to assume the role.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
Tejaskriya
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the clear design doc. I just had one question as commented below. LGTM otherwise.
| - If a (currently) unsupported S3 action is requested, such as `s3:GetAccelerateConfiguration`, it will be silently ignored. | ||
| Similarly, an invalid S3 action will be silently ignored. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from being mentioned in the design doc, for a user, how can we make sure that this "silently ignored" is a known behaviour? Would user docs be the only way? (As this is a design doc, a simple user of ozone may not go through this)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I believe this behavior will need to be included in user docs. Also, if IIRC from previous discussions with an external team, the silently ignored behavior is what happens in AWS as well, at least in the AssumeRole call. When the token is used, it should fail though. Please see #9223 (comment)
|
thanks @Tejaskriya . Hi @ChenSammi - the Ranger team has approved. Can you please review and merge if it looks ok? Thanks! |
Please describe your PR in detail:
Documents the design for STS in Ozone.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-13857
How was this patch tested?
No tests needed for documentation