Skip to content

Conversation

@len548
Copy link
Contributor

@len548 len548 commented Jul 16, 2025

What changes were proposed in this pull request?

Design proposal for security token service for Ozone.
Please review and comment for feedbacks.

What is the link to the Apache JIRA

HDDS-13323

How was this patch tested?

N/A

Copy link
Contributor

@jojochuang jojochuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also here's a list of useful test cases to be added:

Useful Test Cases to Add

  1. Security Test Cases:
    • Confused Deputy: A test case to ensure that the system is not vulnerable to the "confused deputy" problem. This would involve
      attempting to assume a role in a way that could trick the service into granting unintended access.
    • Privilege Escalation: Test that a user cannot use the STS service to gain more permissions than they are originally assigned.
  2. Concurrency Test Cases:
    • Test the system's behavior when multiple users try to assume the same role concurrently.
    • Test the system's behavior when a single user makes multiple concurrent requests to assume different roles.
  3. Integration Test Cases:
    • Ranger Integration: Detailed tests to verify the interaction with Ranger for authorization, including scenarios with complex policies.
    • Ozone Native ACL Integration: Once the design for this is fleshed out, a comprehensive set of tests will be needed to ensure it works
      as expected.
    • External Secret Store Integration: Tests to verify that the system can correctly store and retrieve credentials from an external store
      like HashiCorp Vault.
  4. Negative Test Cases:
    • Invalid DurationSeconds: Test with values for DurationSeconds that are outside the allowed range (less than 900 or greater than
      43200).
    • Malformed resourcePermissions: Test with invalid or malformed resourcePermissions to ensure proper validation and error handling.
    • Expired Credentials: Test that expired temporary credentials are correctly rejected.


### Non functional requirements

1. Support in the order of 20k credentials
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can work on this later. But I'm curious where does this requirement come from.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a remark from @kerneltime:

If cloud resources are going to call this API we should expect thousands of these calls by workers of an analytical job. I do believe 20k should be a decent number to hit

I hope this helps

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so 20k per second?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it 20k credential creation per second, or 20k authorization per second?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose it is about creation. But let me confirm about that after @kerneltime is back available as we haven't discussed that further.

* Restricted to specific bucket/prefix paths
* Restricted to specific S3 operations
* Issuing credentials either to self or another identity
2. Authenticate the AssumeRoleKerberos call using Kerberos
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought this design doc would illustrate how Ozone S3G implements the AWS STS API endpoints, or at least a subset of it. But I don't see any of them mentioned here. Does it intend to be compatible with AWS S3?

AssumeRole

AssumeRoleWithSAML

AssumeRoleWithWebIdentity

AssumeRoot

DecodeAuthorizationMessage

GetAccessKeyInfo

GetCallerIdentity

GetFederationToken

GetSessionToken

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is intended to be compatible with AWS STS contracts. So AWS SDKs can call this STS API in Ozone through S3G. AWS S3 is one of services in AWS as well as STS. A set of temporary credentials issued by this STS service should be compatible to call Ozone S3-compatible APIs. I modified the doc according to this. See the commit: Modify requirements and add goals


With Amazon AWS, there is a central service which has the ability to generate [Security Tokens that span resources across services](https://docs.aws.amazon.com/STS/latest/APIReference/welcome.html).

This document covers a basic proposal that describes how Ozone can offer a stand alone STS service that can be used by users to use REST APIs to retrieve. This can be later extended to integrate with a centralized STS service.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do a "stand alone STS service" and a "centralized STS service" mean specifically?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By stand alone, it means to work itself within Ozone, not depending on other applications. For a client to generate a STS token, use it for S3 operations in Ozone, and token expiration management should be completed solely within Ozone. I'm afraid that I'm not sure what it means by centralized STS service exactly as Ritesh mentioned this too. In my understanding it can be later extended to accept tokens issued by other non-Ozone STS services (e.g. AWS STS). Let me ask @kerneltime about this when he's back available.


Ozone will call Ranger to authorize the AssumeRoleKerberos request. Once authorized, Ozone will generate S3 credentials and store the S3 credentials, role and resources requested.

When the client invokes S3 APIs using the S3 credentials passed in, Ozone will call Ranger with the requested bucketr:path:action along with the original bucket:prefix:action requested for AssumeRoleKerberos. Ranger will authorize the S3 request if it is compliantcomplaint with the original AssumeRoleKerberos request.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
When the client invokes S3 APIs using the S3 credentials passed in, Ozone will call Ranger with the requested bucketr:path:action along with the original bucket:prefix:action requested for AssumeRoleKerberos. Ranger will authorize the S3 request if it is compliantcomplaint with the original AssumeRoleKerberos request.
When the client invokes S3 APIs using the S3 credentials passed in, Ozone will call Ranger with the requested bucketr:path:action along with the original bucket:prefix:action requested for AssumeRoleKerberos. Ranger will authorize the S3 request if it is compliant with the original AssumeRoleKerberos request.


### Non functional requirements

1. Support in the order of 20k credentials
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so 20k per second?

@jojochuang jojochuang added the s3 S3 Gateway label Jul 18, 2025
Copy link
Contributor

@jojochuang jojochuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I had one comment that never got published.

### Non functional requirements

1. Support in the order of 20k credentials
2. Support STS actions other than AssumeRole. Those are:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these would fall under "functional requirements"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May I ask why? These seem to be non-functional requirements to me as we won't support those actions this time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-functional requirement does not mean out of scope.

It means performance, security, usability, scalability, reliability ... those are not expressed in terms of functionality or features, but they are crucial to ensure system works in production.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then isn't this supposed to fall under non-goals instead? Ozone Enhancement Proposals on docs defines non-goals as the followings:

Non-goals
Very important to define what is outside of the scope of this proposal

4. Authenticate the AssumeRoleKerberos call using Kerberos
5. Should work with Ozone native ACLs without Ranger
6. Authorize the credential issuance via Ranger
7. Store temporary credentials securely in Ozone Manager
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we don't store the tokens in OM? The token could have all the elements encoded within it and signed / encrypted by a secret key shared by all the OMs so it cannot be modified. That way, we don't need to store them or expire them or replicate them across OMs for failover. This is just an idea that might simplify things.

@errose28
Copy link
Contributor

Can this be closed in favor of #9223?

@len548 len548 closed this Nov 1, 2025
@errose28 errose28 added the sts Changes for Ozone's S3 Security Token Service label Nov 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

design s3 S3 Gateway sts Changes for Ozone's S3 Security Token Service

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants