HDDS-4440. [PROPOSAL] Use per-request authentication and persistent connections between S3g and OM #1562

elek · 2020-11-09T11:24:27Z

This is a design doc. Please see the content.

…between S3g and OM

elek · 2020-11-09T11:26:47Z

We had a conversation with @bharatviswa504 last week and promised to write it down. Not a must-have, blocker (IMHO) but would be great to discuss it, if somebody would be interested, it's a good but smaller project.

/cc @arp7

bharatviswa504

Thank You @elek for the design.
I have few questions on the proposal for implementation level details, as design talks at high level, we shall use new Grpc Client and Server for S3G.

hadoop-hdds/docs/content/design/s3-performance.md

bharatviswa504 · 2020-11-17T00:28:50Z

hadoop-hdds/docs/content/design/s3-performance.md

+4. create a new protocol proxy
+5. If connection is cached (same UGI) services can be used even if the token is invalidated earlier (as the token is checked during the initialization of the tokens).
+
+Fortunately this behavior doesn't cause any problem in case of Ozone and S3g. UGI (which is part of the cache key of the connection cache) equals if (and only if) the underlying `Subject` is the same.


Just to clarify my understanding, we don't have invalidate token method for s3G right?

As this token is generated from Client auth header fields. Means token is generated per request.

We are sending the required info to validate the auth header with the secret which OM has.

Just to clarify my understanding, we don't have invalidate token method for s3G right?

You are right. This example is independent of s3g just explains how the cache works. I tried to describe the problem with the simple delegation token. (I can add it as a note)

bharatviswa504 · 2020-11-17T00:31:42Z

hadoop-hdds/docs/content/design/s3-performance.md

+
+`OzoneToken` identifier can be simplified (after deprecation period) with removing the S3 specific part, as it won't be required any more.
+
+With this approach the `OzoneClient` instances can be cached on S3g side (with persistent GRPC connections) as the authentication information is not part of the OzoneClient any more (added by the `OmTransport` implementation per request (in case of GRPC) or per connection (in case of HadoopRPC)).


Does that mean with this approach we need one ozone Client instantiated, as token is part of OMRequest.

Few questions:
This means S3G does not use hadoop Rpc Client, it will be use GrpcClient
So how OM HA will be handled retry handling logic, so all that logic need to be implemented in this new GrpcClient?
And once the token is validated, will it go with the normal flow of execution in OzoneManager?

Few minor questions, as I don't have much expertise on Grpc Implementation.

Does GrpcServer also will have RPC handler threads where requests can be handled parallel on OM.

Does that mean with this approach we need one ozone Client instantiated, as token is part of OMRequest.

Yes.

And once the token is validated, will it go with the normal flow of execution in OzoneManager?

Yes, exactly the same logic.

Does GrpcServer also will have RPC handler threads where requests can be handled parallel on OM.

Yes. As far as I understood from the documentation the new thread is created by the async IO handler thread.

But if we need more freedom, we can always introduce a simple Executor.

But it's a very good question. Thinking about this, I have new ideas: with separating S3g and client side traffic we can monitor the two in different way (for example compare queue time of client and s3g calls, or set priorities). Not in this step, but something which will be possible.

How OM HA will be handled retry handling logic, so all that logic need to be implemented in this new GrpcClient?

Yes. We need to take care about the retry logic. My initial suggestion is to create 3, persistent connection to all the OM, and in case of not leader exception try to send the message on a different connection.

In case of client it can be expensive (always create 3 different connection to 3 different OM HA instance), but in case of S3g and persistent connections it seems to more effective as the connections are persistent.

So, it is like a new retry logic should be implemented for GrpcClient.

Thank You for detailed answers for other points.

I have new ideas: with separating S3g and client side traffic we can monitor the two in different way (for example >compare queue time of client and s3g calls, or set priorities). Not in this step, but something which will be possible.

This idea looks interesting, but at end, both are coming from end clients, so getting additional metrics helps to understand better calls from S3/other interface, but not sure in which scenarios this will help.

For example to understand / compare the cluster usage. Which part is used more HCFS or s3? What is the source of small files s3 or HCFS?

This (different metrics) is not something to do right now but an interesting option to think forward if this approach is accepted.

So, it is like a new retry logic should be implemented for GrpcClient.

Yes. And I argue that this logic can be optimized for servers (connections to different OM instances can be cached long-term) and not only optimized for client (open second connections to the right OM only in case of leader election)

bharatviswa504 · 2020-11-17T00:41:00Z

hadoop-hdds/docs/content/design/s3-performance.md

+
+# Possible alternatives
+
+* It's possible to use pure Hadoop RPC client instead of Ozone Client which would make the client connection slightly cheaper (service discovery call is not required) but it's still require to create new connections for each requests (and downloading data without OzoneClient may have own challenges).


Not understood this point, what is meant by service discovery call is not required and also not using ozone client may have own challenge.

So, can we use one single client even with Hadoop RPC? More information on this point will help what is meant by this alternative.

Not understood this point, what is meant by service discovery call is not required and also not using ozone client may have own challenge.

When you use OzoneClient an initial service discovery call will be executed at the beginning. But after that you can use it easily. Both OM client connection and datanode connections are managed by OzoneClient.

We can try to use pure OM Client call (without using OzoneClient just to use Hadoop RPC client API) to avoid service discovery, but in that case we couldn't use OzoneClient. As OzoneClient contains the client logic for datanode, without OzoneClient the OM Client calls can be more simple, but at the end the solution can be more complex as we should use a lower level datanode client api, too.

Understood, so for OM API's we want to use direct omClient instead of coming via ozone client to save service discovery, but for dn we need still ozone client.
But how token authentication will happen?

Understood, so for OM API's we want to use direct omClient instead of coming via ozone client to save service discovery, but for dn we need still ozone client.

Yes, that is added as possible alternative (using pure om client API + OzoneClient for datanode), but I don't like it:

Authentication (as you asked) is not solved here, still you need per request connection

Using OM client + OzoneClient for datanode is not straightforward, requires more work.

Therefore, I suggested to use a different approach. (use OzoneClient but create a new OMTransport implementation based on GRPC).

xiaoyuyao

LGTM, let's schedule some time to discuss this.

xiaoyuyao · 2020-11-17T22:31:52Z

hadoop-hdds/docs/content/design/s3-performance.md

+
+As this is nothing more, just a transport: exactly the same messages (`OmRequest`) will be used, it's not a new RPC interface.
+
+Only one modification is required in the RPC interface: a new per-request`token` field should be introduced in `OMRequest` which is optional.


To protect the token from being stolen, TLS must be enabled for GRPC.
To set up TLS for GRPC, the client must get the CA cert via service discovery.

To protect the token from being stolen, TLS must be enabled for GRPC.

Yes, 100% agree.

To set up TLS for GRPC, the client must get the CA cert via service discovery.

Fix me If I am wrong, but CA certificate is also downloaded during the datanode initialization and can be used.

But anyway: as I suggest to use OzoneClient (but with new OM transport), serviceDiscovery call will be executed as before, but instead of calling once for each S3 HTTP request, it will be called only once one connection is added to the connection pool.

xiaoyuyao · 2020-11-17T22:42:38Z

hadoop-hdds/docs/content/design/s3-performance.md

+
+# Possible alternatives
+
+* It's possible to use pure Hadoop RPC client instead of Ozone Client which would make the client connection slightly cheaper (service discovery call is not required) but it's still require to create new connections for each requests (and downloading data without OzoneClient may have own challenges).


I still feel we can reuse the Hadoop Rpc connection here.
Don't remember exactly why we have to use a token user and do the token validation at OM. But another solution I would like to propose is to use Proxy user at S3g:

Instead of wrap the token to create a new Hadoop RPC connection per call. S3g can validate OM token similar to the way DN validate OM block token. After validation succeeds, S3g can create a proxy user to connect to OM. If it is the same client, the proxy user can be reused.

Thanks the comments @xiaoyuyao

We had an offline discussion and I try to summarize what we discussed.

.... S3g can validate OM token similar to the way DN validate OM block token

We couldn't move the authentication from OM to S3g as it's based on asymmetric encryption (requests are signed with a private key and OM re-produce the signature with the stored secret). Private access key shouldn't be moved out from OM. Therefore, S3g couldn't do authentication.

PROXY_USER itself is per-connection (AFAIK), it doesn't fully solve the problem. We can do per-user Hadoop RPC connection caching, but despite the complexity it's not a full solution in an environment where we have thousands of users.

Also, the per-user Hadoop RPC connection caching on s3g side has some difficulties. The current caching logic is hard coded in static fields. To cache connection per user, s3g trust the user information, which is not possible. A request with a signature which is in valid format, but created with fake access key, couldn't re-use the cached and authenticated connection of the user.

elek · 2020-12-07T15:03:41Z

Updated the implementation section as suggested by @swagle

@xiaoyuyao Do you have any more questions or concerns?

elek · 2021-02-02T08:51:22Z

@xiaoyuyao / @mukul1987 Can we commit this? Do you have any more suggestion?

elek · 2021-02-02T08:51:43Z

For the records, this video explain the current behavior:

https://www.youtube.com/watch?v=ewgpCsvZcKg&list=PLCaV-jpCBO8UK5Ged2A_iv3eHuozzMsYv&index=13

arp7 · 2021-02-02T15:25:36Z

Can we do persistent connections with Hadoop RPC?

elek · 2021-02-02T19:52:51Z

Can we do persistent connections with Hadoop RPC?

Hadoop RPC authentication is per-connection. Persistent Hadoop RPC connection will use the same authentication context for all the calls. I would like to use a custom, per-request authentication.

((One additional benefit what I learned since opening this issue: HDDS-4763 showed that this authentication is better to be removed from the delegation tokens which is available for clients))

arp7 · 2021-02-03T16:35:46Z

I see, let's discuss. If changing the RPC transport requires implementing a new client then it could become a significant task.

The S3G is a trusted component. Can it use persistent Hadoop RPC connections with its own security credentials, and pass some OOB security info about the S3 user on behalf of which it is acting? We have also talked about decoupling S3 users from Kerberos.

cc @prashantpogde, @bharatviswa504

elek · 2021-02-05T05:35:47Z

If changing the RPC transport requires implementing a new client then it could become a significant task.

The proposal suggested using the same OzoneClient as before, the only new part is a new org.apache.hadoop.ozone.om.protocolPB.OmTransportFactory implementation.

We have also talked about decoupling S3 users from Kerberos

IMHO it's an independent question. At least if we are talking about the client usage. That requires a change on OM side to create the secret on a different way. It's independent of the internal communication between OM S3G.

This change can make it possible to start s3g without kerberos, It makes it possible to start stateless s3g services side-by-side with S3 clients.

And I believe using Hadoop RPC in this case has more limitations. This is a very specific case use, and caching and authentication of Hadoop RPC are implemented for per-request, client use case. For example the caching is hard-coded, using static fields.

elek · 2021-02-15T12:33:14Z

Hi, do you have any more comments @arp7, @xiaoyuyao @mukul1987?

Can we commit this one?

elek · 2021-03-01T12:45:21Z

Ping @arp7, @xiaoyuyao @mukul1987, @swagle

arp7 · 2021-03-01T15:34:09Z

I have no objection but one question - What does committing it to the codebase mean? Is someone planning to pick up this proposal and convert it into a detailed design?

elek · 2021-03-01T16:11:32Z

I have no objection but one question - What does committing it to the codebase mean?

Yes, it represents an agreement on the direction to follow. Based on the consensus contributors can create design, poc and -- finally -- the implementation.

arp7

Thanks @elek. LGTM with one small suggestion.

arp7 · 2021-03-01T18:33:59Z

hadoop-hdds/docs/content/design/s3-performance.md

@@ -0,0 +1,205 @@
+---
+title: Persistent OM connection for S3 gateway
+summary: Use per-request authentication and persistent connections between S3g and OM


Can we add the word proposal here in the title/summary?

Sure. Added in 9c6ce91.

xiaoyuyao · 2021-03-01T18:39:52Z

LGTM, thanks @elek

arp7

+1

elek · 2021-03-08T09:20:26Z

Merging it. Thanks for the review to all of you...

HDDS-4440. Use per-request authentication and persistent connections …

ed2429b

…between S3g and OM

elek changed the title ~~HDDS-4440. Use per-request authentication and persistent connections between S3g and OM~~ HDDS-4440. [DESIGN] Use per-request authentication and persistent connections between S3g and OM Nov 9, 2020

elek requested review from arp7 and bharatviswa504 November 16, 2020 08:35

arp7 requested a review from xiaoyuyao November 16, 2020 15:58

bharatviswa504 reviewed Nov 17, 2020

View reviewed changes

xiaoyuyao reviewed Nov 17, 2020

View reviewed changes

fix typo

b0bf60d

ChenSammi mentioned this pull request Dec 3, 2020

HDDS-4519. Return forbidden instead of interval server error from s3g… #1642

Merged

diagram about the old a new approach

af94640

elek changed the title ~~HDDS-4440. [DESIGN] Use per-request authentication and persistent connections between S3g and OM~~ HDDS-4440. [PROPOSAL] Use per-request authentication and persistent connections between S3g and OM Feb 24, 2021

arp7 reviewed Mar 1, 2021

View reviewed changes

adding proposal word to the title and summary

9c6ce91

arp7 approved these changes Mar 1, 2021

View reviewed changes

elek merged commit 3859f9f into apache:master Mar 8, 2021


		`OzoneToken` identifier can be simplified (after deprecation period) with removing the S3 specific part, as it won't be required any more.

		With this approach the `OzoneClient` instances can be cached on S3g side (with persistent GRPC connections) as the authentication information is not part of the OzoneClient any more (added by the `OmTransport` implementation per request (in case of GRPC) or per connection (in case of HadoopRPC)).


		# Possible alternatives

		* It's possible to use pure Hadoop RPC client instead of Ozone Client which would make the client connection slightly cheaper (service discovery call is not required) but it's still require to create new connections for each requests (and downloading data without OzoneClient may have own challenges).


		As this is nothing more, just a transport: exactly the same messages (`OmRequest`) will be used, it's not a new RPC interface.

		Only one modification is required in the RPC interface: a new per-request`token` field should be introduced in `OMRequest` which is optional.

HDDS-4440. [PROPOSAL] Use per-request authentication and persistent connections between S3g and OM #1562

HDDS-4440. [PROPOSAL] Use per-request authentication and persistent connections between S3g and OM #1562

Uh oh!

Conversation

elek commented Nov 9, 2020

Uh oh!

elek commented Nov 9, 2020

Uh oh!

bharatviswa504 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

xiaoyuyao left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

elek commented Dec 7, 2020 • edited by swagle Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elek commented Feb 2, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elek commented Feb 2, 2021

Uh oh!

arp7 commented Feb 2, 2021

Uh oh!

elek commented Feb 2, 2021

Uh oh!

arp7 commented Feb 3, 2021

Uh oh!

elek commented Feb 5, 2021

Uh oh!

elek commented Feb 15, 2021

Uh oh!

elek commented Mar 1, 2021

Uh oh!

arp7 commented Mar 1, 2021

Uh oh!

elek commented Mar 1, 2021

Uh oh!

arp7 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

elek commented Dec 7, 2020 •

edited by swagle

Loading

elek commented Feb 2, 2021 •

edited

Loading