Skip to content

Conversation

@hevinhsu
Copy link
Contributor

@hevinhsu hevinhsu commented Jul 3, 2025

What changes were proposed in this pull request?

  • Implement a HTTP proxy to forward requests to multiple s3 gateways.
  • Implement MiniOzoneCluster.Service interface to support proxy server and multiple s3 gateway creation.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-11679

How was this patch tested?

@adoroszlai adoroszlai requested a review from ivandika3 July 3, 2025 12:22
@ivandika3 ivandika3 added test s3 S3 Gateway labels Jul 4, 2025
Copy link
Contributor

@ivandika3 ivandika3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hevinhsu Thanks a lot for this patch.

Good job on improving the S3 integration tests to mirror real-world deployments. The changes are minimal, but thoughtful.

Overall LGTM. Left some comments. In my opinion, we can combine this patch with the S3G integration tests immediately to validate the changes. Also, it will be good to have a simple test to check that the proxy is indeed routing the requests to different S3Gs. Please let me know what you think.

In the future we can maybe support the following features:

  • Support https where the proxy server as the SSL termination
  • Support more load balancing strategy (random, weighted round-robin, etc).

@hevinhsu
Copy link
Contributor Author

hevinhsu commented Jul 5, 2025

@ivandika3 Thank you for the feedback and suggestions.

Regarding the unit test, my initial thought is to add a unit test for RoundRobinStrategy, but I’m not sure if that would be sufficient.

As for merging with the S3G integration tests, do you mean copying the existing integration test classes and replacing S3GatewayService with MultiS3GatewayService?

@ivandika3
Copy link
Contributor

ivandika3 commented Jul 5, 2025

Regarding the unit test, my initial thought is to add a unit test for RoundRobinStrategy, but I’m not sure if that would be sufficient.

Yes, we can add a simple unit test on the RoundRobinStrategy, although it should be straightforward. The more important test is to validate that the proxy actual routes to different S3Gs, maybe through using Spy or parsing the S3G audit logs.

As for merging with the S3G integration tests, do you mean copying the existing integration test classes and replacing S3GatewayService with MultiS3GatewayService?

We incorporate the changes in your fork (https://github.com/hevinhsu/ozone/actions/runs/16043512252). I'm OK with replacing the S3GatewayService to MultiS3GatewayService as long as the tests are fine.

Copy link
Member

@peterxcli peterxcli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @hevinhsu for working on this!

LOG.info("Proxy stopped on http://{}:{}", host, port);
}

private class ProxyHandler implements HttpHandler {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have any reference for this logic?
I'm not sure if we might be missing something here, even if the tests pass — it could still lead to errors in the future.

@hevinhsu
Copy link
Contributor Author

I found that Jetty has a ProxyServlet, it seems to require minimal configuration.

@peterxcli Thanks for the reference! It's really helpful for implementing a proxy server that only needs to handle URL rewriting and the 100-continue issue. This will save a lot of effort compared to building the proxy from scratch.

@hevinhsu
Copy link
Contributor Author

Yes, we can add a simple unit test on the RoundRobinStrategy, although it should be straightforward. The more important test is to validate that the proxy actual routes to different S3Gs, maybe through using Spy or parsing the S3G audit logs.

@ivandika3 I couldn't find a reliable way to identify which S3G endpoint was actually called. I tried parsing the audit logs, but there was no identifier that could distinguish the specific endpoint. I also explored some HTTP endpoints that might expose this information, but couldn't find anything useful — though it's possible I might have missed something in all of these attempts.
If I overlooked anything, please feel free to let me know.
So instead, I started a simple HTTP server to verify the routing logic. Please take a look and let me know if this approach seems acceptable for validating the strategy.

I'm OK with replacing the S3GatewayService to MultiS3GatewayService as long as the tests are fine.

Since the ProxyServer implementation now uses Jetty's ProxyServlet, which handles HTTP headers properly, I've replaced S3GatewayService with MultiS3GatewayService in the integration tests.

Copy link
Contributor

@ivandika3 ivandika3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hevinhsu Thanks for the update. LGTM +1.

So instead, I started a simple HTTP server to verify the routing logic.

Thanks for trying to add the test, this test should be enough.

@peterxcli Thanks for suggesting the Jetty's ProxyServlet

@ivandika3 ivandika3 requested a review from peterxcli July 12, 2025 14:21
Copy link
Member

@peterxcli peterxcli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @hevinhsu for further refactoring and testing.

- Use constant to replace "Expect" http header
- Use GenericTestUtils to replace Thread.sleep
@hevinhsu
Copy link
Contributor Author

@ivandika3 Nits fixed. PTAL, thanks!

@jojochuang
Copy link
Contributor

Let's get this in. If there are follow-up comments we can address them later.

@ivandika3 ivandika3 merged commit 96cf8be into apache:master Jul 13, 2025
42 checks passed
@ivandika3
Copy link
Contributor

Thanks @hevinhsu for the improvement and @jojochuang @peterxcli for the reviews.

@hevinhsu
Copy link
Contributor Author

Thanks @jojochuang @ivandika3 @peterxcli for the reviews!

errose28 added a commit to errose28/ozone that referenced this pull request Jul 22, 2025
* master: (90 commits)
  HDDS-13308. OM should expose Ratis config for increasing pending write limits (apache#8668)
  HDDS-8903. Add validation for ozone.om.snapshot.db.max.open.files. (apache#8787)
  HDDS-13429. Custom metadata headers with uppercase characters are not supported (apache#8805)
  HDDS-13448. DeleteBlocksCommandHandler thread stop for normal exception (apache#8816)
  HDDS-13346. Intermittent failure in TestCloseContainer#testContainerChecksumForClosedContainer (apache#8771)
  HDDS-13125. Add metrics for monitoring the SST file pruning threads. (apache#8764)
  HDDS-13367. [Docs] User doc for container balancer. (apache#8726)
  HDDS-13200. OM RocksDB Grafana Dashbroad shows no data on all panels (apache#8577)
  HDDS-13428. Recon - Retrigger of build whole NSSummary tree task submission inconsistency. (apache#8793)
  HDDS-13378. [Docs] Add a Production page under Getting Started (apache#8734)
  HDDS-13403. [Docs] Make feature proposal process more visible. (apache#8758)
  HDDS-11797. Remove cyclic dependency between SCMSafeModeManager and SafeModeRules (apache#8782)
  HDDS-13213. KeyDeletingService should limit task size by both key count and serialized size. (apache#8757)
  HDDS-13387. OMSnapshotCreateRequest logs invalid warning about DefaultReplicationConfig (apache#8760)
  HDDS-13405. ozone admin container create runs forever without kinit (apache#8765)
  HDDS-11514. Set optimal default values for delete configurations based on live cluster testing. (apache#8766)
  HDDS-13376. Add server-side limit note to ozone sh snapshot diff --page-size option (apache#8791)
  HDDS-11679. Support multiple S3Gs in MiniOzoneCluster (apache#8733)
  HDDS-13424. Use lsof instead of fuser to find if file is used in AbstractTestChunkManager (apache#8790)
  HDDS-13427. Bump awssdk to 2.31.78 (apache#8792)
  ...
jojochuang pushed a commit to jojochuang/ozone that referenced this pull request Jul 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

s3 S3 Gateway test

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants