Skip to content

Conversation

@L-Applin
Copy link
Contributor

@L-Applin L-Applin commented Sep 16, 2025

Implement parallel download for multipart GetObject in s3 Async Client and Transfer Manager.

Modifications

  • Add two new classes (Publisher/Subscriber) to orchestrate the non-linear multipart download: NonLinearMultipartDownloaderSubscriber and FileAsyncResponseTransformerPublisher. Note for reviewer: This is the core of the PR new functionality and review should probably start with those two classes.
  • Add support in Transfer-Manager module for Transfer Progress Updater.
    • Note for reviewer: The AsyncResponseTransformer published by FileAsyncResponseTransformerPublisher needs to wrapped to publish progress to the progress updater. This is done in GenericS3TransferManager and TransferProgressUpdater
  • New public API, as discussed during design review
    • supportNonSerial on SplitResult
    • ParallelConfiguration new config class in MultipartConfiguration for the maxInFlightParts config
  • New internal API
    • FileAsyncTransformer exposes getters for position, path and FileTransformerConfiguration

Testing

  • Added unit test
  • Added integration test
  • Manual tests using large objects

L-Applin added 28 commits July 22, 2025 18:15
…in the onResponse callback. Keep track of all inflight requests.
- renamed EmittingSubscription, mark it ThreadSafe
- Added comments
- some other renaming

@Override
public void onResponse(T response) {
Optional<String> contentRangeList = response.sdkHttpResponse().firstMatchingHeader("x-amz-content-range");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this logic specific to S3? Could we possibly apply it to a generic streaming service? In general my guess is no, because other streaming APIs don't necessarily support content-range (at least my cursory inspection most I looked at do not support requests or responses with content-range).

Given that - should we keep this class in S3 instead of core?

@dagnir dagnir self-requested a review September 23, 2025 17:55
…oposal. Renamed to ParallelMultipartDownloaderSubscriber as per PR comment

- Other PR comment: Removed unused builder parameter for EmittingSubscription
…n/large-object-merge

# Conflicts:
#	services/s3/src/test/java/software/amazon/awssdk/services/s3/internal/multipart/S3MultipartFileDownloadWiremockTest.java
@L-Applin L-Applin changed the title Olapplin/large object merge Parallel split for multipart GetObject File Download Sep 25, 2025
@sonarqubecloud
Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
66.2% Coverage on New Code (required ≥ 80%)

See analysis details on SonarQube Cloud

@L-Applin L-Applin merged commit 80c60d0 into feature/master/large-object-dl Nov 25, 2025
25 of 30 checks passed
@github-actions
Copy link

This pull request has been closed and the conversation has been locked. Comments on closed PRs are hard for our team to see. If you need more assistance, please open a new issue that references this one.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 25, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants