Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adopt new async write API for PutObject requests #832

Merged
merged 10 commits into from
Apr 10, 2024

Conversation

passaro
Copy link
Contributor

@passaro passaro commented Apr 3, 2024

Description of change

Adopt the new async write API implemented in awslabs/aws-c-s3#418. The new API replaces the previous AsyncInputStream and more closely matches how uploads work in Mountpoint, where a PutObject meta-request is started on open and data is fed into it on write. The main driver for the introduction of async write was to avoid an issue where starting too many uploads would stall progress in the CRT. This change ensure that the issue is addressed in Mountpoint.

Relevant issues: #828

Does this change impact existing behavior?

No. The new async write API results in changes to the specific timing of multi-part upload requests, with an impact on when errors are reported, but as part of this change we introduced an explicit wait for the CreateMultiPartUpload request to complete before the first write, which reproduces existing behavior.


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the Developer Certificate of Origin (DCO).

@passaro passaro temporarily deployed to PR integration tests April 3, 2024 10:46 — with GitHub Actions Inactive
@passaro passaro had a problem deploying to PR integration tests April 3, 2024 10:46 — with GitHub Actions Failure
@passaro passaro temporarily deployed to PR integration tests April 3, 2024 10:46 — with GitHub Actions Inactive
@passaro passaro had a problem deploying to PR integration tests April 3, 2024 10:46 — with GitHub Actions Failure
@passaro passaro temporarily deployed to PR integration tests April 3, 2024 10:46 — with GitHub Actions Inactive
@passaro passaro had a problem deploying to PR integration tests April 3, 2024 10:46 — with GitHub Actions Failure
@passaro passaro had a problem deploying to PR integration tests April 3, 2024 10:46 — with GitHub Actions Failure
@passaro passaro temporarily deployed to PR integration tests April 4, 2024 11:55 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests April 4, 2024 11:55 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests April 4, 2024 11:55 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests April 4, 2024 11:55 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests April 4, 2024 11:55 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests April 4, 2024 11:55 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests April 4, 2024 11:55 — with GitHub Actions Inactive
Copy link
Contributor

@dannycjones dannycjones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the approach looks good! I especially like that it appears to significantly simplify the approach on our side. :)

There's a few comments below, nothing too scary.

mountpoint-s3-client/tests/put_object.rs Show resolved Hide resolved
Comment on lines 228 to 234
// Poll the future only once. This is quite fragile and relies on the following assumptions:
// * after the first invocation, `S3PutObjectRequest::write` calls `aws_s3_async_write` on first poll,
// * for large enough buffers, `aws_s3_async_write` defers the copy.
let waker = noop_waker();
let mut cx = Context::from_waker(&waker);
let poll_result = write.poll_unpin(&mut cx);
assert!(matches!(poll_result, std::task::Poll::Pending));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does the test about polling the future once? If I poll more than once, is that a problem?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to test cancellation, so I need to get to the state where the write has been issued, but has not yet completed. That's why I only poll once (write will invoke aws_s3_meta_request_write) and then drop the future (which should cancel the meta request).

I don't really like it, but I can't think of a better way to test it. Any ideas?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine. We can poll multiple times if we wanted to, right? This is where my confusion lies.

mountpoint-s3-crt/src/s3/client.rs Outdated Show resolved Hide resolved
mountpoint-s3-crt/src/s3/client.rs Outdated Show resolved Hide resolved
mountpoint-s3-crt/src/s3/client.rs Outdated Show resolved Hide resolved
mountpoint-s3-crt/src/s3/client.rs Outdated Show resolved Hide resolved
mountpoint-s3-crt/src/s3/client.rs Outdated Show resolved Hide resolved
mountpoint-s3-crt/src/s3/client.rs Outdated Show resolved Hide resolved
mountpoint-s3-crt/src/s3/client.rs Outdated Show resolved Hide resolved
mountpoint-s3-crt/src/s3/client.rs Outdated Show resolved Hide resolved
@passaro passaro temporarily deployed to PR integration tests April 5, 2024 15:03 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests April 5, 2024 15:03 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests April 5, 2024 15:03 — with GitHub Actions Inactive
@passaro passaro had a problem deploying to PR integration tests April 5, 2024 15:03 — with GitHub Actions Failure
@passaro passaro temporarily deployed to PR integration tests April 5, 2024 15:03 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests April 5, 2024 15:03 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests April 5, 2024 15:03 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests April 5, 2024 18:14 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests April 5, 2024 18:14 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests April 5, 2024 18:14 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests April 5, 2024 18:14 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests April 5, 2024 18:14 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests April 5, 2024 18:14 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests April 9, 2024 13:05 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests April 9, 2024 13:05 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests April 9, 2024 13:26 — with GitHub Actions Inactive
@passaro passaro had a problem deploying to PR integration tests April 9, 2024 13:26 — with GitHub Actions Failure
@passaro passaro temporarily deployed to PR integration tests April 9, 2024 13:26 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests April 9, 2024 13:26 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests April 9, 2024 13:26 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests April 9, 2024 13:26 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests April 9, 2024 13:26 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests April 9, 2024 15:32 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests April 9, 2024 15:32 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests April 9, 2024 15:32 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests April 9, 2024 15:32 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests April 9, 2024 15:32 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests April 9, 2024 15:32 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests April 9, 2024 15:32 — with GitHub Actions Inactive
@passaro passaro added this pull request to the merge queue Apr 9, 2024
@dannycjones dannycjones removed the request for review from vladem April 9, 2024 16:25
@passaro passaro removed this pull request from the merge queue due to a manual request Apr 9, 2024
@passaro passaro added this pull request to the merge queue Apr 10, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Apr 10, 2024
@passaro passaro added this pull request to the merge queue Apr 10, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Apr 10, 2024
@passaro passaro added this pull request to the merge queue Apr 10, 2024
Merged via the queue into awslabs:main with commit 2767c85 Apr 10, 2024
23 checks passed
@passaro passaro deleted the async-write branch April 10, 2024 14:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants