Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minimum throughput body timeouts Pt.1 #3068

Merged
merged 21 commits into from
Oct 26, 2023
Merged

Conversation

Velfi
Copy link
Contributor

@Velfi Velfi commented Oct 13, 2023

Motivation and Context

#1562

Description

This change adds a new body wrapper: The minimum throughput limit wrapper. It tracks the rate that data is being streamed from itself. If that rate falls below some configurable limit, it emits an error instead of the next chunk. This protects users from requests that start quickly but then slow down considerably.

I'd like to get this merged and then figure out the codegen/docs/examples/config part in a separate PR.

Testing

Tests are included.


By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@Velfi Velfi force-pushed the zhessler-minimum-throughput-body branch from b1300d4 to 299cdd1 Compare October 13, 2023 19:14
@github-actions
Copy link

A new generated diff is ready to view.

A new doc preview is ready to view.

@github-actions
Copy link

A new generated diff is ready to view.

A new doc preview is ready to view.

@Velfi Velfi marked this pull request as ready for review October 17, 2023 16:59
@Velfi Velfi requested review from a team as code owners October 17, 2023 16:59
@Velfi Velfi changed the title Zhessler minimum throughput body Minimum throughput body timeouts Pt.1 Oct 17, 2023
@github-actions
Copy link

A new generated diff is ready to view.

  • AWS SDK (ignoring whitespace)
  • No codegen difference in the Client Test
  • No codegen difference in the Server Test
  • No codegen difference in the Server Test Python
  • No codegen difference in the Server Test Typescript

A new doc preview is ready to view.

aws/sdk/integration-tests/s3/tests/throughput-timeout.rs Outdated Show resolved Hide resolved
aws/sdk/integration-tests/s3/tests/throughput-timeout.rs Outdated Show resolved Hide resolved
aws/sdk/integration-tests/s3/tests/throughput-timeout.rs Outdated Show resolved Hide resolved
#[pin]
inner: InnerBody,
// A record of when and how much data was read
throughput_logs: VecDeque<(SystemTime, u64)>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How large does this queue get in practice?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On my machine, a 100MB download with a tick rate of 1s produces 6700 logs. The download completes in only a few secs.

Comment on lines 127 to 129
if this.throughput_logs.len() > LOG_WINDOW_SIZE {
this.throughput_logs.pop_front();
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like it should pop anything that's older than our time window, with the window size being a large upper bound. 16 is probably not high enough? Need to experiment with that number.

@Velfi Velfi marked this pull request as draft October 17, 2023 19:18
@github-actions
Copy link

A new generated diff is ready to view.

  • AWS SDK (ignoring whitespace)
  • No codegen difference in the Client Test
  • No codegen difference in the Server Test
  • No codegen difference in the Server Test Python
  • No codegen difference in the Server Test Typescript

A new doc preview is ready to view.

@github-actions
Copy link

A new generated diff is ready to view.

  • AWS SDK (ignoring whitespace)
  • No codegen difference in the Client Test
  • No codegen difference in the Server Test
  • No codegen difference in the Server Test Python
  • No codegen difference in the Server Test Typescript

A new doc preview is ready to view.

@@ -83,7 +83,7 @@ pub fn default_async_sleep() -> Option<SharedAsyncSleep> {
/// Future returned by [`AsyncSleep`].
#[non_exhaustive]
#[must_use]
pub struct Sleep(Pin<Box<dyn Future<Output = ()> + Send + 'static>>);
pub struct Sleep(Pin<Box<dyn Future<Output = ()> + Send + Sync + 'static>>);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Futures shouldn't be Sync.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After talking with Carl, I decided that this future should in fact be sync. While we could wrap this future in a mutex or similar to make it Sync, I'm opting to just add the bound. The reasoning is that omitting the Sync bound will cause more user pain than including it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requiring a future to be Sync can cause some issue when used with an async block (lang forum), but I don't necessarily see how that bites us here 🤔

At least, I think this PR needs a label breaking-change because Sleep::new now fails to take an async block for certain cases (playground - derived from the above lang forum).

remove `Sync` bound from `Sleep` fut
update NUMBER_OF_LOGS_IN_ONE_KB const to be more accurate
add trailer support to MTB
fix SdkBody trailer handling
@Velfi
Copy link
Contributor Author

Velfi commented Oct 20, 2023

This PR is currently blocked on some HTTP Body changes that @ysaito1001 will soon PR.

@github-actions
Copy link

A new generated diff is ready to view.

  • AWS SDK (ignoring whitespace)
  • No codegen difference in the Client Test
  • No codegen difference in the Server Test
  • No codegen difference in the Server Test Python
  • No codegen difference in the Server Test Typescript

A new doc preview is ready to view.

@github-actions
Copy link

A new generated diff is ready to view.

  • AWS SDK (ignoring whitespace)
  • No codegen difference in the Client Test
  • No codegen difference in the Server Test
  • No codegen difference in the Server Test Python
  • No codegen difference in the Server Test Typescript

A new doc preview is ready to view.

@Velfi Velfi marked this pull request as ready for review October 23, 2023 19:12
@github-actions
Copy link

A new generated diff is ready to view.

  • AWS SDK (ignoring whitespace)
  • No codegen difference in the Client Test
  • No codegen difference in the Server Test
  • No codegen difference in the Server Test Python
  • No codegen difference in the Server Test Typescript

A new doc preview is ready to view.

@Velfi
Copy link
Contributor Author

Velfi commented Oct 23, 2023

The current CI failures are slightly related to Yuki's work to feature gate stuff, I'll resolve them once he gets that work merged.

@Velfi Velfi mentioned this pull request Oct 25, 2023
9 tasks
@Velfi Velfi added the breaking-change This will require a breaking change label Oct 25, 2023
Copy link
Collaborator

@rcoh rcoh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good—a few questions

this.sleep_fut.replace(sleep_fut);

// Calculate the current throughput and emit an error if it's too low.
let actual_throughput = this.throughput_logs.calculate_throughput()?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it worth benchmarking this at some point? if we're getting a ton of data how much time do we spend calculating throughput? I wonder if we should only calculate it when the sleep expires, e.g.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, just not in this PR.

}
Self::TimeTravel(_) => write!(
f,
"negative time has elapsed while reading the inner body, this is a bug"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it isn't really though—we're using SystemTime and not Instant. SystemTime doesn't guarantee monotonicity:

Distinct from the Instant type, this time measurement is not monotonic. This means that you can save a file to the file system, then save another file to the file system, and the second file has a SystemTime measurement earlier than the first. In other words, an operation that happens after another operation in real time may have an earlier SystemTime!

https://doc.rust-lang.org/std/time/struct.SystemTime.html

We should probably try to handle this gracefully or add some sort of InstantTimeSource in our bag of tricks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, just not in this PR.

@github-actions
Copy link

A new generated diff is ready to view.

  • AWS SDK (ignoring whitespace)
  • No codegen difference in the Client Test
  • No codegen difference in the Server Test
  • No codegen difference in the Server Test Python
  • No codegen difference in the Server Test Typescript

A new doc preview is ready to view.

@github-actions
Copy link

A new generated diff is ready to view.

  • AWS SDK (ignoring whitespace)
  • No codegen difference in the Client Test
  • No codegen difference in the Server Test
  • No codegen difference in the Server Test Python
  • No codegen difference in the Server Test Typescript

A new doc preview is ready to view.

add feature gate to poll_next_trailers
@github-actions
Copy link

A new generated diff is ready to view.

  • AWS SDK (ignoring whitespace)
  • No codegen difference in the Client Test
  • No codegen difference in the Server Test
  • No codegen difference in the Server Test Python
  • No codegen difference in the Server Test Typescript

A new doc preview is ready to view.

@Velfi Velfi added this pull request to the merge queue Oct 26, 2023
Merged via the queue into main with commit 0f1f1a6 Oct 26, 2023
40 of 41 checks passed
@Velfi Velfi deleted the zhessler-minimum-throughput-body branch October 26, 2023 20:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking-change This will require a breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants