Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion quic/s2n-quic-core/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ exclude = ["corpus.tar.gz"]
default = ["alloc", "std"]
alloc = ["atomic-waker", "bytes", "crossbeam-utils", "s2n-codec/alloc"]
std = ["alloc", "once_cell"]
testing = ["std", "generator", "s2n-codec/testing", "checked-counters", "insta", "futures-test"]
testing = ["std", "generator", "s2n-codec/testing", "checked-counters", "insta", "futures-test", "bach"]
generator = ["bolero-generator"]
checked-counters = []
branch-tracing = ["tracing"]
Expand Down Expand Up @@ -47,6 +47,7 @@ tracing = { version = "0.1", default-features = false, optional = true }
zerocopy = { version = "0.8", features = ["derive"] }
futures-test = { version = "0.3", optional = true } # For testing Waker interactions
once_cell = { version = "1", optional = true }
bach = { version = "0.1.0", optional = true }

[dev-dependencies]
bolero = "0.13"
Expand Down
73 changes: 73 additions & 0 deletions quic/s2n-quic-core/src/io/event_loop.rs
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,9 @@ where

let select = cooldown.wrap(select);

#[cfg(feature = "testing")]
bach_cpu::assert_zero_cpu();

let select::Outcome {
rx_result,
tx_result,
Expand All @@ -109,6 +112,9 @@ where
return;
};

#[cfg(feature = "testing")]
bach_cpu::take_cpu().await;

// notify the application that we woke up and why
let wakeup_timestamp = clock.get_time();
{
Expand All @@ -126,10 +132,16 @@ where

match rx_result {
Some(Ok(())) => {
#[cfg(feature = "testing")]
bach_cpu::assert_zero_cpu();

// we received some packets. give them to the endpoint.
rx.queue(|queue| {
endpoint.receive(queue, &clock);
});

#[cfg(feature = "testing")]
bach_cpu::take_cpu().await;
}
Some(Err(error)) => {
// The RX provider has encountered an error. shut down the event loop
Expand Down Expand Up @@ -160,11 +172,20 @@ where
}
}

#[cfg(feature = "testing")]
bach_cpu::assert_zero_cpu();

// Let the endpoint transmit, if possible
tx.queue(|queue| {
endpoint.transmit(queue, &clock);
});

#[cfg(feature = "testing")]
bach_cpu::take_cpu().await;

#[cfg(feature = "testing")]
bach_cpu::assert_zero_cpu();

// Get the next expiration from the endpoint and update the timer
let timeout = endpoint.timeout();
if let Some(timeout) = timeout {
Expand All @@ -187,3 +208,55 @@ where
}
}
}

/// This allows various parts of s2n-quic to "spend" CPU cycles within bach simulations
/// deterministically. The goal is to allow simulating (especially) handshakes accurately, which
/// incur significant CPU cycles and as such delay processing subsequent packets. It's inaccurate
/// to model this as network delay.
mod bach_cpu {
#[cfg(feature = "testing")]
use core::cell::Cell;
use core::time::Duration;

// CPU today is attributed within the event loop, which is at least today always single
// threaded, and we never yield while there's still unspent CPU.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thought I kind of wanted to jot down: This breaks with offloading; it messes with the assumptions being made in this PR. Because if the TLS task is now async then you could be attributing CPU while the event loop task is sleeping.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually think we should build on the changes we made for the async TLS task to accomplish this. Basically we could use that runtime trait to spawn a wrapped TLS task that adds delays. We may want to also add support for delaying responses? Not sure... Anyway that area of the code seems like the right place to hook in.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm yeah, that does sound doable. It wouldn't help if you wanted to simulate a non-offload handshake though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true, yeah

//
// FIXME: I *think* an alternative to this is to wire up an event or pseudo-event that s2n-quic
// itself would subscribe to -- that would be a bit less plumbing, but the crypto code doesn't
// directly publish events today so it wouldn't be quite enough either.
#[cfg(feature = "testing")]
thread_local! {
static CPU_SPENT: Cell<Duration> = const { Cell::new(Duration::ZERO) };
}

#[inline]
pub fn attribute_cpu(time: Duration) {
#[cfg(feature = "testing")]
{
CPU_SPENT.with(|c| {
let old = c.get();
let new = old + time;
c.set(new);
});
}
}

#[cfg(feature = "testing")]
pub(super) async fn take_cpu() {
// Make sure assert_zero_cpu works in all cfg(testing), not just with bach.
let taken = CPU_SPENT.take();

if !bach::is_active() {
return;
}

bach::time::sleep(taken).await;
}

#[cfg(feature = "testing")]
pub(super) fn assert_zero_cpu() {
assert_eq!(CPU_SPENT.get(), Duration::ZERO);
}
}

pub use bach_cpu::attribute_cpu;
2 changes: 1 addition & 1 deletion quic/s2n-quic-core/src/io/rx.rs
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ pub trait Rx: Sized {
// TODO make this generic over lifetime
// See https://github.com/aws/s2n-quic/issues/1742
type Queue: Queue<Handle = Self::PathHandle>;
type Error;
type Error: Send;

/// Returns a future that yields after a packet is ready to be received
#[inline]
Expand Down
2 changes: 1 addition & 1 deletion quic/s2n-quic-core/src/io/tx.rs
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ pub trait Tx: Sized {
// TODO make this generic over lifetime
// See https://github.com/aws/s2n-quic/issues/1742
type Queue: Queue<Handle = Self::PathHandle>;
type Error;
type Error: Send;

/// Returns a future that yields after a packet is ready to be transmitted
#[inline]
Expand Down
13 changes: 13 additions & 0 deletions quic/s2n-quic-tls/src/callback.rs
Original file line number Diff line number Diff line change
Expand Up @@ -360,6 +360,16 @@ where
fn on_read(&mut self, data: &mut [u8]) -> usize {
let max_len = Some(data.len());

// This is a semi-random number. However, it happens to work out OK to approximate
// spend during handshakes. On one side of a connection a handshake typically costs about
// 0.5-1ms of CPU. Most of that is driven by the peer sending information that the local
// endpoint acts on (e.g., verifying signatures).
//
// In practice this was chosen to make s2n-quic-sim simulate an uncontended mTLS handshake
// as taking 2ms (in combination with the transition edge adding some extra cost), which is
// fairly close to what we see in one scenario with real handshakes.
s2n_quic_core::io::event_loop::attribute_cpu(core::time::Duration::from_micros(100));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with this for now. I do wonder if @maddeleine's TLS offload work would be a bit less intrusive, though, since you could essentially intercept the task and inject this delay. But I'm fine with unblocking you for now.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could in principle put this in a wrapper around s2n-quic-tls's Provider (similar to slow_tls), it's just fairly painful to write that. I think @maddeleine's work doesn't change that -- it's not offloading just the crypto (what we'd probably ideally do here), so there's no obvious attach point. Every poll is too much.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point, yeah.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A note; I don't know if attributing some micros per on_read call will lead to the most accurate depiction of TLS handshake times. s2n-tls will calls on_read twice for every TLS record(once to retrieve the record header and then once to retrieve the full record). Additionally it can be called even if there is no TLS data to provide to s2n-tls.
Maybe we don't care about being precise here, we're just trying to attribute some time to the TLS ops. But I dunno, feels like this estimation could get wildly off if you vary which handshake you're performing.


let chunk = match self.state.rx_phase {
HandshakePhase::Initial => self.context.receive_initial(max_len),
HandshakePhase::Handshake => self.context.receive_handshake(max_len),
Expand Down Expand Up @@ -407,6 +417,9 @@ enum HandshakePhase {

impl HandshakePhase {
fn transition(&mut self) {
// See comment in `on_read` for value and why this exists.
s2n_quic_core::io::event_loop::attribute_cpu(core::time::Duration::from_micros(100));

*self = match self {
Self::Initial => Self::Handshake,
_ => Self::Application,
Expand Down
Loading