multitask quic socket operations#601
Conversation
|
Backports to the stable branch are to be avoided unless absolutely necessary for fixing bugs, security issues, and perf regressions. Changes intended for backport should be structured such that a minimum effective diff can be committed separately from any refactoring, plumbing, cleanup, etc that are not strictly necessary to achieve the goal. Any of the latter should go only into master and ride the normal stabilization schedule. |
|
Backports to the beta branch are to be avoided unless absolutely necessary for fixing bugs, security issues, and perf regressions. Changes intended for backport should be structured such that a minimum effective diff can be committed separately from any refactoring, plumbing, cleanup, etc that are not strictly necessary to achieve the goal. Any of the latter should go only into master and ride the normal stabilization schedule. Exceptions include CI/metrics changes, CLI improvements and documentation updates on a case by case basis. |
t-nelson
left a comment
There was a problem hiding this comment.
this is pretty much straight from @alessandrod's original work. i took the liberty of removing some debug statements, making the net-utils changes less invasive and a few other things noted below
| pub public_tpu_forwards_addr: Option<SocketAddr>, | ||
| } | ||
|
|
||
| const QUIC_ENDPOINTS: usize = 10; |
There was a problem hiding this comment.
didn't seem like this needed to be part of public api, so i moved it out of sdk
| parse_host_port(&string).map(|_| ()) | ||
| } | ||
|
|
||
| #[derive(Clone, Debug)] |
There was a problem hiding this comment.
kinda regretting not deriving Copy after typing .clone() more times than anticipated...
| } | ||
|
|
||
| impl Default for SocketConfig { | ||
| #[allow(clippy::derivable_impls)] |
| std::iter::once(Ok(socket)) | ||
| .chain((1..num).map(|_| bind_to_with_config(ip, port, config.clone()))) | ||
| .collect() |
There was a problem hiding this comment.
made this functional instead of imperative
|
|
||
| [dev-dependencies] | ||
| assert_matches = { workspace = true } | ||
| socket2 = { workspace = true } |
| pub public_tpu_forwards_addr: Option<SocketAddr>, | ||
| } | ||
|
|
||
| const QUIC_ENDPOINTS: usize = 10; |
There was a problem hiding this comment.
I tried with 4 -- there was about 20% degradation in term of throughput. I think 10 is an okay choice for now. I have not seen much memory usage increase under the spamming tool.
| ) | ||
| .unwrap(); | ||
| let quic_config = SocketConfig { | ||
| reuseaddr: false, |
There was a problem hiding this comment.
Why do we need to redeclare when it was cloned at 2831?
| quic_config.clone(), | ||
| ); | ||
| let tpu_forwards_quic = | ||
| bind_more_with_config(tpu_forwards_quic, QUIC_ENDPOINTS, quic_config.clone()).unwrap(); |
| } | ||
|
|
||
| #[cfg(any(windows, target_os = "ios"))] | ||
| fn udp_socket_with_config(config: SocketConfig) -> io::Result<Socket> { |
| coalesce, | ||
| )); | ||
|
|
||
| let mut accepts = incoming |
There was a problem hiding this comment.
Can you please put some comments here why we are doing this select! it is not immediately obvious.
| Arc<StreamStats>, | ||
| ) { | ||
| let s = UdpSocket::bind("127.0.0.1:0").unwrap(); | ||
| let sockets = (0..10) |
| .collect::<FuturesUnordered<_>>(); | ||
| while !exit.load(Ordering::Relaxed) { | ||
| let timeout_connection = timeout(WAIT_FOR_CONNECTION_TIMEOUT, incoming.accept()).await; | ||
| let timeout_connection = select! { |
There was a problem hiding this comment.
I think what we're looking for is just to await the FuturesUnordered, or use JoinSet. This can also be made to work with a timeout, by simply adding a timeout future to the FuturesUnordered or JoinSet (and having the futures return an option/result).
|
SO_REUSEPORT does not load balance as you would expect on RHEL 8 |
|
closing for #611 |
@ripatel-fd do you have a [cit] for this? because we looked into git history and stickiness was there since day 1 |
No, this is behavior observed when I tested on my box. I can write up a reprod C program and post a log of it running on my box. I'll send it to you in Discord once I have it. (I do believe you with the stickiness. Glad to hear it's there... maybe I just messed up my test infra) |
Problem
quinn uses a single task to do socket operations. not great
Summary of Changes
use SO_REUSEPORT to fanout socket operation to multiple
quinn::Endpoints