Clean up socket allocation logic in cluster_info.rs#5832
Clean up socket allocation logic in cluster_info.rs#5832alexpyattaev merged 7 commits intoanza-xyz:masterfrom
Conversation
e8ce52e to
a34fa4a
Compare
957f9a4 to
7e8b292
Compare
0b78585 to
b34f81c
Compare
b34f81c to
47861e7
Compare
f436b15 to
1745647
Compare
e4fbce3 to
44f1034
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #5832 +/- ##
=========================================
- Coverage 83.3% 83.3% -0.1%
=========================================
Files 853 853
Lines 378158 377982 -176
=========================================
- Hits 315300 315110 -190
- Misses 62858 62872 +14 🚀 New features to boost your workflow:
|
bw-solana
left a comment
There was a problem hiding this comment.
I'm having a tough time reviewing thoroughly. It might help to break this up into separate PRs for some of the major changes:
- Reorganizing the socket binding APIs
- Changing the reuse port usage
- Switching
VALIDATOR_PORT_RANGE-->localhost_port_range_for_tests - Moving
get_gossip_port
All but the socket binding reorganization should be trivial to review. For that one, it might be nice to understand the current hierarchy of functions and the new hierarchy w/ your changes to help guide the review
This plan can not work because stuff will fail CI and we will be committing broken code, so first two steps would essentialy have to be merged into one. We can definitely move get_gossip_port and switch to localhost_port_range in a separate PRs though. |
e6e16d6 to
f47ea7f
Compare
gregcusack
left a comment
There was a problem hiding this comment.
mostly looking good, but it appears new_single_bind() is no longer doing single bind
| } | ||
|
|
||
| /// create localhost node for tests with provided pubkey | ||
| /// unlike the public IP version, this will also bind RPC sockets. |
There was a problem hiding this comment.
what is the "public IP version"? are you referring to new_with_external_ip()? Can we refer to it directly by name in this comment
| ) -> (u16, UdpSocket) { | ||
| bind_in_range_with_config(bind_ip_addr, port_range, config).expect("Failed to bind") | ||
| let addr = IpAddr::V4(Ipv4Addr::LOCALHOST); | ||
| let gossip_addr = SocketAddr::new(addr, 0); |
There was a problem hiding this comment.
before we were binding to gossip localhost within the port range provided by localhost_port_range_for_tests(). now we are just binding to a random port, possibly outside the port range. this is probably fine since it's just for tests and stuff but does change behavior
There was a problem hiding this comment.
Thank you for catching this one, it is a bug! It would not "break" anything but would just cause flaky tests again.
| let (tvu_port, tvu) = Self::bind_with_config(bind_ip_addr, port_range, socket_config); | ||
| let (tvu_quic_port, tvu_quic) = | ||
| Self::bind_with_config(bind_ip_addr, port_range, socket_config); | ||
| let ((tpu_port, tpu), (_tpu_quic_port, tpu_quic)) = |
There was a problem hiding this comment.
maybe i'm reading this wrong, but in this old code, tpu gets just a single UdpSocket. but then in the new code we call:
let ((tpu_port, tpu_sockets), (_tpu_port_quic, tpu_quic)) =
bind_two_in_range_with_offset_and_config(
bind_ip_addr,
port_range,
QUIC_PORT_OFFSET,
socket_config,
socket_config,
)
.expect("tou_socket primary bind");
let tpu_sockets =
bind_more_with_config(tpu_sockets, 32, socket_config).expect("tpu_sockets multi_bind");
so now for new_single_bind() we are binding 32 sockets for tpu even though we only want one
There was a problem hiding this comment.
same goes for tvu, tpu_forwards, tpu_vote, broadcast, retransmit_sockets
There was a problem hiding this comment.
Yes, it is different, but it is a distinction without a difference. new_single_bind is only used in test_validator, and it simply does not derive any meaningful benefit from multiple binds, but it does not hurt it either. This does remind that we need to finally deprecate TPU UDP for good though!
There was a problem hiding this comment.
ok so we are good binding multiple sockets to the same ip/port for test validator? that works for me. but then probably would change name from new_single_bind() to match the fact that it is no longer a single bind
There was a problem hiding this comment.
its public API =( I guess we can go ahead and inline it to its sole caller instead.
Co-authored-by: Greg Cusack <greg.cusack@anza.xyz>
| ) | ||
| .expect("tou_socket primary bind"); | ||
| let tpu_sockets = | ||
| bind_more_with_config(tpu_sockets, 32, socket_config).expect("tpu_sockets multi_bind"); |
There was a problem hiding this comment.
Why hard coded the count? Should be based on the request
There was a problem hiding this comment.
it has been hardcoded before. Also, do we even use use a single one of those 32? I thought TPU UDP is not used anymore.
There was a problem hiding this comment.
@sakridge confirmed it is going away so we do not care at the moment.
Co-authored-by: Greg Cusack <greg.cusack@anza.xyz>
| .expect("Number of QUIC endpoints can not be zero"), | ||
| vortexor_receiver_addr: None, | ||
| }; | ||
| let mut node = Self::new_with_external_ip(pubkey, config); |
There was a problem hiding this comment.
This section code is almost identical to the one in new_single_bind. except for the gossip port is set. Maybe put it in a common code?
There was a problem hiding this comment.
I believe new_single_bind should get deprecated. Just axing it is also an option but in this case we can follow procedure. I think it has no practical purpose, just more code duplication. Am I missing something?
lijunwangs
left a comment
There was a problem hiding this comment.
LGTM. it much reduced code duplication
Problem
#5824
Summary of Changes