quic: Connect client UDP sockets before usage#35652
quic: Connect client UDP sockets before usage#35652RyanTheOptimist merged 22 commits intoenvoyproxy:mainfrom
Conversation
This change does a few things:
1. Calls `connect()` on the QUIC UDP sockets, which allows the socket
to know if the peer address is routable earlier.
2. Does not call `bind()` on a QUIC client UDP socket. It's incorrect
to do so.
3. If the socket is already connected, call writev/send instead of
sendmsg. Connected sockets should not specify a peer address and
hence, writev/send is more appropriate.
Signed-off-by: Ali Beyad <abeyad@google.com>
Not really, we saw connect() succeeded but following write cause "no route to host". The reason we want to connect() here is that connected UDP socket will surface async socket error received from ICMP messages. |
|
/assign @danzh2010 |
|
Alyssa is OOO this week. Assigning Ryan who has more QuIC context. |
danzh2010
left a comment
There was a problem hiding this comment.
Thanks for working on this! Just a few nits!
Signed-off-by: Ali Beyad <abeyad@google.com>
|
@danzh2010 I pushed up new changes that I believe fixes all the tests. PTAL! |
abeyad
left a comment
There was a problem hiding this comment.
I also fixed some more broken tests. PTAL, thanks!
|
/retest |
Signed-off-by: Ali Beyad <abeyad@google.com>
|
/retest |
Signed-off-by: Ali Beyad <abeyad@google.com>
abeyad
left a comment
There was a problem hiding this comment.
Addressed yesterday's comments, thanks!
|
/retest |
|
@danzh2010 friendly ping, thanks! |
RyanTheOptimist
left a comment
There was a problem hiding this comment.
Out of curiosity, have we tried to use EM with this PR to make sure it works as expected?
Also, should we add a runtime guard since this changes behavior.
| Network::Socket::Type::Datagram, | ||
| // Use the loopback address if `local_addr` is null, to pass in the socket interface used to | ||
| // create the IoHandle, without having to make the more expensive `getifaddrs` call. | ||
| local_addr ? local_addr : getLoopbackAddress(peer_addr->ip()->version()), peer_addr, |
There was a problem hiding this comment.
I'm curious how this local address is used. In the server case, we need a local address 'cause that's what we listen. But in the normal POSIX client socket case, a caller typically does not specify a local address. Do you understand how this is used?
There was a problem hiding this comment.
Unfortunately, the local address is needed so we can get a SocketInterface instance because it is needed to create an IoHandle instance. If we don't supply an address here, we segfault in creating the IoHandle in the SocketImpl constructor. Dan and I discussed this and we think the setup of SocketImpl isn't correct - we should be able to use a SocketInterface independent of an Address instance. But that's a change for a separate PR.
There was a problem hiding this comment.
I see. That makes sense, but it makes me nervous. Have we tried running this PR against an external server (using HTTP/3)?
There was a problem hiding this comment.
It should only affect upstream traffic. And if we hide it behind a runtime feature, it should be fine.
There was a problem hiding this comment.
we have tested it on a mobile app connecting to a H/3 server, but for added safety, I added a runtime guard.
There was a problem hiding this comment.
Excellent. Can you also mention that in the PR description and update the release notes?
There was a problem hiding this comment.
Release notes was updated in https://github.com/envoyproxy/envoy/pull/35652/files#diff-6f9c718224c533c13c2c0ba1d5abaab86be9d0cc73808749c77934e9f9b0d5d0. Just updated the PR description.
Signed-off-by: Ali Beyad <abeyad@google.com>
|
/retest |
1 similar comment
|
/retest |
"No route to host" can be reported synchronously during connect() as we observed in the tests. But we want connected UDP socket overall because it surfaces any async errors (from ICMP messages) via following |
| Network::Socket::Type::Datagram, | ||
| // Use the loopback address if `local_addr` is null, to pass in the socket interface used to | ||
| // create the IoHandle, without having to make the more expensive `getifaddrs` call. | ||
| local_addr ? local_addr : getLoopbackAddress(peer_addr->ip()->version()), peer_addr, |
There was a problem hiding this comment.
I see. That makes sense, but it makes me nervous. Have we tried running this PR against an external server (using HTTP/3)?
Updated the PR description |
Signed-off-by: Ali Beyad <abeyad@google.com>
Signed-off-by: Ali Beyad <abeyad@google.com>
|
/retest |
danzh2010
left a comment
There was a problem hiding this comment.
Thanks for working on this!
|
Thanks for the great reviews, @danzh2010 and @RyanTheOptimist ! |
This change does a few things:
connect()on the QUIC UDP sockets, which allows the connection to receive async ICMP error messages onrecvmsgcalls throughout the lifetime of the connection.bind()on a QUIC client UDP socket unless the local address is specified.writev/sendinstead ofsendmsg. Connected sockets should not specify a peer address and hence, writev/send is more appropriate.EnvoyQuicClientConnectiondid preferred server address probing, it specified the existing remote address, not the new preferred server address. This still worked because thesendmsgcall took in the actual destination IP address. However, when callingconnect(), this caused the socket to connect to the wrong address. This issue is fixed in this PR by creating a socket connection to the preferred server address.createConnectionSocketto get anIoHandle. This avoids having to make the more expensivegetifaddrscall, as mentioned in Network::Utility::getLocalAddress performance is bad in multi-threads #35137.connect()behavior is guarded by the runtime guardenvoy.reloadable_features.quic_connect_client_udp_sockets, defaulted totrue.Fixes #35137