Double HBONE implementation for ambient multicluster#1429
Double HBONE implementation for ambient multicluster#1429istio-testing merged 36 commits intoistio:masterfrom
Conversation
|
Skipping CI for Draft Pull Request. |
e27a4a8 to
40bbcd0
Compare
66db8ae to
99d622f
Compare
src/proxy/outbound.rs
Outdated
| debug!(component="outbound", dur=?start.elapsed(), "connection completed"); | ||
| }).instrument(span); | ||
|
|
||
| assertions::size_between_ref(1000, 1750, &serve_outbound_connection); |
There was a problem hiding this comment.
How did we get these numbers?
There was a problem hiding this comment.
by looking a the current size and adding a small amount of buffer.
There was a problem hiding this comment.
I think we covered this live on the WG call, but just to close the loop here - this must not grow (beyond a trivial amount) else it meas every connection will use that much additional memory. Typically the fix here is to Box::pin the futures.
99d622f to
96bb4de
Compare
src/proxy/pool.rs
Outdated
|
|
||
| // Does nothing but spawn new conns when asked | ||
| impl ConnSpawner { | ||
| async fn new_unpooled_conn( |
There was a problem hiding this comment.
Anything here we can do higher up I think, but things might change if we decide to implement pooling in this PR
There was a problem hiding this comment.
Yeah if we want double-hbone conns to be unpooled and thus need ~none of this surrounding machinery, then I'd be inclined to just start proxy/double-hbone.rs and use that directly, rather than complicating the purpose of this file.
(Could also just have a common HboneConnMgr trait or something too)
src/proxy/outbound.rs
Outdated
| // This always drops ungracefully | ||
| // drop(conn_client); | ||
| // tokio::time::sleep(std::time::Duration::from_secs(1)).await; | ||
| // drain_tx.send(true).unwrap(); | ||
| // tokio::time::sleep(std::time::Duration::from_secs(1)).await; | ||
| drain_tx.send(true).unwrap(); | ||
| let _ = driver_task.await; | ||
| // this sleep is important, so we have a race condition somewhere | ||
| // tokio::time::sleep(std::time::Duration::from_secs(1)).await; | ||
| res |
There was a problem hiding this comment.
Does anybody have any info on how to properly drop/terminate H2 connections over stream with nontrivial drops (e.g. shutting down TLS over HTTP2 CONNECT). Right now, I'm just dropping things/aborting tasks randomly until something works
There was a problem hiding this comment.
Are you asking about how to cleanup after, for example, a RST_STREAM to the inner tunnel? Or something else
There was a problem hiding this comment.
Kinda. I mostly mean the outer TLS stream because that's what I've looked at. It seems like if I drop conn_client before termination driver_task the TCP connection will close without sending close notifies. So yes, I'm asking if there is a way to explicitly do cleanup rather than relying on implicit drops.
There was a problem hiding this comment.
I see the code changed; do you still need help figuring this out?
There was a problem hiding this comment.
Im still not confident in it. It works (on my machine), but I couldn't find any docs on proper connection termination/dropping.
There was a problem hiding this comment.
Seems like we ignore shutdown errors so I'm not going to worry about this
| } | ||
|
|
||
| async fn send_hbone_request( | ||
| fn create_hbone_request( |
There was a problem hiding this comment.
Git merge is getting confused here
src/config.rs
Outdated
| const UNSTABLE_ENABLE_SOCKS5: &str = "UNSTABLE_ENABLE_SOCKS5"; | ||
|
|
||
| const DEFAULT_WORKER_THREADS: u16 = 2; | ||
| const DEFAULT_WORKER_THREADS: u16 = 40; |
There was a problem hiding this comment.
I may have missed in the description, but why the change here?
There was a problem hiding this comment.
I was hoping it would making debugging async rust easier (it didn't)
There was a problem hiding this comment.
(if you haven't already found it, tokio-console can be helpful)
src/proxy/outbound.rs
Outdated
| // This always drops ungracefully | ||
| // drop(conn_client); | ||
| // tokio::time::sleep(std::time::Duration::from_secs(1)).await; | ||
| // drain_tx.send(true).unwrap(); | ||
| // tokio::time::sleep(std::time::Duration::from_secs(1)).await; | ||
| drain_tx.send(true).unwrap(); | ||
| let _ = driver_task.await; | ||
| // this sleep is important, so we have a race condition somewhere | ||
| // tokio::time::sleep(std::time::Duration::from_secs(1)).await; | ||
| res |
There was a problem hiding this comment.
Are you asking about how to cleanup after, for example, a RST_STREAM to the inner tunnel? Or something else
1ea75fb to
f1cc535
Compare
src/proxy/outbound.rs
Outdated
| // Inner HBONE | ||
| let upgraded = TokioH2Stream::new(upgraded); | ||
| // TODO: dst should take a hostname? and upstream_sans currently contains E/W Gateway certs | ||
| let inner_workload = pool::WorkloadKey { |
There was a problem hiding this comment.
Will reorganize later.
a8856a4 to
565f41f
Compare
src/proxy/outbound.rs
Outdated
| Protocol::HBONE | Protocol::DOUBLEHBONE => Some(us.workload_socket_addr()), | ||
| Protocol::TCP => None, | ||
| }; | ||
| let (upstream_sans, final_sans) = match us.workload.protocol { |
There was a problem hiding this comment.
My understanding from talking to @keithmattix is that Upstream.service_sans will be repurposed to contain the identities of remote pods/waypoints, so I should change the logic of the other protocols to only use us.workload.identity instead of us.workload_and_services_san.
There was a problem hiding this comment.
Yes, I think this is correct; only the double hbone codepath needs to be added/changed because there are two sans being considered: the e/w gateway SAN and the SANs of the backends. So what you have looks right to me
src/proxy/pool.rs
Outdated
| // send requests over some underlying stream using some underlying http/2 client | ||
| struct ConnClient { | ||
| sender: H2ConnectClient, | ||
| pub struct ConnClient { |
|
I think metrics story is clear: only do metrics for inner hbone. Also for RBAC, its only destination ztunnel that does RBAC, its not relevant here? |
|
/test test |
src/proxy/h2.rs
Outdated
| std::io::Error::new(std::io::ErrorKind::Other, e) | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
tests are in poo.rs
50cb231 to
b4bd1c0
Compare
b4bd1c0 to
7cd4cf1
Compare
src/tls/workload.rs
Outdated
| { | ||
| let c = tokio_rustls::TlsConnector::from(self.client_config); | ||
| c.connect(dest, stream).await | ||
| c.connect(DUMMY_DOMAIN.clone(), stream).await |
There was a problem hiding this comment.
Can you please revert this. We should NOT be sending an SNI.
rustls makes this kind of annoying, but if you just set an IP it will not send anything. We can put 0.0.0.0 if you want.
Putting a dummy IP != putting a dummy domain
src/proxy/outbound.rs
Outdated
| .as_ref() | ||
| .expect("Workloads with network gateways must be service addressed."); | ||
|
|
||
| let (actual_destination, upstream_sans, final_sans) = match &ew_gtw.destination { |
There was a problem hiding this comment.
nit: can we just put this all in state.rs like fetch_waypoint but fetch_network_gateway?
src/proxy/outbound.rs
Outdated
| if let Some(ew_gtw) = &us.workload.network_gateway { | ||
| if us.workload.network != source_workload.network { |
There was a problem hiding this comment.
IIUC this means that if we find a workload in another network but it does NOT have a gateway, we will just send to the IP, which will not work (or worse, may go to some actual destination unrelated to the target). Do we have something protecting against this
There was a problem hiding this comment.
I can add it, but I don't think we ever did this check
src/proxy/outbound.rs
Outdated
| let us_gtw = self | ||
| .pi | ||
| .state | ||
| .fetch_upstream_by_host( |
There was a problem hiding this comment.
If we follow the above comment about following fetch_watchpoint this would probably end up calling find_hostname --> find_upstream_from_service
howardjohn
left a comment
There was a problem hiding this comment.
Overall looks good, mostly some minor comments, only blocker is the dummy SNI
f3fa871 to
04e7d30
Compare
src/state.rs
Outdated
| ) -> Result<(SocketAddr, Vec<Identity>, Vec<Identity>), Error> { | ||
| match >w.destination { | ||
| Destination::Address(address) => Ok(( | ||
| SocketAddr::from((address.address, gtw.hbone_mtls_port)), |
There was a problem hiding this comment.
The intent of these APIs is the address are not raw IPs we send to, but lookup keys into workloads/services. We should use the same logic as fetch_waypoint which calls find_upstream
There was a problem hiding this comment.
ok, so I can assume that the address has a Workload object behind it
|
the PR title is terrible. |
|
That's my fault; I removed the hold before he could change it |
Initial double HBONE implementation
Right now, inner HBONE will only hold one connect tunnel. Once the inner tunnel terminates, so will the outer tunnel (but not the outer HBONE). So when ztunnel receives its first connection to a double HBONE host (E/W gateway), it will perform two TLS handshakes. Subsequent connections to the same host will perform one TLS handshake.
This behavior is not great, but if we put the inner HBONE in the connection pool, then we pin ourselves to a pod in the remote cluster since ztunnel performs connection pooling, but is not aware of the E/W gateway's routing decision.
That being said, I think this is a good place to stop and think about control plane implementation and get some feedback on how I'm approaching this.
Tasks:
Some open questions:
Ninner HBONE connections per E/W or per remote cluster.References: