-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tcproute: add support for round-robin load balancing #148
Conversation
bb64438
to
2f41083
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great, thank you for jumping on this!
Many of my comments are about small improvements or idioms but nothing major. While I think in the future we can probably take a look at other TCP implementations and take some inspiration there, I think it's reasonable to have this as a first step which gives us something that works and then work iteratively up from here.
I'm specifically appreciative of the comments you've been adding, these can be very helpful to future readers of the code and make our codebase more inviting/inclusive for the community.
As an aside and unrelated to any specific bit of code or suggestion: just a general reminder that we're in a memory constrained environment of 512 bytes stack memory with ebpf programs, which felt kind of relevant to me while reading this PR as we've added quite a bit to those programs here. I don't think there's any immediate action to take, as I don't think we're anywhere near in danger of exhausting that limit right now but just something to think about as we continue to add functionality as if we can do some of our optimizations up front we can potentially reduce the size of refactors later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/cc @astoycos for additional review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks really good @aryan9600!!! Mostly nit's and some questions for yah which might result in some follow up issues
Thanks for all the hard work here!
dataplane/ebpf/src/egress/tcp.rs
Outdated
@@ -68,6 +71,39 @@ pub fn handle_tcp_egress(ctx: TcContext) -> Result<i32, i64> { | |||
unsafe { (*tcp_hdr).check = 0 }; | |||
|
|||
// TODO: connection tracking cleanup https://github.com/kubernetes-sigs/blixt/issues/85 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the TODO was to have proper conntrack mechanisms, i.e connection tracktion entries should time out at some point if there's and error in the flow etc
@@ -51,6 +54,40 @@ impl BackendService { | |||
backends_map.remove(&key)?; | |||
let mut gateway_indexes_map = self.gateway_indexes_map.lock().await; | |||
gateway_indexes_map.remove(&key)?; | |||
|
|||
// Delete all entries in our tcp connection tracking map that this backend |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to think about adding a "timeout" at some point so that stale threads are automatically pruned?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are you talking about deleting stale TCP connections based on a certain period of inactivity? this got me thinking about whether we should (and how) deal with TCP keepalive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, I'm not even sure we need that mechanism yet just thinking about it, Let's defer for a follow up.
Add support for round-robin load balancing for `TCPRoute`. This enables mutliple backend references in a UDPRoute object, with traffic being distributed to each backend in a round-robin fashion. A new BPF map `TCP_CONNECTIONS` was introduced to help keep track of active TCP connections, with its key storing the client <ip:port> identifier and the value storing the backend's <ip:port> along with the state of the connection. Signed-off-by: Sanskar Jaiswal <[email protected]>
Signed-off-by: Sanskar Jaiswal <[email protected]>
Signed-off-by: Sanskar Jaiswal <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great thanks @aryan9600!
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: aryan9600, astoycos The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/lgtm |
Add support for round-robin load balancing for
TCPRoute
. This enables mutliple backend references in a UDPRoute object, with traffic being distributed to each backend in a round-robin fashion.A new BPF map
TCP_CONNECTIONS
was introduced to help keep track of active TCP connections, with its key storing the clien <ip:port> identifier and the value storing the backend's <ip:port> along with the state of the connection.Fixes #119
Follow up items:
TCP_CONNECTIONS
to be more generic; get rid ofBLIXT_CONNTRACK
and refactor udp ingress and icmp egress to use the new generic map