-
Notifications
You must be signed in to change notification settings - Fork 473
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extra nft rules from netbird causes crash loop #1788
Comments
I'm going to guess its a problem with the user space tooling being on different versions. The version of iptables that we currently bake into the alpine image is v1.8.9 (although I'm not sure if that's what k3s uses or if they use a different container base). If you are able to exec into the image and run As a helpful troubleshooting step, can you please change your host's iptables userspace to the same one that k3s is running and then see if this problem still occurs? Similar to what I said here: #1777 (comment) the upstream netfilter project recently hasn't been ensuring backwards compatibility in the userspace. |
It looks like iptables-1.8.11 fixes the issue with checking iptables rules that was introduced in iptables-1.8.10 and kept us on a legacy iptables userspace in the kube-router container (1.8.9). My hope is that once Alpine releases a version that includes the iptables-1.8.11 userspace that we can use that and it should hopefully make it more resilient to host userspace tooling versions, since it should hopefully be more compatible with newer userspace version. While we wait for Alpine to release, I've built my own iptables 1.8.11 packages for Alpine and included them in a PR build here: #1790 If the machine that you're testing doesn't have access to sensitive data, and you're willing to trust custom iptables binaries made by the project, and your cluster is using AMD64 architecture machines, you can give |
I'm a bit lost with working out the iptables version in the k3s implementation of kube-router, so I think I'll bounce back to them to try and work that out. Something I missed in my original report was that there is no userspace iptables command on the host system. So I'm guessing netbird is doing something weird as well, given that after a Hopefully the k3s project can help me confirm the iptables version being used, and what might be needed to test with your updated image. Thanks |
From what I can see, k3s is using iptables 1.8.9 still. I've bumped the upstream to see what it would take to change that. If I install userspace iptables, it will also be 1.8.9. So I might just try installing the userspace iptables as well, as then maybe netbird and k3s will use that instead. |
So, it looks like netbird uses (https://github.com/netbirdio/netbird/blob/main/client/firewall/nftables/rule_linux.go) a Google Go implementation to manipulate nftables (https://github.com/google/nftables). Unlike iptables, nftables as a published user-space API and you don't have to use the project's official user-space library / binary in order to manipulate them. I can certainly understand why netbird would want to use the Google library given that its a pure go implementation and it creates a consistent user experience, meaning that they don't have to be beholden to whatever library versions people are using on the host system. It makes the binary that they produce completely self-sufficient, kind of like a container without requiring people to run netbird in a container. However, to me it is a pretty big red flag and it means that kube-router and other tools may never be copacetic with netbird. If the upstream netfilter project has such a hard time ensuring backwards compatibility with different user spaces, I would have very little hope that all of the unofficial Google Library rules that it writes will be problem free. I wouldn't even expect them to be fully compatible with all versions of the official While this is just a guess, I think that it is most likely that the problems you're experiencing have to do with netbird via the Google library writing nft rules that are not fully backward compliant with the official iptables legacy binary that is distributed by the netfilter project. Best bet would be to file this issue with the netbird project and hope that someone on the project is either able to work around the library or get someone it Google to fix the incompatibility. Although on the site for the Google library they say they don't accept GitHub issues and that the library isn't official supported by Google so mileage may vary. One last thing I'll throw out, if the problem that you happen to be experiencing right now in the incompatibility between netbird and kube-router, is related to kube-router using older versions of the iptables user-space, it is possible that the release I pointed you to may help with that. It is still possible that this specific bug is because the Google nftables library is using a new feature that the legacy iptables user-space cannot decode. If this is the case, updating the user-space like I did in the linked kube-router container, would fix this specific issue. |
What happened?
Sometime after starting netbird, k3s goes into a crash loop due to being unable to verify existing firewall rules.
Failed to verify rule exists in FORWARD chain due to running [/var/lib/rancher/k3s/data/3b5b4bda92acada40dbf3251e7e040c82cd2a79760260b5cec3e42f7c1cd0a17/bin/aux/iptables -t filter -C FORWARD -m comment --comment kube-router netpol - TEMCG2JMHZYE7H7T -j KUBE-ROUTER-FORWARD --wait]: exit status 3: Error: cmp sreg undef
I think it may be related to #1777, but after looking closely at the backtrace, I'm not sure if it is.
Issue was originally logged in k3s-io/k3s#11493
What did you expect to happen?
Kube network controller should better handle the error instead of crashing out.
How can we reproduce the behavior you experienced?
Start k3s node, start netbird (connected to a network), wait some amount of time until kubernetes starts to crash.
Screenshots / Architecture Diagrams / Network Topologies
System Information (please complete the following information)
kube-router --version
): v2.2.1kubectl version
) : v1.29.10Logs, other output, metrics
Additional context
Extra rules added by netbird when kube is running first are in
chain FORWARD
and they are:Netbird is changing nft rules, however according to the output of
nft list table ip filter
after starting netbird (before k3s), it's being managed byiptables-nft
, so I'm not sure where the incompatibility is coming from.The text was updated successfully, but these errors were encountered: