Skip to content
This repository was archived by the owner on Jun 20, 2024. It is now read-only.

Reload NPC iptables rules if they get cleared #3919

Open
nicolas-goudry opened this issue Oct 5, 2021 · 3 comments
Open

Reload NPC iptables rules if they get cleared #3919

nicolas-goudry opened this issue Oct 5, 2021 · 3 comments

Comments

@nicolas-goudry
Copy link

Hello,

I’m using weave CNI in a K8S cluster, on Rocky Linux 8 (RHEL8 based) with firewalld enabled.

I noticed that on firewalld reload/restart, iptables are flushed. This is the expected behavior, because iptables should not be used with firewalld and, as stated here, « they fight and firewalld will win ».

After some googling, I stumbled upon this issue, describing exactly my problem. I dove deeper into this, to see that the issue has been partially addressed by this PR, which is merged since last year.

Therefore, I did some testing in my cluster with one of my nodes:

  • save iptables with iptables-save to be able to come back to a working state later on
  • tail the weave-net pod logs for weave container
  • reload firewalld, either with firewall-cmd --reload or systemctl reload firewalld.service
  • saw the following logs in weave container
INFO: 2021/10/05 09:38:17.357690 iptables canary mangle/WEAVE-CANARY deleted
INFO: 2021/10/05 09:38:17.361742 Reloading after iptables flush
INFO: 2021/10/05 09:38:17.361775 Re-configuring iptables
INFO: 2021/10/05 09:38:17.458063 Re-exposing 10.233.84.0/18 on bridge "weave"
  • check iptables to see that all “router” rules are re-applied (as expected with changes introduced in PR mentioned earlier), but NPC rules have disappeared and never came back
  • restore iptables with iptables-restore

Note that I don’t want to disable firewalld in my environment.

Is there anyone currently working on this? Is this feature in the roadmap yet? I’ll allow myself to ping @bboreham who reviewed the mentioned PR for some insights on this.

Thank you 🙂

@petermicuch
Copy link

I think I might be hitting the same problem, I just haven't realized this could be caused by flushing iptables (and I still am not sure to be honest?).

I am running K8S cluster 1.20.10 with containerd as CRI on RHEL 8.4 servers (one master, one node). Exactly same setup was working just fine with RHEL7, but ever since I switched to RHEL8, I am facing issues with weave networking.

I am even using IPTABLES_BACKEND set to nft as suggested in #3465 (although it is not mentioned in that issue explicitly, related PR seemed to enable this env variable).

Whenever I enable firewalld, I am not able to communicate in the pod network correctly. I.e. ping to my nodes or to kube-dns from weave pods is not working, so also DNS resolution is not working at all. When I disable firewalld, things are again working fine. According to this comment, calico recommendation is to disable firewalld. Is this expected also in weave case @bboreham?

@nicolas-goudry if you think this is totally unrelated, I would open new issue for this.

@bboreham
Copy link
Contributor

bboreham commented Nov 5, 2021

I don't work on Weave Net any more.

@nicolas-goudry
Copy link
Author

@petermicuch I think this is clearly related.

My weave pods also use nft backend as confirmed by running:

$ kubectl exec -it [weave_pod] -- iptables -V
iptables v1.8.3 (nf_tables)

If you wish to enable firewalld, you’ll have to open:

  • DNS port
  • Weave communication ports
  • Kubernetes system components ports

You’ll also have to allow docker network to pass through firewall, as well as allowing communication from pods CIDR to services CIDR.

Upon every firewall reload/restart/stop/start, you’ll have to either:

  • save iptables before reloading then restore them
  • kill all weave pods in order for them to reapply their rules to host’s iptables when they restart

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants