Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(network_policy): consolidate policy chains #906

Closed
wants to merge 1 commit into from

Conversation

aauren
Copy link
Collaborator

@aauren aauren commented May 22, 2020

@murali-reddy

Instead of attaching each individual pod FW chain to each primary chain in the filter table (INPUT, FORWARD, and OUTPUT), consolidate all of the rules into a chain called KUBE-ROUTER-FIREWALL.

This reduces the amount of chains that we have to track by about 4 rules per pod per network policy and it allows us to better control the placement of where network policy rules go as compared to services rules and other rules in the primary filter chains.

Fixes #905 by ensuring that KUBE-ROUTER-SERVICES is always evaluated first in the INPUT chain before the new global firewall chain: KUBE-ROUTER-FIREWALL. In this way access to services that are on the same node as the host will pass through INPUT (when they are still src -> pod IP addr : dst -> service VIP), get properly disambiguated via IPVS, and then get filtered properly when they pass through OUTPUT (when they are now src -> pod IP addr : dst -> pod IP addr of service).

Instead of attaching each individual pod FW chain to each primary chain
in the filter table (INPUT, FORWARD, and OUTPUT), consolidate all of the
rules into a chain called KUBE-ROUTER-FIREWALL.

This reduces the amount of chains that we have to track by about 4 rules
per pod per network policy and it allows us to better control the
placement of where network policy rules go as compared to services rules
and other rules in the primary filter chains.
@murali-reddy
Copy link
Member

@aauren unfortunatley this PR breaks the usability of kube-router's network policy implementation that is independently usable with any service proxy and CNI

  • many users use kube-router as CNI and network policy implementation along with kube-proxy
  • use of node's pod CIDR is specific to kube-router as CNI so matching by pod's IP is needed for generic implementation

@aauren
Copy link
Collaborator Author

aauren commented May 25, 2020

@murali-reddy I understand what you're saying. I now see why you don't rely on podCIDR much in the current code base.

However, this PR could easily be adapted to instead use an ipset that contains all of the pod IP addresses that are local to the node instead of relying on the podCIDR. This would allow for different CNIs to handle their IP address provisioning differently while still implementing single hook points into the primary iptables filter chains. Relying on these singe hooks in the primary chains will make our rules easier to maintain, less invasive to the host, and also reduce the number of rules that we have to add to the host increasing sync performance.

In terms of being able to prioritize service rules for other service proxies (like kube-proxy) on the INPUT chain I would be interested to hear your thoughts on possible solutions. I would argue that there may not be a clean way to make this work for all of the potential proxy solutions that someone could use.

Unless we keep a list of all of the potential chains from the main service proxy providers and look for chains by name (like I did with KUBE-ROUTER-SERVICES) we're left with either insert or append semantics, both of which will likely break other proxy and firewall offerings that are concurrently manipulating iptables. In this case append will break the default firewall engine that comes with Fedora, CentOS, and RedHat that at least some of our users use, and insert will likely break most other service proxies.

In any case, I think that at the very least making sure that we don't break our own use case (like this PR does) seems prudent. While it is a huge strength that kube-router is so modular and is able to work with many other proxy, firewall, and routing solutions, this may be a case where trying to be all things to all people may not be possible for us.

@aauren aauren closed this May 27, 2020
@murali-reddy
Copy link
Member

murali-reddy commented May 28, 2020

thanks @aauren for considering to close the PR in favour of #909, #914, #915. clariying the question you raised

However, this PR could easily be adapted to instead use an ipset that contains all of the pod IP addresses that are local to the node instead of relying on the podCIDR. This would allow for different CNIs to handle their IP address provisioning differently while still implementing single hook points into the primary iptables filter chains.

Sure we could optimize maintaining ipset. Right now as part of #909 I added a rule in built in chain for e.g. in INPUT chain to jump traffic to KUBE-ROUTER-INPUT. This basically applies to all traffic. Modifying that rule to jump only if local pod IP set with matching source or destination will optimize

Relying on these singe hooks in the primary chains will make our rules easier to maintain, less invasive to the host, and also reduce the number of rules that we have to add to the host increasing sync performance.

#909 makes it less invasive now

In terms of being able to prioritize service rules for other service proxies (like kube-proxy) on the INPUT chain I would be interested to hear your thoughts on possible solutions. I would argue that there may not be a clean way to make this work for all of the potential proxy solutions that someone could use.

Best way is not to make any assumptions/expections fron proxy to keep things modular and loosley coupled #914 does it in proxy agnostic manner

@aauren aauren deleted the consolidate_network_policy_chains branch September 4, 2023 15:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Egress network policies should not be applied to pod to service VIP traffic
2 participants