Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ExternalIPs and LoadBalancer IPs No Longer Work with NetworkPolicy without Manual Specification #934

Closed
aauren opened this issue Jun 16, 2020 · 9 comments
Assignees

Comments

@aauren
Copy link
Collaborator

aauren commented Jun 16, 2020

Since this PR was merged: #914 only ClusterIPs and NodePorts are specifically bypassed in the INPUT chain for inter-cluster traffic. Essentially, for ClusterIPs and NodePorts the policy enforcement is specifically bypassed in INPUT and then enforced in OUTPUT after all of the VIPs have been disambiguated by IPVS into PodIP addresses. This allows users to use them for inter-cluster traffic using the standard pod and namespace selectors.

However, since ExternalIPs and LoadBalancer IPs were not added to the functionality of #914, they will still get processed in INPUT while the VIP still exists as the destination and will be denied unless manually allowed by an ipBlock statement AND a namespace / pod selector (since this will be enforced when it traverses the OUTPUT chain.

While I think that the intention of the Kubernetes network SIG is that all inter-cluster traffic be routed through ClusterIPs (which is why this is returned from built-in Kubernetes DNS when you lookup a service record) we have many users that lookup records from Consul. As it concerns Consul, we wish the services within Consul to be route-able both from within the cluster and from outside the cluster, so we only add NodePorts and ExternalIPs to the service definitions in Consul from Kubernetes. As such all of our containerized applications that utilize Consul attempt to route to in-cluster services via ExternalIPs or NodePorts.

@murali-reddy
Copy link
Member

murali-reddy commented Jun 16, 2020

@aauren Seems like reasonable use-case. If i understood correctly consul acts as a service discovery for both in-cluster and out-of-the cluster services and you would want services with external IP to be used by both in consistent manner?

@aauren
Copy link
Collaborator Author

aauren commented Jun 16, 2020

@murali-reddy Yes, that is correct.

@zerkms
Copy link
Contributor

zerkms commented Jun 17, 2020

Is it connected to #938 ?

@aauren aauren self-assigned this Jul 10, 2020
@aauren
Copy link
Collaborator Author

aauren commented Jul 30, 2020

@murali-reddy I think that we're ready to move forward with a contribution to hopefully fix this. While we could fix it based upon a range like we did with ClusterIP with subnet mask or NodePort with a port range, I think that it's possible that we'll miss use-cases here if we try to do a range for this one. So I would propose that we make an additional ipset to track externalIPs and then handle them like we do ClusterIPs and NodePorts on the INPUT chain (e.g.

whitelistServiceVips := []string{"-m", "comment", "--comment", "allow traffic to cluster IP", "-d", npc.serviceClusterIPRange, "-j", "RETURN"}
).

The only downside to this is that it will introduce an additional point of churn for the NPC. Right now the NPC only watches on pod, namespace, and networkpolicy events. In order to do this on the NPC we would have to include the informer for services as well (to be notified if someone ever changed the ExternalIP of the service so that we could add it / remove it from our ipset). However, I don't think that we'd have to run the entire NPC controller when this happened, and instead we could have that handler only update the affected ExternalIP ipset that we're maintaining.

Alternatively, we could have the the NSC manage this new ipset by adding another one here:

serviceIPsIPSetName = "kube-router-service-ips"
since it already has informer work to do on service update. However, I'm guessing that you won't be a fan of that approach as that would be relying on one controller to be running from a different controller, and that would be crossing the streams so to speak.

Let me know your thoughts on this and I'll begin drafting a PR.

@murali-reddy
Copy link
Member

Unfortunatley I did not forsee there will be cases where pod's access services by external IP with in the cluster. Given we already have --service-cluster-ip-range I am bit baised to handle all the service VIP's in consistent manner.

While we could fix it based upon a range like we did with ClusterIP with subnet mask or NodePort with a port range, I think that it's possible that we'll miss use-cases here if we try to do a range for this one.

While external IP's are not managed by Kubernetes and can not make general assumption on nature of the external IP pool. I feel its reasonable to accept multiple ranges of external IP's.

Alternativley we can spare the users from this configuration by bookkeeping the set of external IP's. We can do above approaches but there is going to be overhead.

The only downside to this is that it will introduce an additional point of churn for the NPC. Right now the NPC only watches on pod, namespace, and networkpolicy events. In order to do this on the NPC we would have to include the informer for services as well (to be notified if someone ever changed the ExternalIP of the service so that we could add it / remove it from our ipset). However, I don't think that we'd have to run the entire NPC controller when this happened, and instead we could have that handler only update the affected ExternalIP ipset that we're maintaining.

This sounds reasonable. If there is any objection to accept the external IP ranges as configuration then I would prefer this.

However, I'm guessing that you won't be a fan of that approach as that would be relying on one controller to be running from a different controller, and that would be crossing the streams so to speak.

Yes, I would prefer to keep individual functionalities be properly decoupled. I understand for users like yours a cohesive solution is better, but would be better to keep decoupling intact (for the users who dont run kube-router's service proxy and possiblity to run the individual functionalities as containers in the kube-router pod)

@aauren
Copy link
Collaborator Author

aauren commented Aug 3, 2020

I mean ranges won't be a huge problem for us, but I do think that it would be difficult to make ranges work for users that potentially have a bunch of /32's or /28's or some such. Also, every time that a range was added they would have to roll their entire daemonset or update all of their hosts if they run kube-router as a host daemon.

To me, this feels like a pretty big usability concession. Maybe we make the range flag, so that people can choose to add ranges or choose to have kube-router maintain their list of external IPs manually? This essentially would allow users to choose between manual configuration and performance, or automatic configuration with some performance trade-off.

What would you think about an option similar to the following?

--external-ip-ranges string (off|auto|subnets) Specify one or multiple comma delimited external IP CIDRs that are used for inter-cluster communication, auto will have kube-router calculate these but impact performance, off will disable inter-cluster ExternalIP use (i.e. your cluster uses ClusterIP instead) (default: off)

@murali-reddy
Copy link
Member

I mean ranges won't be a huge problem for us, but I do think that it would be difficult to make ranges work for users that potentially have a bunch of /32's or /28's or some such. Also, every time that a range was added they would have to roll their entire daemonset or update all of their hosts if they run kube-router as a host daemon.

I feel this may not be as bad. At least from what I see from the non-cloud implementation of LoadBalancer service
which manage external IP pool like OpenShift (ExternalIPNetworkCIDRs) and MetalLB (address-pools)

Since no one has run into this problem atleast at the moment how about just imposing restriction that one has to configure --external-ip-ranges to let pod's in the cluster access the service external IP's. If there is a concern from any users we can make the flag optional and if not configured kube-router can implicitly figure as you suggest.

@aauren
Copy link
Collaborator Author

aauren commented Aug 5, 2020

That's fine. We can start with ranges, and then if someone else has a strong use-case for allowing kube-router to do the auto functionality proposed above we can always add that later. The two approaches would stack nicely in the future if we had to add the other later.

So for posterity, what we're decided to do is add the following:

--external-ip-ranges string (off|subnets) Specify one or multiple comma delimited external IP CIDRs that are used for inter-cluster communication,  off will disable inter-cluster ExternalIP use (i.e. your cluster uses ClusterIP instead) (default: off)

Off is essentially the same functionality that we provide with the current 1.0 release.

@rdegez
Copy link

rdegez commented Aug 7, 2020

A bit unrelated to $title but the following statement ...

I feel this may not be as bad. At least from what I see from the non-cloud implementation of LoadBalancer service
which manage external IP pool like OpenShift (ExternalIPNetworkCIDRs) and MetalLB (address-pools)

... made me think about this project we recently discovered which is yet another way to handle LoadBalancer service IP assignment without any k8s cloud-controller-manager : https://github.com/Nordix/assign-lb-ip

aauren added a commit to aauren/kube-router that referenced this issue Aug 13, 2020
aauren added a commit to aauren/kube-router that referenced this issue Aug 13, 2020
aauren added a commit to aauren/kube-router that referenced this issue Aug 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants