-
Notifications
You must be signed in to change notification settings - Fork 304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Point gce ingress health checks at the node for onlylocal services #17
Comments
From @MrHohn on February 28, 2017 1:28 Sorry, it's a bit late to ask but I'm lost at some arguments here:
Not sure what 'if all endpoints evacuate a node' means. If it means this node is broken and all endpoints get rescheduled to other nodes. Why would it take 10*10 seconds to mark this node unhealthy? Just checked the default setup for ingress healthcheck, the unhealthy threshold is set to 10 consecutive failures and the interval is set to 1 second. So I assume it would take 10*1 seconds for this node to be unhealthy for that specific backend group (also assuming healthchecks are sending to every instance in this group)? Add @nicksardo who might also have the answer. |
From @thockin on March 1, 2017 8:29
If, for some reason, all the endpoints for a Service get moved to a
The way the healthchecks function today. I think the point is that this is The way I think it should work is: If OnlyLocal { For OnlyLocal Services, this will result in a node dropping out when there For Regular Services, this will result in a node dropping out if kube-proxy What we do today - actually HC'ing the Service - isn't right. It will pick The proposed above is not robust to, e.g. iptables failures. Maybe we can On Mon, Feb 27, 2017 at 5:28 PM, Zihong Zheng [email protected]
|
From @MrHohn on March 1, 2017 18:27
Thanks for clarifying. Yeah I think this is really the point. It seems like current ingress health check setup is totally broken --- don't think bad endpoints could be easily detected when good endpoints also exist.
This seems feasible. We may also take kubernetes/kubernetes#14661 into account. In future if we also have node-level healthcheck from LBs for non-OnlyLocal services, then we could expand above health check mechanism to non-OnlyLocal LoadBalancer services as well. May also need to think of NodePort services. ESIPP does not assign healthCheckNodePort to OnlyLocal NodePort services for now, which means above health check mechanism can not be used on them. For expanding this to NodePort services, we need to assign healthCheckNodePort to NodePort services as well. But then we may start worry about allocating too many healthCheckNodePort --- considering potentially port exhaustion and overhead in kube-proxy for holding too many health check servers. More to think, if we are expanding this mechanism to non-OnlyLocal LoadBalancer services. What about non-OnlyLocal NodePort services? Feel like we need a concrete proposal :) |
From @MrHohn on March 1, 2017 18:34 Oh, made a mistake above. For OnlyLocal NodePort service the original health check already works, traffic sent to nodes that do not have endpoints on it will be dropped. |
From @sanderploegsma on June 22, 2017 13:55
This does not seem to be correct, the current source code explicitly mentions an interval of 60 seconds. Even better: creating an ingress connected to a service that has a readiness probe will combine these intervals, so if the readiness probe is set up to check at an interval of 10 seconds the resulting health check will have an interval of 70 seconds. It will then take over 10 minutes before a node stops receiving traffic it will drop because of the annotation. We can of course change these parameters by hand, but it looks like the documentation fails to mention this behaviour. We found out the hard way in our production environment after having almost 15 minutes of seemingly random outage after deploying a new version of one of our applications... |
From @nicksardo on June 22, 2017 17:14 I wouldn't recommend lowering the interval, @sanderploegsma. Since health checks hit the service's nodeport, if you run a few pods on a large node cluster, the pods will receive a significant amount of traffic. As thockin said earlier,
Therefore, you shouldn't rely on LB health checks for monitoring pod or node health at this time. |
From @sanderploegsma on June 22, 2017 18:15 Right, that makes sense. So there currently is no way for us to combine the |
From @nicksardo on June 22, 2017 18:38 I don't know of anyone else who has tried using that annotation with ingress. Be aware that it's not a documented/supported annotation of services used by GCE ingress. Also, I don't recommend manually modifying health checks. The GLBC 0.9.3 and 0.9.4 will drop these settings when migrating to a newer health check type. This was made seamless in 0.9.5.
Better solutions are being worked on. |
Issues go stale after 90d of inactivity. Prevent issues from auto-closing with an If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Rotten issues close after 30d of inactivity. Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
From @bprashanth on November 18, 2016 22:35
We now have a new beta annotation on Services,
external-traffic
(http://kubernetes.io/docs/user-guide/load-balancer/#loss-of-client-source-ip-for-external-traffic). With this annotation set toonlyLocal
NodePort
Services only proxy to local endpoints. If there are no local endpoints, iptables is configured to drop packets. Currently sticking anonlyLocal
Service behind an Ingress works, but does so in a suboptimal way.The issue is, currently, the best way to configure lb health checks is to set high failure threshold so we detect nodes with bad networking, but not flake on bad endpoints. With this approach, if all endpoints evacuate a node, it'll take eg: 10 health checks*10 seconds per health check = 1.5 minutes to mark that node unhealthy, but the node will start DROPing packets for the NodePort immediately. If we pointed the lb health check at the
healthcheck-nodeport
(a nodePort that's managed by kube-proxy), it would fail in < 10s even with the high thresholds described above.@thockin
Copied from original issue: kubernetes/ingress-nginx#19
The text was updated successfully, but these errors were encountered: