Add shell script for graceful shutdown#28
Conversation
|
If long-lived connection like websocket exists in current HAProxy process, how to handle it? Wait 30s then perform a hard shutdown? But that sounds not a graceful shutdown. |
|
/test e2e-aws |
| if [[ "$graceful_shutdown" = 'true' ]]; then | ||
| echo 'Performing a graceful shutdown:' | ||
| echo " - Sleeping $DRAIN_PERIOD seconds to let connections drain..." | ||
| sleep "$DRAIN_PERIOD" |
There was a problem hiding this comment.
Granted the reload script will be no-op'ing, if we haven't issued USR1 to haproxy it's not clear to me how connection draining is implemented with this sleep.
| # GRACE_PERIOD is the number of seconds that this script will wait for HAProxy | ||
| # processes to terminate after it sends the SIGUSR1 signal, before it sends | ||
| # SIGTERM to any remaining HAProxy processes. | ||
| GRACE_PERIOD=30 |
There was a problem hiding this comment.
Should GRACE_PERIOD and DRAIN_PERIOD be configurable?
There was a problem hiding this comment.
They easily could be, but I wasn't planning to add any API for these parameters to the ingress controller resource, so I don't know that it would be much use to make them configurable here.
There was a problem hiding this comment.
The ingress controller is necessarily aware of how to configure the haproxy-router, I'm not sure that implies we have to expose this as public API. I was just suggesting that if we ever need to tweak these values, we might want to do it through an operator release rather than an haproxy-router release. Especially if we end up tweaking these values dynamically in the operator...
There was a problem hiding this comment.
The stop hook script might even be possible to define in the operator itself rather than here, but that could be a little too coupled?
There was a problem hiding this comment.
I was just suggesting that if we ever need to tweak these values, we might want to do it through an operator release rather than an haproxy-router release. Especially if we end up tweaking these values dynamically in the operator...
Hm, that's a good point—the operator might know that one cloud provider needs a shorter or longer drain period than the typical cloud provider, in which case the operator could set a platform-appropriate value.
The stop hook script might even be possible to define in the operator itself rather than here, but that could be a little too coupled?
You mean with a configmap volume? We could do that, but (1) the reload and shutdown scripts are loosely coupled by the /var/lib/haproxy/run/terminating marker, and (2) it would be another piece that could break, so it adds real operational complexity to reduce hypothetical build complexity.
There was a problem hiding this comment.
Latest push makes TERMINATION_MARKER, GRACE_PERIOD, and DRAIN_PERIOD configurable.
507b9c1 to
1c15371
Compare
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Miciah The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
1c15371 to
b83f97f
Compare
Add a shutdown-haproxy shell script, which can be used as the pre-stop handler for a router deployment in order to provide graceful shutdown. Whereas a hard shutdown simply sends a SIGTERM signal to all haproxy processes to make them immediately shut down, a graceful shutdown does the following: 1. Create a marker file that tells the router's /healthz health check to return false. 2. Sleep for 90 seconds to allow time for external load-balancers to observe the failing health check and drain connections to the router. 3. Create a marker file that tells the reload-haproxy script to exit immediately. 3. Send a SIGUSR1 signal to all HAProxy processes to make them stop accepting new connections. 4. Wait up to 30 seconds for all HAProxy processes to terminate. 5. Perform a hard shutdown of any remaining HAProxy processes. This commit is related to bug 1709958. https://bugzilla.redhat.com/show_bug.cgi?id=1709958 * images/router/haproxy/reload-haproxy: Exit immediately if the file /var/lib/haproxy/run/terminating is present. * images/router/haproxy/shutdown-haproxy: New file. Perform a graceful shutdown as described above. * pkg/cmd/infra/router/template.go (Run): Configure a listener for readiness checks specifically intended for external load balancers. In particular, these checks include the new IsDraining check. * pkg/router/metrics/health.go (IsDraining): New function. Return a healthz check that fails if the "draining" marker that the shutdown script creates exists. * pkg/router/metrics/metrics.go (Listener): Rename LiveChecks to PodLiveChecks. Rename ReadyChecks to PodReadyChecks. Add a new field, LBReadyChecks for readiness checks specific to external load balancers. (handler): Register LBReadyChecks as /healthz, PodLiveChecks as /healthz/live, and PodReadyChecks as /healthz/ready.
b83f97f to
77dc2da
Compare
|
@Miciah: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
What’s the status of this? |
It does not work as is. As soon as the pod is deleted, the pod's endpoints are deleted, even if the pod has a pre-stop lifecycle handler. The service proxy observes that the node has no local endpoints for the LB service, and it starts blackholing traffic to the node. To make draining work, we must initiate it before the pod is deleted. To do this, we need to make the load balancer's health check start failing before deleting the pod. Right now, the health check for LB services is serviced by the service proxy, so we would need to make the health check configurable (proof of concept: Miciah/origin@108a036). Alternatively, we have explored using surge during rollout, using affinity rules to schedule new replicas on the same nodes as old replicas to ensure that local endpoints never disappear during a rollout (proof of concept: Miciah/cluster-ingress-operator@93517fd). This approach has other challenges, the most salient being that we need specific behavior when the old replicas are descheduled, namely to prefer descheduling old replicas on nodes where the new replicas are ready. (Someone suggested a change along these lines here: kubernetes/kubernetes#4301 (comment).) My next step is to create a proof-of-concept change to the replicaset controller to give us the desired behavior. |
|
Superseded by openshift/cluster-ingress-operator#280. |
|
@Miciah: Closed this PR. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Add a
shutdown-haproxyshell script, which can be used in as the pre-stop handler for a router deployment in order to provide graceful shutdown.Whereas a hard shutdown simply sends a SIGTERM signal to all haproxy processes to make them immediately shut down, a graceful shutdown does the following:
Create a marker file that tells the
reload-haproxyscript to exit immediately.Sleep for 30 seconds to allow other proxies such as kube-proxy and any cloud load-balancers to drain connections.
Send a SIGUSR1 signal to all HAProxy processes to make them stop accepting new connections.
Wait up to 30 seconds for all HAProxy processes to terminate.
Perform a hard shutdown of any remaining HAProxy processes.
This commit is related to bug 1709958.
https://bugzilla.redhat.com/show_bug.cgi?id=1709958
images/router/haproxy/reload-haproxy: Exit immediately if the file/var/lib/haproxy/run/terminatingis present.images/router/haproxy/shutdown-haproxy: New file. If the file/var/lib/haproxy/run/terminatingis present, perform a hard shutdown, or else perform a graceful shutdown as described above.