Don't exit the probe on connection issues#240
Don't exit the probe on connection issues#240k8s-ci-robot merged 1 commit intokubernetes-csi:masterfrom
Conversation
|
Tested with a CSI driver that's crashlooping. The liveness probe sidecar is still running and just restarts the driver container. "connection refused" is not great, but it fails the probe just fine: |
ejweber
left a comment
There was a problem hiding this comment.
I think it makes more sense to simply remove acquireConnection altogether as I did in #237. It seems the main purpose of using it as a wrapper is to add timeout functionality (which connlib now gives us for free). I'm not sure if there's a practical reason someone would set probeTimeout > 30, but if they did, connlib.Connect would timeout before probeTimeout, causing confusion.
That being said, this looks like it fixes the issue identified in #236, so no major complaints if you go this route.
Do not exit the liveness probe process when the liveness probe cannot connect to the CSI driver. The driver could be crashlooping, and we should not crashloop the liveness probe process too. The process should only fail all probes to /healthz endpoint. Since the HTTP server is not running when connecting to the driver for the first time, "connection refused" must be a good enough failure.
|
I basically copied #237 and updated it with |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ejweber, jsafrane, xing-yang The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
f8c8cc4 Merge pull request kubernetes-csi#237 from msau42/prow b36b5bf Merge pull request kubernetes-csi#240 from dannawang0221/upgrade-go-version adfddcc Merge pull request kubernetes-csi#243 from pohly/git-subtree-pull-fix c465088 pull-test.sh: avoid "git subtree pull" error 7b175a1 Update csi-test version to v5.2.0 987c90c Update go version to 1.21 to match k/k 2c625d4 Add script to generate patch release notes git-subtree-dir: release-tools git-subtree-split: f8c8cc4
…ersion Update go version to 1.21 to match k/k
What type of PR is this?
/kind bug
What this PR does / why we need it:
Do not exit the liveness probe process when it cannot connect to the CSI driver. The driver could be crashlooping, and we should not crashloop the liveness probe process too.
The process should only fail all probes to
/healthzendpoint. Since the HTTP server is not running when connecting to the driver for the first time, "connection refused" must be a good enough failure.This PR is heavily inspired / copied from #237
Which issue(s) this PR fixes:
Fixes #236
Special notes for your reviewer:
Does this PR introduce a user-facing change?: