From 54ee4e2cffe2cdeadab26f0777907dfe8d356994 Mon Sep 17 00:00:00 2001 From: Dan Mace Date: Tue, 9 Jun 2020 10:48:29 -0400 Subject: [PATCH] Fix quorum-guard timeouts Before this change, the quorum-guard `timeoutSeconds` and `failureThreshold` values were left unspecified in the manifest, and were defaulted. The default value for `timeoutSeconds` is 1, while the probe itself enforces a 2 second timeout. This means that in cases where the probe itself should succeed, Kube will consider the probe failed because of the stricter timeout on the probe specification. The effect is the probe sporadically reports false negative outcomes. This change increases the `timeoutSeconds` value to exceed the probe logic's internal timeout so that the probe command is the source of truth with regards to timeouts. This change also makes the `failureThreshold` value explicit, but the default value is preserved because I don't have a clear reason yet to change it. --- ...0_machine-config-operator_07_etcdquorumguard_deployment.yaml | 2 ++ 1 file changed, 2 insertions(+) diff --git a/install/0000_80_machine-config-operator_07_etcdquorumguard_deployment.yaml b/install/0000_80_machine-config-operator_07_etcdquorumguard_deployment.yaml index 41ecc2f81c..8a259fbd81 100644 --- a/install/0000_80_machine-config-operator_07_etcdquorumguard_deployment.yaml +++ b/install/0000_80_machine-config-operator_07_etcdquorumguard_deployment.yaml @@ -84,6 +84,8 @@ spec: curl --max-time 2 --silent --cert "${cert//:/\:}" --key "$key" --cacert "$cacert" "$health_endpoint" |grep '{ *"health" *: *"true" *}' initialDelaySecond: 5 periodSecond: 5 + failureThreshold: 3 + timeoutSeconds: 3 resources: requests: cpu: 10m