From 75dbab9c54c6cb3470075af1da1b139ecea02d38 Mon Sep 17 00:00:00 2001 From: Colin Walters Date: Thu, 28 May 2020 13:06:42 +0000 Subject: [PATCH] machine-config-daemon-firstboot.service: Make idempotent and block kubelet See https://bugzilla.redhat.com/show_bug.cgi?id=1840222 Something in the baremetal IPI stack is forcibly powering off nodes during the firstboot. This causes all sorts of problems, but we should be more robust to handling this. The problem with `BindsTo=ignition-firstboot-complete.service` is twofold: First, if the service fails, we don't run, and will silently continue on to e.g. `kubelet.service`. That's bad - we should not land user workloads until a node is up to date and secure. Second, the binding is wrong because at some point we may move that service into the initramfs in CoreOS, and that would cause this to break. The "stamp file" approach is a generally good method of achieving idempotence, and we already have one, so let's use it. We also add a `RequiredBy={kubelet,crio}.service` to ensure they don't run unless we succeed. --- .../_base/units/machine-config-daemon-firstboot.service | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/templates/common/_base/units/machine-config-daemon-firstboot.service b/templates/common/_base/units/machine-config-daemon-firstboot.service index 61ac9a9836..b551ee5418 100644 --- a/templates/common/_base/units/machine-config-daemon-firstboot.service +++ b/templates/common/_base/units/machine-config-daemon-firstboot.service @@ -5,12 +5,13 @@ contents: | Description=Machine Config Daemon Firstboot # Make sure it runs only on OSTree booted system ConditionPathExists=/run/ostree-booted - BindsTo=ignition-firstboot-complete.service + # Removal of this file signals firstboot completion ConditionPathExists=/etc/ignition-machine-config-encapsulated.json # We only want to run on 4.3 clusters and above; this came from # https://github.com/coreos/coreos-assembler/pull/768 ConditionPathExists=/sysroot/.coreos-aleph-version.json After=ignition-firstboot-complete.service + Before=crio.service crio-wipe.service Before=kubelet.service [Service] @@ -20,3 +21,4 @@ contents: | [Install] WantedBy=multi-user.target + RequiredBy=crio.service kubelet.service