-
Notifications
You must be signed in to change notification settings - Fork 462
daemon: Use systemd-run, not machine-config-daemon-host.service #1803
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This is the mirror/dual of openshift#1802 With this the MCD (when run as a pod) stops using `machine-config-daemon-host.service`. and creates a dynamic unit instead. With the combination of both, `machine-config-daemon-host.service` is on the path to not being used by default and migrating to a "4.1 bootimage aid". The systemd-run model of creating a unit dynamically is much clearer for what we want here; conceptually the service is just a dynamic child of this pod (if we could we'd tie the lifecycle together). Further: - Let's shorten our systemd unit names by using the `mco-` prefix - Inject the `RPMOSTREE_CLIENT_ID`, see coreos/rpm-ostree@016c1c5 For example, one weird semantic of `systemctl start` is that it "joins" if somehow the service is started for another reason. But here, if somehow two instances of the MCD were running then `systemd-run` will say e.g.: `Failed to start transient service unit: Unit mco-pivot.service already exists.`
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: cgwalters The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
Another reason to do this is that the current |
|
/hold |
|
Looking at the logs for this one...we have no coverage of this in current CI 😢 (the upgrade test is testing from an older revision, so we won't see the new logs). |
| // tie the lifecycle together). Further, let's shorten our systemd unit names | ||
| // by using the mco- prefix, and we also inject the RPMOSTREE_CLIENT_ID now. | ||
| err := exec.Command("systemd-run", "--wait", "--collect", "--unit=mco-pivot", | ||
| "-E", "RPMOSTREE_CLIENT_ID=mco", "/usr/libexec/machine-config-daemon", "pivot", osImageURL).Run() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
umm, so we are directly calling m-c-d pivot here which means in case of an update pivot will run systemctl reboot (https://github.com/openshift/machine-config-operator/blob/master/cmd/machine-config-daemon/pivot.go#L253) . Can this cause system to reboot without executing remaining control flow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My original theory that somehow the pivot code was rebooting turned out to be wrong. That code only reboots if a CLI argument is passed or if a stamp file exists, but just above in this function we unlink it to be sure:
// This is written by code injected by the MCS, but we always want the MCD to be in control of reboots
if err := os.Remove("/run/pivot/reboot-needed"); err != nil && !os.IsNotExist(err) {
return errors.Wrap(err, "deleting pivot reboot-needed file")
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I think we could just remove that pivot reboot code now too. Nothing should be using it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, it won't have /run/pivot/reboot-needed file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious, what is the use of passing osImageURL to m-c-d pivot, does it get processed?
| func followPivotJournalLogs(stopCh <-chan time.Time) { | ||
| cmd := exec.Command("journalctl", "-f", "-b", "-o", "cat", | ||
| "-u", "rpm-ostreed", | ||
| "-u", "pivot", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since we are removing pivot service, does it make sense to remove from logging?
| "-u", "rpm-ostreed", | ||
| "-u", "pivot", | ||
| "-u", "mco-pivot", | ||
| "-u", "machine-config-daemon-host") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this since mco-pivot replaces machine-config-daemon-host service
|
Since we are directly calling |
Not without landing #1802 first right? I'm thinking that we should keep all of these "cleanup firstboot" PRs on hold until we land #1766 - or we can pile them all together there so that way we get CI coverage too. |
ummm, it should be ideally optional because right now RunPivot() is called during both firstboot and subsequent boot.
I agree with this, will be nice to club all related commits in a single PR and get them well tested together. |
|
Closing in favor of piling onto doing this in "one shot" with #1766 |
This is the mirror/dual of
#1802
With this the MCD (when run as a pod) stops using
machine-config-daemon-host.service.and creates a dynamic unit instead.
With the combination of both,
machine-config-daemon-host.serviceis on thepath to not being used by default and migrating to a "4.1 bootimage aid".
The systemd-run model of creating a unit dynamically is much clearer for what we want here;
conceptually the service is just a dynamic child of this pod (if we could we'd
tie the lifecycle together). Further:
mco-prefixRPMOSTREE_CLIENT_ID, see coreos/rpm-ostree@016c1c5For example, one weird semantic of
systemctl startis that it "joins" ifsomehow the service is started for another reason. But here, if somehow
two instances of the MCD were running then
systemd-runwill say e.g.:Failed to start transient service unit: Unit mco-pivot.service already exists.