Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 25 additions & 1 deletion docs/dev/upgrades.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,4 +56,28 @@ Which in practice can be described in runlevels:
0000_70_*: disruptive node-level components: dns, sdn, multus
0000_80_*: machine operators
0000_90_*: reserved for any post-machine updates
```
```

## Why does the OpenShift 4 upgrade process "restart" in the middle?

Since the release of OpenShift 4, a somewhat frequently asked question is: Why sometimes during an `oc adm upgrade` (cluster upgrade) does the process appear to re-start partway through? [This bugzilla](https://bugzilla.redhat.com/show_bug.cgi?id=1690816) for example has a number of duplicates, and I've seen the question appear in chat and email forums.

The answer to this question is worth explaining in detail, because it illustrates some fundamentals of the [self-driving, operator-focused OpenShift 4](https://blog.openshift.com/openshift-4-a-noops-platform/). During the initial development of OpenShift 4, the toplevel [cluster-version-operator](https://github.com/openshift/cluster-version-operator/) (CVO) and the [machine-config-operator](https://github.com/openshift/machine-config-operator/) (MCO) were developed concurrently (and still are).

The MCO is just one of a number of "second level" operators that the CVO manages. However, the relationship between the CVO and MCO is somewhat special because the MCO [updates the operating system itself](https://github.com/openshift/machine-config-operator/blob/master/docs/OSUpgrades.md) for the control plane.

If the new release image has an updated operating system (`machine-os-content`), the CVO pulling down an update ends up causing it to (indirectly) restart itself.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idle curiosity, what percentage of OCP releases bump machine-os-content?


This is because in order to apply the OS update (or any config changes) MCO will drain each node it is working on updating, then reboot. The CVO is just a regular pod (driven by a `deployment`) running in the cluster (`oc -n openshift-cluster-version get pods`); it gets drained and rescheduled just like the rest of the platform it manages, as well as user applications.

Also, besides operating system updates, there's the case where an updated payload changes the CVO image itself.

Today, there's no special support in the CVO for passing "progress" between the previous and new pod; the new pod just looks at the current cluster state and attemps to reconcile between the observed and desired state. This is generally true of the "second level" operators as well, from the MCO to the network operator, the router, etc.

Hence, the fact that the CVO is terminated and restarted is visible to components watching the `clusterversion` object as the status is recalculated.

I could imagine at some point adding clarification for this; perhaps a basic boolean flag state in e.g. a `ConfigMap` or so that denoted that the pod was drained due to an upgrade, and the new CVO pod would "consume" that flag and include "Resuming upgrade..." text in its status. But I think that's probably all we should do.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jottofar @wking correct me if I'm wrong, but a boolean flag for this is not necessary. New CVO can detect in-progress update when started (since current version doesn't yet matches desiredVersion).

Not sure if it can autodetect the percentage though

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It knows it's in an update but I don't know how much state it saves, e.g. RetrievedUpdates, to know it has already loaded the update.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am in favor of storing information about synced/failing manifests in ClusterVersion, which would allow us to pick back up where the previous CVO left off. But there's no such stored state today.

Copy link
Member

@LalatenduMohanty LalatenduMohanty Sep 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This discussion should be tracked in a Jira.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to track in Jira, this is already tracked in the bug that these docs link.


By not special casing upgrading itself, the CVO restart works the same way as it would if the kernel hit a panic and froze, or the hardware died, there was an unrecoverable network partition, etc. By having the "normal" code path work in exactly the same way as the "exceptional" path, we ensure the upgrade process is robust and tested constantly.

In conclusion, OpenShift 4 installations by default have the cluster "self-manage", and the transient cosmetic upgrade status blip is a normal and expected consequence of this.