-
Notifications
You must be signed in to change notification settings - Fork 522
AGENT-1250: Simplified cluster operations for disconnected environments #1821
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
AGENT-1250: Simplified cluster operations for disconnected environments #1821
Conversation
|
@andfasano: This pull request references AGENT-1250 which is a valid jira issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/assign @zaneb |
b1878df to
4b793e5
Compare
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
4b793e5 to
50f9cc0
Compare
| name: openshift-releases | ||
| spec: | ||
| releases: | ||
| - "quay.io/openshift-release-dev/ocp-release:4.18.0-x86_64" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll need to figure out how to map these names to the temporary registry.
It's also essential for security reasons that releases here are referenced only by digest and never by tag,
| was always available in the environment. Removing such dependency across | ||
| all the possible locations appeared to be particularly complex and | ||
| challenging (and moreover no formal way to enforce it once removed) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say all 3 of the alternatives from the discussion doc are worth documenting as rejected alternatives:
- the initial version of this proposal, before we realised that we would have to expose the local registries via api-int.
- the version using PinnedImageSet
- the idea of writing a registry that exposes the containers-storage as a registry
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the idea of writing a registry that exposes the containers-storage as a registry
See also https://github.com/cgwalters/cstor-dist/.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I have chatted to Colin about that. Unfortunately there are some fundamental limitations to what is standardised in the OCI Distribution protocol that we concluded made this approach too high-risk for our purposes.
| * The `spec.releases` will be used to configure the desired release contents. | ||
| Adding (or removing) an entry from the list will trigger an update on all the | ||
| control plane nodes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where do the control plane nodes source the images from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once copied, from the disk. Did you maybe mean the exact location on disk?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I meant, how does the control plane know where to copy the images from, I didn't see anywhere in the API where you configure "ISO is loaded to this mount point"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This wasn't clear to me either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A temporary registry will be started when the ISO is mounted (a udev rule matching on a disk label triggers a systemd service that starts it). The MCD will know how to get the image from that registry - it could be e.g. a static pod with a Service that gives it a well-known URL accessible within the cluster.
|
(sorry, the file rename due the markdownlint failures scrambled a little bit the latest comments, I will keep addressing the unresolved one in the next commits - hopefully no more file renaming 😥 ) |
| to each local control plane node registry), managed by the Machine Config | ||
| Daemon (MCD). Once the copy has been completed on all the control plane nodes, | ||
| the user could start the usual offline upgrade process via the `oc adm upgrade | ||
| --allow-explicit-upgrade --to-image=<release-image>`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the suggested approach to extract the relevant 'release-image' from the ISO?
Or maybe since the ISO is constructed for a specific release image anyway, we can keep it transparent for users? E.g. apply a MachineConfig that starts the upgrade (and detects the right version internally)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The assumption is that the user is going to previously download a specific OCP ISO version, so it should be a known info upfront, explicitly specified by the user
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds completely disconnected from the graph, does that matter? Do you care that you can easily go from 4.20.100 to 4.21.0 and fall back in time by two years?
If we had OSUS here this would just work normally, and you can use symbolic names. But we don't have a registry so we don't have OSUS. We could at least verify that the current version is listed in the target release's list of previous versions and that would prevent the upgrading into an older version issue where we would've never tested such an upgrade and it will almost certainly have some regression if you upgrade to a considerably older version. Perhaps put the IRI resource into a failed state if the current version is neither the version of the IRI or in the list of previous versions?
oc adm release info 4.19.8 -o json | jq '.metadata.previous'
[
"4.18.19",
"4.18.20",
"4.18.21",
"4.18.22",
"4.19.0",
"4.19.1",
"4.19.2",
"4.19.3",
"4.19.4",
"4.19.5",
"4.19.6",
"4.19.7"
]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that idea, although ideally we would apply that at the time of upgrading rather than at the time of copying the image into the local registry. Is this something we should build into CVO in the absence of OSUS?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've lost track of the specific validations the CVO does but I believe the way it's intended to work today is if you provide a symbolic version it has to be on the graph. As soon as you provide a specific pull spec (--allow-explicit-upgrade) the CVO assumes you know what you're doing, not that that's ever been documented explicitly. There's a lot of blurred lines between CLI side guards and what the API accepts.
@wking can you check me here? what protections are in place here and what do you think would be best way to keep them from making a mistake by potentially selecting the wrong mirrored image when expecting to upgrade? Maybe it's time to move all the client side protections into the ClusterVersion CR and CVO via a new interface?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great if we had symbolic versions. What I think we'd need to do that is a mapping of tags to sha digests that is cryptographically signed. If that existed it would be fairly easy to build into the ISO, but the infrastructure to set it up if it doesn't would be potentially daunting. Do you know of any existing artifacts that could act as building blocks here? A brief reading of sigstore docs suggests that it won't help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW: Some time ago, before OSUS and oc-mirror were around, when performing OCP4 upgrades on disconnected environments, they had to be done with --allow-explicit-upgrade and that did have the risks you are pointing out here of not properly obeying the graph. In that era, the message used to be that you had to be careful to follow the graph on your own or you risked to not be properly supported.
However, I'd prefer to have safeguard mechanisms like the ones you are considering. The older approach of "follow the graph" on your own is too risky.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arguably there's a high probability that most disconnected upgrades still follow this path and don't use OSUS, and that we should therefore punt this to a future enhancement rather than block on it now. (I agree that I'd prefer to have it, all things being equal.)
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Show resolved
Hide resolved
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Show resolved
Hide resolved
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Outdated
Show resolved
Hide resolved
|
|
||
|  | ||
|
|
||
| Initially the registry image will be placed into a read-only [additional image |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently we simply use 'podman load' on the registry tar file.
Are you referring to the same flow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not exactly. By placing directly the registry image in the (podman) additional image store there will be no need to previously load it (and being read-only, it could not be pruned)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, so how is it simpler than 'podman load'? I mean, we would still need some service/script to place the image in the store, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The assumption was to have it already available at build time, but I think you're right (for the live phase) as currently it doesn't seem to be (easily) possible to extend the ISO content, in such case we'll have to fallback to the podman load approach.
For what regards the installation phase instead (so after the rebootd), the Assisted Installer could directly copy it to the target node (as mentioned here for the backend data). If so I will rephrase this part to make it clearer
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Show resolved
Hide resolved
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Show resolved
Hide resolved
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Show resolved
Hide resolved
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Show resolved
Hide resolved
| At this point the user could edit the existing IRI resource to add the new | ||
| release to trigger the copy process (from the temporary registry | ||
| to each local control plane node registry), managed by the Machine Config | ||
| Daemon (MCD). Once the copy has been completed on all the control plane nodes, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be helpful to have a singular controller for orchestrating the copy process across the MCDs running on the control plane nodes. This makes it easier to do aggregate several node states into overall conditions and to report error modes at the operator level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This design was inspired by PinnedImageSet in the hope that much of that can be... if not reused, at least used as a template.
IIUC PinnedImageSet does have its own controller running inside the MCO. I don't think the language here was intended to exclude that, so it could probably be clarified slightly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see. Agreed that we should clarify if PIS is going to be reused here.
They're specified at the MachineConfigPool level and the node level images are represented in the status of the MachineConfigNode objects. We should be able to leverage that. We'd need some sort of translation between InternalReleaseImage <=> PinnedImageSets; maybe at the control-plane level?
I'm also not sure if PIS currently works in bootstrap/firstboot mode, so that might be something I can try getting clarification from the team on(cc @yuqi-zhang if you have context here). The flow above hints that IRI manifest is added at the end of the installation, so perhaps the MCO translation piece is not relevant at bootstrapping time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I guess I'm not too clear on this either. It reads to me that the install flow is completely handled by the agent installer (i.e. it will "set up" the registry and image content from the install iso on all the control plane nodes), and upgrade wise, that responsibility falls onto the MCD via specifications in IRI? Does IRI come into the install process at all through the worker nodes?
Re: upgrade, my read of the process outlined below is that:
- the user plugs in a usb disk ("attach the extended ISO") to the first control plane node
- the user writes IRI for that specific node, so that the MCD copies it
- the user unplugs it, plugs it into the second control plane node
- write IRI to the second node
repeat until done for control plane, then trigger the upgrade for the whole cluster, where the 3 control plane node reads off itself and then have the workers read off of the control planes?
If that's the case, I think regardless we'd need some daemon process to handle the actual "copy". We probably also need some controller handling the central state though, to ensure the user has successfully completed all copies and write some status before they can upgrade.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@djoshy it's not literally a PinnedImageSet, but I am saying that the architecture of PIS and any lessons learned is readily adaptable to the architecture of this. (i.e. it's a thing that has to run on every node, report back when it is done, and roll up status so we can see when the whole cluster is up to date.) Hopefully that can speed up the design because it is not uncharted territory.
It reads to me that the install flow is completely handled by the agent installer (i.e. it will "set up" the registry and image content from the install iso on all the control plane nodes)
This is how the Dev Preview prototype works now. The plan was indeed for this continue because that means we can make sure that all of the nodes have their images cached on disk before proceeding with the installation (although we don't yet do that).
It would certainly be advantageous if we could rely on the IRI and MCO even at install time to create the local registry, as it would avoid duplicating the implementation, and eliminate the risk of ending up with both files on disk due to the ones created by the installation not being managed by the MCO. We could monitor the MCN status and not let the install continue until it is complete on the first 2 control plane nodes, but we would have no way of verifying the third node (the one running assisted-service) after it reboots, although theoretically since we'd have two working copies in the cluster there'd be nothing to stop it from eventually converging. This would require us to run an accessible registry on the bootstrap node, and pull data over the network even though it was previously available locally (though this is the case for worker nodes and during upgrades anyway). Perhaps a middle ground is to pre-populate just the images, and let MCO set up the registry service. @andfasano WDYT?
Does IRI come into the install process at all through the worker nodes?
We're proposing that worker nodes will not have a local registry, and that the control plane will operate as an HA registry for all nodes.
the user unplugs it, plugs it into the second control plane node
No, the idea is that when it is plugged in to the first control plane node, we'll start a temporary registry that is accessible from all control plane nodes. So they'll all ~simultaneously pull the image from the node with the ISO attached.
I think regardless we'd need some daemon process to handle the actual "copy". We probably also need some controller handling the central state though, to ensure the user has successfully completed all copies and write some status before they can upgrade.
Agree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would certainly be advantageous if we could rely on the IRI and MCO even at install time to create the local registry
On reflection, this wouldn't work for single-node 🙁
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps a middle ground is to pre-populate just the images, and let MCO set up the registry service. @andfasano WDYT?
If this means letting AS copying the registry data before the first reboot then yes, I think it'd be more convenient to let MCO setup the registry as soon as possible. The registry must be available at least before machine-config-daemon-pull.service and resolv-prepender
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Outdated
Show resolved
Hide resolved
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Outdated
Show resolved
Hide resolved
palonsoro
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd have some questions and I also found a typo.
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Outdated
Show resolved
Hide resolved
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Outdated
Show resolved
Hide resolved
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Outdated
Show resolved
Hide resolved
| At this point the user could edit the existing IRI resource to add the new | ||
| release to trigger the copy process (from the temporary registry | ||
| to each local control plane node registry), managed by the Machine Config | ||
| Daemon (MCD). Once the copy has been completed on all the control plane nodes, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I guess I'm not too clear on this either. It reads to me that the install flow is completely handled by the agent installer (i.e. it will "set up" the registry and image content from the install iso on all the control plane nodes), and upgrade wise, that responsibility falls onto the MCD via specifications in IRI? Does IRI come into the install process at all through the worker nodes?
Re: upgrade, my read of the process outlined below is that:
- the user plugs in a usb disk ("attach the extended ISO") to the first control plane node
- the user writes IRI for that specific node, so that the MCD copies it
- the user unplugs it, plugs it into the second control plane node
- write IRI to the second node
repeat until done for control plane, then trigger the upgrade for the whole cluster, where the 3 control plane node reads off itself and then have the workers read off of the control planes?
If that's the case, I think regardless we'd need some daemon process to handle the actual "copy". We probably also need some controller handling the central state though, to ensure the user has successfully completed all copies and write some status before they can upgrade.
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Show resolved
Hide resolved
| enabled: true | ||
| status: | ||
| releases: | ||
| - "quay.io/openshift-release-dev/ocp-release:4.18.0-x86_64" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about what we should report here, I wonder if each individual status entry for each payload should have subfields around progressing/success/error/etc, instead of overall conditions, so it's easier to see which payloads are there, which are being pulled, etc.
Although since a user is unlikely to specify more than 2 payloads here at a time, perhaps we just report the most recent error? Would the MCD would be constantly reconciling all releases it expects to be on disk? Or just the latest added/removed one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would the MCD would be constantly reconciling all releases it expects to be on disk? Or just the latest added/removed one?
It should reconcile them all. Otherwise, mid-upgrade scenarios could potentially break if the images from the older release are not being properly synchronized for some reason but still needed because their component is not yet fully upgraded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case I think the status should have sub sections per release on the status. i.e. it would be nice to see how many nodes have successfully pulled each release, what errors there are, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fully agree @yuqi-zhang . If a problem of any kind arises, we would indeed need that degree of detail. Otherwise, we would only know that that "something failed somewhere".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So given this direction, what subfields will the release status have?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general I do see the the action of adding / removing a specific release as an atomic operation, with its (global) status reported here. For the detailed error, could be the previously proposed approach be reused?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible that a user is both adding and removing a release simultaneously? If so, and there are errors both in adding and removing those release images, how would that be reported at the top level?
I think adding some more status in this list with the status (perhaps even tallying which release(s) is(are) currently in use by CVO) of each image makes sense so that we can work out problems with the individual releases
These could just be a set of conditions for each release perhaps, InUse, Ready, etc
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Show resolved
Hide resolved
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Outdated
Show resolved
Hide resolved
| to each local control plane node registry), managed by the Machine Config | ||
| Daemon (MCD). Once the copy has been completed on all the control plane nodes, | ||
| the user could start the usual offline upgrade process via the `oc adm upgrade | ||
| --allow-explicit-upgrade --to-image=<release-image>`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds completely disconnected from the graph, does that matter? Do you care that you can easily go from 4.20.100 to 4.21.0 and fall back in time by two years?
If we had OSUS here this would just work normally, and you can use symbolic names. But we don't have a registry so we don't have OSUS. We could at least verify that the current version is listed in the target release's list of previous versions and that would prevent the upgrading into an older version issue where we would've never tested such an upgrade and it will almost certainly have some regression if you upgrade to a considerably older version. Perhaps put the IRI resource into a failed state if the current version is neither the version of the IRI or in the list of previous versions?
oc adm release info 4.19.8 -o json | jq '.metadata.previous'
[
"4.18.19",
"4.18.20",
"4.18.21",
"4.18.22",
"4.19.0",
"4.19.1",
"4.19.2",
"4.19.3",
"4.19.4",
"4.19.5",
"4.19.6",
"4.19.7"
]
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Outdated
Show resolved
Hide resolved
| extended RHCOS ISO is attached, an then mount it to launch a temporary registry | ||
| to serve its content. | ||
| After the user modified the IRI resource with the new release pullspec, the new | ||
| OCP release image payload will be copied on each node registry using skopeo. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Skopeo is pointed directly at the host service or does it also get an IMDS entry and skopeo thinks it's talking to quay.io? If it had an IMDS entry could the IRI controller be used, unaltered, in a connected environment because it would eventually synchronize from the origin?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's something to be said for that, and also something to be said for not doing that because users who are connected but bandwidth-constrained will hate us if we download data over the network when they already supplied it locally 😄
Either way, the answer needs to be made explicit in this proposal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They'd have supplied it locally via an IMDS pointed at their local registry, right? I assume there's a way to say never even attempt to reach the origin and fail instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sadly I don't think there is.
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Outdated
Show resolved
Hide resolved
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Show resolved
Hide resolved
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Show resolved
Hide resolved
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Outdated
Show resolved
Hide resolved
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Outdated
Show resolved
Hide resolved
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Outdated
Show resolved
Hide resolved
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Show resolved
Hide resolved
| enabled: true | ||
| status: | ||
| releases: | ||
| - "quay.io/openshift-release-dev/ocp-release:4.18.0-x86_64" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So given this direction, what subfields will the release status have?
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Outdated
Show resolved
Hide resolved
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Show resolved
Hide resolved
| adopted by the Machine Config Server (MCS). | ||
| This is a new requirement that will need to be properly documented. In | ||
| addition, to minimize the risks, the registry will have to be secured | ||
| via a pull-secret. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would a user configure the appropriate pull secretes that will be allowed to pull from this registry?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need a user to do that? Auto-generating the credentials (during the registry setup) and eventually saving them into an openshift-config/iri-registry CM could be enough? I do not expect anyhow the user to have any specific need to access it (from outside the cluster), though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll probably need to document how the user can force the credentials to be rotated.
I imagine that at installation time we'll generate credentials and save them in a Secret, as well as add the auth data to the cluster's pull secret.
If the user wants to rotate they'll need to delete the Secret (MCO will automatically replace?) and manually update the cluster pull secret.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds like an important workflow to document in the EP
…operations.md Co-authored-by: Pablo Alonso Rodriguez <[email protected]>
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Show resolved
Hide resolved
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Outdated
Show resolved
Hide resolved
| To allow pulling images from the cluster, an ImageDigestMirrorSet | ||
| (IDMS) policy will be added to the cluster manifests - using the `api-int` | ||
| DNS entry - so that the set of registries running within | ||
| the control plane nodes will become the nominal pullspec of the current release |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is ambiguous.
We will certainly need an IDMS because the containers inside the release payload refer to quay images by digest, and we will need to get them from the mirror.
It doesn't follow from this that the nominal pullspec (the one the release image is referred to as in CVO) is using api-int, so the word 'so' is out of place here. It could be any of:
- quay.io with tag + ITMS (possibly has to be this to use the upgrade graph checking in CVO?)
- quay.io with digest (covered by existing IDMS)
- api-int with tag
- api-int with digest
We need to be explicit about which.
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Outdated
Show resolved
Hide resolved
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Outdated
Show resolved
Hide resolved
| desired release entry value. | ||
|
|
||
| A ValidatingAdmissionPolicy will be added to prevent the user to remove | ||
| a release entry that is currently being used by the cluster (as reported by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can look at ClusterVersion's status.desired.image to get the target version.
If an upgrade is in progress... presumably we have to look at status.history and include every release back to the last one with status=Completed? (Not looking forward to seeing the VAP code for this 😬)
Also how can we tell if the release is using the internal registry? Presumably we need to use api-int as the nominal pullspec to tell them apart from ones pulled directly from the Internet.
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Show resolved
Hide resolved
| ##### Removal of InternalReleaseImage resource | ||
|
|
||
| The deletion of the IRI `cluster` resource will be handled by a finalizer, | ||
| and it will stop all the registry services running on the control plane nodes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'stop' here is systemctl stop + systemctl disable? Or is it 'roll out new MachineConfig without the service'?
If the former, how will we get it to stick without the drift detection ree-enabling it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From an implementation point of view, it will be handled by the IRI controller by deleting the local registry MachineConfig resource
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Outdated
Show resolved
Hide resolved
|  | ||
|
|
||
| A systemd service injected by MCO will ensure the registry will be available | ||
| after the reboot. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
During an upgrade, MCO also needs to replace the registry container image with the one from the new release image.
| A systemd service injected by MCO will ensure the registry will be available | ||
| after the reboot. | ||
| The registry will have to run on a host network port on each node to ensure | ||
| it's accessible from the outside, specifically port 22625 (to be [registered](https://github.com/openshift/enhancements/blob/master/dev-guide/host-port-registry.md)). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest registering as part of this PR 🙂
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Outdated
Show resolved
Hide resolved
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Show resolved
Hide resolved
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Show resolved
Hide resolved
| Finally, a new `InternalReleaseImage` custom resource, managed by MCO, will | ||
| also be added as a cluster manifest to keep track and manage the currently | ||
| stored internal release image. | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add details of the OLM aspect of the installation here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you maybe mean reporting the list of what was exactly included in the current release bundle? If so I will sooner update the proposal with more details on this topic (see also this comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @danmanor for additional comments
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Show resolved
Hide resolved
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Show resolved
Hide resolved
Added a more detailed description for its usage, and proposed a new release identifier (automatically discovered) to be used for adding a new release bundle
pablintino
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding some minor comments. Tomorrow I'll give this a second read, but I think it's OK from a MCO point of view.
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Show resolved
Hide resolved
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Show resolved
Hide resolved
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Show resolved
Hide resolved
| releases: | ||
| - name: ocp-release-bundle-4.18.0-x86_64 | ||
| image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:5bca02661d61955b62889e7e2e648905b7202d5788f5ba5ab69055a73dffdb5c | ||
| conditions: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit/question: I'm a bit surprised of seeing conditions outside the status.conditions section. Despite it makes sense I'd think it's worth to double check if we have already other examples like this one. If this usage of conditions is fine, should we report a global condition indicating if the IRI, from a node perspective, is ready/not to the global ones?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went for the conditions as for the most flexible way to describe the story of each individual release bundle on a given node, useful especially for troubleshooting in case of issue. The alternative could be to switch to a more rigid model with a predefined set of fields. Initially I checked the PinnedImageSet example but it was slightly different.
Anyhow, by using the conditions approach, I wouldn't see the need of additional fields to report the global condition, as the IRI controller could look for a specific condition reason
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps to help differentiate we can tweak the naming
yuqi-zhang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some more questions inline
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Show resolved
Hide resolved
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Show resolved
Hide resolved
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Show resolved
Hide resolved
| releases: | ||
| - name: ocp-release-bundle-4.18.0-x86_64 | ||
| image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:5bca02661d61955b62889e7e2e648905b7202d5788f5ba5ab69055a73dffdb5c | ||
| conditions: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps to help differentiate we can tweak the naming
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Show resolved
Hide resolved
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Show resolved
Hide resolved
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Show resolved
Hide resolved
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Show resolved
Hide resolved
| availableReleases: | ||
| - name: ocp-release-bundle-4.19.0-x86_64 | ||
| releases: | ||
| - name: ocp-release-bundle-4.18.0-x86_64 | ||
| image: quay.io/openshift-release-dev/ocp-release@sha256:5bca02661d61955b62889e7e2e648905b7202d5788f5ba5ab69055a73dffdb5c | ||
| - name: ocp-release-bundle-4.19.0-x86_64 | ||
| image: quay.io/openshift-release-dev/ocp-release@sha256:3482dbdce3a6fb2239684d217bba6fc87453eff3bdb72f5237be4beb22a2160b |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we want to think about the naming of these two fields more. Available to me is suggesting this is something I could use, and I'm not sure exactly what the second releases field is for.
If I've read correctly, availableReleases is the currently mounted release, and releases is the list of stored releases in the local registry, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. In particular status.availableReleases indicates the currently mounted and not yet installed bundles, and it's meant to provide an auto-discovery facility to the end user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may want to think of a name the better distinguishes this from releases that are already installed. mountedReleases?
Available to me is too close to "I can use this for my cluster now", just thinking about how we use the term available in other contexts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe a more explicit installableReleases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just my 2 cents. If we grab the idea from the CVO ClusterVersion CR, that already have the concept of candidates, maybe availableReleases would be more aligned. IMHO mountedReleases exposes the mount implementation detail of how the feature internally works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I understand it, one of the ambiguity with the word "available" comes from the uncertainty whether it's "available for the cluster use" or "available to be pulled in, such that the cluster can use it"
So the argument for dis-ambiguity would be mountedRelease --- locallyAvailableRelease, although I feel like that wordplay might make it more confusing than it helps, since mounted could also be understood as "locally mounted" or similar
Alternatively, like Pablo mentions, the CVO considers the "available to be upgraded to, but not yet on the cluster" as available, so maybe following that pattern would be simpler (and we can clarify via godocs). Then we'd get availableRelease --- activeRelease/localRelease or something like that.
(I guess what I'm saying is I feel like neither field should just be releases to help differentiate, but not a strong opinion other than that)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd tend to agree with @pablintino, as mountedReleases seems too technical also for me.
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Outdated
Show resolved
Hide resolved
enhancements/agent-installer/simplified-registry-less-cluster-operations.md
Show resolved
Hide resolved
|
@andfasano: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
| The release image and operators images will be stored inside the extended RHCOS | ||
| ISO in the storage format used by the [distribution/distribution](https://github.com/distribution/distribution) | ||
| registry’s filesystem storage plugin. | ||
| The Appliance builder will generate a new `release-bundle` (signed) image |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it just a blank image with the bundle.json file? What's the process of signing this container image?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently yes, just the bundle.json file. I'm not aware about the specific details for the signing process, but I do expect we'll have to address it while building the OVE ISO in Konflux (I'd expect something similar to CVO, cc @bfournie in case he could have more details about it)
| [Assisted Service](https://github.com/openshift/assisted-service/) | ||
| and [Assisted Installer](https://github.com/openshift/assisted-installer) | ||
| will be enhanced accordingly to take care of this step, and to ensure it | ||
| will be properly monitored during the installation process. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the benefit of this approach is basically the ability to precisly monitor the installation, right?
If so, I just wonder about the priority of such an effort. I mean, it sounds like the right direction, but it's a pretty major change. Do you think it's a mandatory change that should be handled first? Comparing to the current behavior, do we get any advantages other than allowing to fail more gracefully?
Just want to understand whether it's merely a nice-to-have initiative :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO it's not a nice-to-have feature but a critical step to make the installation workflow more robust and properly monitored.
Copying 40Gb+ of data takes a good amount of time on a baremetal instance (up to 1hr as per the current testing), and several things can go wrong during the process, with the potential risk of leaving the data uncompleted or corrupted (as already experienced in the 4.19 DP release) - and thus the user should get properly informed that everything went fine (or if some problem happened, to avoid having later issues during the joining process, usually very painful to troubleshoot).
During the installation phase the Assisted Installer is the current eligible actor, on each node, responsible to carry on all the required task (such as writing the RHCOS image), and monitoring it's already supported via communication with the Assisted Service, which could be used by the user (via openshift-install wait-for command, or via the local Assisted UI) for that. And in this way we could also ensure that, prior of rebooting a node to let it join the control plane, all the required data (for making it working) will be effectively available.
|
Inactive enhancement proposals go stale after 28d of inactivity. See https://github.com/openshift/enhancements#life-cycle for details. Mark the proposal as fresh by commenting If this proposal is safe to close now please do so with /lifecycle stale |
|
/remove-lifecycle stale |
/draft
This patch contains the enhancement proposal for the simplified cluster operations in the disconnected environments.