-
Notifications
You must be signed in to change notification settings - Fork 535
OCPCLOUD-3344: Update installing-cluster-api-components-in-ocp #1918
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
@mdbooth: This pull request references OCPCLOUD-3344 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
4b5dc1b to
5631e8c
Compare
| Runtime: | ||
| * At startup, CAPI installer reads a list of provider images associated with the current OpenShift release | ||
| * CAPI installer pulls these images and extracts manifests and metadata from the `/capi-operator-manifests` directory | ||
| * The revision controller creates a new Revision references all manifests relevant to the current cluster | ||
| * The installer controller installs the new Revision using Boxcutter | ||
| * Once successful, the installer controller deletes orphaned artifacts associated with previous Revisions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may be further on, but so far not seeing how we exclude CRDs in this flow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't included it, yet. I have an idea of how I will incorporate it (essentially by pulling unmanaged CRDs into a virtual initial component that installs CompatibilityRequirements and waits for them to be Admitted and Compatible), but I wanted to focus on getting the rest of the code written and functional before spending too much time on the detail.
| unmanagedCustomResourceDefinitions: | ||
| - machines.cluster-api.x-k8s.io | ||
| status: | ||
| currentRevision: 4.22.5-0d2d3148cd1faa581e3d2924cdd8e9122d298de41cda51cf4e55fcdc1f5d1463-1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does the hash represent in this name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is so we can reflect spec changes in a uniquely-named new revision. e.g. spec currently contains unmanagedCustomResourceDefinitions. We would want a new revision if that value is altered, even if the manifests have not changed. Additionally, I anticipate a future where the spec contains a list of additional platforms to install. Again we'd want to be able to reflect this in a new revision.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use the generation of the resource instead of a hash? This is what other revision like rollouts do in OpenShift. The generation will increment every time there is a spec update (and annotations and labels IIRC)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The resource generation won't change on upgrade, although the ContentID referenced from provider images will.
In the implementation I've truncated it to 8 characters. It doesn't look too nasty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You said the hash was of the spec, the spec of? I thought you meant the spec of the ClusterAPI resource, in which case, generation would change in the same cadence as the hash
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: Clarify somewhere in this document which 'content' is covered by this 'ContentID'.
It's the fully substituted manifests to be applied, in the order they will be applied, plus any modifications to those APIs based on the spec (e.g. unmanagedCRD). So the source is not just the spec, but also content pulled directly from provider images.
| The installer will process every profile in every provider image. | ||
| The purpose of implementing multiple profiles is to enable future selection of different manifests based on FeatureGates. | ||
| Note that FeatureGate support is not intended for the initial implementation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do we know which profile to install?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
profile is a placeholder name. It is the name I'm currently giving to a single (manifest + metadata) within an image. An image may contain arbitrarily many (manifest + metadata)s. I'm calling them profiles until somebody comes up with a better name.
The revision controller will consider all profiles from all images equally. It will add any profile whose metadata matches its selection criteria, meaning that arbitrarily many profiles may be installed from a single image. It would be the responsibility of the provider to ensure that the metadata is mutually exclusive if only one is required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so each profile will have some sort of metadata file that contains, for example, a featuregate that must be enabled to apply the manifests within? The revision controller isn't looking at the name of the profile, but selecting based on metadata in this file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not following the profile much, how will we be consuming these?
-
Will the user specify the desired profile in the spec somehow, based on the specific feature gates they desire to enable?
-
Are these upstream or downstream feature gates?
-
What happens to the ENV vars currently used in the CAPI manifests to select the FeatureGates?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@damdo It's what @JoelSpeed said. The revision controller doesn't pre-select profiles; it looks at all of them. So if an image contains a default and a techpreview profile, the revision controller doesn't care what they're called and doesn't distinguish between them. It processes both of them equally. It's down to the provider maintainer to ensure that the metadata included in a profile ensures that profile is only installed when it is required.
These would be downstream featuregates. The featuregate support isn't implemented or even designed, yet. We should copy an existing selection pattern from another component. There are a few to choose from.
The current CAPI installer doesn't actually support setting envsubst vars by feature gate. It sets one environment variable unconditionally: https://github.com/openshift/cluster-capi-operator/blob/88febbea0a8ff1d432a6b26828741ce30b71b295/cmd/cluster-capi-operator/main.go#L382-L397
I'm thinking I might actually move envsubst values into profile metadata, which would allow them to:
- be specified in the provider repo, not cluster-capi-operator
- be targetted at their specific provider
- be specified differently according to any other selection criteria we implement for profiles, e.g. featuregates
damdo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @mdbooth
Looking good! A bunch of nits
enhancements/cluster-api/installing-cluster-api-components-in-ocp.md
Outdated
Show resolved
Hide resolved
enhancements/cluster-api/installing-cluster-api-components-in-ocp.md
Outdated
Show resolved
Hide resolved
enhancements/cluster-api/installing-cluster-api-components-in-ocp.md
Outdated
Show resolved
Hide resolved
enhancements/cluster-api/installing-cluster-api-components-in-ocp.md
Outdated
Show resolved
Hide resolved
enhancements/cluster-api/installing-cluster-api-components-in-ocp.md
Outdated
Show resolved
Hide resolved
| - Failure to apply manifests on upgrade | ||
| - CAPI components are already installed. | ||
| The cluster has been upgraded. | ||
| The CAPI installer fails to apply the new manifests | ||
| - Depending on the nature of the failure, it is possible that the CAPI components from the previous version will continue to operate correctly. | ||
| - However, if the failure is because an upgraded Deployment has entered a crash loop, it will not be possible to provision new Machines in the cluster. | ||
| - Either way, the `cluster-api` ClusterOperator will be marked Degraded with a Message indicating an installation failure. | ||
| - The CAPI installer logs will contain details of the failure. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this mechanism also bubble up to the cluster-api's ClusterOperator conditions if a Deployment has issues at Runtime (not just during the installation)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should, yes. However, it will need a safety buffer so it doesn't trip immediately during normal temporary interruptions.
I've implemented something similar for the ephemeral errors reported by the Revision controller's Progressing condition. Essentially an ephemeral error like a 5xx from kube-apiserver is reported as Progressing=True, Reason=EphemeralError, Message=. If this situation persists for 5 minutes we also go Degraded. But we don't want to go Degraded immediately just because kube-apiserver hiccoughed.
I've actually implemented this logic directly in the Revision controller, but I wonder if I should instead move it to the (yet to be written) code which aggregates conditions in the ClusterOperator. i.e. The Revision controller would only ever set its Progressing condition, and the aggregator would upgrade that to Degraded after 5 minutes.
We could do the same for the installer controller. If it sets the Progressing condition to true if it observes a Probe failure (which I'm pretty sure is how it will see a runtime interruption of an operand Deployment). The same aggregator can then upgrade that to Degraded only if it persists for a certain length of time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the aggregator would upgrade that to Degraded after 5 minutes.
+1 from me on this pattern
| - Failure to apply revision after some components successfully applied | ||
| - While applying a revision, some components were installed but one failed. | ||
| - Depending on the nature of the failure, the component that failed to apply may or may not be in a working state. | ||
| - If it is not in a working state, this will most likely result in an inability to provision new Machines using Cluster API. | ||
| - The `cluster-api` ClusterOperator will be marked Degraded with a Message indicating an installation failure. | ||
| - The CAPI installer logs will contain details of the failure. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be nice if there was a way in k8s to atomically apply a set of manifests and roll them back if any fails.. alas
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could actually do that, tbh: we'd create a new revision which is identical to the currently applied revision. It would then re-apply the old revision before removing any dangling objects from the unsuccessfully applied revision.
We'd just need to come up with some way to poke that logic in the revision controller. Perhaps a spec.pin: CurrentAppliedRevision 🤔
We can add that later, though.
| * *PROs*: it would ensure compatibility with the clusterctl ecosystem and format/contract | ||
| * *CONs*: it would require a lot of extra complexity (~500 more LOC) and dependencies to achieve a similar result | ||
|
|
||
| ## Reference |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this becoming a downstream CAPI provider contract? If so, should we mention this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is absolutely a downstream CAPI provider contract. I wrote it to be exactly that. I don't know if this is the best place for it to live, but it's a place it can live. If I were looking for it this is one of the places I'd look, so I figured I'd put it here.
| This must use the `registry.ci.openshift.org` registry. | ||
| `manifests-gen` will return an error if it detects any image reference which does not use `registry.ci.openshift.org`. | ||
|
|
||
| `protect-cluster-resource` indicates the name of the infrastructure cluster resource type which the CAPI operator will create for this provider. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Protect? In what way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The next sentence is intended to explain that.
I didn't add this to manifests-gen, btw, it was already there! Not the command line option, but the behaviour.
| * Set the namespace of all namespaced objects to `openshift-cluster-api` | ||
| * Replace cert-manager with OpenShift Service CA: | ||
| Upstream CAPI providers typically include `cert-manager` metadata and manifests for webhooks. | ||
| `manifests-gen` will automatically translate `cert-manager` manifests and metadata to use OpenShift Service CA instead. | ||
| * Exclude `Namespace` and `Secret` objects from the manifests: we expect these to be created by other means. | ||
| * The following set of changes to all Deployments: | ||
| * Set the annotation `target.workload.openshift.io/management: {"effect": "PreferredDuringScheduling"}`. | ||
| * Set resource requests of all containers to `cpu: 10m` and `memory: 50Mi`. | ||
| * Remove resource limits from all containers. | ||
| * Set the terminationMessagePolicy of all containers to `FallbackToLogsOnError`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I worry this is likely to become stale quickly.
Is it worth referencing some external resource instead? Maybe an AI generated comment of the summary of the changes made by the manifests-gen? Maybe we can add it in the AGENTS.md rules for CCAPIO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we try to solve that problem later?
Co-authored-by: Damiano Donati <[email protected]>
| In summary: | ||
|
|
||
| This proposal does not seek to allow installing multiple infrastructure CAPI providers. Although we want to design things in a way so that we do not inhibit this being implemented in the future. | ||
| * Add support for image-based CAPI manifests to the CAPI installer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we explicitly explain what this means (may help understanding the following sections) - and why our current system is bad? (motivation)
|
|
||
|  | ||
| Runtime: | ||
| * At startup, CAPI installer reads a list of provider images associated with the current OpenShift release |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, images have manifests baked in now - so we basically do this:
$ oc adm release info registry.ci.openshift.org/ocp/release:<release version>
Name: 4.22.0-0.ci-2026-01-22-020420
Digest: sha256:33d624609594a63555fbd5496a8d80333268ddc441550a08bc2ce758ed795244
Created: 2026-01-22T02:04:39Z
OS/Arch: multi (linux/amd64)
Manifests: 837
Metadata files: 2
....
....
Images:
NAME DIGEST
agent-installer-api-server sha256:621fd9923cb65f6a3c51df07b2ecb46c0a8f569a3dec83bacdf76a84c01df6d1
agent-installer-csr-approver sha256:c54f4030017755cc69f4e334bd054af2fbbae857e99c4bdf9d92d70502e64a8f
agent-installer-node-agent sha256:a6d9228934f501db46be58543a206c10a3b5374f50710f3788edbd911b9bf0f3
agent-installer-orchestrator sha256:abde659d8a19a92742d06b155bf8c1a148f7c54841f1033261e96a1d020768e8
...
...and get map of component -> image SHA, which we can then pull and extract manifests from /capi-operator-manifests.
This means on upgrade we know which manifests are in release n-1 as well as n
| Runtime: | ||
| * At startup, CAPI installer reads a list of provider images associated with the current OpenShift release | ||
| * CAPI installer pulls these images and extracts manifests and metadata from the `/capi-operator-manifests` directory | ||
| * The revision controller creates a new Revision that references all manifests relevant to the current cluster |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we go into some more detail here? I'm not following, what's an example revision here, what is it generated from?
edit, just saw the TODO below.
No description provided.