Skip to content

Conversation

@mdbooth
Copy link
Contributor

@mdbooth mdbooth commented Jan 8, 2026

No description provided.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jan 8, 2026
@openshift-ci-robot
Copy link

openshift-ci-robot commented Jan 8, 2026

@mdbooth: This pull request references OCPCLOUD-3344 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested review from adambkaplan and jupierce January 8, 2026 14:08
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 8, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign moadz for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@mdbooth mdbooth marked this pull request as draft January 8, 2026 14:16
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 8, 2026
Comment on lines 110 to 115
Runtime:
* At startup, CAPI installer reads a list of provider images associated with the current OpenShift release
* CAPI installer pulls these images and extracts manifests and metadata from the `/capi-operator-manifests` directory
* The revision controller creates a new Revision references all manifests relevant to the current cluster
* The installer controller installs the new Revision using Boxcutter
* Once successful, the installer controller deletes orphaned artifacts associated with previous Revisions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be further on, but so far not seeing how we exclude CRDs in this flow

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't included it, yet. I have an idea of how I will incorporate it (essentially by pulling unmanaged CRDs into a virtual initial component that installs CompatibilityRequirements and waits for them to be Admitted and Compatible), but I wanted to focus on getting the rest of the code written and functional before spending too much time on the detail.

unmanagedCustomResourceDefinitions:
- machines.cluster-api.x-k8s.io
status:
currentRevision: 4.22.5-0d2d3148cd1faa581e3d2924cdd8e9122d298de41cda51cf4e55fcdc1f5d1463-1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does the hash represent in this name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is so we can reflect spec changes in a uniquely-named new revision. e.g. spec currently contains unmanagedCustomResourceDefinitions. We would want a new revision if that value is altered, even if the manifests have not changed. Additionally, I anticipate a future where the spec contains a list of additional platforms to install. Again we'd want to be able to reflect this in a new revision.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use the generation of the resource instead of a hash? This is what other revision like rollouts do in OpenShift. The generation will increment every time there is a spec update (and annotations and labels IIRC)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The resource generation won't change on upgrade, although the ContentID referenced from provider images will.

In the implementation I've truncated it to 8 characters. It doesn't look too nasty.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You said the hash was of the spec, the spec of? I thought you meant the spec of the ClusterAPI resource, in which case, generation would change in the same cadence as the hash

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: Clarify somewhere in this document which 'content' is covered by this 'ContentID'.

It's the fully substituted manifests to be applied, in the order they will be applied, plus any modifications to those APIs based on the spec (e.g. unmanagedCRD). So the source is not just the spec, but also content pulled directly from provider images.

Comment on lines +188 to +190
The installer will process every profile in every provider image.
The purpose of implementing multiple profiles is to enable future selection of different manifests based on FeatureGates.
Note that FeatureGate support is not intended for the initial implementation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we know which profile to install?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

profile is a placeholder name. It is the name I'm currently giving to a single (manifest + metadata) within an image. An image may contain arbitrarily many (manifest + metadata)s. I'm calling them profiles until somebody comes up with a better name.

The revision controller will consider all profiles from all images equally. It will add any profile whose metadata matches its selection criteria, meaning that arbitrarily many profiles may be installed from a single image. It would be the responsibility of the provider to ensure that the metadata is mutually exclusive if only one is required.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so each profile will have some sort of metadata file that contains, for example, a featuregate that must be enabled to apply the manifests within? The revision controller isn't looking at the name of the profile, but selecting based on metadata in this file

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not following the profile much, how will we be consuming these?

  1. Will the user specify the desired profile in the spec somehow, based on the specific feature gates they desire to enable?

  2. Are these upstream or downstream feature gates?

  3. What happens to the ENV vars currently used in the CAPI manifests to select the FeatureGates?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@damdo It's what @JoelSpeed said. The revision controller doesn't pre-select profiles; it looks at all of them. So if an image contains a default and a techpreview profile, the revision controller doesn't care what they're called and doesn't distinguish between them. It processes both of them equally. It's down to the provider maintainer to ensure that the metadata included in a profile ensures that profile is only installed when it is required.

These would be downstream featuregates. The featuregate support isn't implemented or even designed, yet. We should copy an existing selection pattern from another component. There are a few to choose from.

The current CAPI installer doesn't actually support setting envsubst vars by feature gate. It sets one environment variable unconditionally: https://github.com/openshift/cluster-capi-operator/blob/88febbea0a8ff1d432a6b26828741ce30b71b295/cmd/cluster-capi-operator/main.go#L382-L397

I'm thinking I might actually move envsubst values into profile metadata, which would allow them to:

  • be specified in the provider repo, not cluster-capi-operator
  • be targetted at their specific provider
  • be specified differently according to any other selection criteria we implement for profiles, e.g. featuregates

Copy link
Member

@damdo damdo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mdbooth

Looking good! A bunch of nits

Comment on lines +282 to +289
- Failure to apply manifests on upgrade
- CAPI components are already installed.
The cluster has been upgraded.
The CAPI installer fails to apply the new manifests
- Depending on the nature of the failure, it is possible that the CAPI components from the previous version will continue to operate correctly.
- However, if the failure is because an upgraded Deployment has entered a crash loop, it will not be possible to provision new Machines in the cluster.
- Either way, the `cluster-api` ClusterOperator will be marked Degraded with a Message indicating an installation failure.
- The CAPI installer logs will contain details of the failure.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this mechanism also bubble up to the cluster-api's ClusterOperator conditions if a Deployment has issues at Runtime (not just during the installation)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should, yes. However, it will need a safety buffer so it doesn't trip immediately during normal temporary interruptions.

I've implemented something similar for the ephemeral errors reported by the Revision controller's Progressing condition. Essentially an ephemeral error like a 5xx from kube-apiserver is reported as Progressing=True, Reason=EphemeralError, Message=. If this situation persists for 5 minutes we also go Degraded. But we don't want to go Degraded immediately just because kube-apiserver hiccoughed.

I've actually implemented this logic directly in the Revision controller, but I wonder if I should instead move it to the (yet to be written) code which aggregates conditions in the ClusterOperator. i.e. The Revision controller would only ever set its Progressing condition, and the aggregator would upgrade that to Degraded after 5 minutes.

We could do the same for the installer controller. If it sets the Progressing condition to true if it observes a Probe failure (which I'm pretty sure is how it will see a runtime interruption of an operand Deployment). The same aggregator can then upgrade that to Degraded only if it persists for a certain length of time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the aggregator would upgrade that to Degraded after 5 minutes.

+1 from me on this pattern

Comment on lines +291 to +296
- Failure to apply revision after some components successfully applied
- While applying a revision, some components were installed but one failed.
- Depending on the nature of the failure, the component that failed to apply may or may not be in a working state.
- If it is not in a working state, this will most likely result in an inability to provision new Machines using Cluster API.
- The `cluster-api` ClusterOperator will be marked Degraded with a Message indicating an installation failure.
- The CAPI installer logs will contain details of the failure.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be nice if there was a way in k8s to atomically apply a set of manifests and roll them back if any fails.. alas

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could actually do that, tbh: we'd create a new revision which is identical to the currently applied revision. It would then re-apply the old revision before removing any dangling objects from the unsuccessfully applied revision.

We'd just need to come up with some way to poke that logic in the revision controller. Perhaps a spec.pin: CurrentAppliedRevision 🤔

We can add that later, though.

* *PROs*: it would ensure compatibility with the clusterctl ecosystem and format/contract
* *CONs*: it would require a lot of extra complexity (~500 more LOC) and dependencies to achieve a similar result

## Reference
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this becoming a downstream CAPI provider contract? If so, should we mention this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is absolutely a downstream CAPI provider contract. I wrote it to be exactly that. I don't know if this is the best place for it to live, but it's a place it can live. If I were looking for it this is one of the places I'd look, so I figured I'd put it here.

This must use the `registry.ci.openshift.org` registry.
`manifests-gen` will return an error if it detects any image reference which does not use `registry.ci.openshift.org`.

`protect-cluster-resource` indicates the name of the infrastructure cluster resource type which the CAPI operator will create for this provider.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Protect? In what way?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The next sentence is intended to explain that.

I didn't add this to manifests-gen, btw, it was already there! Not the command line option, but the behaviour.

Comment on lines +395 to +404
* Set the namespace of all namespaced objects to `openshift-cluster-api`
* Replace cert-manager with OpenShift Service CA:
Upstream CAPI providers typically include `cert-manager` metadata and manifests for webhooks.
`manifests-gen` will automatically translate `cert-manager` manifests and metadata to use OpenShift Service CA instead.
* Exclude `Namespace` and `Secret` objects from the manifests: we expect these to be created by other means.
* The following set of changes to all Deployments:
* Set the annotation `target.workload.openshift.io/management: {"effect": "PreferredDuringScheduling"}`.
* Set resource requests of all containers to `cpu: 10m` and `memory: 50Mi`.
* Remove resource limits from all containers.
* Set the terminationMessagePolicy of all containers to `FallbackToLogsOnError`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I worry this is likely to become stale quickly.
Is it worth referencing some external resource instead? Maybe an AI generated comment of the summary of the changes made by the manifests-gen? Maybe we can add it in the AGENTS.md rules for CCAPIO.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we try to solve that problem later?

In summary:

This proposal does not seek to allow installing multiple infrastructure CAPI providers. Although we want to design things in a way so that we do not inhibit this being implemented in the future.
* Add support for image-based CAPI manifests to the CAPI installer

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we explicitly explain what this means (may help understanding the following sections) - and why our current system is bad? (motivation)


![runtime](./assets/proposed-runtime.png)
Runtime:
* At startup, CAPI installer reads a list of provider images associated with the current OpenShift release

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, images have manifests baked in now - so we basically do this:

$ oc adm release info  registry.ci.openshift.org/ocp/release:<release version>
Name:           4.22.0-0.ci-2026-01-22-020420
Digest:         sha256:33d624609594a63555fbd5496a8d80333268ddc441550a08bc2ce758ed795244
Created:        2026-01-22T02:04:39Z
OS/Arch:        multi (linux/amd64)
Manifests:      837
Metadata files: 2
....
....

Images:
  NAME                                           DIGEST
  agent-installer-api-server                     sha256:621fd9923cb65f6a3c51df07b2ecb46c0a8f569a3dec83bacdf76a84c01df6d1
  agent-installer-csr-approver                   sha256:c54f4030017755cc69f4e334bd054af2fbbae857e99c4bdf9d92d70502e64a8f
  agent-installer-node-agent                     sha256:a6d9228934f501db46be58543a206c10a3b5374f50710f3788edbd911b9bf0f3
  agent-installer-orchestrator                   sha256:abde659d8a19a92742d06b155bf8c1a148f7c54841f1033261e96a1d020768e8
...
...

and get map of component -> image SHA, which we can then pull and extract manifests from /capi-operator-manifests.

This means on upgrade we know which manifests are in release n-1 as well as n

Runtime:
* At startup, CAPI installer reads a list of provider images associated with the current OpenShift release
* CAPI installer pulls these images and extracts manifests and metadata from the `/capi-operator-manifests` directory
* The revision controller creates a new Revision that references all manifests relevant to the current cluster

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we go into some more detail here? I'm not following, what's an example revision here, what is it generated from?

edit, just saw the TODO below.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants