Enhancement proposal for Confidential Clusters #1878

uril · 2025-10-27T18:08:36Z

This enhancement proposes the integration of confidential computing
capabilities into OpenShift cluster, enabling the deployment of
Confidential Clusters

openshift-ci · 2025-10-27T18:09:25Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign dhellmann for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci · 2025-10-27T18:09:42Z

Hi @uril. Thanks for your PR.

I'm waiting for a github.com member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

travier · 2025-11-06T11:38:10Z

/ok-to-test

uril · 2025-11-06T15:24:06Z

/retest

cgwalters

Awesome work overall! There's of course huge amounts of detail in some of this, but I think the outline looks good.

enhancements/security/confidential-clusters.md

cgwalters · 2025-11-06T19:21:33Z

enhancements/security/confidential-clusters.md

+    instance
+
+* RHEL CoreOS
+  * Support verifying the integrity of the disk content during re-provisioning


We need to boot in a pure stateless mode here, where we're not accessing any persistent storage for /etc and /var right?

To address the larger concern that we can not trust the filesystem itself on the disk on first boot, we need some form of integrity verification that covers the entire partition. I could be implemented in a similar fashion to what is done with Secure Execution on S390x.

If we don't take this concern into account, then we could indeed only read the fs-verity verified content from the composefs repo and re-generate the /etc & /var content from it.

cgwalters · 2025-11-06T19:23:06Z

enhancements/security/confidential-clusters.md

+  * Measure Ignition config in a PCR value, before parsing it
+
+* Machine Config Operator
+  * Ensure that MachineConfigs are only served to attested nodes


I still really hope that we get away from having a MCS at all by shrinking the role of Ignition such that everything needed to join fits into the bootstrap config which really really should be able to fit in e.g. AWS instance user-data store completely and the like.

enhancements/security/confidential-clusters.md

cgwalters · 2025-11-06T19:49:06Z

enhancements/security/confidential-clusters.md

+node, which is considered trusted and it is used to bootstrap the trust for the
+rest of the cluster.
+
+In phase 2, the bootstrap node itself must be attested to establish trust. It is


Yeah but won't most people who want to do this actually want HCP anyways? I would definitely put HCP support far in front of this as a priority

That could be an option. HCP deployments place the trust in the cluster hosting the control plane, so for it to make sense for Confidential Clusters, it would be a configuration with a Hosted Control Plane in a trusted environment (likely a Bare Metal cluster) and HCP Workers in a cloud.

If we want everything in the cloud then we are back to the standalone cluster case for the control plane part as you can not claim that your workers are confidential if the control plane is hosted on the same cloud with non confidential VMs.

cgwalters · 2025-11-06T19:50:26Z

enhancements/security/confidential-clusters.md

+have been pre-computed and stored, or pull the container image itself and
+directly compute the values.


I'd lean towards pull, but have a cache of course of container-sha ➡️ PCRs

travier · 2025-11-07T14:52:10Z

Direct link to rendered docs: https://github.com/uril/openshift_enhancements/blob/confidential-clusters-proposal/enhancements/security/confidential-clusters.md

enhancements/security/confidential-clusters.md

yuqi-zhang

Some general comments inline

yuqi-zhang · 2025-11-14T21:11:26Z

enhancements/security/confidential-clusters.md

+
+## Proposal
+
+Run all OpenShift nodes on Confidential VMs (CVMs). Use remote attestation to


Basic question: the CVM is the entire node right? You can't say, run 2 CVMs on one machine, or run things outside of the CVM on that machine?

Right, in our case, OpenShift node == CVM == OpenShift machine.

The "host machine" (the cloud server) can run many CVMs (and other things).

Yes, the entire node runs as a Confidential VM provided by the cloud provider. You don't control on which host your VM runs (it's a cloud), and you can not run things outside.

yuqi-zhang · 2025-11-14T21:12:32Z

enhancements/security/confidential-clusters.md

+components:
+
+* OpenShift API
+  * Allow nodes to be marked as confidential. This is specific per cloud


As a clarification here, this will be a cluster level setting? Or are you proposing that in one cluster, some nodes can be CVMs and others not?

All the nodes of a confidential cluster are CVMs.

The specific configuration/API, for requesting cloud providers to create a CVM is platform dependent and is not kept at the cluster level.

A mixed cluster of non/confidential nodes is technically possible, but is not safe.

It will be cluster wide. A cluster will be either all confidential nodes or not at all. Technically you can mix things, but it does not make sense for a cluster running in a cloud to be mixed.

yuqi-zhang · 2025-11-14T21:13:35Z

enhancements/security/confidential-clusters.md

+    reference-values (expected "correct" values) in Trustee.
+
+* RHEL CoreOS
+  * Add support for composefs (native), UKI, and systemd-boot to bootc (Bootable


Since the MCO and on-cluster RHCOS operations doesn't currently use bootc at all, would that integration be needed here?

We will indeed need "direct" bootc support in the MCO (i.e. not use rpm-ostree at all anymore).

yuqi-zhang · 2025-11-14T21:14:00Z

enhancements/security/confidential-clusters.md

+    (cloud provider specific).
+  * Deploy the Confidential Cluster Operator on the bootstrap node
+
+* Confidential Cluster Operator


Will this be running as a core payload operator that's always present, or only deployed conditionally?

It will be an operator part of the core payload but only running if needed.

yuqi-zhang · 2025-11-14T21:17:08Z

enhancements/security/confidential-clusters.md

+installer, passing in the URL of the external Trustee instance chosen above.
+1. The OpenShift installer generates a set of configuration files for the
+   external Trustee instance.
+1. If the cluster creator adds/removes/modifies MachineConfigs, the


Could you clarify on this point? The admin shouldn't be able to "modify configs on the fly" during installation. The MCO has a singular render generation phase, if that's what you're trying to fetch here.

The idea is that the config that will be used for the external Trustee server will include the full config passed to the bootstrap node. If anything modifies this config, then the Trustee config will have to be re-generated. We don't expect the manifests to be modified live during the installation.

yuqi-zhang · 2025-11-14T21:18:47Z

enhancements/security/confidential-clusters.md

+This enhancement introduces some new API extensions:
+
+* **Running nodes on cloud CVMs**:
+For each supported cloud provider, confidential computing types and code need to


Could you provide some examples for this? Just curious what that would look like in practice.

Also curious if this affects the ongoing MAPI->CAPI transition at all

yuqi-zhang · 2025-11-14T21:24:34Z

enhancements/security/confidential-clusters.md

+In phase 2, the initial configuration will be modified to tell Ignition to fetch
+the new configuration from a remotely attested resource endpoint. The MCS will
+not serve Ignition configs directly for nodes anymore but will store those as
+resources in a Trustee instance. To access those configurations, the node will


Is the trustee instance responsible for asking the MCS for the contents through some in-cluster proxy, or would we have to have the MCS initiate that?

yuqi-zhang · 2025-11-14T21:26:27Z

enhancements/security/confidential-clusters.md

+is created, which hosts a temporary control plane used to create the final
+control plane and worker nodes of the cluster.
+
+In phase 1, the Confidential Cluster Operator is deployed on this bootstrap


Would we GA on phase 1, or would we techpreview phase 1, implement phase 2, and GA the feature after we complete phase 2?

We will dev/tech-preview on phase 1 and phase 2.

I'm not sure about GA after phase 1.

openshift-ci · 2025-11-17T09:10:37Z

@uril: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/markdownlint	`766b9a1`	link	true	`/test markdownlint`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

patrickdillon · 2025-11-17T15:03:28Z

enhancements/security/confidential-clusters.md

+As part of the cluster installation process in cloud platforms, a bootstrap node
+is created, which hosts a temporary control plane used to create the final
+control plane and worker nodes of the cluster.


There is another part of the installation process we should consider, where the installer:

generates the ignition file for the bootstrap node

uploads bootstrap ignition to a cloud storage bucket

puts pointer ignition in bootstrap node userdata redirecting bootstrap node to pull ignition from storage bucket (using a self-signed URL, although for Azure I think self-signed URL support is WIP and azure uses storage account keys)

It's unclear whether this model will continue to work with the remote attestation service. If the First Boot configuration from the attestation service can be merged alongside the bootstrap ignition bucket, then it would require fewer changes to the installer. For example (pseudo), the bootstrap pointer ignition would be injected with the additional attestation server source:

{ "ignition": { "config": { "merge": [ { "source": "http://<registration-service>/ignition" }, { "source": "http://<cloud-bucket>/ignition" } ] } } }

This would utilize ignition's merge functionality to grab the configs from both the attestation server and the cloud bucket.

But if it's a requirement that the remote attestation server is contacted first (or only) then the installer would presumably need to be updated to upload bootstrap ignition so it could be served by the attestation service.

openshift-ci bot requested review from dougbtv and sdodson October 27, 2025 18:09

openshift-ci bot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Oct 27, 2025

openshift-ci bot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Nov 6, 2025

uril force-pushed the confidential-clusters-proposal branch from c213033 to 677a330 Compare November 6, 2025 15:21

cgwalters reviewed Nov 6, 2025

View reviewed changes

travier reviewed Nov 7, 2025

View reviewed changes

enhancements/security/confidential-clusters.md Outdated Show resolved Hide resolved

travier reviewed Nov 7, 2025

View reviewed changes

enhancements/security/confidential-clusters.md Outdated Show resolved Hide resolved

uril force-pushed the confidential-clusters-proposal branch from 677a330 to ff97116 Compare November 9, 2025 20:02

Jakob-Naucke reviewed Nov 12, 2025

View reviewed changes

enhancements/security/confidential-clusters.md Outdated Show resolved Hide resolved

yuqi-zhang reviewed Nov 14, 2025

View reviewed changes

Enhancement proposal for Confidential Clusters

766b9a1

uril force-pushed the confidential-clusters-proposal branch from ff97116 to 766b9a1 Compare November 17, 2025 08:48

patrickdillon reviewed Nov 17, 2025

View reviewed changes

		have been pre-computed and stored, or pull the container image itself and
		directly compute the values.


		## Proposal

		Run all OpenShift nodes on Confidential VMs (CVMs). Use remote attestation to

Enhancement proposal for Confidential Clusters #1878

Are you sure you want to change the base?

Enhancement proposal for Confidential Clusters #1878

Uh oh!

Conversation

uril commented Oct 27, 2025

Uh oh!

openshift-ci bot commented Oct 27, 2025

Uh oh!

openshift-ci bot commented Oct 27, 2025

Uh oh!

travier commented Nov 6, 2025

Uh oh!

uril commented Nov 6, 2025

Uh oh!

cgwalters left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

travier commented Nov 7, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yuqi-zhang left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

travier Nov 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

openshift-ci bot commented Nov 17, 2025

Uh oh!

patrickdillon Nov 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

travier Nov 17, 2025 •

edited

Loading

patrickdillon Nov 17, 2025 •

edited

Loading