Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CORS-3883: Azure Machine Identity API #9538

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

patrickdillon
Copy link
Contributor

@patrickdillon patrickdillon commented Mar 6, 2025

Most significantly, this PR removes the automatic creation of the cluster identity which was attached to control-plane and compute nodes. This identity was originally created in order to authenticate the in-tree cloud-provider-azure. When cloud-provider-azure moved out-of-tree and under the management of the CCMO, it began to use credentials requests to authenticate.

The identity we create for that purpose is now unused. By removing the creation of the identity, we can reduce the permissions required to run an Azure install, paritcularly the User Access Administrator role, which is a significant role.

This PR also extends CAPZ's identity API into the install config machine pool to allow users to customize the identities attached to VMs. The identity will default to None, but we also allow users to BYO UserAssigned identities or to have Azure create a SystemAssigned identity.

For example:

controlPlane:
  platform:
    azure:
      identity:
        type: UserAssigned
        userAssignedIdentities:
        - name: test-capi-id
          resourceGroup: os4-common
          subscription: 433715e6-37fe-4328-af75-3661e13b15fc

uses an existing identity called test-capi-id from the os4-common resource group.

controlPlane:
  platform:
    azure:
      identity:
        type: SystemAssigned

Will create a system-assigned identity and attach it to the VM. It defaults to a contributor role. The system-assigned identity role can also be customized:

controlPlane:
  platform:
    azure:
      identity:
        type: SystemAssigned
        systemAssignedIdentityRole:
          definitionID: /providers/Microsoft.Authorization/roleDefinitions/8e3af657-a8ff-443c-a75c-2fe8c4bcb635

Will give the system-assigned identity the Owner role (the value for definitionID corresponds to the owner role). This will also work for custom roles.

008926c removes any identity from being attached to compute nodes. To configure an identity in MAPI, MAPI expects a user-assigned identity in the machine's resource group. That would only be possible when installing to an existing resource group, but the installer enforces that resource group to be empty. This can be addressed in future work, particularly the MAPI->CAPI transition will allow us to extend this API directly to compute nodes.

Removes automatic creation of the user-assigned identity, as it is
no longer required to authenticate the cloud-provider (CCM handles
this). All subsequent identity-VM relationships will be handled
directly through CAPZ. Subsequent commits will mirror the CAPZ API
to the install config.
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Mar 6, 2025
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Mar 6, 2025

@patrickdillon: This pull request references CORS-3883 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target the "4.19.0" version, but no target version was set.

In response to this:

Most significantly, this PR removes the automatic creation of the cluster identity which was attached to control-plane and compute nodes. This identity was originally created in order to authenticate the in-tree cloud-provider-azure. When cloud-provider-azure moved out-of-tree and under the management of the CCMO, it began to use credentials requests to authenticate.

The identity we create for that purpose is now unused. By removing the creation of the identity, we can reduce the permissions required to run an Azure install, paritcularly the User Access Administrator role, which is a significant role.

This PR also extends CAPZ's identity API into the install config machine pool to allow users to customize the identities attached to VMs. The identity will default to None, but we also allow users to BYO UserAssigned identities or to have Azure create a SystemAssigned identity.

For example:

controlPlane:
 platform:
   azure:
     identity:
       type: UserAssigned
       userAssignedIdentities:
       - name: test-capi-id
         resourceGroup: os4-common
         subscription: 433715e6-37fe-4328-af75-3661e13b15fc

uses an existing identity called test-capi-id from the os4-common resource group.

controlPlane:
 platform:
   azure:
     identity:
       type: SystemAssigned

Will create a system-assigned identity and attach it to the VM. It defaults to a contributor role.

System-assigned identities are also configurable:

controlPlane:
 platform:
   azure:
     identity:
       type: SystemAssigned
       systemAssignedIdentityRole:
         definitionID: Owner

008926c removes any identity from being attached to compute nodes. To configure an identity in MAPI, MAPI expects a user-assigned identity in the machine's resource group. That would only be possible when installing to an existing resource group, but the installer enforces that resource group to be empty. This can be addressed in future work, particularly the MAPI->CAPI transition will allow us to extend this API directly to compute nodes.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested review from bfournie and jhixson74 March 6, 2025 21:10
Copy link
Contributor

openshift-ci bot commented Mar 6, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from patrickdillon. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Mar 6, 2025

@patrickdillon: This pull request references CORS-3883 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target the "4.19.0" version, but no target version was set.

In response to this:

Most significantly, this PR removes the automatic creation of the cluster identity which was attached to control-plane and compute nodes. This identity was originally created in order to authenticate the in-tree cloud-provider-azure. When cloud-provider-azure moved out-of-tree and under the management of the CCMO, it began to use credentials requests to authenticate.

The identity we create for that purpose is now unused. By removing the creation of the identity, we can reduce the permissions required to run an Azure install, paritcularly the User Access Administrator role, which is a significant role.

This PR also extends CAPZ's identity API into the install config machine pool to allow users to customize the identities attached to VMs. The identity will default to None, but we also allow users to BYO UserAssigned identities or to have Azure create a SystemAssigned identity.

For example:

controlPlane:
 platform:
   azure:
     identity:
       type: UserAssigned
       userAssignedIdentities:
       - name: test-capi-id
         resourceGroup: os4-common
         subscription: 433715e6-37fe-4328-af75-3661e13b15fc

uses an existing identity called test-capi-id from the os4-common resource group.

controlPlane:
 platform:
   azure:
     identity:
       type: SystemAssigned

Will create a system-assigned identity and attach it to the VM. It defaults to a contributor role.

008926c removes any identity from being attached to compute nodes. To configure an identity in MAPI, MAPI expects a user-assigned identity in the machine's resource group. That would only be possible when installing to an existing resource group, but the installer enforces that resource group to be empty. This can be addressed in future work, particularly the MAPI->CAPI transition will allow us to extend this API directly to compute nodes.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@patrickdillon patrickdillon force-pushed the azure-identity-cors-3878 branch from f10fb05 to a4ea8d7 Compare March 7, 2025 16:42
Adds types to the machinepool in order to allows users to specify
how identities on control plane and compute nodes should be handled.
By default, we will not use an identity on nodes.
Updates control-plane machine manifests to use the identities specified
in the install config machinepools.

The MAPI API for identities is very basic, and expects to only use
an identity within the same resource group as the cluster. Currently
the installer only allows installing into an empty resource
group so it is not possible for users to BYO identity with MAPI.

That is fine for the time being as there is no known use for having
an identity attached to compute nodes. We can resolve this in future
work, particularly with the MAPI->CAPI transition.
@patrickdillon patrickdillon force-pushed the azure-identity-cors-3878 branch from a4ea8d7 to 329c047 Compare March 7, 2025 17:37
This commit removes the SDK-based creation of the resource group
and depends on CAPZ to reconcile the resource group. Doing this
reduces the amount of code we need to maintain and simplifies the
Azure Stack implementation so that we don't need to worry about
setting the API version for this client.
@patrickdillon
Copy link
Contributor Author

patrickdillon commented Mar 7, 2025

I was going to open a new PR, branched off this one, for[ CORS-3861](https://issues.redhat.com/browse/CORS-3861), which refactors our code to use CAPZ to reconcile resource groups rather than creating with an SDK. I decided that it is simple enough that I should just stick it on the end of this one, so I added that in 90f61a0

@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Mar 7, 2025

@patrickdillon: This pull request references CORS-3883 which is a valid jira issue.

In response to this:

Most significantly, this PR removes the automatic creation of the cluster identity which was attached to control-plane and compute nodes. This identity was originally created in order to authenticate the in-tree cloud-provider-azure. When cloud-provider-azure moved out-of-tree and under the management of the CCMO, it began to use credentials requests to authenticate.

The identity we create for that purpose is now unused. By removing the creation of the identity, we can reduce the permissions required to run an Azure install, paritcularly the User Access Administrator role, which is a significant role.

This PR also extends CAPZ's identity API into the install config machine pool to allow users to customize the identities attached to VMs. The identity will default to None, but we also allow users to BYO UserAssigned identities or to have Azure create a SystemAssigned identity.

For example:

controlPlane:
 platform:
   azure:
     identity:
       type: UserAssigned
       userAssignedIdentities:
       - name: test-capi-id
         resourceGroup: os4-common
         subscription: 433715e6-37fe-4328-af75-3661e13b15fc

uses an existing identity called test-capi-id from the os4-common resource group.

controlPlane:
 platform:
   azure:
     identity:
       type: SystemAssigned

Will create a system-assigned identity and attach it to the VM. It defaults to a contributor role. The system-assigned identity role can also be customized:

controlPlane:
 platform:
   azure:
     identity:
       type: SystemAssigned
       systemAssignedIdentityRole:
         definitionID: /providers/Microsoft.Authorization/roleDefinitions/8e3af657-a8ff-443c-a75c-2fe8c4bcb635

Will give the system-assigned identity the Owner role (the value for definitionID corresponds to the owner role). This will also work for custom roles.

008926c removes any identity from being attached to compute nodes. To configure an identity in MAPI, MAPI expects a user-assigned identity in the machine's resource group. That would only be possible when installing to an existing resource group, but the installer enforces that resource group to be empty. This can be addressed in future work, particularly the MAPI->CAPI transition will allow us to extend this API directly to compute nodes.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@sadasu
Copy link
Contributor

sadasu commented Mar 7, 2025

Do we want to add validations around the new install-config identity parameters?

Copy link
Contributor

openshift-ci bot commented Mar 7, 2025

@patrickdillon: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-azurestack 90f61a0 link false /test e2e-azurestack
ci/prow/e2e-vsphere-externallb-ovn 90f61a0 link false /test e2e-vsphere-externallb-ovn
ci/prow/okd-scos-e2e-aws-ovn 90f61a0 link false /test okd-scos-e2e-aws-ovn
ci/prow/e2e-azure-ovn 90f61a0 link true /test e2e-azure-ovn
ci/prow/e2e-azure-ovn-shared-vpc 90f61a0 link false /test e2e-azure-ovn-shared-vpc
ci/prow/e2e-vsphere-static-ovn 90f61a0 link false /test e2e-vsphere-static-ovn
ci/prow/okd-scos-images 90f61a0 link true /test okd-scos-images
ci/prow/e2e-vsphere-ovn-multi-network 90f61a0 link false /test e2e-vsphere-ovn-multi-network

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants