Skip to content

Conversation

@bennerv
Copy link
Member

@bennerv bennerv commented Dec 15, 2020

Work in progress for some early feedback

Which issue this PR addresses:

Fixes 8858759

What this PR does / why we need it:

Expose the cluster service principal validity as a condition in the ARO cluster CRD through use of the operator.

Test plan for issue:

Unit tests, e2e, manual

Check the Service Principal is Valid

oc get cluster cluster -o json | jq -r '.status.conditions[] | select(.type == "ServicePrincipalValid") | .'
{
  "lastTransitionTime": "2021-01-04T23:57:03Z",
  "message": "service principal is valid",
  "reason": "CheckDone",
  "status": "True",
  "type": "ServicePrincipalValid"
}

Get Service Principal Information

CLUSTER=bvesel
RESOURCEGROUP=bvesel
CLUSTER_CLIENT_ID=$(az aro show -g bvesel -n $CLUSTER | jq -r .servicePrincipalProfile.clientId)
CLUSTER_SERVICE_PRINCIPAL=$(az ad sp list --all --filter "appId eq '${CLUSTER_CLIENT_ID}'" | jq -r '.[].objectId')

Get and delete the VNET role assignment

$ az role assignment list --all -o json | jq -r --arg RESOURCEGROUP $RESOURCEGROUP --arg ARG $CLUSTER_SERVICE_PRINCIPAL '.[] | select( (.principalId == $ARG) and (.resourceGroup == $RESOURCEGROUP) and (.roleDefinitionName == "Network Contributor") ) | .id'
/subscriptions/SUB_ID/resourceGroups/bvesel/providers/Microsoft.Network/virtualNetworks/bvesel-vnet/providers/Microsoft.Authorization/roleAssignments/RES_ID

$ az role assignment delete --ids \
	/subscriptions/SUB_ID/resourceGroups/bvesel/providers/Microsoft.Network/virtualNetworks/bvesel-vnet/providers/Microsoft.Authorization/roleAssignments/RES_ID

Recheck the Service Principal

$ oc get cluster cluster -o json | jq -r '.status.conditions[] | select(.type == "ServicePrincipalValid") | .'
{
  "lastTransitionTime": "2021-01-05T00:59:04Z",
  "message": "400: InvalidServicePrincipalPermissions: : The provided service principal does not have Network Contributor permission on vnet '/subscriptions/SUB_ID/resourceGroups/bvesel/providers/Microsoft.Network/virtualNetworks/bvesel-vnet'.",
  "reason": "CheckDone",
  "status": "False",
  "type": "ServicePrincipalValid"
}

Add the Role Assignment Back

# This will add back the role assignment
az aro update -g $RESOURCEGROUP -n $CLUSTER

# Check it's back
$ az role assignment list --all -o json | jq -r --arg RESOURCEGROUP $RESOURCEGROUP --arg ARG $CLUSTER_SERVICE_PRINCIPAL '.[] | select( (.principalId == $ARG) and (.resourceGroup == $RESOURCEGROUP) and (.roleDefinitionName == "Network Contributor") ) | .id'
/subscriptions/SUB_ID/resourceGroups/bvesel/providers/Microsoft.Network/virtualNetworks/bvesel-vnet/providers/Microsoft.Authorization/roleAssignments/RES_ID

$ oc get cluster cluster -o json | jq -r '.status.conditions[] | select(.type == "ServicePrincipalValid") | .'
{
  "lastTransitionTime": "2021-01-04T23:57:03Z",
  "message": "service principal is valid",
  "reason": "CheckDone",
  "status": "True",
  "type": "ServicePrincipalValid"
}

Is there any documentation that needs to be updated for this PR?

No

@bennerv bennerv force-pushed the operator-validate-cluster-sp branch 3 times, most recently from e60402b to f24f13e Compare December 15, 2020 23:08
@bennerv bennerv changed the title Operator validate cluster sp Operator: Expose Cluster SP as Condition Dec 15, 2020
@bennerv bennerv force-pushed the operator-validate-cluster-sp branch 2 times, most recently from d88067f to 68dc45f Compare December 16, 2020 03:55
Copy link
Contributor

@mjudeikis mjudeikis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a good start, but we have some structural changes we need to address before we progress. kicked off POC PR how it could look like. Can you take a look and lets discuss this.

@github-actions
Copy link

Please rebase pull request.

@github-actions github-actions bot added the needs-rebase branch needs a rebase label Dec 18, 2020
@bennerv bennerv force-pushed the operator-validate-cluster-sp branch from 68dc45f to a57ecdd Compare December 21, 2020 21:14
@github-actions github-actions bot removed the needs-rebase branch needs a rebase label Dec 21, 2020
@bennerv bennerv force-pushed the operator-validate-cluster-sp branch 4 times, most recently from 54da84e to 7204bce Compare December 23, 2020 04:57
@bennerv bennerv marked this pull request as ready for review December 23, 2020 05:15
@bennerv bennerv requested a review from jim-minter as a code owner December 23, 2020 05:15
@bennerv
Copy link
Member Author

bennerv commented Dec 23, 2020

@mjudeikis @olga-mir @jim-minter This is ready for review, but there's a few things that need to be changed before the e2e tests pass.

I still need to test running the container on a cluster. I tested locally using a kubeconfig, but when attempting to update the image to a custom quay image I am running into a problem with my glibc version of the compiled binary not matching the version in ubi8-min. Additionally, we will need to push this to arointsvc.azurecr.io/aro at some point (and then prod) for the e2e test to pass as the new condition being exposed by the operator won't be present for the installation to succeed until the image is updated.

Will work on reconciling the glibc version so I can test locally and then sync up on any additional review items.

@bennerv bennerv force-pushed the operator-validate-cluster-sp branch from 7204bce to 532ad86 Compare January 4, 2021 21:42
@github-actions github-actions bot added the needs-rebase branch needs a rebase label Jan 7, 2021
@github-actions
Copy link

github-actions bot commented Jan 7, 2021

Please rebase pull request.

Copy link
Contributor

@nilsanderselde nilsanderselde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, just some nitpicking

@troy0820
Copy link
Contributor

This PR is marked next up but doesn't #1244 need to be merged first and this to be refactored to use the changes from the implementation of the validator? @bennerv @mjudeikis

@github-actions github-actions bot removed the needs-rebase branch needs a rebase label Feb 17, 2021
@github-actions
Copy link

Please rebase pull request.

@github-actions github-actions bot added the needs-rebase branch needs a rebase label Feb 22, 2021
@bennerv bennerv force-pushed the operator-validate-cluster-sp branch from 319724a to c6ebdbf Compare March 16, 2021 17:46
@github-actions github-actions bot removed the needs-rebase branch needs a rebase label Mar 16, 2021
@bennerv bennerv force-pushed the operator-validate-cluster-sp branch from c6ebdbf to 5d1c18b Compare March 16, 2021 18:16
@github-actions github-actions bot added the needs-rebase branch needs a rebase label Mar 22, 2021
@github-actions
Copy link

Please rebase pull request.

Copy link
Contributor

@mjudeikis mjudeikis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very close :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now this call is not needed. Validator initiation with NewValidator is needed to instantiate clients. In this case we dont use clients as we do only ValidateServicePrincipals which gets token inside.

Comment on lines 70 to 72
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All this could be like this:

// IMPORTANT: We provide authorizer = nil because operator service principal validation do not need any of the clients to be initiated inside validator. 
spDynamic, err := dynamic.NewValidator(r.log, &azEnv, resource.SubscriptionID, nil, dynamic.AuthorizerClusterServicePrincipal)
	if err != nil {
		return err
	}

Copy link
Member Author

@bennerv bennerv Mar 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mjudeikis these are two different endpoints (GraphEndpoint, ResourceManagerEndpoint) we're verifying against which is why there are two aad.GetToken calls: 1 within the validator, 1 within the sp checker.

Do we only need to validate against the graph endpoint now? I thought we needed both graph and resourcemanager endpoints as this is what we're doing in openshiftcluster_validatedynamic.go.

https://github.com/Azure/ARO-RP/blob/master/pkg/api/validate/openshiftcluster_validatedynamic.go#L73-L86

@bennerv bennerv force-pushed the operator-validate-cluster-sp branch from 5d1c18b to 29af0ae Compare March 22, 2021 14:13
@bennerv bennerv requested a review from jewzaam as a code owner March 22, 2021 14:13
@github-actions github-actions bot removed the needs-rebase branch needs a rebase label Mar 22, 2021
@bennerv bennerv force-pushed the operator-validate-cluster-sp branch from 29af0ae to 83dd576 Compare March 22, 2021 14:20
@github-actions github-actions bot added the needs-rebase branch needs a rebase label Mar 24, 2021
@github-actions
Copy link

Please rebase pull request.

@bennerv bennerv force-pushed the operator-validate-cluster-sp branch from 83dd576 to 616cb04 Compare March 25, 2021 14:15
@github-actions github-actions bot removed the needs-rebase branch needs a rebase label Mar 25, 2021
@github-actions
Copy link

github-actions bot commented Apr 8, 2021

Please rebase pull request.

@github-actions github-actions bot added the needs-rebase branch needs a rebase label Apr 8, 2021
Copy link
Contributor

@mjudeikis mjudeikis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to tide this up and merge :)

@bennerv bennerv force-pushed the operator-validate-cluster-sp branch from 616cb04 to 70a0d87 Compare April 14, 2021 19:34
@github-actions github-actions bot removed the needs-rebase branch needs a rebase label Apr 14, 2021
@bennerv bennerv force-pushed the operator-validate-cluster-sp branch from 70a0d87 to 444aaca Compare April 14, 2021 19:35
@mjudeikis mjudeikis merged commit 7feb3bb into Azure:master Apr 15, 2021
@bennerv bennerv deleted the operator-validate-cluster-sp branch August 9, 2021 13:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants