Skip to content

CONSOLE 2919: create OAuth clients on spoke clusters - rebased#602

Merged
jhadvig merged 3 commits intoopenshift:master-multi-cluster-featurefrom
florkbr:CONSOLE-2919-OAUTH-REBASED
Nov 10, 2021
Merged

CONSOLE 2919: create OAuth clients on spoke clusters - rebased#602
jhadvig merged 3 commits intoopenshift:master-multi-cluster-featurefrom
florkbr:CONSOLE-2919-OAUTH-REBASED

Conversation

@florkbr
Copy link
Copy Markdown
Contributor

@florkbr florkbr commented Oct 12, 2021

Creates oauth clients on spoke clusters using the new managed cluster
controller via a dynamic client.

TheRealJon and others added 2 commits October 11, 2021 22:33
- Still need feature gate and oauth  work for multicluster to be fully implmented
- Add managed cluster controller that watches ACM ManagedCluster resources and builds a config file that can be consumed by the console backend
- Update console deployment sync with new volumes and volume mounts for managed cluster config and CA bundles.
- Add managed cluster and API server CA bundle ConfigMaps to subresources
- Update informer event filter utils
- Add github.com/open-cluster-management/api dependency
Creates oauth clients on spoke clusters using the new managed cluster
controller via a dynamic client.

CONSOLE 2919: clean up oauth client impl on spoke clusters

Add informers for CRD changes via dyanmic informer
@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 12, 2021
@openshift-ci openshift-ci Bot requested review from jhadvig and spadgett October 12, 2021 02:41
@florkbr florkbr requested a review from TheRealJon October 12, 2021 02:42
@florkbr
Copy link
Copy Markdown
Contributor Author

florkbr commented Oct 13, 2021

/retest

2 similar comments
@florkbr
Copy link
Copy Markdown
Contributor Author

florkbr commented Oct 13, 2021

/retest

@florkbr
Copy link
Copy Markdown
Contributor Author

florkbr commented Oct 13, 2021

/retest

apiVersion: v1
kind: ConfigMap
metadata:
namespace: openshift-config-managed
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a config specific to the console, I think this should be created in the openshift-console namespace.

Suggested change
namespace: openshift-config-managed
namespace: openshift-console

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, we should put it int console's namespace

Copy link
Copy Markdown
Member

@TheRealJon TheRealJon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good! I think after addressing the few comments I left, we can probably merge this. There is still more follow-on work to do that we can address in future PRs:

  • Update console deployment with Volumes and VolumeMounts for the managed cluster default ingress cert bundles.
  • Add Oauth.ClientID, Oauth.ClientSecret, and Oauth.CAFile properties for each managed cluster config in the main managed-clusters ConfigMap.
  • Make the managed cluster controller idempotent. This will include changes to my original logic because I didn't consider this. When a managed cluster is detached, we need to make sure all related resources that were created by our controller get deleted. We also need to make sure we clean up when no MangedClusters are found or the ManagedCluster API is not found on the cluster (ACM isn't installed).

Comment thread pkg/console/subresource/managedclusteraction/managed_cluster_action.go Outdated
Comment thread pkg/console/subresource/managedclusterview/managed_cluster_view.go Outdated
Comment thread pkg/console/subresource/managedclusterview/managed_cluster_view.go
Comment thread pkg/console/subresource/util/util.go
return nil, ""
}

func (c *ManagedClusterController) removeManagedClusters(ctx context.Context) error {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will need to add logic here to remove any ManagedClusterAction, ManageClusterView, and ConfigMap resources that were created by this controller when the operator is in a "Removed" state.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be good to have an object in the controller struct which would contain list of created resources, to keep track of them.
On removeManagedClusters we would just flush the list.

Or just label all those resources with api.ManagedClusterLabel and then just list them and delete them

URL: clientConfig.URL,
CAFile: fmt.Sprintf("%s/%s/%s", api.ManagedClusterAPIServerCAMountDir, configmapsub.APIServerCAConfigMapName(clusterName), api.ManagedClusterAPIServerCAKey),
},
})
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-on: We will need to add oauth info for each managed cluster to the main managed-cluster configmap:

Suggested change
})
Oauth: {
ClientID: // Managed cluster oauth client name,
ClientSecret: // Managed cluster oauth client secret,
CAFile: // Path to ca bundle file (determined by the VolumeMount that's created in the Deployment)
},
})

if !status || statusErr != nil {
namespace, namespaceErr := managedclusterviewsub.GetNamespace(managedClusterOAuthView)
if namespaceErr != nil || namespace == "" {
managedClusterListErrors = append(managedClusterListErrors, fmt.Sprintf("Unable to create oauth client for cluster %v: managed cluster view status unknown", namespace))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

namespace is likely to be an empty string here, but we are using it in a formatted string.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
managedClusterListErrors = append(managedClusterListErrors, fmt.Sprintf("Unable to create oauth client for cluster %v: managed cluster view status unknown", namespace))
managedClusterListErrors = append(managedClusterListErrors, fmt.Sprintf("Unable to create oauth client for cluster %s: managed cluster view status unknown", namespace))

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should log the namespaceErr here

managedClusterListErrors := []string{}
for _, managedClusterOAuthView := range managedClusterOAuthClientViews {
status, statusErr := managedclusterviewsub.GetStatus(managedClusterOAuthView)
if !status || statusErr != nil {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm reading correctly, we create an MCA for any MCV where we don't know a status, but we do know a namespace? Seems like there could be a race condition here where the status hasn't had time to propagate to the MCV and we try to create an MCA.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does the status of the contain any meaning full info? Dont see anything particularly helpful in the Unstructured struct & funcs

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does !status == statusErr?

}

func (c *ManagedClusterController) SyncManagedClusterViewIngressCert(ctx context.Context, operatorConfig *operatorv1.Console) ([]*unstructured.Unstructured, error, string) {
managedClusters, listErr := c.listManagedClusters(ctx)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We list managed clusters at least 3 times per sync loop. We could probably just do this once at the beginning of the sync loop.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same for:

gvr := managedclusterviewsub.GetGVR()

@TheRealJon
Copy link
Copy Markdown
Member

  • Make the managed cluster controller idempotent. This will include changes to my original logic because I didn't consider this. When a managed cluster is detached, we need to make sure all related resources that were created by our controller get deleted. We also need to make sure we clean up when no MangedClusters are found or the ManagedCluster API is not found on the cluster (ACM isn't installed).

We'll also need to consider this when we add the feature flag. Going from true to false state will mean that we need to clean up.

Copy link
Copy Markdown
Member

@jhadvig jhadvig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @florkbr @TheRealJon thanks for the PR with your commits. The work looks really good!
Added couple of comments.

apiVersion: v1
kind: ConfigMap
metadata:
namespace: openshift-config-managed
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, we should put it int console's namespace


// Get a list of ManagedCluster resources, degraded if error is returned
managedClusterClientConfigs, managedClusterClientConfigValidationErr, managedClusterClientConfigValidationErrReason := c.ValidateManagedClusterClientConfigs(ctx, operatorConfig, controllerContext.Recorder())
statusHandler.AddConditions(status.HandleProgressingOrDegraded("ManagedClusterValidation", managedClusterClientConfigValidationErrReason, managedClusterClientConfigValidationErr))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and below as well I would suggest to reduce the amount of the condition types prefixes.

Suggested change
statusHandler.AddConditions(status.HandleProgressingOrDegraded("ManagedClusterValidation", managedClusterClientConfigValidationErrReason, managedClusterClientConfigValidationErr))
statusHandler.AddConditions(status.HandleProgressingOrDegraded("ManagedClusterSync", managedClusterClientConfigValidationErrReason, managedClusterClientConfigValidationErr))

The issue here is that each type prefix will register conditions that we need to maintain. For example calling status.HandleProgressingOrDegraded("ManagedClusterValidation" ...) will register these two conditions:

  • ManagedClusterValidationProgressing
  • ManagedClusterValidationDegraded

For that reason we should reduce those prefixes to only those that would make sense for debugging issues and rather make clear error reasons.

}

// Return slice of clusterv1.ClientConfigs that have been validated or error and reaons if unable to list ManagedClusters
func (c *ManagedClusterController) ValidateManagedClusterClientConfigs(ctx context.Context, operatorConfig *operatorv1.Console, recorder events.Recorder) (map[string]*clusterv1.ClientConfig, error, string) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: error should go as last return value

Suggested change
func (c *ManagedClusterController) ValidateManagedClusterClientConfigs(ctx context.Context, operatorConfig *operatorv1.Console, recorder events.Recorder) (map[string]*clusterv1.ClientConfig, error, string) {
func (c *ManagedClusterController) ValidateManagedClusterClientConfigs(ctx context.Context, operatorConfig *operatorv1.Console, recorder events.Recorder) (map[string]*clusterv1.ClientConfig, string, error) {

return statusHandler.FlushAndReturn(configSyncErr)
}

// Return slice of clusterv1.ClientConfigs that have been validated or error and reaons if unable to list ManagedClusters
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Return slice of clusterv1.ClientConfigs that have been validated or error and reaons if unable to list ManagedClusters
// Return slice of clusterv1.ClientConfigs that have been validated or error and its reason, if unable to list ManagedClusters

return validatedClientConfigs, nil, ""
}

// Using ManagedCluster.spec.ManagedClusterClientConfigs, sync ConfigMaps containing the API server CA bundle for each managed cluster
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great to see all those godocs 👍

Comment thread pkg/console/subresource/util/util.go
OAuthServingCert(useDefaultCAFile).
APIServerURL(getApiUrl(infrastructureConfig)).
InactivityTimeout(inactivityTimeoutSeconds).
ManagedClusterConfigFile(fmt.Sprintf("%v/%v", api.ManagedClusterConfigMountDir, api.ManagedClusterConfigKey)).
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ManagedClusterConfigFile(fmt.Sprintf("%v/%v", api.ManagedClusterConfigMountDir, api.ManagedClusterConfigKey)).
ManagedClusterConfigFile(fmt.Sprintf("%s/%s", api.ManagedClusterConfigMountDir, api.ManagedClusterConfigKey)).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this file always present on the cluster ? What does it contain if the cluster is not part of the ACM ?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondring if we should not pass canMountManagedClusterConfig to the DefaultConfigMap() method and based on that bool either pass the fmt.Sprintf("%s/%s", api.ManagedClusterConfigMountDir, api.ManagedClusterConfigKey) or not?

// Degraded if managed cluster config map is present but doesn't have the correct data key
_, ok := managedClusterConfigMap.Data[api.ManagedClusterConfigKey]
if !ok {
return false, "MissingManagedClusterConfig", fmt.Errorf("%v ConfigMap is missing %v data key", api.ManagedClusterConfigMapName, api.ManagedClusterConfigKey)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return false, "MissingManagedClusterConfig", fmt.Errorf("%v ConfigMap is missing %v data key", api.ManagedClusterConfigMapName, api.ManagedClusterConfigKey)
return false, "MissingManagedClusterConfig", fmt.Errorf("%s ConfigMap is missing %s data key", api.ManagedClusterConfigMapName, api.ManagedClusterConfigKey)

// Create managed cluster views for oauth clients
managedClusterViewOAuthClients, managedClusterViewOAuthClientErr, managedClusterViewOAuthClientErrReason := c.SyncManagedClusterViewOAuthClient(ctx, operatorConfig)
statusHandler.AddConditions(status.HandleProgressingOrDegraded("ManagedClusterViewOAuthClientSync", managedClusterViewOAuthClientErrReason, managedClusterViewOAuthClientErr))

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why arent we erroring out on error here and below ?

if managedClusterViewOAuthClientErr != nil {
         return statusHandler.FlushAndReturn(managedClusterViewOAuthClientErr)
}

@florkbr florkbr changed the title [WIP] CONSOLE 2919: create OAuth clients on spoke clusters - rebased CONSOLE 2919: create OAuth clients on spoke clusters - rebased Nov 2, 2021
@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 2, 2021
@florkbr
Copy link
Copy Markdown
Contributor Author

florkbr commented Nov 2, 2021

/retest

2 similar comments
@florkbr
Copy link
Copy Markdown
Contributor Author

florkbr commented Nov 2, 2021

/retest

@florkbr
Copy link
Copy Markdown
Contributor Author

florkbr commented Nov 3, 2021

/retest

Copy link
Copy Markdown
Member

@jhadvig jhadvig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve
/lgtm

@jhadvig
Copy link
Copy Markdown
Member

jhadvig commented Nov 4, 2021

Tagging the PR to merge to feature branch. Would be good to address pending comments.

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Nov 4, 2021
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Nov 4, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: florkbr, jhadvig

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 4, 2021
@jhadvig
Copy link
Copy Markdown
Member

jhadvig commented Nov 4, 2021

/test e2e-aws-operator

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@jhadvig
Copy link
Copy Markdown
Member

jhadvig commented Nov 4, 2021

/test e2e-aws-operator

@jhadvig
Copy link
Copy Markdown
Member

jhadvig commented Nov 4, 2021

=== RUN   TestUnmanaged
    unmanaged_test.go:12: waiting for setup to reach settled state...
    unmanaged_test.go:13: changing console operator state to 'Unmanaged'...
    framework.go:71: deleting console's console-config ConfigMap...
    framework.go:71: deleting console's oc-cli-downloads ConsoleCLIDownloads...
    framework.go:71: deleting console's odo-cli-downloads ConsoleCLIDownloads...
    framework.go:71: deleting console's console Deployment...
    framework.go:71: deleting console's downloads Deployment...
    framework.go:71: deleting console's console Route...
    framework.go:71: deleting console's console Service...
    unmanaged_test.go:28: validating that the operator does not recreate deleted resources when ManagementState:Unmanaged...
validating console resources have been removed... true
validating console resources have been removed... true
E1104 18:42:38.865559   10271 request.go:1027] Unexpected error when reading response body: read tcp 10.131.71.50:39510->52.8.29.89:6443: read: connection reset by peer
validating console resources have been removed... false
    unmanaged_test.go:31: unexpected error when reading response body. Please retry. Original error: read tcp 10.131.71.50:39510->52.8.29.89:6443: read: connection reset by peer
    unmanaged_test.go:18: waiting for cleanup to reach settled state...
--- FAIL: TestUnmanaged (177.39s)
=== RUN   TestEditUnmanagedConfigMap
    unmanaged_test.go:12: waiting for setup to reach settled state...
    unmanaged_test.go:13: changing console operator state to 'Unmanaged'...
    util.go:28: patching Data on the console ConfigMap
    util.go:35: polling for patched Data on the console ConfigMap
    unmanaged_test.go:41: error: timed out waiting for the condition
    unmanaged_test.go:18: waiting for cleanup to reach settled state...
    console-operator.go:370: waited 10 seconds to reach settled state...
--- FAIL: TestEditUnmanagedConfigMap (33.79s)

/test e2e-aws-operator

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

3 similar comments
@openshift-bot
Copy link
Copy Markdown
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@jhadvig jhadvig merged commit 33ecbfd into openshift:master-multi-cluster-feature Nov 10, 2021
florkbr added a commit to florkbr/console-operator that referenced this pull request Dec 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants