CONSOLE 2919: create OAuth clients on spoke clusters - rebased by florkbr · Pull Request #602 · openshift/console-operator

florkbr · 2021-10-12T02:41:10Z

Creates oauth clients on spoke clusters using the new managed cluster
controller via a dynamic client.

- Still need feature gate and oauth work for multicluster to be fully implmented - Add managed cluster controller that watches ACM ManagedCluster resources and builds a config file that can be consumed by the console backend - Update console deployment sync with new volumes and volume mounts for managed cluster config and CA bundles. - Add managed cluster and API server CA bundle ConfigMaps to subresources - Update informer event filter utils - Add github.com/open-cluster-management/api dependency

Creates oauth clients on spoke clusters using the new managed cluster controller via a dynamic client. CONSOLE 2919: clean up oauth client impl on spoke clusters Add informers for CRD changes via dyanmic informer

florkbr · 2021-10-13T12:59:09Z

/retest

florkbr · 2021-10-13T15:51:50Z

/retest

florkbr · 2021-10-13T18:28:46Z

/retest

TheRealJon · 2021-10-13T19:13:52Z

+apiVersion: v1
+kind: ConfigMap
+metadata:
+  namespace: openshift-config-managed


Since this is a config specific to the console, I think this should be created in the openshift-console namespace.

Suggested change

namespace: openshift-config-managed

namespace: openshift-console

yes, we should put it int console's namespace

TheRealJon

This looks good! I think after addressing the few comments I left, we can probably merge this. There is still more follow-on work to do that we can address in future PRs:

Update console deployment with Volumes and VolumeMounts for the managed cluster default ingress cert bundles.
Add Oauth.ClientID, Oauth.ClientSecret, and Oauth.CAFile properties for each managed cluster config in the main managed-clusters ConfigMap.
Make the managed cluster controller idempotent. This will include changes to my original logic because I didn't consider this. When a managed cluster is detached, we need to make sure all related resources that were created by our controller get deleted. We also need to make sure we clean up when no MangedClusters are found or the ManagedCluster API is not found on the cluster (ACM isn't installed).

TheRealJon · 2021-10-13T20:27:23Z

+	return nil, ""
+}
+
+func (c *ManagedClusterController) removeManagedClusters(ctx context.Context) error {


We will need to add logic here to remove any ManagedClusterAction, ManageClusterView, and ConfigMap resources that were created by this controller when the operator is in a "Removed" state.

would be good to have an object in the controller struct which would contain list of created resources, to keep track of them.
On removeManagedClusters we would just flush the list.

Or just label all those resources with api.ManagedClusterLabel and then just list them and delete them

TheRealJon · 2021-10-13T20:37:50Z

+				URL:    clientConfig.URL,
+				CAFile: fmt.Sprintf("%s/%s/%s", api.ManagedClusterAPIServerCAMountDir, configmapsub.APIServerCAConfigMapName(clusterName), api.ManagedClusterAPIServerCAKey),
+			},
+		})


Follow-on: We will need to add oauth info for each managed cluster to the main managed-cluster configmap:

Suggested change

})

Oauth: {

ClientID: // Managed cluster oauth client name,

ClientSecret: // Managed cluster oauth client secret,

CAFile: // Path to ca bundle file (determined by the VolumeMount that's created in the Deployment)

},

})

TheRealJon · 2021-10-13T21:24:57Z

+		if !status || statusErr != nil {
+			namespace, namespaceErr := managedclusterviewsub.GetNamespace(managedClusterOAuthView)
+			if namespaceErr != nil || namespace == "" {
+				managedClusterListErrors = append(managedClusterListErrors, fmt.Sprintf("Unable to create oauth client for cluster %v: managed cluster view status unknown", namespace))


namespace is likely to be an empty string here, but we are using it in a formatted string.

Suggested change

managedClusterListErrors = append(managedClusterListErrors, fmt.Sprintf("Unable to create oauth client for cluster %v: managed cluster view status unknown", namespace))

managedClusterListErrors = append(managedClusterListErrors, fmt.Sprintf("Unable to create oauth client for cluster %s: managed cluster view status unknown", namespace))

We should log the namespaceErr here

TheRealJon · 2021-10-13T21:35:50Z

+	managedClusterListErrors := []string{}
+	for _, managedClusterOAuthView := range managedClusterOAuthClientViews {
+		status, statusErr := managedclusterviewsub.GetStatus(managedClusterOAuthView)
+		if !status || statusErr != nil {


If I'm reading correctly, we create an MCA for any MCV where we don't know a status, but we do know a namespace? Seems like there could be a race condition here where the status hasn't had time to propagate to the MCV and we try to create an MCA.

does the status of the contain any meaning full info? Dont see anything particularly helpful in the Unstructured struct & funcs

does !status == statusErr?

TheRealJon · 2021-10-13T21:37:41Z

+}
+
+func (c *ManagedClusterController) SyncManagedClusterViewIngressCert(ctx context.Context, operatorConfig *operatorv1.Console) ([]*unstructured.Unstructured, error, string) {
+	managedClusters, listErr := c.listManagedClusters(ctx)


We list managed clusters at least 3 times per sync loop. We could probably just do this once at the beginning of the sync loop.

The same for:

gvr := managedclusterviewsub.GetGVR()

TheRealJon · 2021-10-15T16:47:05Z

Make the managed cluster controller idempotent. This will include changes to my original logic because I didn't consider this. When a managed cluster is detached, we need to make sure all related resources that were created by our controller get deleted. We also need to make sure we clean up when no MangedClusters are found or the ManagedCluster API is not found on the cluster (ACM isn't installed).

We'll also need to consider this when we add the feature flag. Going from true to false state will mean that we need to clean up.

jhadvig

Hey @florkbr @TheRealJon thanks for the PR with your commits. The work looks really good!
Added couple of comments.

jhadvig · 2021-10-19T11:20:48Z

+apiVersion: v1
+kind: ConfigMap
+metadata:
+  namespace: openshift-config-managed


yes, we should put it int console's namespace

jhadvig · 2021-10-19T11:36:44Z

+
+	// Get a list of ManagedCluster resources, degraded if error is returned
+	managedClusterClientConfigs, managedClusterClientConfigValidationErr, managedClusterClientConfigValidationErrReason := c.ValidateManagedClusterClientConfigs(ctx, operatorConfig, controllerContext.Recorder())
+	statusHandler.AddConditions(status.HandleProgressingOrDegraded("ManagedClusterValidation", managedClusterClientConfigValidationErrReason, managedClusterClientConfigValidationErr))


Here and below as well I would suggest to reduce the amount of the condition types prefixes.

Suggested change

statusHandler.AddConditions(status.HandleProgressingOrDegraded("ManagedClusterValidation", managedClusterClientConfigValidationErrReason, managedClusterClientConfigValidationErr))

statusHandler.AddConditions(status.HandleProgressingOrDegraded("ManagedClusterSync", managedClusterClientConfigValidationErrReason, managedClusterClientConfigValidationErr))

The issue here is that each type prefix will register conditions that we need to maintain. For example calling status.HandleProgressingOrDegraded("ManagedClusterValidation" ...) will register these two conditions:

ManagedClusterValidationProgressing

ManagedClusterValidationDegraded

For that reason we should reduce those prefixes to only those that would make sense for debugging issues and rather make clear error reasons.

jhadvig · 2021-10-19T12:14:45Z

+}
+
+// Return slice of clusterv1.ClientConfigs that have been validated or error and reaons if unable to list ManagedClusters
+func (c *ManagedClusterController) ValidateManagedClusterClientConfigs(ctx context.Context, operatorConfig *operatorv1.Console, recorder events.Recorder) (map[string]*clusterv1.ClientConfig, error, string) {


nit: error should go as last return value

Suggested change

func (c *ManagedClusterController) ValidateManagedClusterClientConfigs(ctx context.Context, operatorConfig *operatorv1.Console, recorder events.Recorder) (map[string]*clusterv1.ClientConfig, error, string) {

func (c *ManagedClusterController) ValidateManagedClusterClientConfigs(ctx context.Context, operatorConfig *operatorv1.Console, recorder events.Recorder) (map[string]*clusterv1.ClientConfig, string, error) {

jhadvig · 2021-10-19T12:15:10Z

+	return statusHandler.FlushAndReturn(configSyncErr)
+}
+
+// Return slice of clusterv1.ClientConfigs that have been validated or error and reaons if unable to list ManagedClusters


Suggested change

// Return slice of clusterv1.ClientConfigs that have been validated or error and reaons if unable to list ManagedClusters

// Return slice of clusterv1.ClientConfigs that have been validated or error and its reason, if unable to list ManagedClusters

jhadvig · 2021-10-19T12:17:32Z

+	return validatedClientConfigs, nil, ""
+}
+
+// Using ManagedCluster.spec.ManagedClusterClientConfigs, sync ConfigMaps containing the API server CA bundle for each managed cluster


Great to see all those godocs 👍

jhadvig · 2021-10-19T13:25:55Z

 		OAuthServingCert(useDefaultCAFile).
 		APIServerURL(getApiUrl(infrastructureConfig)).
 		InactivityTimeout(inactivityTimeoutSeconds).
+		ManagedClusterConfigFile(fmt.Sprintf("%v/%v", api.ManagedClusterConfigMountDir, api.ManagedClusterConfigKey)).


Suggested change

ManagedClusterConfigFile(fmt.Sprintf("%v/%v", api.ManagedClusterConfigMountDir, api.ManagedClusterConfigKey)).

ManagedClusterConfigFile(fmt.Sprintf("%s/%s", api.ManagedClusterConfigMountDir, api.ManagedClusterConfigKey)).

Is this file always present on the cluster ? What does it contain if the cluster is not part of the ACM ?

Wondring if we should not pass canMountManagedClusterConfig to the DefaultConfigMap() method and based on that bool either pass the fmt.Sprintf("%s/%s", api.ManagedClusterConfigMountDir, api.ManagedClusterConfigKey) or not?

jhadvig · 2021-10-19T13:38:14Z

+	// Degraded if managed cluster config map is present but doesn't have the correct data key
+	_, ok := managedClusterConfigMap.Data[api.ManagedClusterConfigKey]
+	if !ok {
+		return false, "MissingManagedClusterConfig", fmt.Errorf("%v ConfigMap is missing %v data key", api.ManagedClusterConfigMapName, api.ManagedClusterConfigKey)


Suggested change

return false, "MissingManagedClusterConfig", fmt.Errorf("%v ConfigMap is missing %v data key", api.ManagedClusterConfigMapName, api.ManagedClusterConfigKey)

return false, "MissingManagedClusterConfig", fmt.Errorf("%s ConfigMap is missing %s data key", api.ManagedClusterConfigMapName, api.ManagedClusterConfigKey)

jhadvig · 2021-10-19T13:40:02Z

+	// Create managed cluster views for oauth clients
+	managedClusterViewOAuthClients, managedClusterViewOAuthClientErr, managedClusterViewOAuthClientErrReason := c.SyncManagedClusterViewOAuthClient(ctx, operatorConfig)
+	statusHandler.AddConditions(status.HandleProgressingOrDegraded("ManagedClusterViewOAuthClientSync", managedClusterViewOAuthClientErrReason, managedClusterViewOAuthClientErr))
+


Why arent we erroring out on error here and below ?

if managedClusterViewOAuthClientErr != nil { return statusHandler.FlushAndReturn(managedClusterViewOAuthClientErr) }

florkbr · 2021-11-02T18:15:59Z

/retest

florkbr · 2021-11-02T22:16:01Z

/retest

florkbr · 2021-11-03T21:55:37Z

/retest

jhadvig

/approve
/lgtm

jhadvig · 2021-11-04T14:58:38Z

Tagging the PR to merge to feature branch. Would be good to address pending comments.

openshift-ci · 2021-11-04T15:00:29Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: florkbr, jhadvig

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [jhadvig]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

jhadvig · 2021-11-04T15:04:16Z

/test e2e-aws-operator

openshift-bot · 2021-11-04T15:18:25Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

jhadvig · 2021-11-04T17:47:17Z

/test e2e-aws-operator

jhadvig · 2021-11-04T19:28:08Z

=== RUN   TestUnmanaged
    unmanaged_test.go:12: waiting for setup to reach settled state...
    unmanaged_test.go:13: changing console operator state to 'Unmanaged'...
    framework.go:71: deleting console's console-config ConfigMap...
    framework.go:71: deleting console's oc-cli-downloads ConsoleCLIDownloads...
    framework.go:71: deleting console's odo-cli-downloads ConsoleCLIDownloads...
    framework.go:71: deleting console's console Deployment...
    framework.go:71: deleting console's downloads Deployment...
    framework.go:71: deleting console's console Route...
    framework.go:71: deleting console's console Service...
    unmanaged_test.go:28: validating that the operator does not recreate deleted resources when ManagementState:Unmanaged...
validating console resources have been removed... true
validating console resources have been removed... true
E1104 18:42:38.865559   10271 request.go:1027] Unexpected error when reading response body: read tcp 10.131.71.50:39510->52.8.29.89:6443: read: connection reset by peer
validating console resources have been removed... false
    unmanaged_test.go:31: unexpected error when reading response body. Please retry. Original error: read tcp 10.131.71.50:39510->52.8.29.89:6443: read: connection reset by peer
    unmanaged_test.go:18: waiting for cleanup to reach settled state...
--- FAIL: TestUnmanaged (177.39s)
=== RUN   TestEditUnmanagedConfigMap
    unmanaged_test.go:12: waiting for setup to reach settled state...
    unmanaged_test.go:13: changing console operator state to 'Unmanaged'...
    util.go:28: patching Data on the console ConfigMap
    util.go:35: polling for patched Data on the console ConfigMap
    unmanaged_test.go:41: error: timed out waiting for the condition
    unmanaged_test.go:18: waiting for cleanup to reach settled state...
    console-operator.go:370: waited 10 seconds to reach settled state...
--- FAIL: TestEditUnmanagedConfigMap (33.79s)

/test e2e-aws-operator

openshift-bot · 2021-11-04T21:08:22Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2021-11-04T22:13:22Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2021-11-04T23:57:21Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2021-11-05T01:40:21Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2021-11-05T03:04:22Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@jhadvig

Addresses @jhadvig comments on openshift#602.

TheRealJon and others added 2 commits October 11, 2021 22:33

[WIP] CONSOLE 2919: create OAuth clients on spoke clusters

48116ad

Creates oauth clients on spoke clusters using the new managed cluster controller via a dynamic client. CONSOLE 2919: clean up oauth client impl on spoke clusters Add informers for CRD changes via dyanmic informer

openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 12, 2021

openshift-ci Bot requested review from jhadvig and spadgett October 12, 2021 02:41

florkbr requested a review from TheRealJon October 12, 2021 02:42

TheRealJon reviewed Oct 13, 2021

View reviewed changes

jhadvig reviewed Oct 19, 2021

View reviewed changes

Code review comments for spoke cluster oauth clients

09afff7

florkbr changed the title ~~[WIP] CONSOLE 2919: create OAuth clients on spoke clusters - rebased~~ CONSOLE 2919: create OAuth clients on spoke clusters - rebased Nov 2, 2021

openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 2, 2021

jhadvig approved these changes Nov 4, 2021

View reviewed changes

openshift-ci Bot assigned jhadvig Nov 4, 2021

openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Nov 4, 2021

openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 4, 2021

jhadvig merged commit 33ecbfd into openshift:master-multi-cluster-feature Nov 10, 2021

TheRealJon mentioned this pull request Nov 10, 2021

CONSOLE-2833: Integrate multicluster POC operator changes #575

Closed

TheRealJon mentioned this pull request Nov 23, 2021

Folllow on changes to CONSOLE-2919 #612

Closed

florkbr added a commit to florkbr/console-operator that referenced this pull request Dec 13, 2021

CONSOLE 2919: code review adjustments for managed cluster action/views

b957d6a

Addresses @jhadvig comments on openshift#602.

florkbr mentioned this pull request Dec 13, 2021

CONSOLE 2919: code review adjustments for managed cluster action/views #617

Closed

	namespace: openshift-config-managed
	namespace: openshift-console

-		})
+			Oauth: {
+			  ClientID: // Managed cluster oauth client name,
+			  ClientSecret: // Managed cluster oauth client secret,
+			  CAFile: // Path to ca bundle file (determined by the VolumeMount that's created in the Deployment)
+			},
+		})

	managedClusterListErrors = append(managedClusterListErrors, fmt.Sprintf("Unable to create oauth client for cluster %v: managed cluster view status unknown", namespace))
	managedClusterListErrors = append(managedClusterListErrors, fmt.Sprintf("Unable to create oauth client for cluster %s: managed cluster view status unknown", namespace))

	statusHandler.AddConditions(status.HandleProgressingOrDegraded("ManagedClusterValidation", managedClusterClientConfigValidationErrReason, managedClusterClientConfigValidationErr))
	statusHandler.AddConditions(status.HandleProgressingOrDegraded("ManagedClusterSync", managedClusterClientConfigValidationErrReason, managedClusterClientConfigValidationErr))

	func (c ManagedClusterController) ValidateManagedClusterClientConfigs(ctx context.Context, operatorConfig operatorv1.Console, recorder events.Recorder) (map[string]*clusterv1.ClientConfig, error, string) {
	func (c ManagedClusterController) ValidateManagedClusterClientConfigs(ctx context.Context, operatorConfig operatorv1.Console, recorder events.Recorder) (map[string]*clusterv1.ClientConfig, string, error) {

	// Return slice of clusterv1.ClientConfigs that have been validated or error and reaons if unable to list ManagedClusters
	// Return slice of clusterv1.ClientConfigs that have been validated or error and its reason, if unable to list ManagedClusters

	ManagedClusterConfigFile(fmt.Sprintf("%v/%v", api.ManagedClusterConfigMountDir, api.ManagedClusterConfigKey)).
	ManagedClusterConfigFile(fmt.Sprintf("%s/%s", api.ManagedClusterConfigMountDir, api.ManagedClusterConfigKey)).

	return false, "MissingManagedClusterConfig", fmt.Errorf("%v ConfigMap is missing %v data key", api.ManagedClusterConfigMapName, api.ManagedClusterConfigKey)
	return false, "MissingManagedClusterConfig", fmt.Errorf("%s ConfigMap is missing %s data key", api.ManagedClusterConfigMapName, api.ManagedClusterConfigKey)

Conversation

florkbr commented Oct 12, 2021

Uh oh!

florkbr commented Oct 13, 2021

Uh oh!

florkbr commented Oct 13, 2021

Uh oh!

florkbr commented Oct 13, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TheRealJon left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TheRealJon commented Oct 15, 2021

Uh oh!

jhadvig left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

florkbr commented Nov 2, 2021

Uh oh!

florkbr commented Nov 2, 2021

Uh oh!

florkbr commented Nov 3, 2021

Uh oh!

jhadvig left a comment

Choose a reason for hiding this comment