Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(api): add multi-namespace ClusterCryostat API #520

Merged
merged 21 commits into from
Feb 23, 2023

Conversation

ebaron
Copy link
Member

@ebaron ebaron commented Feb 6, 2023

This PR adds a new ClusterCryostat CRD. This is essentially a cluster-scoped version of the Cryostat CRD with a couple of additions. Instead of installing Cryostat into the same namespace of the CR, the ClusterCryostat CRD has an Spec.InstallNamespace property. Then there is the Spec.TargetNamespaces property where the user can specify which namespaces they want this Cryostat instance to work with. The operator accomplishes this by creating a RoleBinding in each of these namespaces, binding the CR's service account to the Cryostat Role (now a Cluster Role to save resources).

The reasoning for making a new CRD instead of using the existing Cryostat is to guard against unexpected privilege escalation. OLM-installed operators automatically aggregate view, edit, and admin permissions for each CRD to the corresponding Cluster Roles using role aggregation. A cluster-admin might give an unprivileged user admin or edit permissions for a particular namespace by binding the admin or edit cluster role to that user. This cluster-admin may not expect that giving such permissions to a user would allow them to create a Cryostat CR that in turn creates a multi-namespace Cryostat installation through the Spec.TargetNamespaces field. By making ClusterCryostat cluster-scoped, the only way to allow unprivileged users to create one is by a cluster-admin explicitly creating a Cluster Role Binding for a user. This action makes it explicit that the user will be authorized to monitor as much of the cluster as they wish with Cryostat.

Here is an example for a user "developer" that has the "admin" role bound to it in the current namespace:

$ kubectl create -f config/samples/operator_v1beta1_clustercryostat.yaml 
Error from server (Forbidden): error when creating "config/samples/operator_v1beta1_clustercryostat.yaml": clustercryostats.operator.cryostat.io is forbidden: User "developer" cannot create resource "clustercryostats" in API group "operator.cryostat.io" at the cluster scope
$ kubectl create -f config/samples/operator_v1beta1_cryostat.yaml 
cryostat.operator.cryostat.io/cryostat-sample created

In order to not break API between minor releases and to allow unprivileged users to still create single-namespace Cryostat installations as before, the existing Cryostat CRD remains. Much of the code changes of this PR are to make the controller and test code able to handle an abstraction of the two CRDs, called CryostatInstance in the new model package.

I had to make test changes in quite a few places to avoid hardcoding the "cryostat" CR name for all created resources. This is necessary because for cluster-wide CRs we want to be able to test with two (differently named) CRs at once to ensure they don't conflict with each other.

To test in OpenShift:

  1. Build an operator bundle:
    make oci-build bundle bundle-build OPERATOR_IMG=quay.io/foo/cryostat-operator:tag BUNDLE_IMG=quay.io/foo/cryostat-operator-bundle:tag
    
  2. Push both images to Quay
  3. Install bundle with Operator SDK (install cert-manager first, if not already done):
    operator-sdk run bundle --install-mode AllNamespaces quay.io/foo/cryostat-operator-bundle:tag
    
  4. In the OpenShift Console, navigate to "Operators -> Installed Operators". Select "Cryostat Operator".
  5. Create a ClusterCryostat, filling in the new Install Namespace and Target Namespaces fields.
    Screenshot 2023-02-08 at 18-27-39 Create ClusterCryostat · Red Hat OpenShift
    Screenshot 2023-02-08 at 18-29-01 Create ClusterCryostat · Red Hat OpenShift

Fixes: #513

@ebaron ebaron added the feat New feature or request label Feb 6, 2023
@ebaron ebaron marked this pull request as ready for review February 14, 2023 00:43
@ebaron ebaron requested review from tthvo and andrewazores February 14, 2023 00:43
@andrewazores
Copy link
Member

Thanks for the explanation writeup, it sounds like it makes a lot of sense to me. Building and manual testing everything looks great too. It'll take some time to work through the patch review but I'll be looking at it this week.

Copy link
Member

@andrewazores andrewazores left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I'm sure we'll find things that need adjusting between this and the various Cryostat changes that are also in the pipeline around this story, but we'll see what happens when everything gets done and tested together.

@ebaron
Copy link
Member Author

ebaron commented Feb 22, 2023

I've seen this error come up a couple times, but I'm not sure what causes it.

1.6771049505679452e+09	INFO	controllers.ClusterCryostat	Reconciling ClusterCryostat	{"Request.Name": "clustercryostat-sample"}
1.6771049505681043e+09	ERROR	controllers.ClusterCryostat	Error reading ClusterCryostat instance	{"Request.Name": "clustercryostat-sample", "error": "cache had type *v1beta1.Cryostat, but *v1beta1.ClusterCryostat was asked for"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:121
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:320
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:273
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:234

@ebaron
Copy link
Member Author

ebaron commented Feb 23, 2023

I've seen this error come up a couple times, but I'm not sure what causes it.

1.6771049505679452e+09	INFO	controllers.ClusterCryostat	Reconciling ClusterCryostat	{"Request.Name": "clustercryostat-sample"}
1.6771049505681043e+09	ERROR	controllers.ClusterCryostat	Error reading ClusterCryostat instance	{"Request.Name": "clustercryostat-sample", "error": "cache had type *v1beta1.Cryostat, but *v1beta1.ClusterCryostat was asked for"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:121
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:320
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:273
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:234

In the most recent occurrence, I noticed that I deployed the operator using the make deploy target. This causes the WATCH_NAMESPACES environment variable to be set to the current namespace. The leads to the controller manager's cache being restricted to this namespace. Perhaps the error is caused by this restriction, and just manifesting in an odd way.

From https://sdk.operatorframework.io/docs/building-operators/golang/operator-scope/#overview:

IMPORTANT: When a Manager instance is created in the main.go file, the Namespaces are set via Manager Options as described below. These Namespaces should be watched and cached for the Client which is provided by the Manager. Only clients provided by cluster-scoped Managers are able to manage cluster-scoped CRD’s.

If this is the cause of the error, perhaps we want to disable the ClusterCryostat controller for SingleNamespace operator installations. We could simply not start the ClusterCryostat controller and log a message.

@ebaron
Copy link
Member Author

ebaron commented Feb 23, 2023

I've seen this error come up a couple times, but I'm not sure what causes it.

1.6771049505679452e+09	INFO	controllers.ClusterCryostat	Reconciling ClusterCryostat	{"Request.Name": "clustercryostat-sample"}
1.6771049505681043e+09	ERROR	controllers.ClusterCryostat	Error reading ClusterCryostat instance	{"Request.Name": "clustercryostat-sample", "error": "cache had type *v1beta1.Cryostat, but *v1beta1.ClusterCryostat was asked for"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:121
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:320
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:273
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:234

In the most recent occurrence, I noticed that I deployed the operator using the make deploy target. This causes the WATCH_NAMESPACES environment variable to be set to the current namespace. The leads to the controller manager's cache being restricted to this namespace. Perhaps the error is caused by this restriction, and just manifesting in an odd way.

From https://sdk.operatorframework.io/docs/building-operators/golang/operator-scope/#overview:

IMPORTANT: When a Manager instance is created in the main.go file, the Namespaces are set via Manager Options as described below. These Namespaces should be watched and cached for the Client which is provided by the Manager. Only clients provided by cluster-scoped Managers are able to manage cluster-scoped CRD’s.

This last sentence doesn't seem to match up with what I'm seeing. The underlying Informer will restrict list/watches to a set of namespaces, if provided. This doesn't happen for cluster-scoped resources though:
https://github.com/kubernetes-sigs/controller-runtime/blob/f6f37e6cc1ec7b7d18a266a6614f86df211b1a0a/pkg/cache/internal/informers.go#L378-L508

If this is the cause of the error, perhaps we want to disable the ClusterCryostat controller for SingleNamespace operator installations. We could simply not start the ClusterCryostat controller and log a message.

I've tried this and still encounter the error. I'm now able to reliably reproduce it:

  1. Install the operator
  2. Create a ClusterCryostat.
  3. The operator should deploy Cryostat as expected.
  4. Restart the operator with kubectl rollout restart deploy cryostat-operator-controller-manager
  5. The above error should occur.

Still no idea why this is happening, but it's pretty bad. So I'm going to hold off on merging until we can figure this out. At least now we have a reproducer.

@ebaron
Copy link
Member Author

ebaron commented Feb 23, 2023

Fixed now, it was just a typo in the ClusterCryostatList struct... @andrewazores still good to merge?

@ebaron ebaron merged commit 1004a6a into cryostatio:multi-namespace-wip Feb 23, 2023
ebaron added a commit that referenced this pull request Feb 23, 2023
* Add ClusterCryostat type and controller

* Use abstraction for tests

* Make CR name in tests configurable

* Refactor tests to share between APIs

* Run tests for ClusterCryostat controller

* Implement RBAC per target namespace

* Pass CRYOSTAT_K8S_NAMESPACES to deployment

* Fix generated files with operator-sdk

* Use ClusterCryostat as initialization-resource

* Use Kustomize patch to adjust x-descriptors

* Use Kustomize patch to add x-descriptors to targetNamespace elements

* Regenerate bundle

* Add note on multi-tenancy

* Use common SetupWithManager

* Rename reconciler interface

* Use cluster role instead of role

* Restore role permission

* Add new cluster role to bundle, fix logging

* Rename Instance to Object

* Don't default to installNamespace

* Fix list item typo
ebaron added a commit that referenced this pull request Feb 23, 2023
* Add ClusterCryostat type and controller

* Use abstraction for tests

* Make CR name in tests configurable

* Refactor tests to share between APIs

* Run tests for ClusterCryostat controller

* Implement RBAC per target namespace

* Pass CRYOSTAT_K8S_NAMESPACES to deployment

* Fix generated files with operator-sdk

* Use ClusterCryostat as initialization-resource

* Use Kustomize patch to adjust x-descriptors

* Use Kustomize patch to add x-descriptors to targetNamespace elements

* Regenerate bundle

* Add note on multi-tenancy

* Use common SetupWithManager

* Rename reconciler interface

* Use cluster role instead of role

* Restore role permission

* Add new cluster role to bundle, fix logging

* Rename Instance to Object

* Don't default to installNamespace

* Fix list item typo
@ebaron ebaron linked an issue Feb 23, 2023 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat New feature or request
Projects
No open projects
Status: Done
Development

Successfully merging this pull request may close these issues.

[Story] Add API to allow users to install a multi-namespace Cryostat
2 participants