-
Notifications
You must be signed in to change notification settings - Fork 71
Allow BMO to watch all Namespaces #106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -262,6 +262,21 @@ func newMetal3Containers(images *Images, config *metal3iov1alpha1.ProvisioningSp | |
| return containers | ||
| } | ||
|
|
||
| func getWatchNamespace(config *metal3iov1alpha1.ProvisioningSpec) corev1.EnvVar { | ||
| if config.WatchAllNamespaces { | ||
| return corev1.EnvVar{} | ||
| } else { | ||
| return corev1.EnvVar{ | ||
| Name: "WATCH_NAMESPACE", | ||
| ValueFrom: &corev1.EnvVarSource{ | ||
| FieldRef: &corev1.ObjectFieldSelector{ | ||
| FieldPath: "metadata.namespace", | ||
| }, | ||
| }, | ||
| } | ||
| } | ||
| } | ||
|
|
||
| func createContainerMetal3BaremetalOperator(images *Images, config *metal3iov1alpha1.ProvisioningSpec) corev1.Container { | ||
| container := corev1.Container{ | ||
| Name: "metal3-baremetal-operator", | ||
|
|
@@ -280,14 +295,7 @@ func createContainerMetal3BaremetalOperator(images *Images, config *metal3iov1al | |
| inspectorCredentialsMount, | ||
| }, | ||
| Env: []corev1.EnvVar{ | ||
| { | ||
| Name: "WATCH_NAMESPACE", | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are we Ok with unconditionally watching all namespaces, or should we have a provisioning CR option to enable it just for the hub-cluster use-case? @dhellmann I recall we discussed both options before - in the regular IPI case where BMH resources only exist in the openshift-machine-api namespace, will there be any overhead to this being unconditional?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think there would be significant overhead in always watching all namespaces in a normal cluster. There's a kubernetes API to list all resources of a type, regardless of the namespace, so I would expect the client in the controller-runtime to use that instead of the API for listing resources within a namespace. There could be some user confusion if a user creates a host outside of the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I think that will be the next step, e.g look at the work being done in the machine-api team to enable machines in a multi-namespace environment, IIUC the current plan is a machine-controller (and thus CAPBM) per-namespace, so we'll need to sync up with folks working on that and figure out how to make it work on baremetal (current testing is only on AWS I believe) @JoelSpeed is ^^ accurate, and can you point to any docs/scripts etc that we can refer to re running multi-namespace machine controllers (I'm thinking it's probably simplest to start without hive and just test Machine+BaremetalHost resources?) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In a normal OpenShift cluster we restrict the controllers to only watch a single namespace (
This entirely depends on what kinds of resources you are watching. For example, in the machine controller we need to load secrets for userdata and credentials. Normally you would use the controller-runtime cached client for this so that you get notified of changes to the secrets to trigger a reconcile. If you don't restrict this cache to a certain namespace, it will cache all secrets from all namespaces, which not only could impact memory usage, crosses a security boundary (eg multi-tenant clusters must not list secrets from multiple namespaces).
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
@hardys In some earlier discussions around this, I recall that we said we cannot assume Provisioning CR is going to be present in the multi-namespace scenario. Has that changed? @JoelSpeed the issues you mentioned above are valid. I think our team is viewing this approach only for 4.8 and aligning with MAO's approach for 4.9 which might include multiple instances of BMO or even CBO (still under review). The current assumption is that we can live with these limitations/concerns listed above for 1 release.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
We do watch Secrets as well as the hosts, but only those associated with the hosts. Based on what you're saying, it sounds like we may want to change the deployment so that our single baremetal-operator doesn't have to watch all Secrets? I'm not sure we'll actually save any RAM that way, since we'll have N copies of the operator with smaller caches, but it does address the security boundary issue. We really can't afford to run multiple copies of Ironic, though. I think that means splitting the metal3 pod apart, which we didn't feel we would have time to do this release. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I spoke with @hardys yesterday and I think given your current plans for 4.8, it makes sense to be able to watch all namespaces for the first iteration and look into later splitting into per namespace. Security boundaries are less important if all the hosts belong to the same person for example. But I do still question removing the namespace limiting functionality altogether, is it not useful to keep that for regular OCP clusters and just have the ability to configure whether it should be namespaced or watch everything?
Depending on what you do with the secrets, it may be better to use a non-caching client and Get instead of list. Looks like you're creating the secrets though so that's probably not possible here, else you'd lose the event notification if they are modified out of band
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The CBO is not responsible for creating the Secrets for BMC credentials and does not try to access them. I think the above comment is meant for those Secrets and we need to make sure within BMO that we follow the above guidelines i.e do a GET for a specific Namespace vs a List for all Namespaces etc. Also, make sure we are using the right client to access the Secrets. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
@sadasu I don't recall saying that - I don't see any reason we couldn't put some configuration option in the provisioning CR, so that the current single-namespace behavior is maintained, but for the hub-cluster use-case watching all namespaces could be enabled? The main disadvantage of that approach is probably that it expands the test matrix, as we potentially have to care about the situation where we switch from single->multi namespace mode as mentioned in metal3-io/baremetal-operator#797 (comment)
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. cb47870 is taking care of adding this config to Provisioning CR. |
||
| ValueFrom: &corev1.EnvVarSource{ | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Question: what will be the impact, if any, on
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It appears that leader election needs a Namespace to be passed in https://github.com/kubernetes-sigs/controller-runtime/blob/0b554ebb54901a88427d4a89014af0d2dba1bbcf/pkg/manager/manager.go#L319. With this change, BMO will not have the Namespace passed in via WATCH_NAMESPACE. But, with #107 we are passing in the POD_NAMESPACE which is the correct Namespace to be used for leader election. /cc @asalkeld |
||
| FieldRef: &corev1.ObjectFieldSelector{ | ||
| FieldPath: "metadata.namespace", | ||
| }, | ||
| }, | ||
| }, | ||
| getWatchNamespace(config), | ||
| { | ||
| Name: "POD_NAMESPACE", | ||
| ValueFrom: &corev1.EnvVarSource{ | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We generally try to avoid using
boolfields in the API, because it makes it difficult to expand or redefine them later. Here I would suggest an enum with 2 valuesSingleandAll, usingAllas the default.