Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions api/v1alpha1/provisioning_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,14 @@ type ProvisioningSpec struct {
// accessible from the machine networks. User should provide two IPs on
// the external network that would be used for provisioning services.
ProvisioningNetwork ProvisioningNetwork `json:"provisioningNetwork,omitempty"`

// WatchAllNamespaces provides a way to explicitly allow use of this
// Provisioning configuration across all Namespaces. It is an
// optional configuration which defaults to false and in that state
// will be used to provision baremetal hosts in only the
// openshift-machine-api namespace. When set to true, this provisioning
// configuration would be used for baremetal hosts across all namespaces.
WatchAllNamespaces bool `json:"watchAllNamespaces,omitempty"`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We generally try to avoid using bool fields in the API, because it makes it difficult to expand or redefine them later. Here I would suggest an enum with 2 values Single and All, using All as the default.

}

// ProvisioningStatus defines the observed state of Provisioning
Expand Down
3 changes: 3 additions & 0 deletions config/crd/bases/metal3.io_provisionings.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,9 @@ spec:
provisioningOSDownloadURL:
description: ProvisioningOSDownloadURL is the location from which the OS Image used to boot baremetal host machines can be downloaded by the metal3 cluster.
type: string
watchAllNamespaces:
description: WatchAllNamespaces provides a way to explicitly allow use of this Provisioning configuration across all Namespaces. It is an optional configuration which defaults to false and in that state will be used to provision baremetal hosts in only the openshift-machine-api namespace. When set to true, this provisioning configuration would be used for baremetal hosts across all namespaces.
type: boolean
type: object
status:
description: ProvisioningStatus defines the observed state of Provisioning
Expand Down
34 changes: 17 additions & 17 deletions config/rbac/role.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,23 @@ rules:
- infrastructures/status
verbs:
- get
- apiGroups:
- metal3.io
resources:
- baremetalhosts
verbs:
- get
- list
- patch
- update
- watch
- apiGroups:
- metal3.io
resources:
- baremetalhosts/finalizers
- baremetalhosts/status
verbs:
- update
- apiGroups:
- metal3.io
resources:
Expand Down Expand Up @@ -198,23 +215,6 @@ rules:
- patch
- update
- watch
- apiGroups:
- metal3.io
resources:
- baremetalhosts
verbs:
- get
- list
- patch
- update
- watch
- apiGroups:
- metal3.io
resources:
- baremetalhosts/finalizers
- baremetalhosts/status
verbs:
- update
- apiGroups:
- monitoring.coreos.com
resources:
Expand Down
4 changes: 2 additions & 2 deletions controllers/provisioning_controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -67,8 +67,6 @@ type ensureFunc func(*provisioning.ProvisioningInfo) (bool, error)

// +kubebuilder:rbac:namespace=openshift-machine-api,groups="",resources=configmaps,verbs=get;list;watch;create;update;patch;delete
// +kubebuilder:rbac:namespace=openshift-machine-api,groups="",resources=secrets,verbs=get;list;watch;create;update;patch;delete
// +kubebuilder:rbac:namespace=openshift-machine-api,groups=metal3.io,resources=baremetalhosts,verbs=get;list;watch;update;patch
// +kubebuilder:rbac:namespace=openshift-machine-api,groups=metal3.io,resources=baremetalhosts/status;baremetalhosts/finalizers,verbs=update
// +kubebuilder:rbac:namespace=openshift-machine-api,groups=security.openshift.io,resources=securitycontextconstraints,verbs=use
// +kubebuilder:rbac:namespace=openshift-machine-api,groups=apps,resources=deployments,verbs=get;list;watch;create;update;patch;delete
// +kubebuilder:rbac:namespace=openshift-machine-api,groups=apps,resources=daemonsets,verbs=get;list;watch;create;update;patch;delete
Expand All @@ -83,6 +81,8 @@ type ensureFunc func(*provisioning.ProvisioningInfo) (bool, error)
// +kubebuilder:rbac:groups=metal3.io,resources=provisionings,verbs=get;list;watch;create;update;patch;delete
// +kubebuilder:rbac:groups=metal3.io,resources=provisionings/status,verbs=get;update;patch
// +kubebuilder:rbac:groups=metal3.io,resources=provisionings/finalizers,verbs=get;list;watch;create;update;patch;delete
// +kubebuilder:rbac:groups=metal3.io,resources=baremetalhosts,verbs=get;list;watch;update;patch
// +kubebuilder:rbac:groups=metal3.io,resources=baremetalhosts/status;baremetalhosts/finalizers,verbs=update
// +kubebuilder:rbac:groups="",resources=secrets,verbs=get;list;watch;create;update;patch;delete
// +kubebuilder:rbac:groups=apps,resources=deployments,verbs=get;list;watch;create;update;patch;delete
// +kubebuilder:rbac:groups=apps,resources=daemonsets,verbs=get;list;watch;create;update;patch;delete
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,9 @@ spec:
provisioningOSDownloadURL:
description: ProvisioningOSDownloadURL is the location from which the OS Image used to boot baremetal host machines can be downloaded by the metal3 cluster.
type: string
watchAllNamespaces:
description: WatchAllNamespaces provides a way to explicitly allow use of this Provisioning configuration across all Namespaces. It is an optional configuration which defaults to false and in that state will be used to provision baremetal hosts in only the openshift-machine-api namespace. When set to true, this provisioning configuration would be used for baremetal hosts across all namespaces.
type: boolean
type: object
status:
description: ProvisioningStatus defines the observed state of Provisioning
Expand Down
34 changes: 17 additions & 17 deletions manifests/0000_31_cluster-baremetal-operator_05_rbac.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -68,23 +68,6 @@ rules:
- patch
- update
- watch
- apiGroups:
- metal3.io
resources:
- baremetalhosts
verbs:
- get
- list
- patch
- update
- watch
- apiGroups:
- metal3.io
resources:
- baremetalhosts/finalizers
- baremetalhosts/status
verbs:
- update
- apiGroups:
- monitoring.coreos.com
resources:
Expand Down Expand Up @@ -196,6 +179,23 @@ rules:
- infrastructures/status
verbs:
- get
- apiGroups:
- metal3.io
resources:
- baremetalhosts
verbs:
- get
- list
- patch
- update
- watch
- apiGroups:
- metal3.io
resources:
- baremetalhosts/finalizers
- baremetalhosts/status
verbs:
- update
- apiGroups:
- metal3.io
resources:
Expand Down
64 changes: 64 additions & 0 deletions provisioning/baremetal_config_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ package provisioning

import (
"fmt"
"strconv"
"strings"
"testing"

Expand Down Expand Up @@ -408,3 +409,66 @@ func (pb *provisioningBuilder) ProvisioningOSDownloadURL(value string) *provisio
pb.ProvisioningSpec.ProvisioningOSDownloadURL = value
return pb
}

func enableMultiNamespace() *provisioningBuilder {
return &provisioningBuilder{
metal3iov1alpha1.ProvisioningSpec{
ProvisioningInterface: "",
ProvisioningIP: "172.30.20.3",
ProvisioningNetworkCIDR: "172.30.20.0/24",
ProvisioningOSDownloadURL: "http://172.22.0.1/images/rhcos-44.81.202001171431.0-openstack.x86_64.qcow2.gz?sha256=e98f83a2b9d4043719664a2be75fe8134dc6ca1fdbde807996622f8cc7ecd234",
ProvisioningNetwork: "Disabled",
WatchAllNamespaces: true,
},
}
}

func disableMultiNamespace() *provisioningBuilder {
return &provisioningBuilder{
metal3iov1alpha1.ProvisioningSpec{
ProvisioningInterface: "",
ProvisioningIP: "172.30.20.3",
ProvisioningNetworkCIDR: "172.30.20.0/24",
ProvisioningOSDownloadURL: "http://172.22.0.1/images/rhcos-44.81.202001171431.0-openstack.x86_64.qcow2.gz?sha256=e98f83a2b9d4043719664a2be75fe8134dc6ca1fdbde807996622f8cc7ecd234",
ProvisioningNetwork: "Disabled",
WatchAllNamespaces: false,
},
}
}

func (pb *provisioningBuilder) WatchAllNamespaces(value bool) *provisioningBuilder {
pb.ProvisioningSpec.WatchAllNamespaces = value
return pb
}

func TestWatchAllNamespaces(t *testing.T) {
tCases := []struct {
name string
spec *metal3iov1alpha1.ProvisioningSpec
expectedValue bool
}{
{
name: "Default",
spec: managedProvisioning().build(),
expectedValue: false,
},
{
name: "Single Namespace",
spec: disableMultiNamespace().build(),
expectedValue: false,
},
{
name: "Multiple Namespaces",
spec: enableMultiNamespace().build(),
expectedValue: true,
},
}
for _, tc := range tCases {
t.Run(tc.name, func(t *testing.T) {
t.Logf("Testing tc : %s", tc.name)
assert.NotNil(t, tc.spec.WatchAllNamespaces)
assert.Equal(t, tc.expectedValue, tc.spec.WatchAllNamespaces, fmt.Sprintf("WatchAllNamespaces : Expected : %s Actual : %s", strconv.FormatBool(tc.expectedValue), strconv.FormatBool(tc.spec.WatchAllNamespaces)))
return
})
}
}
24 changes: 16 additions & 8 deletions provisioning/baremetal_pod.go
Original file line number Diff line number Diff line change
Expand Up @@ -262,6 +262,21 @@ func newMetal3Containers(images *Images, config *metal3iov1alpha1.ProvisioningSp
return containers
}

func getWatchNamespace(config *metal3iov1alpha1.ProvisioningSpec) corev1.EnvVar {
if config.WatchAllNamespaces {
return corev1.EnvVar{}
} else {
return corev1.EnvVar{
Name: "WATCH_NAMESPACE",
ValueFrom: &corev1.EnvVarSource{
FieldRef: &corev1.ObjectFieldSelector{
FieldPath: "metadata.namespace",
},
},
}
}
}

func createContainerMetal3BaremetalOperator(images *Images, config *metal3iov1alpha1.ProvisioningSpec) corev1.Container {
container := corev1.Container{
Name: "metal3-baremetal-operator",
Expand All @@ -280,14 +295,7 @@ func createContainerMetal3BaremetalOperator(images *Images, config *metal3iov1al
inspectorCredentialsMount,
},
Env: []corev1.EnvVar{
{
Name: "WATCH_NAMESPACE",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we Ok with unconditionally watching all namespaces, or should we have a provisioning CR option to enable it just for the hub-cluster use-case?

@dhellmann I recall we discussed both options before - in the regular IPI case where BMH resources only exist in the openshift-machine-api namespace, will there be any overhead to this being unconditional?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there would be significant overhead in always watching all namespaces in a normal cluster. There's a kubernetes API to list all resources of a type, regardless of the namespace, so I would expect the client in the controller-runtime to use that instead of the API for listing resources within a namespace.

There could be some user confusion if a user creates a host outside of the openshift-machine-api namespace and it is inspected but cannot be used for provisioning because CAPBM won't see it.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There could be some user confusion if a user creates a host outside of the openshift-machine-api namespace and it is inspected but cannot be used for provisioning because CAPBM won't see it.

I think that will be the next step, e.g look at the work being done in the machine-api team to enable machines in a multi-namespace environment, IIUC the current plan is a machine-controller (and thus CAPBM) per-namespace, so we'll need to sync up with folks working on that and figure out how to make it work on baremetal (current testing is only on AWS I believe)

@JoelSpeed is ^^ accurate, and can you point to any docs/scripts etc that we can refer to re running multi-namespace machine controllers (I'm thinking it's probably simplest to start without hive and just test Machine+BaremetalHost resources?)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a normal OpenShift cluster we restrict the controllers to only watch a single namespace (openshift-machine-api), so if you were to create a machine outside of this namespace, nothing would happen, and that is expected.

I don't think there would be significant overhead in always watching all namespaces in a normal cluster.

This entirely depends on what kinds of resources you are watching. For example, in the machine controller we need to load secrets for userdata and credentials. Normally you would use the controller-runtime cached client for this so that you get notified of changes to the secrets to trigger a reconcile. If you don't restrict this cache to a certain namespace, it will cache all secrets from all namespaces, which not only could impact memory usage, crosses a security boundary (eg multi-tenant clusters must not list secrets from multiple namespaces).

Copy link
Copy Markdown
Contributor Author

@sadasu sadasu Feb 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we Ok with unconditionally watching all namespaces, or should we have a provisioning CR option to enable it just for the hub-cluster use-case?

@hardys In some earlier discussions around this, I recall that we said we cannot assume Provisioning CR is going to be present in the multi-namespace scenario. Has that changed?

@JoelSpeed the issues you mentioned above are valid. I think our team is viewing this approach only for 4.8 and aligning with MAO's approach for 4.9 which might include multiple instances of BMO or even CBO (still under review). The current assumption is that we can live with these limitations/concerns listed above for 1 release.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a normal OpenShift cluster we restrict the controllers to only watch a single namespace (openshift-machine-api), so if you were to create a machine outside of this namespace, nothing would happen, and that is expected.

I don't think there would be significant overhead in always watching all namespaces in a normal cluster.

This entirely depends on what kinds of resources you are watching. For example, in the machine controller we need to load secrets for userdata and credentials. Normally you would use the controller-runtime cached client for this so that you get notified of changes to the secrets to trigger a reconcile. If you don't restrict this cache to a certain namespace, it will cache all secrets from all namespaces, which not only could impact memory usage, crosses a security boundary (eg multi-tenant clusters must not list secrets from multiple namespaces).

We do watch Secrets as well as the hosts, but only those associated with the hosts. Based on what you're saying, it sounds like we may want to change the deployment so that our single baremetal-operator doesn't have to watch all Secrets? I'm not sure we'll actually save any RAM that way, since we'll have N copies of the operator with smaller caches, but it does address the security boundary issue.

We really can't afford to run multiple copies of Ironic, though. I think that means splitting the metal3 pod apart, which we didn't feel we would have time to do this release.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spoke with @hardys yesterday and I think given your current plans for 4.8, it makes sense to be able to watch all namespaces for the first iteration and look into later splitting into per namespace. Security boundaries are less important if all the hosts belong to the same person for example.

But I do still question removing the namespace limiting functionality altogether, is it not useful to keep that for regular OCP clusters and just have the ability to configure whether it should be namespaced or watch everything?

We do watch Secrets

Depending on what you do with the secrets, it may be better to use a non-caching client and Get instead of list. Looks like you're creating the secrets though so that's probably not possible here, else you'd lose the event notification if they are modified out of band

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CBO is not responsible for creating the Secrets for BMC credentials and does not try to access them. I think the above comment is meant for those Secrets and we need to make sure within BMO that we follow the above guidelines i.e do a GET for a specific Namespace vs a List for all Namespaces etc. Also, make sure we are using the right client to access the Secrets.

Copy link
Copy Markdown

@hardys hardys Feb 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recall that we said we cannot assume Provisioning CR is going to be present in the multi-namespace scenario. Has that changed?

@sadasu I don't recall saying that - I don't see any reason we couldn't put some configuration option in the provisioning CR, so that the current single-namespace behavior is maintained, but for the hub-cluster use-case watching all namespaces could be enabled?

The main disadvantage of that approach is probably that it expands the test matrix, as we potentially have to care about the situation where we switch from single->multi namespace mode as mentioned in metal3-io/baremetal-operator#797 (comment)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cb47870 is taking care of adding this config to Provisioning CR.

ValueFrom: &corev1.EnvVarSource{
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: what will be the impact, if any, on LeaderElectionNamespace for BMO in case of empty namespace?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears that leader election needs a Namespace to be passed in https://github.com/kubernetes-sigs/controller-runtime/blob/0b554ebb54901a88427d4a89014af0d2dba1bbcf/pkg/manager/manager.go#L319.

With this change, BMO will not have the Namespace passed in via WATCH_NAMESPACE. But, with #107 we are passing in the POD_NAMESPACE which is the correct Namespace to be used for leader election. /cc @asalkeld

FieldRef: &corev1.ObjectFieldSelector{
FieldPath: "metadata.namespace",
},
},
},
getWatchNamespace(config),
{
Name: "POD_NAMESPACE",
ValueFrom: &corev1.EnvVarSource{
Expand Down