Impl OverprovisionControl using OpenShift's ClusterResourceQuota#1282
Impl OverprovisionControl using OpenShift's ClusterResourceQuota#1282openshift-merge-robot merged 7 commits intored-hat-storage:masterfrom
Conversation
e1d5ead to
5c4768b
Compare
| PVs overprovisioning relative to the effective usable storage capacity. | ||
| items: | ||
| description: OverprovisionControlSpec defines the allowed overprovisioning | ||
| PC consumtion from the underlying cluster. This may be either |
| PC consumtion from the underlying cluster. This may be either | ||
| as an absolute value or as a percentage from the overall effective | ||
| capacity. When defined in terms of percentages, zero value indicates | ||
| no over-provisioning. |
There was a problem hiding this comment.
Does no over-provisioning mean that the limits are not checked or it is limited to 100% of the available capacity?
There was a problem hiding this comment.
Following a review by @jarrpa, this need to fixed: either have explicit over-provision or none at all.
| } | ||
| requestName := resourceRequestName(opc.StorageClassName) | ||
| storageQuota := "av1.ClusterResourceQuota{ | ||
| TypeMeta: metav1.TypeMeta{APIVersion: quotav1.SchemeGroupVersion.String(), Kind: clusterResourceQuotaKind}, |
There was a problem hiding this comment.
I don't think the TypeMeta is required here - that is automatically taken care of when you create the quotav1.ClusterResourceQuota.
There was a problem hiding this comment.
You are correct! nice.
There was a problem hiding this comment.
You are correct. Better to remove it
| }, | ||
| }, | ||
| } | ||
| currentQuota, err := r.QuotaV1.ClusterResourceQuotas().Get(context.TODO(), storageQuota.Name, metav1.GetOptions{}) |
There was a problem hiding this comment.
Wouldn't r.Client.Get() work here?
There was a problem hiding this comment.
It will not work. OpenShift's ClusterResourceQuota has its own client interface. I do not know why they did it this way, but I tried to follow their conventions. See for example:
There was a problem hiding this comment.
This is infuriating and they should feel bad. 😤
There was a problem hiding this comment.
It should work. Controller runtime client works with any runtime.Object provided that correct scheme is added.
There was a problem hiding this comment.
@umangachapagain Could you please elaborate on how to add the correct scheme in this case?
There was a problem hiding this comment.
Right, a search is basically a get call. For create you will anyway need to specify the namespace for either client, correct?
There was a problem hiding this comment.
On second thoughts - In which namespace is the ocs operator creating these ClusterResourceQuotas?
Should it only create the ClusterResourceQuotas for the entries in the StorageCluster and create them in the openshift-storage namespace?
The users can create other ClusterResourceQuotas elsewhere but OCS should not care about those.
There was a problem hiding this comment.
This is a very good question, which touches the core issue here: unlike all other resources in OCS, ClusterResourceQuota is not associated with any specific namespace. Think of it as "global" resource, which limits resource consumption from other (labled) namespaces.
There was a problem hiding this comment.
global resources exist even without OpenShift and ClusterResourceQuota. controller-runtime client can handle it. We do that for StorageClass.
There was a problem hiding this comment.
OK, it looks like it is doable after all (i.e., avoid ClusterResourceQuotas specific interface). Many thanks @nbalacha and @umangachapagain for your inputs
| storageQuota.Spec.DeepCopyInto(¤tQuota.Spec) | ||
| _, err = r.QuotaV1.ClusterResourceQuotas().Update(context.TODO(), currentQuota, metav1.UpdateOptions{}) | ||
| if err != nil { | ||
| r.Log.Error(err, "Update ClusterResourceQuota failed") |
There was a problem hiding this comment.
Please include the name of the ClusterResourceQuota here.
| } | ||
|
|
||
| func storageQuotaName(storageClassName string, idx int) string { | ||
| subKey := storageClassName |
There was a problem hiding this comment.
If idx is the index in the array this could be problematic if an entry in the middle is deleted. The names of the later entries would change post that.
| const storageClassSuffix string = ".storageclass.storage.k8s.io/" | ||
|
|
||
| // V1ResourceByStorageClass returns a quota resource name by storage class. | ||
| func V1ResourceByStorageClass(storageClassName string, resourceName corev1.ResourceName) corev1.ResourceName { |
There was a problem hiding this comment.
How does this work if there are multiple quotas for the same storageClass but different projects.
Personally, I think importing the pkg is better than copying it.
There was a problem hiding this comment.
Same. Bringing in additional dependencies is not a concern for me. Having copy-pasted code from another package is dangerous if the source ever changes, and that's a higher priority concern.
| if hardLimit == nil { | ||
| continue | ||
| } | ||
| requestName := resourceRequestName(opc.StorageClassName) |
There was a problem hiding this comment.
Should there be a check to see if the storageclass is actually using OCS ceph provisioners?
There was a problem hiding this comment.
Following a review by @jarrpa this will be fixed
api/v1/storagecluster_types.go
Outdated
| SchemeBuilder.Register(&StorageCluster{}, &StorageClusterList{}) | ||
| } | ||
|
|
||
| // OverprovisionControlSpec defines the allowed overprovisioning PC consumtion from the underlying cluster. |
There was a problem hiding this comment.
| // OverprovisionControlSpec defines the allowed overprovisioning PC consumtion from the underlying cluster. | |
| // OverprovisionControlSpec defines the allowed overprovisioning PC consumption from the underlying cluster. |
api/v1/storagecluster_types.go
Outdated
| } | ||
|
|
||
| // OverprovisionControlSpec defines the allowed overprovisioning PC consumtion from the underlying cluster. | ||
| // This may be either as an absolute value or as a percentage from the overall effective capacity. |
There was a problem hiding this comment.
| // This may be either as an absolute value or as a percentage from the overall effective capacity. | |
| // This may be an absolute value or a percentage of the overall effective capacity. |
api/v1/storagecluster_types.go
Outdated
|
|
||
| // OverprovisionControlSpec defines the allowed overprovisioning PC consumtion from the underlying cluster. | ||
| // This may be either as an absolute value or as a percentage from the overall effective capacity. | ||
| // When defined in terms of percentages, zero value indicates no over-provisioning. |
There was a problem hiding this comment.
| // When defined in terms of percentages, zero value indicates no over-provisioning. | |
| // When defined in percentage, zero value indicates no over-provisioning. |
| requestName := resourceRequestName(opc.StorageClassName) | ||
| storageQuota := "av1.ClusterResourceQuota{ | ||
| TypeMeta: metav1.TypeMeta{APIVersion: quotav1.SchemeGroupVersion.String(), Kind: clusterResourceQuotaKind}, | ||
| ObjectMeta: metav1.ObjectMeta{Name: storageQuotaName(opc.StorageClassName, idx)}, |
There was a problem hiding this comment.
we have a separate file for all name generating functions. Move storageQuotaName there and follow similar naming.
| // https://github.com/kubernetes/kubernetes/blob/v1.21.2/pkg/quota/v1/evaluator/core/persistent_volume_claims.go#L63 | ||
| // | ||
| // Avoids importing "k8s.io/kubernetes/pkg/quota/v1/evaluator/core" and its chain of dependencies | ||
| // TODO: Ask Jose A.Rivera if he prefers import |
There was a problem hiding this comment.
| // TODO: Ask Jose A.Rivera if he prefers import |
@jarrpa What do you prefer?
| // storageClassSuffix is the suffix to the qualified portion of storage class resource name. | ||
| // For example, if you want to quota storage by storage class, you would have a declaration | ||
| // that follows <storage-class>.storageclass.storage.k8s.io/<resource>. | ||
| const storageClassSuffix string = ".storageclass.storage.k8s.io/" |
There was a problem hiding this comment.
Should probably move this and all name generation functions below to generate.go?
| func createFakeStorageClusterWithQuotaReconciler(t *testing.T, obj ...runtime.Object) *StorageClusterReconciler { | ||
| scheme := createFakeSchemeWithQuota(t) | ||
| client := fake.NewClientBuilder().WithScheme(scheme).WithRuntimeObjects(obj...).Build() | ||
|
|
||
| return &StorageClusterReconciler{ | ||
| Client: client, | ||
| QuotaV1: fakequota.NewSimpleClientset().QuotaV1(), | ||
| Scheme: scheme, | ||
| serverVersion: &k8sVersion.Info{}, | ||
| Log: logf.Log.WithName("storagequota_test"), | ||
| platform: &Platform{platform: configv1.NonePlatformType}, | ||
| } | ||
| } | ||
|
|
||
| func createFakeSchemeWithQuota(t *testing.T) *runtime.Scheme { | ||
| scheme := createFakeScheme(t) | ||
| err := quotav1.AddToScheme(scheme) | ||
| if err != nil { | ||
| assert.Fail(t, "failed to add quotav1 scheme") | ||
| } | ||
| return scheme | ||
| } |
There was a problem hiding this comment.
Try reusing already available utilities rather than creating more.
| TypeMeta: metav1.TypeMeta{ | ||
| APIVersion: "quota.openshift.io", | ||
| Kind: "ClusterResourceQuota", | ||
| }, |
There was a problem hiding this comment.
| TypeMeta: metav1.TypeMeta{ | |
| APIVersion: "quota.openshift.io", | |
| Kind: "ClusterResourceQuota", | |
| }, |
| TypeMeta: metav1.TypeMeta{ | ||
| APIVersion: "quota.openshift.io", | ||
| Kind: "ClusterResourceQuota", | ||
| }, |
There was a problem hiding this comment.
| TypeMeta: metav1.TypeMeta{ | |
| APIVersion: "quota.openshift.io", | |
| Kind: "ClusterResourceQuota", | |
| }, |
| }, | ||
| }, | ||
| } | ||
| currentQuota, err := r.QuotaV1.ClusterResourceQuotas().Get(context.TODO(), storageQuota.Name, metav1.GetOptions{}) |
There was a problem hiding this comment.
This is infuriating and they should feel bad. 😤
| const storageClassSuffix string = ".storageclass.storage.k8s.io/" | ||
|
|
||
| // V1ResourceByStorageClass returns a quota resource name by storage class. | ||
| func V1ResourceByStorageClass(storageClassName string, resourceName corev1.ResourceName) corev1.ResourceName { |
There was a problem hiding this comment.
Same. Bringing in additional dependencies is not a concern for me. Having copy-pasted code from another package is dangerous if the source ever changes, and that's a higher priority concern.
| func createFakeStorageClusterWithQuotaReconciler(t *testing.T, obj ...runtime.Object) *StorageClusterReconciler { | ||
| scheme := createFakeSchemeWithQuota(t) | ||
| client := fake.NewClientBuilder().WithScheme(scheme).WithRuntimeObjects(obj...).Build() | ||
|
|
||
| return &StorageClusterReconciler{ | ||
| Client: client, | ||
| QuotaV1: fakequota.NewSimpleClientset().QuotaV1(), | ||
| Scheme: scheme, | ||
| serverVersion: &k8sVersion.Info{}, | ||
| Log: logf.Log.WithName("storagequota_test"), | ||
| platform: &Platform{platform: configv1.NonePlatformType}, | ||
| } | ||
| } | ||
|
|
||
| func createFakeSchemeWithQuota(t *testing.T) *runtime.Scheme { | ||
| scheme := createFakeScheme(t) | ||
| err := quotav1.AddToScheme(scheme) | ||
| if err != nil { | ||
| assert.Fail(t, "failed to add quotav1 scheme") | ||
| } | ||
| return scheme | ||
| } |
There was a problem hiding this comment.
Don't do this, we want to avoid this kind of duplication if possible. See the following commit for an example: 837ae4b
| var obj ocsStorageQuota | ||
| err := obj.ensureCreated(r, sc) | ||
| assert.NoError(t, err) | ||
| assert.Equal(t, len(listStorageQuotas(t, r)), 2) | ||
|
|
||
| sc.Spec.OverprovisionControl[1].Percentage = 60 | ||
| err = obj.ensureCreated(r, sc) | ||
| assert.NoError(t, err) | ||
| assert.Equal(t, len(listStorageQuotas(t, r)), 2) | ||
|
|
||
| sc.Spec.OverprovisionControl[0].Capacity = &testQuantity1T | ||
| err = obj.ensureCreated(r, sc) | ||
| assert.NoError(t, err) | ||
| assert.Equal(t, len(listStorageQuotas(t, r)), 2) |
There was a problem hiding this comment.
For things like this, try to follow the test-cases format we're using elsewhere: https://github.com/openshift/ocs-operator/blob/master/controllers/storagecluster/storagecluster_controller_test.go#L248-L304 This helps reduce duplication of actual test code so extending them becomes much easier.
In this case, you'll want to supply the different OPCs as test variables. Also, be sure to call Created and Deleted for every test iteration so we have clean test environments.
| var testQuantity1T = resource.MustParse("1Ti") | ||
| var testQuantity2T = resource.MustParse("2Ti") | ||
| var testStorageClusterWithOverprovision = &api.StorageCluster{ |
There was a problem hiding this comment.
s/test/mock, just out of convention.
|
|
||
| var testQuantity1T = resource.MustParse("1Ti") | ||
| var testQuantity2T = resource.MustParse("2Ti") | ||
| var testStorageClusterWithOverprovision = &api.StorageCluster{ |
There was a problem hiding this comment.
Rather than defining a new StorageCluster, define a valid StorageDeviceSets list with appropriate OPCs. Since you already have a function for the DeepCopy, you could do a DeepCopy of the existing mock StorageCluster and then DeepCopy the SDSs/OPCs into that.
| var obj ocsStorageQuota | ||
| err := obj.ensureCreated(r, sc) | ||
| assert.NoError(t, err) | ||
| assert.Equal(t, len(listStorageQuotas(t, r)), 2) |
There was a problem hiding this comment.
Try not to hardcode any values like this. You should be able to determine the expected count from the spec you're providing.
| func TestStorageQuotaEnsureCreatedDeleted(t *testing.T) { | ||
| r := createFakeStorageClusterWithQuotaReconciler(t) | ||
| sc := createStorageClusterWithOverprovision() | ||
|
|
||
| var obj ocsStorageQuota | ||
| err := obj.ensureCreated(r, sc) | ||
| assert.NoError(t, err) | ||
| assert.Equal(t, len(listStorageQuotas(t, r)), 2) | ||
|
|
||
| err = obj.ensureDeleted(r, sc) | ||
| assert.NoError(t, err) | ||
| assert.Equal(t, len(listStorageQuotas(t, r)), 0) | ||
| } |
There was a problem hiding this comment.
This is a bit of a tricky situation for all parts of ocs-operator. But generally, you should not need a separate test function for testing create and delete. The test cases for the overall feature already contain the ensureCreated() and ensureDeleted() calls, so as long as you have proper error reporting we'll see what went wrong.
|
A few other top-level change requests:
|
This is for returning errors. |
| //nolint | ||
| type StorageClusterReconciler struct { | ||
| client.Client | ||
| QuotaV1 quotav1.QuotaV1Interface |
There was a problem hiding this comment.
Please rename to QuotaClient. QuotaV1 is usually the format for the API itself
| "github.com/go-logr/logr" | ||
| nbv1 "github.com/noobaa/noobaa-operator/v2/pkg/apis/noobaa/v1alpha1" | ||
|
|
||
| quotav1 "github.com/openshift/client-go/quota/clientset/versioned/typed/quota/v1" |
There was a problem hiding this comment.
Please import as quotaclient or similar
When calculating the usable capacity of a cluster, other modules will need to take into account the same logic of count and replica. Avoid cope duplication by exporting this logic via dedicated function. Signed-off-by: Shachar Sharon <ssharon@redhat.com>
The core logic of creation ClusterResourceQuota for OCS storage-cluster. For each OverprovisionControl entry in StorageCluster, define a corresponding ClusterResourceQuota object using OpenShift's quota-client mechanism. The actual value is one of two cases: - Explicit hard-limit - Derived as percentage from the cluster's raw capacity Enable ocsStorageQuota as part of resource-managers reconcile loop. Enable apiGroup 'quota.openshift.io' with resources clusterresourcequotas. Signed-off-by: Shachar Sharon <ssharon@redhat.com>
Using ClusterResourceQuota fake client to test overprovisionControl functionality (Create/Update/Delete). Signed-off-by: Shachar Sharon <ssharon@redhat.com>
Signed-off-by: Shachar Sharon <ssharon@redhat.com>
Signed-off-by: Shachar Sharon <ssharon@redhat.com>
Export storage clusterresourcequota HARD/USED metrics via Prometheus exporter. For-each ClusterResourceQuota storage entry, export a pair on hard-limit/used-count metrics. Signed-off-by: Shachar Sharon <ssharon@redhat.com>
Signed-off-by: Shachar Sharon <ssharon@redhat.com>
|
/retest-required |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jarrpa The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
The following patch-set implements storage-quota (a.k.a, OverprovisionControl) using OpenShift's ClusterResourceQuota mechanism. It enables new section in StorageCluster CRD called OverprovisionControl which allows end-users to define ClusterResourceQuota over OCS storage cluster (either as fixed value or as percentage of raw capacity).
Refs KNIP-1617