Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate controllers to v1beta1 #604

Merged
merged 3 commits into from
Mar 8, 2023

Conversation

alculquicondor
Copy link
Contributor

@alculquicondor alculquicondor commented Mar 3, 2023

What type of PR is this?

/kind feature
/kind api-change

What this PR does / why we need it:

Uses the v1beta1 API throughout the code base and changes the storage version.
The code is functionally the same, but there are some differences:

  • Use of client.Status() when admitting or evicting workloads.
  • The concept of "codependent resources" disappears in favor of the API's resourceGroups.
  • The queue name in a Job can be specified as a label or annotation.

Makes use of APIs introduced in #532 (contained in the first two commits).

Which issue(s) this PR fixes:

Fixes #23

Special notes for your reviewer:

I heavily changed the testing wrappers in a attempt to make the test cases more slim.
The wrappers also add coveredResources automatically and panic if you define flavors with different resources.

To be done if follow ups:

Change-Id: Ic4cc56f07e8ef5caa7b55ac02901a54708c35f20
Change-Id: I9a12e50eaa66c1897406b44e59f9e0813382645d
@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API labels Mar 3, 2023
@netlify
Copy link

netlify bot commented Mar 3, 2023

Deploy Preview for kubernetes-sigs-kueue canceled.

Name Link
🔨 Latest commit cfaf110
🔍 Latest deploy log https://app.netlify.com/sites/kubernetes-sigs-kueue/deploys/6408adbbb8915b00089939d4

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alculquicondor

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. approved Indicates a PR has been approved by an approver from all required OWNERS files. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Mar 3, 2023
@alculquicondor
Copy link
Contributor Author

/hold
for all the eyes that want to have a look :)

cc @ahg-g

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 3, 2023
@alculquicondor alculquicondor force-pushed the beta1-controllers branch 7 times, most recently from 03be22c to 28aa2ee Compare March 3, 2023 20:32
@alculquicondor
Copy link
Contributor Author

/test pull-kueue-test-integration-main

Copy link
Contributor

@ahg-g ahg-g left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unbelievable effort, but hard to review :)

I think this could have been made easier to review if we introduced the changes to v1beta1 over different PRs (for example, first add v1beta1 as an exact copy of v1alpha2, then do the Workload Admission move, then the ResourceFlavor changes, then the ClusterQueue etc.), the wrappers refactoring could have also been introduced in a separate PR.

As for the logic itself, I wonder if we actually needed to make all those changes at once? maybe by copying the external type to the existing internal types we could have avoided doing all those changes in one go?

Regardless, since this doesn't intend to introduce new behavior, our only hope is that we have enough test coverage for validation :)

qKey := workload.QueueKey(wi.Obj)
if _, ok := c.admittedWorkloadsPerQueue[qKey]; ok {
c.admittedWorkloadsPerQueue[qKey] += int(m)
}
}

func updateUsage(wi *workload.Info, usedResources ResourceQuantities, m int64) {
func updateUsage(wi *workload.Info, cqUsage FlavorResourceQuantities, m int64) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make this a member function of ClusterQueue since it updates the Usage of a CQ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, because the snapshot also uses this function to update the usage of the cohort, not just the clusterQueue.

}
cqFlv, cqFlvExist := cqUsage[wlResFlv]
if cqFlvExist && wlResExist {
cqFlv[wlRes] += v * m
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a similar check to the previous one is still applicable:

if _, exist := cqFlv[wlRes]; exist {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reverted

Comment on lines 628 to 629
v, ok := job.Labels[constants.QueueLabel]
if ok {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
v, ok := job.Labels[constants.QueueLabel]
if ok {
if v, ok := job.Labels[constants.QueueLabel]; ok {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -11,7 +11,7 @@ See the License for the specific language governing permissions and
limitations under the License.
*/

package v1alpha2
package webhook
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did we move them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe when there are conversion webhooks it'll make sense to the tests to the version, but for now it doesn't make much sense.

@ahg-g
Copy link
Contributor

ahg-g commented Mar 8, 2023

should we delete v1alpha2? it is orphaned now, right?

@alculquicondor
Copy link
Contributor Author

alculquicondor commented Mar 8, 2023

I think this could have been made easier to review if we introduced the changes to v1beta1 over different PRs (for example, first add v1beta1 as an exact copy of v1alpha2, then do the Workload Admission move, then the ResourceFlavor changes, then the ClusterQueue etc.), the wrappers refactoring could have also been introduced in a separate PR.

Yeah, I was too far ahead with changes once I realized the PR was too big :(

The wrapping changes was an attempt to make it easier for me to re-write the unit tests.

As for the logic itself, I wonder if we actually needed to make all those changes at once? maybe by copying the external type to the existing internal types we could have avoided doing all those changes in one go?

There are no internal types in CRDs. Only a "storage version", which is v1beta1 now.

should we delete v1alpha2? it is orphaned now, right?

Next PR

@ahg-g
Copy link
Contributor

ahg-g commented Mar 8, 2023

There are no internal types in CRDs. Only a "storage version", which is v1beta1 now.

we have internal types in the code itself, not sure how much we use the api types directly though in the core logic.

@ahg-g
Copy link
Contributor

ahg-g commented Mar 8, 2023

/hold
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 8, 2023
Copy link
Member

@tenzen-y tenzen-y left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the great work!

Maybe for a follow up, we must update the samples:

https://github.com/kubernetes-sigs/kueue/tree/main/config/samples

@@ -210,7 +210,7 @@ clean-manifests = (cd config/components/manager && $(KUSTOMIZE) edit set image c

.PHONY: install
install: manifests kustomize ## Install CRDs into the K8s cluster specified in ~/.kube/config.
$(KUSTOMIZE) build config/components/crd | kubectl apply -f -
$(KUSTOMIZE) build config/components/crd | kubectl apply --server-side -f -
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's next step :)

}

func queueName(job *batchv1.Job) string {
if v, ok := job.Labels[constants.QueueLabel]; ok {
return v
}
return job.Annotations[constants.QueueAnnotation]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why take the queue name from annotations? For backward compatible?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great :)

@alculquicondor
Copy link
Contributor Author

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 8, 2023
@alculquicondor
Copy link
Contributor Author

/hold
for squash

Change-Id: If389945ad5833d8b869823afa1816772af22663c
@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 8, 2023
@alculquicondor
Copy link
Contributor Author

squashed in two commits
/hold cancel

@alculquicondor alculquicondor mentioned this pull request Mar 8, 2023
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 8, 2023
@tenzen-y
Copy link
Member

tenzen-y commented Mar 8, 2023

/lgtm

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 8, 2023
@tenzen-y
Copy link
Member

tenzen-y commented Mar 8, 2023

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 8, 2023
@k8s-ci-robot k8s-ci-robot merged commit 1bb8e0d into kubernetes-sigs:main Mar 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Graduate API to beta
4 participants