-
Notifications
You must be signed in to change notification settings - Fork 474
Validate slice annotations #5582
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validate slice annotations #5582
Conversation
✅ Deploy Preview for kubernetes-sigs-kueue canceled.
|
|
Hi @lchrzaszcz. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
4c0f6a2 to
4a8f2f1
Compare
4a8f2f1 to
ace4ac6
Compare
| } else if val < 1 { | ||
| allErrs = append(allErrs, field.Invalid(annotationsPath.Key(kueuealpha.PodSetSliceSizeAnnotation), sliceSizeValue, "must be greater than or equal to 1")) | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we validate that it divides into pod count with no remainder?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's a good idea, but I think it'll get more complex, as we don't have easy access to number of pods in a PodSet in that stage (see PR description). I'll give it a look thought, maybe it's easy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make the algorithm substantially harder? If not, then I would rather support also Jobs divided with reminder.
The API-wise it makes sense to me, that you may want to have a Job of size 1000 Pods split into racks of 16 nodes. It might get awkward to require a user to pad the Job to create 1008 Pods to split into racks equally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it will make algorithm much more complex. If we do not mind assuming that the last slice behaves like a full slice then it should work almost out of the box (corner case: if 1000 pods fit, but 1008 do not fit, then we also won't fit such workload)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is fair for simplicity to allow it with the remark about its use in documentation, for example in the limitations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool. I'll make sure to reflect it in the main PR and write a test for that.
There is still a caveat of checking if slice size do not exceed the number of pods in PodSet. However that will be a bigger change, so we can proceed with this PR while I work on adding that validation.
mimowo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
Thanks 👍
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: lchrzaszcz, mimowo The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
LGTM label has been added. Git tree hash: 17c89b17988fb33e33155fb9d8a263c4a241f0df
|
|
Oh, I missed setting /ok-to-test. Approval triggered the tests, but they may potentially fail. |
|
/ok-to-test |
|
/lgtm |
|
LGTM label has been added. Git tree hash: 0503c84317a01f8ca5317cc188bd425dfa14b09b
|
|
I don't think this requires a separate release note as it is part of 2-level scheduling in TAS #5353 |
What type of PR is this?
/kind feature
What this PR does / why we need it:
#5353 introduces two-level scheduling, however it does not add validation. This PR is a preparation PR that introduces validation for annotations
kueue.x-k8s.io/podset-slice-required-topologyandkueue.x-k8s.io/podset-slice-size.Which issue(s) this PR fixes:
Related to #5439
Special notes for your reviewer:
For better readability I've added tests only for Job. I see that we repeat the same tests for each (or at least most of) Job type. That would be copy-paste, so I've decided to present tests for Job for now and once I get an approval I'll add the rest of the tests or do it in next PR.
I think we should also validate that the slice size is not greater than the number of pods in PodSet, because if someone sets it to a huge number algorithm might calculate that there are 0 slices, which might have unexpected behavior. However that requires to know what is the size of a PodSet, which is not that easy on this stage of validation. I will address this in a follow-up PR.
Does this PR introduce a user-facing change?