Skip to content

Conversation

@lchrzaszcz
Copy link
Contributor

@lchrzaszcz lchrzaszcz commented Jun 9, 2025

What type of PR is this?

/kind feature

What this PR does / why we need it:

#5353 introduces two-level scheduling, however it does not add validation. This PR is a preparation PR that introduces validation for annotations kueue.x-k8s.io/podset-slice-required-topology and kueue.x-k8s.io/podset-slice-size.

Which issue(s) this PR fixes:

Related to #5439

Special notes for your reviewer:

For better readability I've added tests only for Job. I see that we repeat the same tests for each (or at least most of) Job type. That would be copy-paste, so I've decided to present tests for Job for now and once I get an approval I'll add the rest of the tests or do it in next PR.

I think we should also validate that the slice size is not greater than the number of pods in PodSet, because if someone sets it to a huge number algorithm might calculate that there are 0 slices, which might have unexpected behavior. However that requires to know what is the size of a PodSet, which is not that easy on this stage of validation. I will address this in a follow-up PR.

Does this PR introduce a user-facing change?

NONE

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Jun 9, 2025
@netlify
Copy link

netlify bot commented Jun 9, 2025

Deploy Preview for kubernetes-sigs-kueue canceled.

Name Link
🔨 Latest commit de420b7
🔍 Latest deploy log https://app.netlify.com/projects/kubernetes-sigs-kueue/deploys/68484ea620c73f00085bb52e

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jun 9, 2025
@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jun 9, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @lchrzaszcz. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jun 9, 2025
@lchrzaszcz lchrzaszcz force-pushed the slice-annotations-validation branch from 4c0f6a2 to 4a8f2f1 Compare June 10, 2025 13:15
@lchrzaszcz lchrzaszcz force-pushed the slice-annotations-validation branch from 4a8f2f1 to ace4ac6 Compare June 10, 2025 13:19
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 10, 2025
@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Jun 10, 2025
@lchrzaszcz lchrzaszcz changed the title WIP Validate slice annotations Validate slice annotations Jun 10, 2025
@lchrzaszcz lchrzaszcz marked this pull request as ready for review June 10, 2025 14:24
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 10, 2025
} else if val < 1 {
allErrs = append(allErrs, field.Invalid(annotationsPath.Key(kueuealpha.PodSetSliceSizeAnnotation), sliceSizeValue, "must be greater than or equal to 1"))
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we validate that it divides into pod count with no remainder?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a good idea, but I think it'll get more complex, as we don't have easy access to number of pods in a PodSet in that stage (see PR description). I'll give it a look thought, maybe it's easy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make the algorithm substantially harder? If not, then I would rather support also Jobs divided with reminder.

The API-wise it makes sense to me, that you may want to have a Job of size 1000 Pods split into racks of 16 nodes. It might get awkward to require a user to pad the Job to create 1008 Pods to split into racks equally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it will make algorithm much more complex. If we do not mind assuming that the last slice behaves like a full slice then it should work almost out of the box (corner case: if 1000 pods fit, but 1008 do not fit, then we also won't fit such workload)

Copy link
Contributor

@mimowo mimowo Jun 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is fair for simplicity to allow it with the remark about its use in documentation, for example in the limitations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. I'll make sure to reflect it in the main PR and write a test for that.

There is still a caveat of checking if slice size do not exceed the number of pods in PodSet. However that will be a bigger change, so we can proceed with this PR while I work on adding that validation.

Copy link
Contributor

@mimowo mimowo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve
Thanks 👍

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 10, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: lchrzaszcz, mimowo

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 17c89b17988fb33e33155fb9d8a263c4a241f0df

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 10, 2025
@mimowo
Copy link
Contributor

mimowo commented Jun 10, 2025

Oh, I missed setting /ok-to-test. Approval triggered the tests, but they may potentially fail.

@mimowo
Copy link
Contributor

mimowo commented Jun 10, 2025

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jun 10, 2025
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 10, 2025
@k8s-ci-robot k8s-ci-robot requested a review from mimowo June 10, 2025 15:26
@mimowo
Copy link
Contributor

mimowo commented Jun 10, 2025

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 10, 2025
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 0503c84317a01f8ca5317cc188bd425dfa14b09b

@k8s-ci-robot k8s-ci-robot merged commit 466ad41 into kubernetes-sigs:main Jun 10, 2025
23 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v0.13 milestone Jun 10, 2025
@mimowo
Copy link
Contributor

mimowo commented Jul 7, 2025

I don't think this requires a separate release note as it is part of 2-level scheduling in TAS #5353
/release-note-edit

NONE

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. and removed release-note Denotes a PR that will be considered when it comes time to generate release notes. labels Jul 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants