Skip to content

Conversation

@mwielgus
Copy link
Contributor

What type of PR is this?

/kind feature
/kind api-change

What this PR does / why we need it:

KEP for a new resource fair sharing method.

Which issue(s) this PR fixes:

Fixes #4136

Special notes for your reviewer:

Does this PR introduce a user-facing change?

NONE

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/feature Categorizes issue or PR as related to a new feature. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Feb 12, 2025
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Feb 12, 2025
@netlify
Copy link

netlify bot commented Feb 12, 2025

Deploy Preview for kubernetes-sigs-kueue canceled.

Name Link
🔨 Latest commit ba587fa
🔍 Latest deploy log https://app.netlify.com/sites/kubernetes-sigs-kueue/deploys/680275236f513d0008c66cdc

@mimowo
Copy link
Contributor

mimowo commented Feb 13, 2025

/assign @PBundyra @gabesaba
To help with the review. I appreciate it is planned for 0.12 as we already have a plan for 0.11: #4249
and this seems big.

admission logic. If there are two AdmissionScopes on the path from CQ/Cohort to the top of
the hierarchy tree, the higher one is used.

Const (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix formatting


* Create a new struct AdmissionScope and make it an optional field for CQ and Cohort Spec. If
not provided, CQ or Cohort is not considered an AdmissionScope and is not a subject for new
admission logic. If there are two AdmissionScopes on the path from CQ/Cohort to the top of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? Let's make this invalid state, as the lower scope is doing nothing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because someone may be changing the scope. Scope change is not atomic and we cannot block the entire hierarchy in the meantime.

or following is possible then that workload is admitted, under condition that it might get preempted.

3. AdmissionScope at Cohort level - Kueue operates in a mixed mode. Inside CQ workloads are
selected according to their AdmissionMode (if specified). If a workload fits entirely into
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inside CQ workloads are selected according to their AdmissionMode (if specified)

Only highest AdmissionScope is used. Do you mean Queueing Policy (FIFO/BestEffort)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The candidates inside individual CQs are selected based on the specified logic and bubbled up.

Copy link
Contributor

@mimowo mimowo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass. For now trying to wrap my head around the overlap with the classical fair sharing, and the enablement scope. Also, I'm not entirely clear if we support reclamation in the new mode.

@PBundyra
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 18, 2025
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 79e3e4ee380e8a269461ace0f105f414e8f9a438

@mimowo
Copy link
Contributor

mimowo commented Apr 18, 2025

/lgtm
/approve
I think this is a very important feature to provide Fair Sharing based on past cumulative usage.

I'm not quite sure about the abstraction of "Admission Fair Sharing", but this is something we will need to figure out as we go.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mimowo, mwielgus

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 18, 2025
Copy link
Member

@tenzen-y tenzen-y left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/hold
Basically lgtm

one comment for API typed

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 18, 2025
Comment on lines +66 to +75
* Modify CQ’s FairSharing struct with

```go
type FairSharing struct {
// Weight denotes how important the given queue when competing against other queues
// for unused shared resources. The exact impact of the weight in fair share calculations
// depends on the fair share algorithm used. Default = 1.
Weight *resource.Quantity `json:"weight,omitempty"`
}
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, we already have this field

// fairSharing defines the properties of the ClusterQueue when
// participating in FairSharing. The values are only relevant
// if FairSharing is enabled in the Kueue configuration.
// +optional
FairSharing *FairSharing `json:"fairSharing,omitempty"`

Weight *resource.Quantity `json:"weight,omitempty"`

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there is. I change the meaning a bit to be more generic.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. We will change only the meaning, and not change the API's looking. Thanks.

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 18, 2025
Comment on lines +97 to +98
// LastUpdate is the time when share and consumed resources were updated.
LastUpdate metav1.Time `json:"lastUpdate,omitempty"`
Copy link
Member

@tenzen-y tenzen-y Apr 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use conditions on each resource instead of this dedicated LastUpdate?
In this approach, I think If we want to add any reason and message for FairShare, we need to add those condition similarity fields here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conditions have last transition, not last update.

// lastTransitionTime is the last time the condition transitioned from one status to another.
// This should be when the underlying condition changed. 

Here there is no transition, just updates.

Copy link
Member

@tenzen-y tenzen-y Apr 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. You mean this is the time when kueue takes a snapshot of consumed usage.
In that case, we probably want to say usageSnapshotAt, usageCalculationAt or something so that we can obviously clarify what is this time.

WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I see LastUpdate time as more straightforward and potentially covering also WeightedShare or whatever else we add in the future. UsageSnapshotAt limits us, and if any new field is added then we will need another timestamp.

Copy link
Member

@tenzen-y tenzen-y left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mimowo
Copy link
Contributor

mimowo commented Apr 24, 2025

/lgtm
/unhold
To avoid blocking the implementation work on the design which has only minor questions ope.

@tenzen-y I believe we can address this later or during the implementation , and then we update KEP to align.

@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. and removed do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Apr 24, 2025
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 73d6215304b15875284d34eb036170d3f524c32f

@k8s-ci-robot k8s-ci-robot merged commit f61f8e9 into kubernetes-sigs:main Apr 24, 2025
7 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v0.12 milestone Apr 24, 2025
@gabesaba
Copy link
Contributor

/lgtm

@tenzen-y
Copy link
Member

@tenzen-y I believe we can address this later or during the implementation , and then we update KEP to align.

That makes sense.
If I can see anything in the implementation PR, let me leave comments on the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fair sharing mechanism without preemptions

8 participants