-
Notifications
You must be signed in to change notification settings - Fork 480
KEP-4136: Admission Fair Sharing #4252
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
✅ Deploy Preview for kubernetes-sigs-kueue canceled.
|
| admission logic. If there are two AdmissionScopes on the path from CQ/Cohort to the top of | ||
| the hierarchy tree, the higher one is used. | ||
|
|
||
| Const ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix formatting
|
|
||
| * Create a new struct AdmissionScope and make it an optional field for CQ and Cohort Spec. If | ||
| not provided, CQ or Cohort is not considered an AdmissionScope and is not a subject for new | ||
| admission logic. If there are two AdmissionScopes on the path from CQ/Cohort to the top of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? Let's make this invalid state, as the lower scope is doing nothing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because someone may be changing the scope. Scope change is not atomic and we cannot block the entire hierarchy in the meantime.
| or following is possible then that workload is admitted, under condition that it might get preempted. | ||
|
|
||
| 3. AdmissionScope at Cohort level - Kueue operates in a mixed mode. Inside CQ workloads are | ||
| selected according to their AdmissionMode (if specified). If a workload fits entirely into |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inside CQ workloads are selected according to their AdmissionMode (if specified)
Only highest AdmissionScope is used. Do you mean Queueing Policy (FIFO/BestEffort)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The candidates inside individual CQs are selected based on the specified logic and bubbled up.
mimowo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First pass. For now trying to wrap my head around the overlap with the classical fair sharing, and the enablement scope. Also, I'm not entirely clear if we support reclamation in the new mode.
|
/lgtm |
|
LGTM label has been added. Git tree hash: 79e3e4ee380e8a269461ace0f105f414e8f9a438
|
|
/lgtm I'm not quite sure about the abstraction of "Admission Fair Sharing", but this is something we will need to figure out as we go. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mimowo, mwielgus The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
tenzen-y
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/hold
Basically lgtm
one comment for API typed
| * Modify CQ’s FairSharing struct with | ||
|
|
||
| ```go | ||
| type FairSharing struct { | ||
| // Weight denotes how important the given queue when competing against other queues | ||
| // for unused shared resources. The exact impact of the weight in fair share calculations | ||
| // depends on the fair share algorithm used. Default = 1. | ||
| Weight *resource.Quantity `json:"weight,omitempty"` | ||
| } | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, we already have this field
kueue/apis/kueue/v1beta1/clusterqueue_types.go
Lines 128 to 132 in 636d57a
| // fairSharing defines the properties of the ClusterQueue when | |
| // participating in FairSharing. The values are only relevant | |
| // if FairSharing is enabled in the Kueue configuration. | |
| // +optional | |
| FairSharing *FairSharing `json:"fairSharing,omitempty"` |
| Weight *resource.Quantity `json:"weight,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, there is. I change the meaning a bit to be more generic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. We will change only the meaning, and not change the API's looking. Thanks.
Co-authored-by: Yuki Iwai <[email protected]>
| // LastUpdate is the time when share and consumed resources were updated. | ||
| LastUpdate metav1.Time `json:"lastUpdate,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use conditions on each resource instead of this dedicated LastUpdate?
In this approach, I think If we want to add any reason and message for FairShare, we need to add those condition similarity fields here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Conditions have last transition, not last update.
// lastTransitionTime is the last time the condition transitioned from one status to another.
// This should be when the underlying condition changed.
Here there is no transition, just updates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. You mean this is the time when kueue takes a snapshot of consumed usage.
In that case, we probably want to say usageSnapshotAt, usageCalculationAt or something so that we can obviously clarify what is this time.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I see LastUpdate time as more straightforward and potentially covering also WeightedShare or whatever else we add in the future. UsageSnapshotAt limits us, and if any new field is added then we will need another timestamp.
Co-authored-by: Yuki Iwai <[email protected]>
tenzen-y
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than https://github.com/kubernetes-sigs/kueue/pull/4252/files#r2050817373
lgtm, thanks
|
/lgtm @tenzen-y I believe we can address this later or during the implementation , and then we update KEP to align. |
|
LGTM label has been added. Git tree hash: 73d6215304b15875284d34eb036170d3f524c32f
|
|
/lgtm |
That makes sense. |
What type of PR is this?
/kind feature
/kind api-change
What this PR does / why we need it:
KEP for a new resource fair sharing method.
Which issue(s) this PR fixes:
Fixes #4136
Special notes for your reviewer:
Does this PR introduce a user-facing change?