-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added proposal with priority but assured concurrency and fairness #930
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: MikeSpreitzer If they are not already assigned, you can assign the PR to them by writing The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
CC @yue9944882 |
.. to make the management like today's max-in-flight handler.
queueLengthLimit: 100 | ||
``` | ||
|
||
Some flow schemata. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the api model of flow schema here can be simplified by:
- removing support for
patternMatch
andnonPatternMatch
. additionally, all controllers are sharing the same user identity assystem:kube-controller-manager
but differs in user-agent http header. - make the field
matchingPriority
optional and only filled to resolve conflicts among multiple flow schemas. A default value will be working for most cases I suppose. - instead of defining explicit boolean tests in the api model. consider make it a bit simpler by costing a bit ambiguity. as the yaml examples in the following shows:
kind: FlowSchema
meta:
name: system-top
spec:
matchingPriority: 500 // default to 500, override if necessary
requestPriority:
name: system-top
matchRules:
- field: groups
valueSet: [ "system:masters" ]
---
kind: FlowSchema
meta:
name: system-high-node-heartbeat
spec:
matchingPriority: 500 // default to 500, override if necessary
requestPriority:
name: system-high
flowDistinguisher:
source: user
# no transformation in this case
matchRules:
- field: groups
valueSet: [ "system:nodes" ]
- field: resource
valueSet: [ "nodes" ]
---
kind: FlowSchema
meta:
name: system-high-system-objects
spec:
matchingPriority: 500 // default to 500, override if necessary
requestPriority:
name: system-high
flowDistinguisher:
source: user
# no transformation in this case
matchRules:
- field: groups
valueSet: [ "system:nodes" ]
- field: namespace
valueSet: [ "kube-system" ]
---
kind: FlowSchema
meta:
name: system-high-system-objects
spec:
matchingPriority: 500 // default to 500, override if necessary
requestPriority:
name: system-high
flowDistinguisher:
source: user
# no transformation in this case
matchRules:
- field: groups
valueSet: [ "system:nodes" ]
- field: namespace
valueSet: [ "kube-system" ]
---
kind: FlowSchema
meta:
name: workload-low-serviceaccounts
spec:
matchingPriority: 500 // default to 500, override if necessary
requestPriority:
name: workload-high
flowDistinguisher:
source: namespace
# no transformation in this case
matchRules:
- field: groups
valueSet: [ "system:serviceaccounts" ]
....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- all controllers are sharing the same user identity as
system:kube-controller-manager
but differs in user-agent http header.
that's not generally accurate... typically, each controller loop gets a distinct identity
(ACV), which is calculated as follows. | ||
|
||
``` | ||
ACV(l) = ceil( SCL * ACS(l) / (100 + sum[priority levels k] ACS(k) ) ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is it 100? i have a complex feeling about introducing more magic numbers..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This number is not particularly magic. It just establishes a magnitude standard to compare the ACS values with. All that matters is their ratios. We could create another configuration parameter to define the share for shared concurrency. But, without loss of generality, we can fix it at a particular number.
In the meeting today we agreed to start simpler, and #933 is an attempt to capture that agreement. |
@MikeSpreitzer: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@MikeSpreitzer - should this one be closed now? |
It is not something we are trying to merge now. OTOH, it has a more advanced design that I think we will want to look at second. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Rotten issues close after 30d of inactivity. Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
@fejta-bot: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
No description provided.