-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Downward API support for HugePages #2055
Downward API support for HugePages #2055
Conversation
/cc @sjenning @liggitt @dchen1107 PTAL |
@derekwaynecarr: GitHub didn't allow me to request PR reviews from the following users: PTAL. Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/milestone v1.20 |
|
||
The primary risk for this proposal is that it loosens validation for | ||
Pods accepted by the API server. As a result, the `kube-apiserver` should | ||
not enable support for this feature until all known `kubelet` in the cluster |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is not a new field, but is a newly accepted value for an existing field, clarify what the gate protects. I would expect:
- With the gate enabled, the newly allowed values are permitted in all creation and update scenarios.
- When the gate disabled, the new values are permitted only in updates of objects which already contain the new values. Use in creation or in updates of objects which do not already use the new values will fail validation.
feature-gates: | ||
- name: DownwardAPIHugePages | ||
components: | ||
- kubelet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clarify what the feature gate controls in the kubelet... I don't think we have a great mechanism for enabling a gate in one component independently, so the "enable in the kubelet 2 releases before the apiserver" bit might be tricky to do with a single gate. Options are multiple gates, or leaving the kubelet code path ungated to be triggered by the presence of the data in the API
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer kubelet behavior is ungated, apiserver is gated, will clarify.
largely lgtm from an API perspective, just a couple clarifications/comments |
c18bca7
to
f89d463
Compare
@liggitt addressed feedback, thanks again! PTAL. |
Wouldn't every new downward API element have this same problem? Maybe the fix is that apiserver should stop validating downward API field selectors and just let kubelet do something appropriate with invalid ones |
@danwinship I don't have an issue with your last statement, just for my understanding, would that mean that if I entered an invalid DownwardAPI field in the podSpec, the pod would be created but of coarse the field would not be exposed to the pod? |
Yes
deferring failure of stringly typed expressions to async pod runtime makes for a pretty poor user experience |
|
||
## Design Details | ||
|
||
Add support for `requests.hugepages-<pagesize>` and `limits.hugepages-<pagesize>` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will validation ensure that only pagesizes actually present in requests/limits are allowed? if not, what value is used if a pagesize that is not set as a request or limit is specified as a downward API value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
value projected in pod will always match value presented in the cgroup (which absent an explicit request is 0) because hugepages are not subject to overcommit like memory. validation will not require a value present in request/limit as i anticipate pod templates will use the downward API form in order to know if a resource request was made or not for specific page sizes as the same workload is applied to nodes with different page sizes, or have some kustomize template that omits the request when testing, but the workload would still want to know it has 0 pages at that size..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i anticipate pod templates will use the downward API form in order to know if a resource request was made or not for specific page sizes as the same workload is applied to nodes with different page sizes
Are supported page sizes well-known? Would a pod template have to enumerate a lot of pagesizes to be aware of all of the possible values? How would an injected container discover hugepage requests/limits via this mechanism without first-hand knowledge of the pod template it was injected into? All of the existing downwardAPI resource keys are fixed (e.g. memory, cpu, ephemeral-storage), so it's easy to discovery resource configurations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@liggitt the list of pagesizes varies by architecture. there are a practical set of pagesizes supported by things like runc. the validation for the downward api field will not vary from the validation for resource requirements today for hugepages.
additions look good, thanks. I had one other question about validation and behavior for things that match the pattern but do not have a corresponding pagesize present in requests/limits |
f89d463
to
37f78d8
Compare
37f78d8
to
029967e
Compare
@liggitt added text for requested updates in design detail section. thanks all for the prompt assistance! |
sure but so does implementing new features and then telling users they can't use them for two releases... wasn't there some idea floating around somewhere about non-fatal warnings on creation? |
@danwinship in retrospect, it was a mistake that we didn't have downward API for hugepages when initially adding support for hugepages. i would say in the future, any new standard resource types should have downward API covered as part of the alpha/beta criteria. |
/lgtm Agreed with @derekwaynecarr at #2055 (comment) We can enforce this during the KEP review from now on. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dchen1107, derekwaynecarr The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Adding support for hugepages in the downward API requires validation to loosen.
As a result, we need to coordinate adding support in the kubelet, and ensure that the feature on the kube-apiserver is not enabled for 2 releases to ensure all nodes in the cluster have latest support before enabling the feature by default for existing clusters. This KEP describes that process.