Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Downward API support for HugePages #2055

Merged

Conversation

derekwaynecarr
Copy link
Member

Adding support for hugepages in the downward API requires validation to loosen.

As a result, we need to coordinate adding support in the kubelet, and ensure that the feature on the kube-apiserver is not enabled for 2 releases to ensure all nodes in the cluster have latest support before enabling the feature by default for existing clusters. This KEP describes that process.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Oct 2, 2020
@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/node Categorizes an issue or PR as relevant to SIG Node. labels Oct 2, 2020
@derekwaynecarr
Copy link
Member Author

/cc @sjenning @liggitt @dchen1107 PTAL

@k8s-ci-robot
Copy link
Contributor

@derekwaynecarr: GitHub didn't allow me to request PR reviews from the following users: PTAL.

Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @sjenning @liggitt @dchen1107 PTAL

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@derekwaynecarr
Copy link
Member Author

/milestone v1.20


The primary risk for this proposal is that it loosens validation for
Pods accepted by the API server. As a result, the `kube-apiserver` should
not enable support for this feature until all known `kubelet` in the cluster
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is not a new field, but is a newly accepted value for an existing field, clarify what the gate protects. I would expect:

  • With the gate enabled, the newly allowed values are permitted in all creation and update scenarios.
  • When the gate disabled, the new values are permitted only in updates of objects which already contain the new values. Use in creation or in updates of objects which do not already use the new values will fail validation.

feature-gates:
- name: DownwardAPIHugePages
components:
- kubelet
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clarify what the feature gate controls in the kubelet... I don't think we have a great mechanism for enabling a gate in one component independently, so the "enable in the kubelet 2 releases before the apiserver" bit might be tricky to do with a single gate. Options are multiple gates, or leaving the kubelet code path ungated to be triggered by the presence of the data in the API

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer kubelet behavior is ungated, apiserver is gated, will clarify.

@liggitt
Copy link
Member

liggitt commented Oct 2, 2020

largely lgtm from an API perspective, just a couple clarifications/comments

@derekwaynecarr
Copy link
Member Author

@liggitt addressed feedback, thanks again! PTAL.

@danwinship
Copy link
Contributor

Adding support for hugepages in the downward API requires validation to loosen.

Wouldn't every new downward API element have this same problem? Maybe the fix is that apiserver should stop validating downward API field selectors and just let kubelet do something appropriate with invalid ones

@Billy99
Copy link

Billy99 commented Oct 2, 2020

@danwinship I don't have an issue with your last statement, just for my understanding, would that mean that if I entered an invalid DownwardAPI field in the podSpec, the pod would be created but of coarse the field would not be exposed to the pod?

@liggitt
Copy link
Member

liggitt commented Oct 2, 2020

Wouldn't every new downward API element have this same problem?

Yes

Maybe the fix is that apiserver should stop validating downward API field selectors and just let kubelet do something appropriate with invalid ones

deferring failure of stringly typed expressions to async pod runtime makes for a pretty poor user experience


## Design Details

Add support for `requests.hugepages-<pagesize>` and `limits.hugepages-<pagesize>`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will validation ensure that only pagesizes actually present in requests/limits are allowed? if not, what value is used if a pagesize that is not set as a request or limit is specified as a downward API value?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value projected in pod will always match value presented in the cgroup (which absent an explicit request is 0) because hugepages are not subject to overcommit like memory. validation will not require a value present in request/limit as i anticipate pod templates will use the downward API form in order to know if a resource request was made or not for specific page sizes as the same workload is applied to nodes with different page sizes, or have some kustomize template that omits the request when testing, but the workload would still want to know it has 0 pages at that size..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i anticipate pod templates will use the downward API form in order to know if a resource request was made or not for specific page sizes as the same workload is applied to nodes with different page sizes

Are supported page sizes well-known? Would a pod template have to enumerate a lot of pagesizes to be aware of all of the possible values? How would an injected container discover hugepage requests/limits via this mechanism without first-hand knowledge of the pod template it was injected into? All of the existing downwardAPI resource keys are fixed (e.g. memory, cpu, ephemeral-storage), so it's easy to discovery resource configurations.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liggitt the list of pagesizes varies by architecture. there are a practical set of pagesizes supported by things like runc. the validation for the downward api field will not vary from the validation for resource requirements today for hugepages.

@liggitt
Copy link
Member

liggitt commented Oct 2, 2020

additions look good, thanks. I had one other question about validation and behavior for things that match the pattern but do not have a corresponding pagesize present in requests/limits

@derekwaynecarr
Copy link
Member Author

@liggitt added text for requested updates in design detail section.
@kikisdeliveryservice added alpha criteria!

thanks all for the prompt assistance!

@danwinship
Copy link
Contributor

deferring failure of strongly typed expressions to async pod runtime makes for a pretty poor user experience

sure but so does implementing new features and then telling users they can't use them for two releases...

wasn't there some idea floating around somewhere about non-fatal warnings on creation?

@derekwaynecarr
Copy link
Member Author

@danwinship in retrospect, it was a mistake that we didn't have downward API for hugepages when initially adding support for hugepages. i would say in the future, any new standard resource types should have downward API covered as part of the alpha/beta criteria.

@dchen1107
Copy link
Member

/lgtm
/approve

Agreed with @derekwaynecarr at #2055 (comment) We can enforce this during the KEP review from now on.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 6, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dchen1107, derekwaynecarr

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 9d4e9c9 into kubernetes:master Oct 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/node Categorizes an issue or PR as relevant to SIG Node. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants