-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Downward API support for HugePages #2055
Merged
k8s-ci-robot
merged 1 commit into
kubernetes:master
from
derekwaynecarr:downward-api-hugepages
Oct 6, 2020
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,265 @@ | ||
# KEP-1967: Downward API HugePages | ||
|
||
<!-- toc --> | ||
- [Release Signoff Checklist](#release-signoff-checklist) | ||
- [Summary](#summary) | ||
- [Motivation](#motivation) | ||
- [Goals](#goals) | ||
- [Non-Goals](#non-goals) | ||
- [Proposal](#proposal) | ||
- [Risks and Mitigations](#risks-and-mitigations) | ||
- [Design Details](#design-details) | ||
- [Test Plan](#test-plan) | ||
- [Graduation Criteria](#graduation-criteria) | ||
- [Alpha](#alpha) | ||
- [Alpha -> Beta Graduation](#alpha---beta-graduation) | ||
- [Beta -> GA Graduation](#beta---ga-graduation) | ||
- [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy) | ||
- [Version Skew Strategy](#version-skew-strategy) | ||
- [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire) | ||
- [Feature Enablement and Rollback](#feature-enablement-and-rollback) | ||
- [Rollout, Upgrade and Rollback Planning](#rollout-upgrade-and-rollback-planning) | ||
- [Monitoring Requirements](#monitoring-requirements) | ||
- [Dependencies](#dependencies) | ||
- [Scalability](#scalability) | ||
- [Troubleshooting](#troubleshooting) | ||
- [Implementation History](#implementation-history) | ||
- [Drawbacks](#drawbacks) | ||
- [Alternatives](#alternatives) | ||
- [Infrastructure Needed (Optional)](#infrastructure-needed-optional) | ||
<!-- /toc --> | ||
|
||
## Release Signoff Checklist | ||
|
||
Items marked with (R) are required *prior to targeting to a milestone / release*. | ||
|
||
- [ ] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR) | ||
- [ ] (R) KEP approvers have approved the KEP status as `implementable` | ||
- [ ] (R) Design details are appropriately documented | ||
- [ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input | ||
- [ ] (R) Graduation criteria is in place | ||
- [ ] (R) Production readiness review completed | ||
- [ ] Production readiness review approved | ||
- [ ] "Implementation History" section is up-to-date for milestone | ||
- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io] | ||
- [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes | ||
|
||
## Summary | ||
|
||
This KEP exposes hugepages in the downward API. | ||
|
||
## Motivation | ||
|
||
Pods are unable to know their hugepage request or limits via the downward API. HugePages | ||
are a natively supported resource in Kubernetes and should be visible in downward API | ||
consistent with other resources like cpu, memory, ephemeral-storage. | ||
|
||
### Goals | ||
|
||
- Add support for hugepage requests and limits for all page sizes in downward API | ||
|
||
### Non-Goals | ||
|
||
- Change any other aspect of hugepage support | ||
|
||
## Proposal | ||
|
||
Define a new feature gate: `DownwardAPIHugePages`. | ||
|
||
If enabled, the `kube-apiserver` will allow pod specifications to make use | ||
of hugepages in downward API when the feature gate is enabled. The `kubelet` | ||
will add support for hugepages in the downward API independent of the feature | ||
gate. | ||
|
||
### Risks and Mitigations | ||
|
||
The primary risk for this proposal is that it loosens validation for Pods. | ||
|
||
The mitigation proposed is as follows: | ||
|
||
- Add support for the new fields in `kubelet` by default. This is considered | ||
low risk as the code is inert when pods do not use the tokens, and the subsystem | ||
in the kubelet is localized. | ||
- The `kube-apiserver` will have the feature gate disabled by default for 2 | ||
releases until we know all supported skew scenarios result in all kubelets having | ||
the supported code present. | ||
|
||
When the gate is enabled, the `kube-apiserver` will permit the newly allowed | ||
values in all creation and update scenarios. When the gate is disabled, the | ||
new values are permitted only in updates of objects which already contain | ||
the new values. Use in creation of in updates of objects which do not | ||
already use the new values will fail validation. | ||
|
||
## Design Details | ||
|
||
Add support for `requests.hugepages-<pagesize>` and `limits.hugepages-<pagesize>` | ||
to downward API consistent with cpu, memory, and ephemeral storage. Enable the | ||
support by default in the kubelet, but gate its usage by default in the `kube-apiserver` | ||
for 2 releases to ensure all nodes in the cluster have been proper support. | ||
|
||
It is important to remember that `hugepages-<pagesize>` is not a resource | ||
that is subject to overcommit. A pod must have a matching request and limit | ||
for an explicit `hugepages-<pagesize>` in order to consume hugepages. Absent | ||
an explicit request, no `hugepages-<pagesize>` is provided to a pod. | ||
|
||
The `kube-apiserver` will not require pods to make an explicit `hugepages-<pagesize>` | ||
request in its pod spec in order to use the field in the downward API. The rationale | ||
for this behavior is that pod templates for specific workload types may support | ||
running with or without `hugepages-<pagesize>` made available to them and as a result, | ||
it may include both memory and hugepages in the downward API in order to know how to adjust. | ||
The `kubelet` will ensure that the downward API value projected into the container for | ||
a specific `hugepages-<pagesize>` will match what is provided with its bounding pod | ||
and or container cgroup. | ||
|
||
### Test Plan | ||
|
||
Unit and e2e testing will be added consistent with other resources in downward API. | ||
|
||
e2e testing will only function if a node in the cluster exposes hugepages, otherwise, | ||
it will gracefully skip (as expected). | ||
|
||
### Graduation Criteria | ||
|
||
dchen1107 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
#### Alpha | ||
|
||
- Feature gate is present and enforced in kube-apiserver | ||
- Validation logic is in-place in kube-apiserver | ||
- Kubelet has support for projecting the value in the pod | ||
- unit testing for downward API enhancement | ||
|
||
#### Alpha -> Beta Graduation | ||
|
||
- Added support in kube-apiserver protected by feature gate | ||
- Added support in kubelet for 2 releases. | ||
- e2e testing for hosts with hugepages enabled | ||
|
||
#### Beta -> GA Graduation | ||
|
||
- Enable support by default one release after kube-apiserver feature gate is enabled in beta. | ||
|
||
### Upgrade / Downgrade Strategy | ||
|
||
The kubelet will have the support for 2 releases before its | ||
enabled in the kube-apiserver. This ensures that pods cannot | ||
get accepted in the platform for which nodes do not have support. | ||
|
||
### Version Skew Strategy | ||
|
||
The kubelet will have the support for 2 releases before its | ||
enabled in the kube-apiserver. This ensures that pods cannot | ||
get accepted in the platform for which nodes do not have support. | ||
|
||
## Production Readiness Review Questionnaire | ||
|
||
### Feature Enablement and Rollback | ||
|
||
_This section must be completed when targeting alpha to a release._ | ||
|
||
* **How can this feature be enabled / disabled in a live cluster?** | ||
- [x] Feature gate (also fill in values in `kep.yaml`) | ||
- Feature gate name: DownwardAPIHugePages | ||
- Components depending on the feature gate: kube-apiserver | ||
- Will enabling / disabling the feature require downtime or reprovisioning | ||
of a node? No | ||
|
||
* **Does enabling the feature change any default behavior?** | ||
Yes, the kube-apiserver will admit pods that use the new downward API support. | ||
|
||
* **Can the feature be disabled once it has been enabled (i.e. can we roll back | ||
the enablement)?** Yes | ||
Only if pods were not admitted that used the feature. | ||
|
||
* **What happens if we reenable the feature if it was previously rolled back?** | ||
Nothing. New pods will now accept the new fields in admission. | ||
|
||
* **Are there any tests for feature enablement/disablement?** | ||
No, this will be handled by coordinating support in the kubelet. | ||
|
||
### Rollout, Upgrade and Rollback Planning | ||
|
||
* **How can a rollout fail? Can it impact already running workloads?** | ||
If all kubelets in a cluster do not have support for hugepages enabled | ||
prior to accepting pods in the kube-apiserver that use it in the downward api, | ||
a node may not start with the downward api information made available. It would | ||
impact the operating environment for the application and not the cluster. | ||
|
||
* **What specific metrics should inform a rollback?** | ||
None. | ||
|
||
* **Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?** | ||
I do not believe this is applicable. | ||
|
||
* **Is the rollout accompanied by any deprecations and/or removals of features, APIs, | ||
fields of API types, flags, etc.?** | ||
Even if applying deprecation policies, they may still surprise some users. | ||
No, validation is loosened but coordinated across N-2 releases. | ||
|
||
### Monitoring Requirements | ||
|
||
* **How can an operator determine if the feature is in use by workloads?** | ||
An operator could audit pods that use the new downward API tokens. | ||
|
||
* **What are the SLIs (Service Level Indicators) an operator can use to determine | ||
the health of the service?** | ||
This does not seem relevant to this feature. | ||
|
||
* **What are the reasonable SLOs (Service Level Objectives) for the above SLIs?** | ||
This does not seem relevant to this feature. | ||
|
||
* **Are there any missing metrics that would be useful to have to improve observability | ||
of this feature?** | ||
No. | ||
|
||
### Dependencies | ||
|
||
* **Does this feature depend on any specific services running in the cluster?** | ||
No | ||
|
||
### Scalability | ||
|
||
* **Will enabling / using this feature result in any new API calls?** | ||
No. | ||
|
||
* **Will enabling / using this feature result in introducing new API types?** | ||
No | ||
|
||
* **Will enabling / using this feature result in any new calls to the cloud | ||
provider?** | ||
No | ||
|
||
* **Will enabling / using this feature result in increasing size or count of | ||
the existing API objects?** | ||
No | ||
|
||
* **Will enabling / using this feature result in increasing time taken by any | ||
operations covered by [existing SLIs/SLOs]?** | ||
No | ||
|
||
* **Will enabling / using this feature result in non-negligible increase of | ||
resource usage (CPU, RAM, disk, IO, ...) in any components?** | ||
No | ||
|
||
### Troubleshooting | ||
|
||
* **How does this feature react if the API server and/or etcd is unavailable?** | ||
No impact. | ||
|
||
* **What are other known failure modes?** | ||
Not applicable. | ||
|
||
* **What steps should be taken if SLOs are not being met to determine the problem?** | ||
Not applicable | ||
|
||
## Implementation History | ||
|
||
## Drawbacks | ||
|
||
None. | ||
|
||
## Alternatives | ||
|
||
None. | ||
|
||
## Infrastructure Needed (Optional) | ||
|
||
None. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
title: Downward API HugePages | ||
kep-number: 2053 | ||
authors: | ||
- "@derekwaynecarr" | ||
owning-sig: sig-node | ||
participating-sigs: [] | ||
status: implementable | ||
creation-date: 2020-06-18 | ||
reviewers: | ||
- "@dashpole" | ||
- "@sjenning" | ||
approvers: | ||
- "@dashpole" | ||
- "@sjenning" | ||
- "@dchen1107" | ||
prr-approvers: | ||
- "deads2k" | ||
- "johnbelamaric" | ||
- "wojtek-t" | ||
see-also: | ||
- "/keps/sig-node/20190129-hugepages.md" | ||
replaces: [] | ||
|
||
# The target maturity stage in the current dev cycle for this KEP. | ||
stage: alpha | ||
|
||
# The most recent milestone for which work toward delivery of this KEP has been | ||
# done. This can be the current (upcoming) milestone, if it is being actively | ||
# worked on. | ||
latest-milestone: "v1.20" | ||
|
||
# The milestone at which this feature was, or is targeted to be, at each stage. | ||
milestone: | ||
alpha: "v1.20" | ||
beta: "v1.21" | ||
stable: "v1.22" | ||
|
||
# The following PRR answers are required at alpha release | ||
# List the feature gate name and the components for which it must be enabled | ||
feature-gates: | ||
- name: DownwardAPIHugePages | ||
components: | ||
- kube-apiserver | ||
disable-supported: true | ||
|
||
metrics: | ||
- "N/A" |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will validation ensure that only pagesizes actually present in requests/limits are allowed? if not, what value is used if a pagesize that is not set as a request or limit is specified as a downward API value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
value projected in pod will always match value presented in the cgroup (which absent an explicit request is 0) because hugepages are not subject to overcommit like memory. validation will not require a value present in request/limit as i anticipate pod templates will use the downward API form in order to know if a resource request was made or not for specific page sizes as the same workload is applied to nodes with different page sizes, or have some kustomize template that omits the request when testing, but the workload would still want to know it has 0 pages at that size..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are supported page sizes well-known? Would a pod template have to enumerate a lot of pagesizes to be aware of all of the possible values? How would an injected container discover hugepage requests/limits via this mechanism without first-hand knowledge of the pod template it was injected into? All of the existing downwardAPI resource keys are fixed (e.g. memory, cpu, ephemeral-storage), so it's easy to discovery resource configurations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@liggitt the list of pagesizes varies by architecture. there are a practical set of pagesizes supported by things like runc. the validation for the downward api field will not vary from the validation for resource requirements today for hugepages.