-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #1985 from dims/remove-dockershim-from-kubelet
Removing dockershim from kubelet
- Loading branch information
Showing
2 changed files
with
343 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,302 @@ | ||
# KEP-1985: Removing dockershim from kubelet | ||
|
||
<!-- toc --> | ||
- [Release Signoff Checklist](#release-signoff-checklist) | ||
- [Terms](#terms) | ||
- [Summary](#summary) | ||
- [Motivation](#motivation) | ||
- [Pros](#pros) | ||
- [Cons](#cons) | ||
- [Goals](#goals) | ||
- [Non-Goals](#non-goals) | ||
- [Proposal](#proposal) | ||
- [Dockershim removal criteria](#dockershim-removal-criteria) | ||
- [Dockershim removal plan](#dockershim-removal-plan) | ||
- [Risks and Mitigations](#risks-and-mitigations) | ||
- [Test Plan](#test-plan) | ||
- [Graduation Criteria](#graduation-criteria) | ||
- [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy) | ||
- [Version Skew Strategy](#version-skew-strategy) | ||
- [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire) | ||
- [Feature Enablement and Rollback](#feature-enablement-and-rollback) | ||
- [Rollout, Upgrade and Rollback Planning](#rollout-upgrade-and-rollback-planning) | ||
- [Monitoring Requirements](#monitoring-requirements) | ||
- [Dependencies](#dependencies) | ||
- [Scalability](#scalability) | ||
- [Troubleshooting](#troubleshooting) | ||
- [Implementation History](#implementation-history) | ||
- [Drawbacks](#drawbacks) | ||
- [Alternatives](#alternatives) | ||
- [Infrastructure Needed (Optional)](#infrastructure-needed-optional) | ||
<!-- /toc --> | ||
|
||
## Release Signoff Checklist | ||
|
||
Items marked with (R) are required *prior to targeting to a milestone / release*. | ||
|
||
- [ ] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR) | ||
- [ ] (R) KEP approvers have approved the KEP status as `implementable` | ||
- [ ] (R) Design details are appropriately documented | ||
- [ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input | ||
- [ ] (R) Graduation criteria is in place | ||
- [ ] (R) Production readiness review completed | ||
- [ ] Production readiness review approved | ||
- [ ] "Implementation History" section is up-to-date for milestone | ||
- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io] | ||
- [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes | ||
|
||
## Terms | ||
|
||
- **CRI:** Container Runtime Interface – a plugin interface which enables kubelet to use a wide variety of container | ||
runtimes, without the need to recompile. | ||
|
||
## Summary | ||
|
||
CRI for docker (i.e. dockershim) is currently a built-in container runtime in kubelet code base. This proposal aims | ||
at a deprecation and subsequent removal of dockershim from kubelet. | ||
|
||
## Motivation | ||
|
||
In Kubernetes, the CRI interface is used to talk to a container runtime, The design of CRI is to be able to run a CRI | ||
implementation as a separate binary. However currently the CRI of docker (a.k.a. dockershim) is part of kubelet code, runs | ||
as part of kubelet and is tightly coupled with kubelet's lifecycle. | ||
|
||
This is not ideal as kubelet then has dependency on specific container runtime which leads to maintenance burden for not | ||
only developers in sig-node, but also cluster administrators when critical issues (e.g. runc CVE) happen to container | ||
runtimes. The pros of removing dockershim is straightforward: | ||
|
||
### Pros | ||
- Docker is not special and should be just a CRI implementation just like every other CRI implementation in our ecosystem. | ||
- Currently, dockershim "enjoys" some inconsistent integrations for various reasons (see [legacyLogProvider](https://cs.k8s.io/?q=legacyLogProvider&i=nope&files=&repos=kubernetes/kubernetes) for example) . Removing these "features" should eliminate maintenance burden of kubelet. | ||
- A cri-dockerd can be maintained independently by folks who are interested in keeping this functionality | ||
- Over time we can remove vendored docker dependencies in kubelet. | ||
- Due to convenience of inheriting from this builtin shim for the container runtime, there is less incentive to move to | ||
new container runtimes. The production issues found and addressed in both Containerd and CRI-O might cause the | ||
production issues for some users, which causes a lot of maintenance burdens to Kubernetes community. | ||
- The community can focus and move faster on the new container runtime-related enhancements once we drop dockershim | ||
|
||
Having said that, cons of removal built-in dockershim requires lots of attention: | ||
|
||
### Cons | ||
- Deployment pain with a new binary in addition to kubelet. | ||
- An additional component may aggravate the complexity currently. It may be relieved with docker version evolutions. | ||
- The number of affected users may be large. | ||
- Users must change existing use experience when using Kubernetes and docker. | ||
- Users have to change their existing workflows to adapt to this new changes. | ||
- And other unrecorded stuff. | ||
- CRI is still technically in alpha, and should graduate to GA before removing dockershim from kubelet. There | ||
is a [KEP 2041](https://github.com/kubernetes/enhancements/pull/2041) for graduating the CRI API version. | ||
- cri-dockerd will vendor kubernetes/kubernetes, that may be tough. | ||
- cri-dockerd as an independent software running on node should be allocated enough resource to guarantee its availability. | ||
|
||
> You can check [the discussion in sig-node mailing list](https://groups.google.com/forum/#!msg/kubernetes-sig-node/0qVzfugYhro/l6Au216XAgAJ) for more details. | ||
### Goals | ||
|
||
- A concrete dockershim removal criteria. | ||
- A brief plan to remove dockershim spanning multiple releases. | ||
|
||
### Non-Goals | ||
|
||
- Refactoring or re-design of dockershim itself due to deprecation. | ||
|
||
## Proposal | ||
|
||
### Dockershim removal criteria | ||
|
||
- CRI itself is alpha. So we need another KEP to graduate CRI API. | ||
- kubelet has no dependency on dockershim/docker in its whole lifecycle. This is already done using the `dockerless` tag | ||
- All node related features are CRI generic and have no "back door" dependency on dockershim/docker. | ||
- Deprecate and remove, or replace all Docker-specific features. | ||
- Reasonable benchmark result of performance degradation after moving dockershim to out-of-tree. | ||
- E2E test framework has been updated with fully support of out-of-tree CRI container runtime. | ||
|
||
### Dockershim removal plan | ||
|
||
Step 1: Deprecate in-tree dockershim and decouple dockershim from kubelet. | ||
|
||
Target releases: 1.20, 1.21 | ||
|
||
Actions: | ||
|
||
- Mark in-tree dockershim as "maintenance mode": | ||
- CRI generic changes/features can continue on dockershim. | ||
- WIP efforts on dockershim can continue and go to complete. | ||
- dockershim/docker specific changes/features should be rejected. | ||
- Deprecate the legacy features of dockershim in kubelet by providing a specific timeline. Currently, kubelet still has: | ||
- vendored dockershim | ||
- flags that are used to configure dockershim. | ||
- support to get container logs when docker uses journald as the driver. | ||
- logic of moving docker processes to a given cgroup | ||
- Ensure e2e/Node e2e test framework is CRI generic and test cases are independent of container runtime. | ||
- Refactoring e2e/Node e2e test framework to include CRI for docker installation (or use other CRI container runtime). | ||
- Ensure cluster/node e2e are 100% CRI focused. | ||
- Ensure test-infra install appropriate CRI implementations in e2e machines. | ||
- Ensure Windows scenarios that currently depend on docker are fully supported in alternative CRI implementations | ||
|
||
Step 2: Release kubelet without dockershim | ||
|
||
Target releases: 1.22 | ||
|
||
Actions: | ||
- Document and announce migration guide. | ||
- Release harness would build kubelet with `dockerless` tag on. So the default build will not support docker out of | ||
the box. | ||
- If folks need this support, they would have to build kubelet by themselves as the code is still present in the | ||
source tree. | ||
|
||
Step 3: Completely remove in-tree dockershim from kubelet. | ||
|
||
Target releases: Deprecation should be for at least a year. So the earliest possible release after that time period. | ||
|
||
Actions: | ||
|
||
- Delete in-tree dockershim code from kubelet after certain "grace period". | ||
|
||
### Risks and Mitigations | ||
|
||
The easier we make it for folks to switch to CRI implementations the lesser the risk. Another option would be for | ||
folks for a brand new CRI implementation that targets docker. Though even this option means that folks will have to | ||
run an extra process outside of kubelet. The worst case scenario is for us to carry on the dockershim for a couple | ||
of more releases. | ||
|
||
### Test Plan | ||
|
||
Node e2e testing will be augmented to test kubelet built with `dockerless` tag | ||
|
||
### Graduation Criteria | ||
|
||
- All feedback gathered from users | ||
- Adequate test signal quality for node e2e | ||
- Tests are in Testgrid and linked in KEP | ||
- Allowing time for additional user feedback and bug reports | ||
|
||
### Upgrade / Downgrade Strategy | ||
|
||
Upgrade: Users should follow the migration guide before upgrading to a version of the kubelet that no longer | ||
includes dockershim. | ||
|
||
Downgrade: Not applicable. | ||
|
||
### Version Skew Strategy | ||
|
||
Not applicable. | ||
|
||
## Production Readiness Review Questionnaire | ||
|
||
### Feature Enablement and Rollback | ||
|
||
_This section must be completed when targeting alpha to a release._ | ||
|
||
* **How can this feature be enabled / disabled in a live cluster?** | ||
- [ ] Feature gate (also fill in values in `kep.yaml`) | ||
- Feature gate name: NONE | ||
- Components depending on the feature gate: kubelet | ||
- Will enabling / disabling the feature require downtime or reprovisioning | ||
of a node? No | ||
|
||
* **Does enabling the feature change any default behavior?** | ||
Yes, the kubelet will size the empty dir volume to match the precise | ||
amount of memory the pod is able to write rather than over or undersizing. | ||
Prior behavior is node dependent, and so pod authors had no mechanism | ||
to control this behavior properly. | ||
|
||
* **Can the feature be disabled once it has been enabled (i.e. can we roll back | ||
the enablement)?** Yes | ||
|
||
* **What happens if we reenable the feature if it was previously rolled back?** | ||
Pods that run on that node will have memory backed volumes sized based on Linux | ||
host default. The sizing may not align with actual available memory for an app. | ||
|
||
* **Are there any tests for feature enablement/disablement?** | ||
No, testing behavior with the feature disabled is dependent on node operating | ||
system configuration. The point of this KEP is to address that coupling. | ||
|
||
### Rollout, Upgrade and Rollback Planning | ||
|
||
* **How can a rollout fail? Can it impact already running workloads?** | ||
TBD | ||
|
||
* **What specific metrics should inform a rollback?** | ||
None. | ||
|
||
* **Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?** | ||
I do not believe this is applicable. | ||
|
||
* **Is the rollout accompanied by any deprecations and/or removals of features, APIs, | ||
fields of API types, flags, etc.?** | ||
Even if applying deprecation policies, they may still surprise some users. | ||
No. | ||
|
||
### Monitoring Requirements | ||
|
||
* **How can an operator determine if the feature is in use by workloads?** | ||
Not applicable (no feature gate). | ||
|
||
* **What are the SLIs (Service Level Indicators) an operator can use to determine | ||
the health of the service?** | ||
This does not seem relevant to this feature. | ||
|
||
* **What are the reasonable SLOs (Service Level Objectives) for the above SLIs?** | ||
This does not seem relevant to this feature. | ||
|
||
* **Are there any missing metrics that would be useful to have to improve observability | ||
of this feature?** | ||
No. | ||
|
||
### Dependencies | ||
|
||
* **Does this feature depend on any specific services running in the cluster?** | ||
No | ||
|
||
### Scalability | ||
|
||
* **Will enabling / using this feature result in any new API calls?** | ||
No. | ||
|
||
* **Will enabling / using this feature result in introducing new API types?** | ||
No | ||
|
||
* **Will enabling / using this feature result in any new calls to the cloud | ||
provider?** | ||
No | ||
|
||
* **Will enabling / using this feature result in increasing size or count of | ||
the existing API objects?** | ||
No | ||
|
||
* **Will enabling / using this feature result in increasing time taken by any | ||
operations covered by [existing SLIs/SLOs]?** | ||
No | ||
|
||
* **Will enabling / using this feature result in non-negligible increase of | ||
resource usage (CPU, RAM, disk, IO, ...) in any components?** | ||
No | ||
|
||
### Troubleshooting | ||
|
||
* **How does this feature react if the API server and/or etcd is unavailable?** | ||
No impact. | ||
|
||
* **What are other known failure modes?** | ||
Not applicable. | ||
|
||
* **What steps should be taken if SLOs are not being met to determine the problem?** | ||
Not applicable | ||
|
||
## Implementation History | ||
|
||
## Drawbacks | ||
|
||
None. | ||
|
||
This eliminates unnecessary vendoring of code from docker/docker github repository and many others dragged in | ||
transitively. | ||
|
||
## Alternatives | ||
|
||
None. | ||
|
||
## Infrastructure Needed (Optional) | ||
|
||
None. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
title: Removing dockershim from kubelet | ||
kep-number: 1985 | ||
authors: | ||
- "@resouer" | ||
- "@dims" | ||
owning-sig: sig-node | ||
participating-sigs: | ||
status: implementable | ||
creation-date: 2020-09-14 | ||
reviewers: | ||
- "@yujuhong" | ||
- "@dchen1107" | ||
- "@derekwaynecarr" | ||
- "@dashpole" | ||
- "@sjennning" | ||
approvers: | ||
- "@dchen1107" | ||
- "@derekwaynecarr" | ||
prr-approvers: | ||
- johnbelamaric | ||
see-also: | ||
replaces: | ||
|
||
# The target maturity stage in the current dev cycle for this KEP. | ||
stage: alpha | ||
|
||
# The most recent milestone for which work toward delivery of this KEP has been | ||
# done. This can be the current (upcoming) milestone, if it is being actively | ||
# worked on. | ||
latest-milestone: "v1.20" | ||
|
||
# The milestone at which this feature was, or is targeted to be, at each stage. | ||
milestone: | ||
|
||
# The following PRR answers are required at alpha release | ||
# List the feature gate name and the components for which it must be enabled | ||
feature-gates: | ||
disable-supported: true | ||
|
||
# The following PRR answers are required at beta release | ||
metrics: |