From 5b1da2bb7bd2066b5b63a9ae18e3060c127912b7 Mon Sep 17 00:00:00 2001 From: Sascha Grunert Date: Tue, 30 May 2023 11:06:17 +0200 Subject: [PATCH] Add Pod Security Standards to User Namespaces KEP This KEP update outlines the required changes for Pod Security Standards in relation to the User Namespaces support. Planned graduation to beta is v1.28, which is now reflected in `kep.yaml` as well. Updating the PRR for it will follow in another PR. Signed-off-by: Sascha Grunert --- keps/sig-node/127-user-namespaces/README.md | 45 ++++++++++++++++++--- keps/sig-node/127-user-namespaces/kep.yaml | 3 +- 2 files changed, 41 insertions(+), 7 deletions(-) diff --git a/keps/sig-node/127-user-namespaces/README.md b/keps/sig-node/127-user-namespaces/README.md index b767ed6ff754..347b0a13580b 100644 --- a/keps/sig-node/127-user-namespaces/README.md +++ b/keps/sig-node/127-user-namespaces/README.md @@ -24,6 +24,7 @@ - [Example without idmap mounts](#example-without-idmap-mounts) - [Example with idmap mounts](#example-with-idmap-mounts) - [Regarding the previous implementation for volumes](#regarding-the-previous-implementation-for-volumes) + - [Pod Security Standards (PSS) integration](#pod-security-standards-pss-integration) - [Unresolved](#unresolved) - [Test Plan](#test-plan) - [Prerequisite testing updates](#prerequisite-testing-updates) @@ -130,7 +131,7 @@ Here we use UIDs, but the same applies for GIDs. the pod (not valid in the host). - Benefit from the security hardening that user namespaces provide against some of the future unknown runtime and kernel vulnerabilities. -- Support only stateless pods +- Initially support stateless and later stateful pods, too. ### Non-Goals @@ -141,7 +142,6 @@ Here we use UIDs, but the same applies for GIDs. - Implement all the very nice use cases that user namespaces allows. The goal here is to allow them as incremental improvements, not implement all the possible ideas related with user namespaces. -- Support stateful pods [kubelet-userns]: https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/2033-kubelet-in-userns-aka-rootless @@ -329,7 +329,7 @@ way, the Kubelet can read all the allocated mappings if it restarts. During alpha, to make sure we don't exhaust the host UID namespace, we will limit the number of pods using user namespaces to `min(maxPods, 1024)`. This leaves us plenty of host UID space free and this limits is probably never hit in -practice. See UNRESOLVED for more some UNRESOLVED info we still have on this. +practice. See the [Unresolved section](#unresolved) for more details on this. ### Handling of stateless volumes @@ -422,6 +422,34 @@ components that implement the interface. [kubeletVolumeHost-interface]: https://github.com/kubernetes/kubernetes/blob/36450ee422d57d53a3edaf960f86b356578fe996/pkg/volume/plugins.go#L322 +### Pod Security Standards (PSS) integration + +[Pod Security Standards](https://k8s.io/docs/concepts/security/pod-security-standards) +define three different policies to broadly cover the whole security spectrum of +Kubernetes, while the User Namespaces feature should integrate into them. This +will happen only if the feature is graduated to GA, which _may_ result in +changing the `Restricted` profile to disallow host user namespaces. + +With graduating the feature to beta, the Pod Security will relax in a controlled +way for pods which enable user namespaces. This behavior can controlled by an API +Server Feature Gate, which allows an early opt-in for end users. The overall +burden to ensure that all nodes will honor user namespaces is on the cluster +admin, though. The relaxation in detail means, that if user namespaces are +enabled, then the following fields won't be restricted any more because they +always have to refer to the user inside the container: + +- `spec.securityContext.runAsNonRoot` +- `spec.containers[*].securityContext.runAsNonRoot` +- `spec.initContainers[*].securityContext.runAsNonRoot` +- `spec.ephemeralContainers[*].securityContext.runAsNonRoot` +- `spec.securityContext.runAsUser` +- `spec.containers[*].securityContext.runAsUser` +- `spec.initContainers[*].securityContext.runAsUser` +- `spec.ephemeralContainers[*].securityContext.runAsUser` +- `spec.containers[*].securityContext.allowPrivilegeEscalation` +- `spec.initContainers[*].securityContext.allowPrivilegeEscalation` +- `spec.ephemeralContainers[*].securityContext.allowPrivilegeEscalation` + ### Unresolved Here is a list of considerations raised in PRs discussion that hasn't yet @@ -551,16 +579,21 @@ use container runtime versions that have the needed changes. ##### Beta -- Make plans on whether, when, and how to enable by default +- Gather and address feedback from the community +- Add API Server feature flag to integrate into [Pod Security Standards (PSS)](#pod-security-standards-pss-integration) +- Get review from VM container runtimes maintainers + +###### Open Questions + - Should we reconsider making the mappings smaller by default? - Should we allow any way for users to for "more" IDs mapped? If yes, how many more and how? - Should we allow the user to ask for specific mappings? -- Get review from VM container runtimes maintainers -- Gather and address feedback from the community ##### GA - Gather and address feedback from the community +- Support stateful pods +- Fully integrate into [Pod Security Standards (PSS)](#pod-security-standards-pss-integration) ### Upgrade / Downgrade Strategy diff --git a/keps/sig-node/127-user-namespaces/kep.yaml b/keps/sig-node/127-user-namespaces/kep.yaml index d211b1dd3378..7b7b56c649a0 100644 --- a/keps/sig-node/127-user-namespaces/kep.yaml +++ b/keps/sig-node/127-user-namespaces/kep.yaml @@ -3,6 +3,7 @@ kep-number: 127 authors: - "@rata" - "@giuseppe" + - "@saschagrunert" owning-sig: sig-node participating-sigs: [] status: implementable @@ -15,7 +16,7 @@ approvers: - "@derekwaynecarr" stage: alpha -latest-milestone: "v1.27" +latest-milestone: "v1.28" milestone: alpha: "v1.25"