diff --git a/keps/prod-readiness/sig-storage/3636.yaml b/keps/prod-readiness/sig-storage/3636.yaml new file mode 100644 index 000000000000..004bae345372 --- /dev/null +++ b/keps/prod-readiness/sig-storage/3636.yaml @@ -0,0 +1,5 @@ +kep-number: 3636 +alpha: + approver: "@deads2k" +beta: + approver: "@deads2k" diff --git a/keps/sig-windows/3636-windows-csi-host-process-pods/README.md b/keps/sig-windows/3636-windows-csi-host-process-pods/README.md new file mode 100644 index 000000000000..0923da5f824a --- /dev/null +++ b/keps/sig-windows/3636-windows-csi-host-process-pods/README.md @@ -0,0 +1,1414 @@ + +# KEP-3636: CSI Drivers in Windows as HostProcess Pods + + + + +- [Release Signoff Checklist](#release-signoff-checklist) +- [Summary](#summary) + - [Glossary](#glossary) +- [Motivation](#motivation) + - [Goals](#goals) + - [Non-Goals](#non-goals) +- [Proposal](#proposal) + - [User Stories (Optional)](#user-stories-optional) + - [Story 1](#story-1) + - [Story 2](#story-2) + - [Notes/Constraints/Caveats (Optional)](#notesconstraintscaveats-optional) + - [Risks and Mitigations](#risks-and-mitigations) +- [Design Details](#design-details) + - [Prerequisite: Make CSI Proxy an embedded library without a server component](#prerequisite-make-csi-proxy-an-embedded-library-without-a-server-component) + - [Preferred option: Update the CSI Drivers to use the server code directly](#preferred-option-update-the-csi-drivers-to-use-the-server-code-directly) + - [Alternative: Update the translation layer to use the server code gRPC](#alternative-update-the-translation-layer-to-use-the-server-code-grpc) + - [Alternative: Convert CSI Proxy to a Library of Functions](#alternative-convert-csi-proxy-to-a-library-of-functions) + - [Comparison Matrix](#comparison-matrix) + - [Maintenance of the new model and existing client/server model of CSI Proxy](#maintenance-of-the-new-model-and-existing-clientserver-model-of-csi-proxy) + - [Security analysis](#security-analysis) + - [Test Plan](#test-plan) + - [Prerequisite testing updates](#prerequisite-testing-updates) + - [Unit tests](#unit-tests) + - [Integration tests](#integration-tests) + - [e2e tests](#e2e-tests) + - [Graduation Criteria](#graduation-criteria) + - [Alpha](#alpha) + - [Beta](#beta) + - [GA](#ga) + - [Deprecation](#deprecation) + - [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy) + - [Version Skew Strategy](#version-skew-strategy) +- [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire) + - [Feature Enablement and Rollback](#feature-enablement-and-rollback) + - [Rollout, Upgrade and Rollback Planning](#rollout-upgrade-and-rollback-planning) + - [Monitoring Requirements](#monitoring-requirements) + - [Dependencies](#dependencies) + - [Scalability](#scalability) + - [Troubleshooting](#troubleshooting) +- [Implementation History](#implementation-history) +- [Drawbacks](#drawbacks) +- [Alternatives](#alternatives) +- [Infrastructure Needed (Optional)](#infrastructure-needed-optional) + + +## Release Signoff Checklist + + + +Items marked with (R) are required *prior to targeting to a milestone / release*. + +- [x] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR) +- [ ] (R) KEP approvers have approved the KEP status as `implementable` +- [ ] (R) Design details are appropriately documented +- ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors) + - [ ] e2e Tests for all Beta API Operations (endpoints) + - [ ] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md) + - [ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free +- [ ] (R) Graduation criteria is in place + - [ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md) within one minor version of promotion to GA +- [ ] (R) Production readiness review completed +- [ ] (R) Production readiness review approved +- [ ] "Implementation History" section is up-to-date for milestone +- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io] +- [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes + + + +[kubernetes.io]: https://kubernetes.io/ +[kubernetes/enhancements]: https://git.k8s.io/enhancements +[kubernetes/kubernetes]: https://git.k8s.io/kubernetes +[kubernetes/website]: https://git.k8s.io/website + +## Summary + + + +In Kubernetes, CSI enables third-party storage providers to write and deploy plugins without needing +to alter the core Kubernetes codebase. + +A CSI Driver in Kubernetes has two main components: a controller plugin that runs in +the control plane and a node plugin that runs on every node. + +The node plugin requires direct access to the host to make block devices and/or filesystems +available to the kubelet. In Linux-based nodes, CSI Drivers use the [mkfs(8)](https://man7.org/linux/man-pages/man8/mkfs.8.html) +and the [mount(8)](https://man7.org/linux/man-pages/man8/mount.8.html) commands to format and mount filesystems. + +In Windows-based nodes, a node plugin cannot execute similar Windows commands due to the missing capability +of running privileged operations from a container. To solve this issue, the CSI community created a proxy binary +called [CSI Proxy](https://kubernetes.io/blog/2020/04/03/kubernetes-1-18-feature-windows-csi-support-alpha/) which +performs privileged storage operations on behalf of the CSI Driver. First, cluster administrators run the CSI Proxy +on the node as a service. Next, CSI Drivers connect to named pipes set up by CSI Proxy and issue commands through a gRPC API. +CSI Proxy then runs privileged PowerShell commands to mount and format filesystems. This strategy was adopted by +several CSI Drivers that want to support Windows nodes and eventually led to +[CSI Proxy becoming stable and GA in Kubernetes 1.22](https://kubernetes.io/blog/2021/08/09/csi-windows-support-with-csi-proxy-reaches-ga/). + +In 2021, the SIG Windows community introduced a feature called [HostProcess containers](https://kubernetes.io/blog/2021/08/16/windows-hostprocess-containers/). +This feature enables running Windows Process-isolated containers (hence the name HostProcess container). + +With this feature, a CSI Driver node plugin can run as a HostProcess container and issue the privileged +storage operation directly without a proxy binary. This KEP explains the implementation details +of how Windows-based node plugins can adopt HostProcess containers and the evolution of +CSI Proxy from a client/server-based proxy to a library of privileged storage operation functions similar to +[kubernetes/mount-utils](https://github.com/kubernetes/mount-utils). + +### Glossary + +Terms used in this document: + +* API Group - A grouping of APIs in CSI Proxy by purpose. For example, the Volume API Group has API Methods related to volume interaction. + [There are 4 API groups (Disk, Filesystem, Volume, SMB) in v1 status and 2 API Groups in v1beta status](https://github.com/kubernetes-csi/csi-proxy/blob/c0c6293490fd8aec269685bb4089be56d69921b1/README.md#feature-status). +* API Version - An API Group can have multiple versions. + [The versions include v1alpha1, v1beta1, v1beta2, v1beta3, v1, v2alpha1](https://github.com/kubernetes-csi/csi-proxy/blob/c0c6293490fd8aec269685bb4089be56d69921b1/README.md#feature-status). +* Translation layer - Generated Go code in CSI Proxy that transforms client versioned requests to server "version-agnostic" requests. +* CSI Proxy server - The CSI Proxy binary running in the host node. +* CSI Proxy client - The Go module client used by CSI Drivers and addons to connect to the CSI Proxy server. +* CSI Proxy v1 - The CSI Proxy implementation using the client/server model. +* CSI Proxy v2 - The CSI Proxy implementation using a Go module imported by CSI drivers. + +## Motivation + + + +The client/server model of CSI Proxy enabled running privileged storage operations from CSI Node plugins +in Windows nodes. The server is the CSI Proxy binary running as a Windows service in the node, +and the client is the CSI Driver node plugin, which makes an RPC request to CSI Proxy on +most node CSI operations. While this model works, it has a few drawbacks: + +- **Different deployment model than Linux** - Linux privileged containers perform the privileged storage + operations (format/mount). However, Windows containers aren't privileged. To work around the problem, the CSI Driver runs as non-privileged containers, + and privileged operations are relayed to CSI Proxy. In deployment manifests, the Windows component needs an + additional section to mount the named pipes exposed by CSI Proxy as a hostpath. +- **Additional component in the host to maintain** - The cluster administrator needs to install and run CSI Proxy + during node bootstrap. The cluster administrator also needs to think about the upgrade workflow in addition to + upgrading the CSI Driver. +- **Difficult releases of bugfixes & features** - After a bugfix, we create a new version of the CSI Proxy to be + redeployed in the cluster. After a feature is merged, in addition to redeploying a new version of CSI Proxy, + the client needs to be updated with a new version of the CSI Proxy client and connect to the new version of the named pipes. + This workflow is not as simple as the Linux counterpart, which only needs to update Go dependencies. +- **Multiple API versions to maintain** - As part of the original design of CSI Proxy, it was decided to have different + protobuf versions whenever there were breaking changes (like updates in the protobuf services & messages). This led + to having multiple versions of the API (v1alphaX, v1betaX, v1). In addition, if we want to add a new feature, we'd need + to create a new API version e.g. v2alpha1 ([see this PR as an example of adding methods to the Volume API Group](https://github.com/kubernetes-csi/csi-proxy/pull/186)). + it includes an API group for handling the SMB protocol, which a CSI Driver might not use. + +In 1.22, SIG Windows introduced [HostProcess containers](https://kubernetes.io/blog/2021/08/16/windows-hostprocess-containers/) +as an alternative way to run containers. HostProcess containers run directly in the host +and behave like to a regular process. The HostProcess containers feature became stable in 1.26. + +Using HostProcess containers in CSI Drivers enables CSI Drivers to perform the privileged storage operations +directly. Most of the drawbacks in the client/server model are no longer present in the new model. + +### Goals + + + +- Define the implementation details to transform CSI Proxy from its client/server model to + a go module that can be directly imported by CSI Drivers. + +### Non-Goals + + +- Improve the performance of CSI Drivers in Windows - There should be an improvement in the performance by + removing the communication aspects between the CSI Driver and CSI Proxy (the protobuf serialization/deserialization, + the gRPC call through named pipes). However, this improvement might be negligible, as most of the latency + comes from running powershell commands, which is outside the scope of this change. +- Deprecate the client/server model - This model is still used by the majority of CSI Driver implementations, + adopting the new go module model will take time and in the meantime we still plan to maintain + both models. +- Define strict security implementation details - A goal is to understand the security implications of enabling HostProcess + containers. We aim to provide guidelines but not implementation details about the components that need to be installed + in the cluster. + +## Proposal + + + +This proposal advocates for evolving CSI Proxy from a standalone binary into a Go library that CSI drivers can directly import. +By leveraging HostProcess containers, CSI drivers for Windows can bundle the necessary privileged functionality, +eliminating the need for a separate proxy component running on each node. +This shift simplifies the deployment and maintenance model for both driver developers and cluster administrators. + +The core of this proposal is to: + +- Refactor the CSI Proxy codebase to expose its API groups as importable Go packages, removing the client/server gRPC architecture. +- Provide clear migration guidelines for CSI driver developers to adopt the new library model and transition their + drivers to use HostProcess containers. This will include code examples, manifest changes, and security best practices. + +### User Stories (Optional) + + + +#### Story 1 + +As a CSI driver developer, I want to consume CSI Proxy as a Go library so that I can simplify my driver's architecture, +remove the runtime dependency on an external CSI Proxy binary, and align the Windows deployment model more closely with the Linux model, +reducing maintenance overhead. + +#### Story 2 + +As a Kubernetes cluster administrator, I want CSI drivers for Windows to be self-contained without requiring me to +separately install and manage the CSI Proxy lifecycle on my nodes. +This will simplify node bootstrapping, reduce operational complexity, +and make driver upgrades more straightforward. + +### Notes/Constraints/Caveats (Optional) + + +HostProcess containers run as processes in the host. One of the differences with a privileged Linux container +is that there's no filesystem isolation. This means that enabling HostProcess containers should be done for +system components only. This point will be expanded on in the detailed design. + +### Risks and Mitigations + + + +Security implications of HostProcess containers will be reviewed by SIG Windows, SIG Storage +and SIG Security. + +One risk about enabling the HostProcess containers feature is not having enough security policies in the cluster +for workloads, if workloads can be deployed as HostProcess containers or if there's an escalation that allow +non-privileged pods to become HostProcess containers then workloads have complete access to the host filesystem, +this allows access to the tokens in `/var/lib/kubelet` as well as the volumes of other pods inside `/var/lib/kubelet/`. + +## Design Details + + + +CSI Proxy has a client/server design with two main components: + +* a binary that runs in the host (the CSI Proxy server). This binary can execute privileged storage operations on the + host. Once configured to run as a Windows service, it creates named pipes on startup for all the versions of the API + Groups defined on the codebase. +* client Go libraries that CSI Drivers and Addons import to connect to the CSI Proxy server. The methods and objects + available in the library are defined with [protobuf](https://github.com/kubernetes-csi/csi-proxy#feature-status). On + startup, the CSI Driver initializes a client for each version of the API Groups required, which will connect and issue + requests through gRPC to their pre-configured named pipes on the host. + +CSI Driver implementers can write a Windows-specific implementation of the node component of the CSI Driver. In the +implementation, a CSI Driver will make use of the imported CSI Proxy client libraries to issue privileged storage +operations. Assuming that a volume was created and attached to a node by the controller component of the CSI Driver, +the following CSI calls will be done by the kubelet to the CSI Driver. + +**Volume set up** + +* NodeStageVolume - Create a Windows volume, format it to NTFS, and create a partition access path in the node (global mount). +* NodePublishVolume - Create a symlink from the kubelet Pod-PVC path to the global path (pod mount). + +**Volume tear down** + +* NodeUnpublishVolume - Remove the symlink created above. +* NodeUnstageVolume - Remove the partition access path. + +CSI Proxy is designed to be backwards compatible, and a single binary running in the Windows node can serve requests from +multiple CSI Proxy client versions. We're able to do this, because the CSI Proxy binary will create named +pipes on startup for all the versions available in every API Group (e.g. the Volume, Disk, Filesystem, SMB groups). In addition, +there's a translation layer in the CSI Proxy binary that transforms client version specific requests to server "version +agnostic" requests, which are then processed by the CSI Proxy binary. The following diagram shows the conversion process +(from the [CSI Proxy development docs](https://github.com/kubernetes-csi/csi-proxy/blob/c0c6293490fd8aec269685bb4089be56d69921b1/docs/DEVELOPMENT.md)): + +![CSI Proxy client/server model](./csi-proxy-client-server.jpg) + +Understanding the translation layer will help in the transition to HostProcess containers, as most of the code that the +clients use to communicate with the CSI Proxy server is generated. The translation layer's objective is to generate Go code +that maps versioned client requests to server agnostic requests. It does so by analyzing the generated `api.pb.go` +files (generated through `protoc` from the protobuf files) for each version of the API Groups and generating multiple +files for different purposes (taking as example the Volume API Group): + + +* [\/server_generated.go](https://github.com/kubernetes-csi/csi-proxy/blob/c0c6293490fd8aec269685bb4089be56d69921b1/pkg/server/volume/impl/v1beta3/server_generated.go) + - The gRPC server implementation of the methods of a versioned API Group. Each method receives a versioned request and + expects a versioned response. The code generated follows this pattern: + +``` +func v1Foo(v1Request v1FooRequest) v1FooResponse { + + // convert versioned request to server request (version agnostic) + fooRequest = convertV1FooRequestToFooRequest(v1Request) + + // process request (server handler) + fooResponse = server.Foo(fooRequest) + + // convert server response (version agnostic) to versioned response + v1Response = convertFooResponseToV1FooResponse(fooResponse) + + return v1Response +} +``` + + +* [types_generated.go](https://github.com/kubernetes-csi/csi-proxy/blob/c0c6293490fd8aec269685bb4089be56d69921b1/pkg/server/volume/impl/types_generated.go) + The idea is to collect all the methods available across all the versions of an API Group so that the server has a + corresponding implementation for it. The generator reads all the methods found across the + `volume//api.pb.go` files and generates an interface with all the methods found that the server must + implement, in the example above the server interface will have the `Foo` method +* [\/conversion_generated.go](https://github.com/kubernetes-csi/csi-proxy/blob/c0c6293490fd8aec269685bb4089be56d69921b1/pkg/server/volume/impl/v1/conversion_generated.go) + The generated implementation of the conversion functions shown above (e.g. `convertV1FooRequestToFooRequest`, + `convertFooResponseToV1FooResponse`). In some cases, it's possible that the conversion code generator generates a nested + data structure that's not built correctly. There's an additional file with overrides for the functions that were + generated incorrectly. +* Client [\/\/client_generated.go](https://github.com/kubernetes-csi/csi-proxy/blob/c0c6293490fd8aec269685bb4089be56d69921b1/client/groups/volume/v1/client_generated.go) + Generated in the client libraries to be used by users of the CSI Proxy client. It creates proxy methods corresponding + to the `api.pb.go` methods of the versioned API Group. This file defines the logic to create a connection to the + corresponding named pipe, creating a gRPC client out of it and storing it for later usage. As a result, the proxy + methods don't need a reference to the gRPC client. + + +### Prerequisite: Make CSI Proxy an embedded library without a server component + +If we configure the Windows node component of a CSI Driver/Addon to be a Windows HostProcess pod, then it'll be able to +use the same powershell commands that we use in the server code of CSI Proxy. The idea is to use the server code of CSI +Proxy as a library in CSI Drivers/Addons. With this, we also remove the server component. + +As described in the [Windows HostProcess Pod](https://kubernetes.io/docs/tasks/configure-pod-container/create-hostprocess-pod/) +guide, we'd need to configure the PodSpec of node component of the CSI Driver/Addon that runs in Windows nodes with: + + +```yaml +spec: + securityContext: + windowsOptions: + hostProcess: true + runAsUserName: "NT AUTHORITY\\SYSTEM" +``` + +### Preferred option: Update the CSI Drivers to use the server code directly + +Modify the client code to use the server API handlers directly which would call the server implementation next, this +means that the concept of an "API version" is also removed from the codebase, the clients instead would import and use +the internal server structs (request and response objects). + +Currently, GCE PD CSI driver uses the v1 Filesystem API group as follows: + + +```go +// note the API version in the imports +import ( + fsapi "github.com/kubernetes-csi/csi-proxy/client/api/filesystem/v1" + fsclient "github.com/kubernetes-csi/csi-proxy/client/groups/filesystem/v1" +) + +func NewCSIProxyMounterV1() (*CSIProxyMounterV1, error) { + fsClient, err := fsclient.NewClient() + if err != nil { + return nil, err + } + return &CSIProxyMounterV1{ + FsClient: fsClient, + }, nil +} + +// ExistsPath - Checks if a path exists. Unlike util ExistsPath, this call does not perform follow link. +func (mounter *CSIProxyMounterV1) PathExists(path string) (bool, error) { + isExistsResponse, err := mounter.FsClient.PathExists(context.Background(), + &fsapi.PathExistsRequest{ + Path: mount.NormalizeWindowsPath(path), + }) + if err != nil { + return false, err + } + return isExistsResponse.Exists, err +} + +// usage +csiProxyV1, _ := NewCSIProxyMounterV1() +csiProxyV1.PathExists(path) +``` + + +Internally the `PathExists` call is in the file [\/\/client_generated.go](https://github.com/kubernetes-csi/csi-proxy/blob/c0c6293490fd8aec269685bb4089be56d69921b1/client/groups/volume/v1/client_generated.go) +described above, which performs the execution through gRPC. In the proposal we'd need to use the server implementation +instead: + +```go +// note that there is no version in the import +import ( + fsapi "github.com/kubernetes-csi/csi-proxy/pkg/os/filesystem" + fsserver "github.com/kubernetes-csi/csi-proxy/pkg/server/filesystem" + fsserverimpl "github.com/kubernetes-csi/csi-proxy/pkg/server/filesystem/impl" +) + +// no need to initialize a gRPC client, however the server handler impl is initialized instead +// no need for a versioned client + +func NewCSIProxyMounter() (*CSIProxyMounter, error) { + fsServer, err := fsserver.NewServer(fsapi.New()) + if err != nil { + return nil, err + } + return &CSIProxyMounter{ + FsServer: fsServer, + }, nil +} + +// ExistsPath - Checks if a path exists. Unlike util ExistsPath, this call does not perform follow link. +func (mounter *CSIProxyMounter) PathExists(path string) (bool, error) { + isExistsResponse, err := mounter.FsServer.PathExists(context.Background(), + &fsserverimpl.PathExistsRequest{ + Path: mount.NormalizeWindowsPath(path), + }, + // 3rd arg is the version, remove the version here too! + ) + if err != nil { + return false, err + } + return isExistsResponse.Exists, err +} + +// usage +csiProxy, _ := NewCSIProxyMounter() +csiProxy.PathExists(path) +``` + +![csi-proxy-library](./csi-proxy-library.jpg) + +Pros: + +* We remove the concept of API Version & the translation layer and instead consider the go mod version as the API + version. This is how other libraries like [k8s.io/mount-utils](https://github.com/kubernetes/mount-utils) work. + * Version dependent server validation in the API handler layer is removed. + * Legacy structs for older API versions are removed. +* New APIs are easier to add. Only the server handler & impl code is modified, so there’s no need for the code + generation tool anymore. + +Cons: + +* The client goes through a bigger diff. Every occurrence of a call to a CSI Proxy method needs to be modified to use + the server handler & impl code, but this penalty is paid only once. + * Legacy interface implementations for the v1beta API in the CSI Drivers are removed. +* As we no longer use protobuf to define the API and use internal structs instead, we'd need to update the API docs to + be directly generated from source code (including the comments around server handler methods and internal server + structs). + +It is worth noting that at this point, the notion of a server is no longer valid, as CSI Proxy has become a +library. We can take this opportunity to reorganize the packages by + +1. Moving `/pkg/server/` and `/pkg/server//impl` to `/pkg/` +2. Moving `/pkg/os/` to `/pkg//api` + +The new structure looks like: + + +``` +pkg +├── disk +│ ├── api +│ │ ├── api.go +│ │ └── types.go +│ ├── disk.go +│ └── types.go +├── iscsi +│ ├── api +│ │ ├── api.go +│ │ └── types.go +│ ├── disk.go +│ └── types.go +``` + +There are also three minor details we can take care of while we’re migrating: + +1. The two structs under `pkg/shared/disk/types.go` are only ever referenced by `pkg/os/disk`, so they can be safely added + to `pkg/disk/api/types.go`. +2. The FS server receives `workingDirs` as an input, in addition to the OS API. It’s only used to sandbox what directories + the CSI Proxy is enabled to operate on. Now that control is part of the CSI Driver, we can safely remove it. +3. `pkg/os/filesystem` is no longer necessary, as the implementation just calls out to the Golang standard library os + package. We can deprecate it in release notes and remove it in a future release. + +### Alternative: Update the translation layer to use the server code gRPC + +Modify the implementation of [\/\/client_generated.go](https://github.com/kubernetes-csi/csi-proxy/blob/c0c6293490fd8aec269685bb4089be56d69921b1/client/groups/volume/v1/client_generated.go) +so that it calls the server implementation directly (which should be part of the imported go module). The current +implementation uses `w.client` which is the gRPC client: + + +```go +func (w *Client) GetVolumeStats( + context context.Context, + request *v1.GetVolumeStatsRequest, + opts ...grpc.CallOption +) (*v1.GetVolumeStatsResponse, error) { + return w.client.GetVolumeStats(context, request, opts...) +} +``` + + +The new implementation should use the server code instead. In the server code, `volumeserver` is the implementation agnostic server that's instantiated by every versioned client `volumeservervX`. E.g., + + +```go +import ( + v1 "github.com/kubernetes-csi/csi-proxy/client/api/volume/v1" + volumeserver "github.com/kubernetes-csi/csi-proxy/pkg/server/volume" + volumeserverv1 "github.com/kubernetes-csi/csi-proxy/pkg/server/volume/v1" +) + +// initialize all the versioned volume servers i.e. do what cmd/csi-proxy does but on the client +serverImpl := volumeserver.NewServer() + +// shim that would need to be auto generated for every version +serverv1 := volumeserverv1.NewVersionedServer(serverImpl) + +// client still calls the conversion handler code +func (w *Client) GetVolumeStats( + context context.Context, + request *v1.GetVolumeStatsRequest +) (*v1.GetVolumeStatsResponse, error) { + return serverv1.GetVolumeStats(context, request) +} +``` + +![csi-proxy-reuse-client-server-pod](./csi-proxy-reuse-client-server-pod.jpg) + +Pros: + +* We get to reuse the protobuf code. +* We would still support the client/server model, as this is a new method that clients would use. +* We only need to change the client import paths to use the alternative version that doesn't connect to the server with + gRPC, which minimizes the changes necessary in the client code. + +Cons: + +* New APIs would need to be added to the protobuf file, and we would need to run the code generation tool again, with + the rule of not modifying already released API Groups. This means that we would also need to create another API Group + version for a new API. +* We still have two distinct concepts of version: the Go module version and the API version. Given that we want to use + CSI Proxy as a library, it makes sense to use the Go module version as the source of truth and implement a single API + version in each Go version. + +### Alternative: Convert CSI Proxy to a Library of Functions + +With the new changes, CSI Proxy is effectively just a library of Go functions mapping to Windows commands. The notion of +servers and clients is no longer relevant, so it makes sense to restructure the package into a library of functions, +with each API Group’s interfacing functions and types provided under `pkg/` (right now, these files sit at +`pkg/server//server.go` and `pkg/server//impl/types.go`). The OS-facing API at `/pkg/os` is kept +is, and the corresponding OS API struct is initialized globally inside each `pkg/` (to allow for subbing +during testing). All other code can be safely deleted. + +```go +// there is now only one single import +import fs "github.com/kubernetes-csi/csi-proxy/pkg/fs" + +// there is no longer a need to initialize a server +func NewCSIProxyMounter() *CSIProxyMounter { + return &CSIProxyMounter{ + } +} + +// ExistsPath - Checks if a path exists. Unlike util ExistsPath, this call does not perform follow link. +func (*CSIProxyMounter) PathExists(path string) (bool, error) { + // both mounter.FsServer and fsserverimpl are changed to just fs + isExistsResponse, err := fs.PathExists(context.Background(), + &fs.PathExistsRequest{ + Path: mount.NormalizeWindowsPath(path), + }) + if err != nil { + return false, err + } + return isExistsResponse.Exists, err +} + +// usage +csiProxy := NewCSIProxyMounter() +csiProxy.PathExists(path) + +// at test time +fs.UseAPI(mockAPI) +// run tests... +fs.ResetAPI() +``` + +This is the most invasive option of all three. Specifically, we combine the two imports into one and move to a pure +function paradigm. However, the method implementation sees very minimal changes, requiring only import path updates. + +Pros: + +* Like implementation idea 2, we switch to a single notion of version via Go modules. +* The pure function paradigm more accurately reflects the nature of the new design, which simplifies how clients use the + library. +* Like implementation idea 2, new APIs are easier to add by moving away from code generation. + +Cons: + +* There is now an implicit dependency on the os API package-level variable. Testing can still be done by subbing out the + variable with a mock implementation during test time. +* More work (2 imports -> 1, remove server initialization, replace function call and request type package names) needs + to be done by clients to adapt to the new change, though it’s not that much more than implementation idea 2. Again, + the price is only paid once. +* Like impl idea 2, we also need to transition our API doc generation to generate from Go source. + + +### Comparison Matrix + +| ||Preferred option: Update the CSI Drivers to use the server code directly | Alternative: Update the translation layer to use the server code gRPC| Alternative: Convert CSI Proxy to a Library of Functions| +| --- |--- |--- |--- | +| Adoption cost||Considerate (imports and API calls)| Minimal (only changing imports) | Considerate (imports, API calls, and initialization)| +| Future development|Directly add methods to Go code, but leaves legacy notion of “server”| Still need code generation and and protobuf | Directly add functions to Go code. Code base cleaned up| +| Versioning||Go mod version only| Both Go mod version and API version are maintained | Go mod version only| +| Testing|Current tests should still work.| Current tests should still work. | OS API mocking needs to be subbed in, as we have an implicit dependency| +| Support for legacy client/server model|Not supported| Still supported | Not supported| + + +### Maintenance of the new model and existing client/server model of CSI Proxy + +We plan to maintain both versions (the client/server model and the library model) +while the majority of CSI Drivers are in the client/server model. + +The `library-development` branch will be used for the development of this model. +We will create release artifacts from the `library-development` branch and use it in CSI Drivers. +Once the library reaches GA, we will create a `v2` from the `library-development` branch and make it the new default. +For compatibility purposes, `master` will still point to the client/server model. + +### Security analysis + +- Install the Pod Security Admissions controller and use Pod Security Standards + - Embrace the least privilege principle, quoting [Enforcing Pod Security Standards | Kubernetes](https://kubernetes.io/docs/setup/best-practices/enforcing-pod-security-standards/#embrace-the-principle-of-least-privilege) + - Namespaces that lack any configuration at all should be considered significant gaps in your cluster security model. + We recommend taking the time to analyze the types of workloads occurring in each namespace, and by referencing the Pod Security Standards, + decide on an appropriate level for each of them. Unlabeled namespaces should only indicate that they've yet to be evaluated. + - Namespaces allowing privileged workloads should establish and enforce appropriate access controls. + - For workloads running in those permissive namespaces, maintain documentation about their unique security requirements. + If at all possible, consider how those requirements could be further constrained. + - In namespaces without privileged workloads: + - Follow the guidelines in https://kubernetes.io/docs/tasks/configure-pod-container/enforce-standards-namespace-labels/#applying-to-a-single-namespace, + for example, add the following labels to a namespace: + + ```plain + kubectl label --overwrite ns my-existing-namespace \ + pod-security.kubernetes.io/enforce=restricted \ + pod-security.kubernetes.io/enforce-version=v1.25 + ``` + + - Both the baseline and restricted Pod Security Standards disallows the creation of HPC pods (docs). +- Create a Windows group with limited permissions to create files under the kubelet controlled path `C:\var\lib\kubelet` and set the `runAsUserName` field in the PodSpec to that group. + +### Test Plan + + + +The CSI Proxy is implemented out of tree, as such, its testing is not +tied strictly to kubernetes testing. + +- **unit tests and integration tests** - Unit and integration tests will be set up + in the repository through a combination of tests to run in Github Action Windows workers + and in Kubernetes clusters with Windows nodes created through `kubernetes/test-infra`. +- **e2e tests** - Because CSI Proxy v2 is a library to be used by CSI Drivers it cannot + be e2e tested on its own. Instead, CSI Driver authors need to ensure that the kubernetes + external storage test suite passes after integrating their CSI Drivers with CSI Proxy v2. + +##### Prerequisite testing updates + + + +##### Unit tests + + + + + +For CSI Proxy we already have unit tests inside `pkg/`. These tests are run on presubmit for every PR. + +Examples: + +- [volume tests](https://github.com/kubernetes-csi/csi-proxy/blob/c0c6293490fd8aec269685bb4089be56d69921b1/pkg/volume/volume_test.go) +- [filesystem tests](https://github.com/kubernetes-csi/csi-proxy/blob/c0c6293490fd8aec269685bb4089be56d69921b1/pkg/filesystem/filesystem_test.go) + +##### Integration tests + + + + + +For CSI Proxy, we already have integration tests inside `integrationtests`. These tests are run on presubmit for every PR. + +Examples: + +- [volume integration tests](https://github.com/kubernetes-csi/csi-proxy/blob/c0c6293490fd8aec269685bb4089be56d69921b1/integrationtests/volume_test.go) +- [filesystem integration tests](https://github.com/kubernetes-csi/csi-proxy/blob/c0c6293490fd8aec269685bb4089be56d69921b1/integrationtests/filesystem_test.go) +- [iscsi integration tests](https://github.com/kubernetes-csi/csi-proxy/blob/c0c6293490fd8aec269685bb4089be56d69921b1/integrationtests/iscsi_test.go) +- [system integration tests](https://github.com/kubernetes-csi/csi-proxy/blob/c0c6293490fd8aec269685bb4089be56d69921b1/integrationtests/system_test.go) +- [smb integration tests](https://github.com/kubernetes-csi/csi-proxy/blob/c0c6293490fd8aec269685bb4089be56d69921b1/integrationtests/smb_test.go) +- [disk integration tests](https://github.com/kubernetes-csi/csi-proxy/blob/c0c6293490fd8aec269685bb4089be56d69921b1/integrationtests/disk_test.go) + +##### e2e tests + + + +Because CSI Proxy v2 is a library to be used by CSI Drivers it cannot be e2e tested on its own. +Instead, CSI Driver authors need to ensure that the kubernetes external storage test suite passes after integrating +their CSI Drivers with CSI Proxy v2. + +### Graduation Criteria + + +Most of the code used by CSI Drivers through CSI Proxy is already GA. This KEP defines a new mechanism to run +the same code that the CSI Driver executes through CSI Proxy directly inside the CSI Driver. +To verify that the new mechanism is mature, we define the following graduation criteria: + +#### Alpha + +At least 1 CSI Driver uses an alpha release of CSI proxy v2 + +#### Beta + +At least 2 CSI Drivers use a release of CSI Proxy v2 + +#### GA + +At least 2 CSI Drivers use a release of CSI Proxy v2 for at least 1 Kubernetes release (to check on issues during cluster upgrades) + +**Note:** Generally we also wait at least two releases between beta and +GA/stable, because there's no opportunity for user feedback, or even bug reports, +in back-to-back releases. + +**For non-optional features moving to GA, the graduation criteria must include +[conformance tests].** + +[conformance tests]: https://git.k8s.io/community/contributors/devel/sig-architecture/conformance-tests.md + +#### Deprecation + + + +We plan to maintain both versions (the client/server model and the go library model) +because the majority of CSI Drivers use the client/server model. +There is no deprecation of the CSI Proxy v1 model. + +### Upgrade / Downgrade Strategy + + +During the development of a new minor version of the CSI Driver we suggest the following changes: + +**CSI Proxy** + +- Start a development branch for the upcoming work (`library-development`). +- Refactor the filesystem, disk, volume, system, iSCSI, SMB API Groups out of the current client/server. +- Remove the client/server code from the codebase. +- Update the unit and integration tests to work with the refactored code. +- Run the integration tests in a HostProcess container. +- Update the README and DEVELOPMENT docs. +- Once the above items are completed, we can create an alpha tag in the `library-development` branch to import in CSI Drivers. + +**CSI Driver** + +- Update the CSI Proxy library to the alpha v2 tag from the `library-development` branch. +- Update the codebase import to use the server implementation directly instead of the client library. +- Update the CSI Driver deployment manifest with the HostProcess container fields in the `PodSpec`. +- Run the e2e tests. + +When the CSI Driver is upgraded to the next minor version it'd include the imported CSI Proxy library. + +### Version Skew Strategy + + +CSI Proxy v1 has a different release cycle than the CSI Driver, each CSI Proxy binary has its own version and +supports different CSI Proxy client versions. CSI Proxy v2 is a go library imported by the CSI Drivers +so the responsibility of handling version skew is owned by the CSI Driver. + +This component is a dataplane component only and it doesn't need to handle API server version skews, +management of possible version skew for CSI features implemented in the CSI Driver is handled +by the CSI Driver and not CSI Proxy v2. + +## Production Readiness Review Questionnaire + + + +### Feature Enablement and Rollback + + + +###### How can this feature be enabled / disabled in a live cluster? + + + +- [x] Other + - Describe the mechanism: + - Since this component is out of tree, it's up + to CSI Driver authors to integrate CSI Proxy v2 with their CSI Drivers. + - Will enabling / disabling the feature require downtime of the control + plane? + - No + - Will enabling / disabling the feature require downtime or reprovisioning + of a node? + - Yes, after a node plugin is integrated with CSI Proxy v2 + it needs to be installed in the node. Since the node plugin is a container + there needs to be a new rollout. + +###### Does enabling the feature change any default behavior? + + + +The CSI Proxy v2 change is for CSI Driver authors and their CSI Drivers, +there is no default behavior changed that affects cluster users. + +###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)? + + + +Yes, by rolling back to an old implementation of the CSI Driver node plugin using the CSI Proxy v1 client. +Since it needs a rollout, there would be downtime of the CSI Driver node plugin. + +###### What happens if we reenable the feature if it was previously rolled back? + +Nothing, the behavior is constrained to the CSI Driver version only so +the CSI Driver would use the go mod library instead. + +###### Are there any tests for feature enablement/disablement? + + + +No, because this component is out of tree, CSI Driver authors need to verify +that their CSI Driver handles feature enablement/disablement. + +### Rollout, Upgrade and Rollback Planning + + + +###### How can a rollout or rollback fail? Can it impact already running workloads? + + + +###### What specific metrics should inform a rollback? + + + +###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested? + + + +###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.? + + + +### Monitoring Requirements + + + +###### How can an operator determine if the feature is in use by workloads? + + + +###### How can someone using this feature know that it is working for their instance? + + + +- [ ] Events + - Event Reason: +- [ ] API .status + - Condition name: + - Other field: +- [ ] Other (treat as last resort) + - Details: + +###### What are the reasonable SLOs (Service Level Objectives) for the enhancement? + + + +###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service? + + + +- [ ] Metrics + - Metric name: + - [Optional] Aggregation method: + - Components exposing the metric: +- [ ] Other (treat as last resort) + - Details: + +###### Are there any missing metrics that would be useful to have to improve observability of this feature? + + + +### Dependencies + + + +###### Does this feature depend on any specific services running in the cluster? + + + +### Scalability + + + +###### Will enabling / using this feature result in any new API calls? + + + +###### Will enabling / using this feature result in introducing new API types? + + + +###### Will enabling / using this feature result in any new calls to the cloud provider? + + + +###### Will enabling / using this feature result in increasing size or count of the existing API objects? + + + +###### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs? + + + +###### Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, ...) in any components? + + + +###### Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)? + + + +### Troubleshooting + + + +###### How does this feature react if the API server and/or etcd is unavailable? + +###### What are other known failure modes? + + + +###### What steps should be taken if SLOs are not being met to determine the problem? + +## Implementation History + + + +## Drawbacks + + + +## Alternatives + + + +## Infrastructure Needed (Optional) + + + diff --git a/keps/sig-windows/3636-windows-csi-host-process-pods/csi-proxy-client-server.jpg b/keps/sig-windows/3636-windows-csi-host-process-pods/csi-proxy-client-server.jpg new file mode 100644 index 000000000000..656deb3868e2 Binary files /dev/null and b/keps/sig-windows/3636-windows-csi-host-process-pods/csi-proxy-client-server.jpg differ diff --git a/keps/sig-windows/3636-windows-csi-host-process-pods/csi-proxy-library.jpg b/keps/sig-windows/3636-windows-csi-host-process-pods/csi-proxy-library.jpg new file mode 100644 index 000000000000..9a708170e981 Binary files /dev/null and b/keps/sig-windows/3636-windows-csi-host-process-pods/csi-proxy-library.jpg differ diff --git a/keps/sig-windows/3636-windows-csi-host-process-pods/csi-proxy-reuse-client-server-pod.jpg b/keps/sig-windows/3636-windows-csi-host-process-pods/csi-proxy-reuse-client-server-pod.jpg new file mode 100644 index 000000000000..a7ed48292ea4 Binary files /dev/null and b/keps/sig-windows/3636-windows-csi-host-process-pods/csi-proxy-reuse-client-server-pod.jpg differ diff --git a/keps/sig-windows/3636-windows-csi-host-process-pods/kep.yaml b/keps/sig-windows/3636-windows-csi-host-process-pods/kep.yaml new file mode 100644 index 000000000000..0c54ee568b14 --- /dev/null +++ b/keps/sig-windows/3636-windows-csi-host-process-pods/kep.yaml @@ -0,0 +1,40 @@ +title: CSI Drivers in Windows as HostProcess Pods +kep-number: 3636 +authors: + - "@mauriciopoppe" +owning-sig: sig-storage +participating-sigs: + - sig-windows +status: implementable +creation-date: 2022-10-23 +reviewers: + - "@msau42" +approvers: + - "@msau42" + +see-also: + - "/keps/sig-windows/1122-windows-csi-support" +replaces: + - "/keps/sig-windows/1122-windows-csi-support" + +# The target maturity stage in the current dev cycle for this KEP. +stage: alpha + +# The most recent milestone for which work toward delivery of this KEP has been +# done. This can be the current (upcoming) milestone, if it is being actively +# worked on. +latest-milestone: "v1.35" + +# The milestone at which this feature was, or is targeted to be, at each stage. +milestone: + alpha: "v1.35" + beta: "v1.36" + stable: "v1.37" + +# The following PRR answers are required at alpha release +# List the feature gate name and the components for which it must be enabled +feature-gates: [] +disable-supported: true + +# The following PRR answers are required at beta release +metrics: []