KEP-4639: adding OCI VolumeSource #4642

sallyom · 2024-05-17T14:29:39Z

One-line PR description: adding OCI image and/or artifact VolumeSource
VolumeSource: OCI Artifact and/or Image #4639

sftim · 2024-05-18T19:27:18Z

Can this feature be provided out-of-tree (for example, using CSI)?

sftim · 2024-05-18T19:27:58Z

keps/sig-node/4639-oci-volume-source/README.md

+<!--
+What other approaches did you consider, and why did you rule them out? These do
+not need to be as detailed as the proposal, but should include enough
+information to express the idea and why it was not acceptable.
+-->


If CSI is an option for this, explain how that alternative could look.

will add this

A CSI driver would be a better approach IMO

Runtimes already pull and handle images using the CRI. I'm wondering how a CSI parallel pull, auth and storage implementation is better than extending an existing solution.

I have a project that took this approach. https://github.com/converged-computing/oras-csi

@vsoch very cool! is it possible to gate pulls based on signature verification & policy?

Sorry missed this! Not at the moment, but I don't see why not.

sftim · 2024-05-18T19:28:38Z

keps/sig-node/4639-oci-volume-source/README.md

+We propose to add a new `VolumeSource` that supports OCI images and/or artifacts. This `VolumeSource` will allow users to mount an OCI image or artifact directly into a pod,
+making the files within the image accessible to the containers without the need for a shell or additional image manipulation.


Which kinds of artifacts? Not all OCI artifacts represent filesystems.

here I'm referencing files - the equivalent of an OCI scratch image - for container volume mounts - an artifact that can be represented as a file is all that makes sense here I think - but really, anything that can be mounted within a pod's or container's filesystem

How will the volume driver know what kind of OCI artifact it's trying to pull and mount? I know “container image” is one of the options.

should we handle artifacts/images for pods shared across containers in a completely different way than artifacts/images of individual containers... hmm

If this does end up in-tree, I'd hope to see a dedicated artifact type that means “here are files, don't try to run them”; this is just data.

An example: you run an antimalware service with scanning patterns for email messages. The scanner uses a blue/green approach where the same application container image gets deployed alongside an updated scanning patterns image, n times an hour. Both the application image artifact and the scanning pattern artifact are signed. The files in that signature image consist of compiled patterns that would work as a volume source, but there's no executable code in there.

(I do like the idea of making signature verification an eventual goal; that's a useful feature to leave room for).

To me what needs clarifying is:

are we planning to support OCI container images as a readonly volume type, using CRI to handle those images but not to run any code?

are we planning to support arbitrary OCI artifacts as volume sources, include OCI container images as one of potentially many kinds of artifact you can mount?

If ① then CRI feels like the better fit; that way we can use the Pod's RuntimeClass to fetch the image, in case it matters. If ② then we can't assume CRI knows about that kind of artifact.

Either way we should document the alternative, even if we rule it out. It's good for KEPs to mention a sketch of the design(s) we didn't pick as well as the one we did.

On the arbitrary OCI artifacts note.. needs investigation/discussion.. We are talking about setting up a call to discuss next week.

There are a number of ways artifacts have been implemented pre. and post oci 1.1 specs.

@SFTM to your 1. proposal question: What does it mean to provide a mount volume into the pod's containers but not to run any code? I get that we are not expanding a pod to be an arbitrary container image/artifact with it's own command to run (yet). Sure. But what does it mean to mount a container image or artifact, but not to run any code there or from there? Does the volume include platforms/versions or just the one image the oci tooling would've chosen if it was being run on the node as a container by a runtime engine, does the volume include the entire index manifest tree/chain, manifest configs layers and all? Or just the layers that would otherwise makeup the container's rootfs?

To your 2. proposal we also can't assume any CSI driver would know about all particular kinds of artifacts, registries and how to pull, parse & mount them.

On the arbitrary OCI artifacts note.. needs investigation/discussion.. We are talking about setting up a call to discuss next week.

The OCI volume source should be a mountable artifact in the first place.
Arbitrary OCI artifacts may not work very well. ORAS supports pushing any object to the registry and generating its OCI artifacts, but not all of them can be mounted as volumes.

If ① then CRI feels like the better fit; that way we can use the Pod's RuntimeClass to fetch the image, in case it matters

This may matter for VM-based runtimes that use block devices to expose the image rootfs.

But what does it mean to mount a container image or artifact, but not to run any code there or from there? Does the volume include platforms/versions or just the one image the oci tooling would've chosen if it was being run on the node as a container by a runtime engine, does the volume include the entire index manifest tree/chain, manifest configs layers and all? Or just the layers that would otherwise makeup the container's rootfs?

As a basic idea here: the one image the container runtime would have chosen (i.e., matching platforms in the case of an index) as a union (or flattened), but without the config (since the config does not affect any files). IOW the union of the layers that would otherwise make up the container's rootfs (and probably without any runtime-generated files such as /etc/hosts and without runtime-generated mounts like /proc or /sys).

If ② then we can't assume CRI knows about that kind of artifact.

To your 2. proposal we also can't assume any CSI driver would know about all particular kinds of artifacts, registries and how to pull, parse & mount them.

An argument in favor of the CSI approach is that a cluster admin could deploy a CSI driver that matches whatever artifact types they expect to have used in the cluster. On the other hand, another approach that we could take on the runtime side is to build out an extension mechanism such that the runtime delegates to some sort of mount-provider program based on the mediatype that is found after pulling the artifact.

sallyom · 2024-05-21T13:02:35Z

Can this feature be provided out-of-tree (for example, using CSI)?

Anyone know what happened with this? I'll add this as an Alternative, I'm gathering info csi-driver-image-populator

Signed-off-by: sallyom <[email protected]>

keps/sig-node/4639-oci-volume-source/README.md

rchincha · 2024-05-24T17:46:18Z

Noticing that there are a lot of groups working on "similar" things.
Getting like-minded folks together ...
https://github.com/converged-computing/oras-csi

rchincha · 2024-05-25T06:48:20Z

keps/sig-node/4639-oci-volume-source/README.md

+   - Update Kubelet and CRI to recognize and handle new media types associated with OCI artifacts.
+   - Ensure that pulling and storing these artifacts is as efficient and secure as with OCI images.
+
+**Lifecycling and Garbage Collection:**


https://kubernetes.io/docs/concepts/architecture/garbage-collection/

rchincha · 2024-05-25T06:52:10Z

keps/sig-node/4639-oci-volume-source/README.md

+
+### [KEP 1495: Volume Populators](https://github.com/kubernetes/enhancements/tree/master/keps/sig-storage/1495-volume-populators)
+
+The volume-populators API extension allows you to populate a volume with data from an external data source when the volume is created.


Volume populator with a datasource pointing to a OCI artifact on a registry could still be an option. But with all the options k8s has, prudent to pick one.

rchincha · 2024-05-25T06:53:38Z

keps/sig-node/4639-oci-volume-source/README.md

+### Custom CSI Plugin
+
+See [https://github.com/warm-metal/container-image-csi-driver](https://github.com/warm-metal/container-image-csi-driver)
+


https://github.com/converged-computing/oras-csi

rchincha · 2024-05-25T06:57:29Z

keps/sig-node/4639-oci-volume-source/README.md

+	 volumes and container images. The in-tree OCI VolumeSource will utilize these existing mechanisms.
+
+2. **Integration with Kubernetes:**
+   - **Optimal Performance:** Deep integration with the scheduler and kubelet ensures optimal performance and


More details needed about how one would handle large OCI artifacts - AI/ML inference data use case for example. The real win we are looking for here is the latency/network cost of pulling said artifact and not needing to repeat this across multiple pod restarts.

https://kubernetes.io/docs/concepts/storage/volumes/#gitrepo
^ could be similar to this, deprecated though in favor on initContainer/emptyDir

We likely need a "persistent" (cache?) volume source.

https://kubernetes.io/docs/concepts/storage/volumes/#gitrepo ^ could be similar to this, deprecated though in favor on initContainer/emptyDir

We likely need a "persistent" (cache?) volume source.

On-demand loading with a separate standalone cache can handle large artifact sizes very well. We have done some POC about it.

If the artifacts support lazy loading, then only some metadata needs to be saved locally, and there is no need to worry too much about garbage collection or layer data reuse. We only need a local cache system to save temporary data, and the management of images/artifacts can be easier.

keps/sig-node/4639-oci-volume-source/kep.yaml

keps/sig-node/4639-oci-volume-source/README.md

aojea · 2024-05-28T21:33:45Z

keps/sig-node/4639-oci-volume-source/README.md

+   an additional operational burden. For a generic, vendor-agnostic, and widely-adopted solution this would not make sense.
+ - External CSI plugins implement their own lifecycle management and garbage collection mechanisms,
+   yet these already exist in-tree for OCI images.
+ - Performance: There is additional overhead with an out-of-tree CSI plugin, especially in scenarios requiring frequent image pulls


can you expand more on what is this additional overhead?

keps/sig-node/4639-oci-volume-source/README.md

Signed-off-by: Sascha Grunert <[email protected]>

saschagrunert · 2024-06-19T11:50:39Z

Requesting approval from @kubernetes/sig-node-feature-requests

keps/sig-node/4639-oci-volume-source/README.md

Co-authored-by: Brandon Mitchell <[email protected]>

Signed-off-by: Sascha Grunert <[email protected]>

keps/sig-node/4639-oci-volume-source/README.md

keps/sig-node/4639-oci-volume-source/kep.yaml

SergeyKanzhelev

/lgtm

Signed-off-by: Sascha Grunert <[email protected]>

haircommander · 2024-06-21T13:38:29Z

excellent work here!

/lgtm

mrunalp · 2024-06-21T14:38:53Z

/approve

saschagrunert · 2024-06-21T14:41:18Z

@kubernetes/prod-readiness-reviewers PTAL

soltysh · 2024-06-21T15:20:23Z

/approve
David previously approved the PRR, and he's currently out, so replacing him on that job 😉

k8s-ci-robot · 2024-06-21T15:20:36Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mrunalp, sallyom, soltysh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~keps/prod-readiness/OWNERS~~ [soltysh]
~~keps/sig-node/OWNERS~~ [mrunalp]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

gjkim42 · 2024-07-20T06:04:43Z

keps/sig-node/4639-oci-volume-source/README.md

+
+To clear old volumes, all workloads using the `VolumeSource` needs to be
+recreated after restarting the kubelets. The kube-apiserver does only the API
+validation whereas the kubelets serve the implementation. This means means that


nit:
s/means means/means/

k8s-ci-robot requested review from derekwaynecarr and mrunalp May 17, 2024 14:29

sallyom force-pushed the oci-volume-source branch from da2605a to 872e7a5 Compare May 17, 2024 20:21

sftim reviewed May 18, 2024

View reviewed changes

sallyom mentioned this pull request May 20, 2024

VolumeSource: OCI Artifact and/or Image #4639

Open

9 tasks

kep for OCI VolumeSource

0ea6a01

Signed-off-by: sallyom <[email protected]>

sallyom force-pushed the oci-volume-source branch from 872e7a5 to b00c454 Compare May 22, 2024 02:00

update with alternatives

e19f0a2

Signed-off-by: sallyom <[email protected]>

sallyom force-pushed the oci-volume-source branch from b00c454 to e19f0a2 Compare May 22, 2024 02:05

update to include diffs from images & artifacts

fd15ec5

Signed-off-by: sallyom <[email protected]>

sallyom force-pushed the oci-volume-source branch from d3a6089 to fd15ec5 Compare May 22, 2024 20:28

samuelkarp reviewed May 23, 2024

View reviewed changes

keps/sig-node/4639-oci-volume-source/README.md Outdated Show resolved Hide resolved

rchincha mentioned this pull request May 24, 2024

Model Registry proposal (ref KF community meeting 20240102) kubeflow/community#682

Open

rchincha reviewed May 25, 2024

View reviewed changes

aojea reviewed May 28, 2024

View reviewed changes

keps/sig-node/4639-oci-volume-source/kep.yaml Outdated Show resolved Hide resolved

samuelkarp mentioned this pull request May 28, 2024

performance: what can image-spec do to improve handling of large images? opencontainers/image-spec#1190

Open

aojea reviewed May 28, 2024

View reviewed changes

keps/sig-node/4639-oci-volume-source/README.md Outdated Show resolved Hide resolved

aojea reviewed May 28, 2024

View reviewed changes

keps/sig-node/4639-oci-volume-source/README.md Show resolved Hide resolved

aojea reviewed May 28, 2024

View reviewed changes

keps/sig-node/4639-oci-volume-source/README.md Outdated Show resolved Hide resolved

samuelkarp mentioned this pull request May 29, 2024

REQUEST: New membership for samuelkarp kubernetes/org#4987

Closed

11 tasks

Add more flow explanation and mermaid chart

2300ee9

Signed-off-by: Sascha Grunert <[email protected]>

saschagrunert force-pushed the oci-volume-source branch from fb66d1b to 9546ed9 Compare June 19, 2024 08:12

Make API section dedicated

e23bc03

Signed-off-by: Sascha Grunert <[email protected]>

saschagrunert force-pushed the oci-volume-source branch from 9546ed9 to e23bc03 Compare June 19, 2024 08:13

k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Jun 19, 2024

sudo-bmitch reviewed Jun 19, 2024

View reviewed changes

keps/sig-node/4639-oci-volume-source/README.md Outdated Show resolved Hide resolved

saschagrunert and others added 2 commits June 20, 2024 09:09

Integrate review from sudo-bmitch

5d78981

Co-authored-by: Brandon Mitchell <[email protected]>

Add API feature gate

5155c90

Signed-off-by: Sascha Grunert <[email protected]>

gnufied reviewed Jun 20, 2024

View reviewed changes

keps/sig-node/4639-oci-volume-source/README.md Show resolved Hide resolved

SergeyKanzhelev reviewed Jun 20, 2024

View reviewed changes

keps/sig-node/4639-oci-volume-source/kep.yaml Show resolved Hide resolved

SergeyKanzhelev reviewed Jun 20, 2024

View reviewed changes

k8s-ci-robot assigned SergeyKanzhelev Jun 20, 2024

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 20, 2024

Added mrunalp as approver and removed oscar.doe

fac2625

Signed-off-by: Sascha Grunert <[email protected]>

k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 21, 2024

k8s-ci-robot assigned haircommander Jun 21, 2024

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 21, 2024

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 21, 2024

k8s-ci-robot merged commit fa8ed51 into kubernetes:master Jun 21, 2024
4 checks passed

k8s-ci-robot added this to the v1.31 milestone Jun 21, 2024

andreyvelich mentioned this pull request Jul 18, 2024

KEP-2170: Kubeflow Training V2 API kubeflow/training-operator#2171

Merged

gjkim42 reviewed Jul 20, 2024

View reviewed changes

sudo-bmitch mentioned this pull request Sep 3, 2024

Support the media type application/vnd.oci.image.layer.v1.raw opencontainers/image-spec#1197

Open

rhatdan mentioned this pull request Nov 5, 2024

Use subpath for OCI Models containers/ramalama#411

Merged

		We propose to add a new `VolumeSource` that supports OCI images and/or artifacts. This `VolumeSource` will allow users to mount an OCI image or artifact directly into a pod,
		making the files within the image accessible to the containers without the need for a shell or additional image manipulation.


		### [KEP 1495: Volume Populators](https://github.com/kubernetes/enhancements/tree/master/keps/sig-storage/1495-volume-populators)

		The volume-populators API extension allows you to populate a volume with data from an external data source when the volume is created.

		### Custom CSI Plugin

		See [https://github.com/warm-metal/container-image-csi-driver](https://github.com/warm-metal/container-image-csi-driver)

KEP-4639: adding OCI VolumeSource #4642

KEP-4639: adding OCI VolumeSource #4642

Conversation

sallyom commented May 17, 2024 • edited by saschagrunert Loading

sftim commented May 18, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sftim May 21, 2024 • edited Loading

Choose a reason for hiding this comment

sftim May 24, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sallyom commented May 21, 2024

rchincha commented May 24, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rchincha May 25, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

saschagrunert commented Jun 19, 2024

SergeyKanzhelev left a comment

Choose a reason for hiding this comment

haircommander commented Jun 21, 2024

mrunalp commented Jun 21, 2024

saschagrunert commented Jun 21, 2024

soltysh commented Jun 21, 2024

k8s-ci-robot commented Jun 21, 2024

Choose a reason for hiding this comment

sallyom commented May 17, 2024 •

edited by saschagrunert

Loading

sftim May 21, 2024 •

edited

Loading

sftim May 24, 2024 •

edited

Loading

rchincha May 25, 2024 •

edited

Loading