-
Notifications
You must be signed in to change notification settings - Fork 5.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Raw block volume support #805
Conversation
@erinboyd Seems the proposal is overlapping with local persistent storage, which including features like local block devices as a volume and local raw block device. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This proposal also looks pretty rough, and I'm missing details about how the raw block storage is now envisioned.
|
||
Authors: erinboyd@, screeley44@, mtanino@ | ||
|
||
This document presents a proposal for managing raw block storage in Kubernetes using the persistent volume soruce API as a consistent model |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
source
Before the advent of storage plugins, emptyDir and hostPath were widely used to quickly prototype stateless applications in Kube. | ||
Both have limitations for their use in application that need to store persistent data or state. | ||
EmptyDir, though quick and easy to use, provides no real guarantee of persistence for suitable amount of time. | ||
Appropriately used as scratch space, it does not have the HostPath, became an initial offering for local storage, but had many drawbacks. Without having the ability to guarantee space & ensure ownership, one would lose data once a node was rescheduled. Therefore, the risk outweighed the reward when trying to leverage the power of local storage needed for stateful applications like databases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/ / /
@screeley44 @msau42 @mtanino please review |
drawbacks. Without having the ability to guarantee space & ensure ownership, one would lose data once a node was rescheduled. | ||
Therefore, the risk outweighed the reward when trying to leverage the power of local storage needed for stateful applications like | ||
databases. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that many people are familiar with the difference between a raw block volume and a block volume with a filesystem on it.
Raw block will provide the ability to perform block operations on the volumes, which are required or desired by several stateful systems. For example BlueStore/BlueFS for Ceph and MySQL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ianchakeres
Agree. We need to define terminologies on top of the document.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a 'system utility' & bootstrap use case that should be noted. If a fs health check, partition/format or other disk utility needs to run on the node it will want raw block. Examples may be local storage partition & format of a raw disk, deploy kube to a bare metal node, utility pods that check disk health after reboot etc..
@@ -0,0 +1,571 @@ | |||
# Local Raw Block Consumption via Persistent Volume Source |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can remove "local" from the title since both local and network attached are target of this proposal.
Can we use same terminology throughout the spec?
- block volume
- raw block volume
- raw block device
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we don't need "via Persistent Volume Source" on the title since this spec will cover both static and dynamic provisioning, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's implying that we're not going to support direct volume access (ie VolumeSource in the pod)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got an intention. thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I noticed someone changed the title to just raw block, are we fine with this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am fine with it. The changes proposed here can be applied to any volume plugin, not just local storage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@erinboyd
I mean the title of committed document would be "Raw Block Volume Consumption via Persistent Volume Source" or something like that. At least, please remove the "local".
I think the title of this PR is originally "Raw block". I'd prefer the title of this PR like "[Proposal] Raw block volume support" but this is nit.
|
||
# Value add to Kubernetes | ||
|
||
Before the advent of storage plugins, emptyDir and hostPath were widely used to quickly prototype stateless applications in Kube. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/Kube/Kubernetes/
* Enable durable access to block storage | ||
* Support storage requirements for all workloads supported by Kubernetes | ||
* Provide flexibility for users/vendors to utilize various types of storage devices | ||
* Agree on API changes for block |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
block volume support
or raw block volume support
or raw block device support
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This goal is already implied by "Enable durable access to block storage"
|
||
DESCRIPTION: | ||
|
||
A developer wishes to enable their application to use a local raw block device as the volume for the container. The admin has already created PVs that the user will bind to by specifying 'block' as the volume type of their PVC. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think iSCSI or FC are type of network attached.
I suppose that the local storage means the server itself has own physical storage like SSDs inside the server.
I recommend to add terminology section to define terms such as local storage, network attached storage at the top of this document to avoid confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this example is a little confusing because it mentions local storage and iscsi/fc together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, most of storage plugins are network attached, I think we should introduce examples mainly base on network attached storage plugin such as gce, aws, fc, iscsi, nfs etc, and then we can add local storage plugin(local storage case) as additional examples, but we need clearly distinguish between network attached case and local storage case to avoid confusion.
Unmounter | ||
GetVolumePath() string | ||
} | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add "Features & Milestones" section, then we need to mention that there are two phases to support of block volume support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -0,0 +1,571 @@ | |||
# Local Raw Block Consumption via Persistent Volume Source |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we don't need "via Persistent Volume Source" on the title since this spec will cover both static and dynamic provisioning, right?
accessModes: | ||
- "ReadWriteOnce" | ||
gcePersistentDisk: | ||
fsType: "raw" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fsType: "raw"
will also require a change in mount_linux.go for formatAndMount method correct? If so probably want to call this out as a change in the proposal like you do for the api changes in the yaml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated in the milestones and features
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of special casing "raw" as a value for fsType, can fsType just be ignored/empty if volumeType == block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could there be a requested volume (other than GCE) where you would request volumeType: block
and also want a fs put on...seems if we always ignore fsType when block
is requested we could have a problem for other storage plugins or even future use cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to @msau42, if someone wants a raw device then volume plugin should ignore fsType part of the PV. If some crazy pod wants a raw device with a filesystem on it the pod can call mkfs on its own.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mounter binding matrix for "Dynamic Provisioning"
# | Plugin fstype | PVC volumeType | Result | Result of volume |
---|---|---|---|---|
00 | unspecified | BIND | file(ext4) | |
01 | block | BIND | block |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mtanino @jsafrane @msau42 so there is much discussion around not having an fstype. So what if you aren't pre-creating the volumes (hence the PV doesn't exist). You would only have the PVC and the storageClass to indicate block. You could potentially have a provisioner that then creates the block device and adds a filesystem or not based on the fstype. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like @jsafrane mentioned, today provisioners don't format the filesystem. It's only at mount time that the plugin formats a filesystem. So I don't think provisioners or binding should use fstype
, but they should use volumeType
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I'm not sure which above you're referring to. Can you clarify?
@@ -0,0 +1,571 @@ | |||
# Local Raw Block Consumption via Persistent Volume Source |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's implying that we're not going to support direct volume access (ie VolumeSource in the pod)
* Provide flexibility for users/vendors to utilize various types of storage devices | ||
* Agree on API changes for block | ||
* Provide a consistent security model for block devices | ||
* Provide block storage usage isolation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by isolation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically to only allow one user to use the raw block device as done with fsgroup today
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you clarify that here? There are other kinds of isolation that we sometimes talk about, like capacity and IOPS isolation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we are guaranteeing IOPS or capacity isolation. I will be explicit on this bullet point
* Support all storage devices natively in upstream Kubernetes. Non-standard storage devices are expected to be managed using extension | ||
mechanisms. | ||
|
||
# Value add to Kubernetes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section talks about ephemeral vs persistent, and downsides of hostpath volumes for local storage. But it doesn't really answer why block is valuable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has been updated, please review
The additional benefit of explicitly defining how the volume is to be consumed will provide a means for indicating the method | ||
by which the device should be scrubbed when the claim is deleted, as this method will differ from a raw block device compared to a | ||
filesystem. The ownership of scrubbing the device properly shall be up to the plugin method being utilized. | ||
The last design point is block devices should be able to be fully restricted by the admin in accordance with how inline volumes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opened up issue kubernetes/kubernetes#44982 about this. Currently pod security policy only enforces direct volume sources, and not Persistent Volumes.
|
||
## Persistent Volume Claim API Changes: | ||
In the simplest case of static provisioning, a user asks for a volumeType of block. The binder will only bind to a PV defined | ||
with the same label. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/label/volumeType
|
||
DESCRIPTION: | ||
|
||
A developer wishes to enable their application to use a local raw block device as the volume for the container. The admin has already created PVs that the user will bind to by specifying 'block' as the volume type of their PVC. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this example is a little confusing because it mentions local storage and iscsi/fc together.
claimName: raw-pvc | ||
``` | ||
|
||
## UC3: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is this different from UC1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to have UC1 admin focused and UC3 to be user focused
metadata: | ||
name: web | ||
spec: | ||
serviceName: "nginx" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know the actual gluster spec may be more complicated than this, but it may fit the example better if we used gluster here instead of nginx.
And then have an example of an application pod using gluster volumes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is UC7
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, so the example in UC7 shows a nginx application using ebs raw block volumes.
I think what you actually want, is an gluster application using ebs raw block volumes, and then an nginx application using gluster volumes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lpabon can you clarify the example here? My understanding is that the volume access would be like: application -> gluster -> raw block
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lpabon ^^^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@msau42 I think what you actually want, is an gluster application using ebs raw block volumes, and then an nginx application using gluster volumes.
-- Correct.
The goal here is for raw block devices to be created using dynamic provisioning, then attached to the host/container for access -- without a file system being created on top of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can remove the nginx example. It could create confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lpabon so just remove UC7 all together?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@erinboyd Please don't :) . Just remove the nginx example. I will provide an example for you
|
||
## UC9: | ||
|
||
DESCRIPTION: Developer wishes to consumes raw device |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is this different from UC1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed as dupicate @msau42
|
||
# Implementation Plan | ||
|
||
Phase 1: Pre-provisioned PVs to precreated devices |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also CRI changes
provisioner: no-provisioning | ||
parameters: | ||
volumeType: block | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please remove empty line.
- ReadWriteOnce | ||
local: | ||
path: /dev/xvdc | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please remove empty line.
provisioner: kubernetes.io/local-block-ssd | ||
parameters: | ||
volumeType: block | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please remove empty line.
resources: | ||
requests: | ||
storage: 10Gi | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please remove empty line.
gcePersistentDisk: | ||
fsType: "raw" | ||
pdName: "gce-disk-1" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please remove empty line.
|
||
DESCRIPTION: | ||
|
||
A developer wishes to enable their application to use a local raw block device as the volume for the container. The admin has already created PVs that the user will bind to by specifying 'block' as the volume type of their PVC. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, most of storage plugins are network attached, I think we should introduce examples mainly base on network attached storage plugin such as gce, aws, fc, iscsi, nfs etc, and then we can add local storage plugin(local storage case) as additional examples, but we need clearly distinguish between network attached case and local storage case to avoid confusion.
To access this proposal easily, could you update the title from "Raw block" |
metadata: | ||
name: myclaim | ||
spec: | ||
**volumeType: block** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't render correctly if the entire block is formatted. how about we simply add a comment "# new field"
@k8s-sig-api @k8s-sig-apps @k8s-sig-node |
@msau42 @mtanino @screeley44 please re-review changes according to comments |
@@ -0,0 +1,559 @@ | |||
# Local Raw Block Consumption in Kubernetes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove "Local".
Local Raw Block Consumption in Kubernetes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mtanino done
provisioner: no-provisioning | ||
parameters: | ||
``` | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add "volumeType combination matrix" which we have on the google doc?
User and Admin need to know the combination of parameters between pv, pvc and storageclass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mtanino Added
/cc |
/sig storage |
Therefore, the risk outweighed the reward when trying to leverage the power of local storage needed for stateful applications like | ||
databases. | ||
|
||
By extending the API for volumes to specifically request a raw block device, we provide a consist method for accessing volumes. In |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How was it not consistent before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By having the raw block devices created as PVs they are more consumable rather than in terms of how file on block is handled today, the raw devices are created outside of kube and kube has no knowledge or reference to what is available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess my confusion is that from a user point of view, the access of volume's today is consistent. It's always a filesystem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to make it more concise
|
||
By extending the API for volumes to specifically request a raw block device, we provide a consist method for accessing volumes. In | ||
addition, the ability to use a raw block device without a filesystem will allow Kuberenets better support of high performance | ||
applications that can utilitze raw block devices directly for their storage. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you give some examples here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these are covered in the Use Cases. I don't see a need to duplicate it here unless you feel it's necessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't mean yaml examples, I mean just naming some of the DBs that could benefit from it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@msau42 added some examples, but as said above, these are covered extensively in the use cases
|
||
# Value add to Kubernetes | ||
|
||
Before the advent of storage plugins, emptyDir and hostPath were widely used to quickly prototype stateless applications in Kubernetes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if we need the all this storage history.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like it in there...code is 90% history
|
||
## Persistent Volume API Changes: | ||
For static provisioning the admin creates the volume and also is intentional about how the volume should be consumed. For backwards | ||
compatibility, the absence of volumeType will default to volumes work today, which are formatted with a filesystem depending on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
default to file
, which is how volumes work today
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@msau42 fixed
For static provisioning the admin creates the volume and also is intentional about how the volume should be consumed. For backwards | ||
compatibility, the absence of volumeType will default to volumes work today, which are formatted with a filesystem depending on | ||
the plug-in chosen. Recycling will not be a supported reclaim policy. Once the user deletes the claim against a PV, the volume will | ||
be scrubbed according to how it was bound. The path value in the local PV definition would be overloaded to define the path of the raw |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
emphasize that it's up to the plugin to scrub appropriately
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
|
||
## UC1: | ||
|
||
DESCRIPTION: An admin wishes to pre-create a series of raw block devices to expose as PVs for consumption. The admin wishes to specify the purpose of these devices by specifying 'block' as the volumeType for the PVs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you're using local volumes as the example, you can say "local raw block devices" in the description. Or, if you want it to be generic, and not just local, then maybe also give an example of some other plugins, like gce pd or fc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@msau42 UC8 gives an example of GCE network attached raw block device. Updated UC1 to be local specific
``` | ||
NOTE: *accessModes correspond to the container runtime values. Where RWO == RWM (mknod) to enable the device to be written to and | ||
create new files. (Default is RWM) ROX == R | ||
**(RWX is NOT valid for block and should return an error.)** * This has been validated among runc, Docker and rocket. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this something we can include in the PV validation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@msau42 we don't do this today for any other parameters, it's just is for 'matching' labels. I am not against it for making usability better, but I think we would have to validate all access modes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I think I am a little confused here. Access mode in this comment is referring to the container runtime values, and not the access mode that you specify in the PV/PVC.
So maybe this comment belongs in a section about CRI + runtimes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@msau42 I think we are just noting this as a suggestion to the CRI team. That we have validated it for various runtimes and it should behave to respect those values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think it would be valuable to have a section about CRI and how you plan on using CRI to implement this. Then this comment could go there. Right now it's kind of out of place in the user examples.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This note on access modes should be moved to the CRI section
|
||
***This has implementation details that have yet to be determined. It is included in this proposal for completeness of design **** | ||
|
||
## UC6: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is this different from UC3?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@msau42 UC6 uses a Storage Class and UC3 does not
metadata: | ||
name: web | ||
spec: | ||
serviceName: "nginx" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, so the example in UC7 shows a nginx application using ebs raw block volumes.
I think what you actually want, is an gluster application using ebs raw block volumes, and then an nginx application using gluster volumes.
accessModes: | ||
- "ReadWriteOnce" | ||
gcePersistentDisk: | ||
fsType: "raw" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of special casing "raw" as a value for fsType, can fsType just be ignored/empty if volumeType == block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I miss some technical details about how will we expose block devices to pods and how we make them usable by non-root processes.
|
||
# Non Goals | ||
* Support all storage devices natively in upstream Kubernetes. Non-standard storage devices are expected to be managed using extension | ||
mechanisms. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What extension mechanism? flex? CSI?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jsafrane I don't know. It's a non-goal so we haven't figured that out yet and isn't included in this design.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For extension, flexvolume is a example.
@erinboyd Hypervisor based container runtime is one of best use cases here I think. xref https://github.com/kubernetes/frakti
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @resouer
# Terminology | ||
* Raw Block Device - a physically attached device devoid of a filesystem | ||
* Raw Block Volume - a logical abstraction of the raw block device as defined by a path | ||
* File on Block - a formatted (ie xfs) filesystem on top of a raw block device |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you mean Filesystem on Block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jsafrane they will need to be exposed through the CRI (deviceinfo and devices) as part of the runtime interfaces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant that File on Block
does not make much sense, do you really mean single file on a block devices? Most people use filesystems for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 "Filesystem on block" is more clear.
and consumption of the devices. Since it's possible to create a volume as a block device and then later consume it by provisioning | ||
a filesystem on top, the design requires explicit intent for how the volume will be used. | ||
The additional benefit of explicitly defining how the volume is to be consumed will provide a means for indicating the method | ||
by which the device should be scrubbed when the claim is deleted, as this method will differ from a raw block device compared to a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why should kubernetes / volume plugin scrub a device? Recycler is deprecated and local storage manages scrubbing in a different way, hidden from kubernetes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because if we go to reuse the volume, the data will still be there. Depending on the plugin type, the scrubbing may differ. For instance, if you are just using the raw block device vs. file on block, the clean up will be different. Therefore, it should fall to the plugin to 'know' how to scrub it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the point you're trying to make here is that explicitly having volumeType
in the PV and PVC can help the storage provider determine what scrubbing method to use. As an example, for local storage:
- PV = block, PVC = block: zero the bytes
- PV = block, PVC = file: destroy the filesystem
- PV = file, PVC = file: delete the files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
THanks @msau42 added this to the doc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, Kubernetes does not scrub devices. Recycler is deprecated, the only policies are Retain or Delete (the whole volume). So why is scrubbing part of this proposal? The only thing that scrubs devices can be an external DaemonSet for local storage. And that's outside of this PR, isn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jsafrane it will be plugin dependent on how to scrub the device. (Not that I am disagreeing with you above)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds to me like another project which can be included later. I completely agree that scrubbing is important, but this is mainly on bare-metal deployments. On the cloud, when a network block device is deleted, the cloud vendor takes care of all of this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lpabon yes it is a different project and must be a consideration of each plugin. It is noted here for thoroughness
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the difference between recycling and scrubbing? To be it's the same thing.
it will be plugin dependent on how to scrub the device
Plugins don't scrub anything. External local provisioner might want to do this, however it's external provisioner and we should not define the policy in Kubernetes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 with @jsafrane- this is confusing. We do not want to imply that there is recycle or 'scrub' function of any sort encouraged by the API... You could say that the existing PV retention policies apply the same to PV.block as PV.file.. or not say anything.. but bringing up "scrub" implies recycle which is very deprecated.
by which the device should be scrubbed when the claim is deleted, as this method will differ from a raw block device compared to a | ||
filesystem. The ownership of scrubbing the device properly shall be up to the plugin method being utilized. | ||
The last design point is block devices should be able to be fully restricted by the admin in accordance with how inline volumes | ||
are today. Ideally, the admin would want to be able to restrict either raw-local devices and or raw-network attached devices. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder why would an admin want to restrict usage of block devices. What damage can an user do? The device is retained/deleted afterwards, i.e. nobody else can use it. The damage is IMO zero.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from @thockin in the google doc review: "I could easily write something that tastes like ext4 but causes the kernel
to do silly things like allocate infinite memory or dereference bad
pointers.
Mount is privileged for very good reasons."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mount is privileged operation. If a pod is privileged, it can destroy the machine in many various ways, mounting a bad ext4 is quite complicated one. One does not need to use block device to mount a bad ext4, simple local file and mount -o loop
is enough.
And there should be no way how to change a PVC with volumeType=block
to volumeType=ext4
, our validator should not allow this change (btw, a note in this proposal would be nice).
So, is there any other thing to be afraid of?
name: myclaim | ||
spec: | ||
volumeType: block #proposed API change | ||
accessModes: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indentation is wrong here, accesModes (and other fields) should be on the same level as volumeType
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually, most YAML examples in this PR have wrong indentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jsafrane fixed
|
||
ADMIN: | ||
|
||
* Admin creates a disk and attach it to server and pass-through the disk to KVM guest which Kubelet node1 is working inside. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the way I understand FC volume plugin is that it's the plugin who attaches a volume to a node like iSCSI or AWS does. There is no need for admin to attach the volume nor label nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jsafrane
In this case, kubelet is working inside a VM on hyper-visor. Therefore, admin have to create a disk and expose,attach the disk to the hyper-visor and also pass-through it to the VM beforehand.
(This might be done by external-provisioner if exits)
After that, FC plugin can find(attach) the volume to a container on the VM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But then it's not a PV with FCVolumeSource
, it looks like a HostPath or local volume. The thing that this volume is FC is irrelevant to the use case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jsafrane
IIRC, following steps are needed to use pre-created FC volume.(note: these steps aren't virtual environmental case.)
- Admin create a volume and expose the volume to a worker node.(This is done by storage operation)
- Admin create FC PV using device location such as WWN and LUN.
- User creates a PVC to bind pre-created FC PV.
- User creates a Pod with PVC. During Pod creation, FC Plugin find FC disk which is attached to the worker node using "echo 1 > /sys...", then the FC volume is recognized to the worker node and the Pod consume this FC volume.
As you mentioned, I think FC volume is similar to a HostPath without external provisoner. But I think this is not a local volume case, because the FC volume is network attached volume and is not included inside the server local.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FC Plugin find FC disk which is attached to the worker node using "echo 1 > /sys...", then
Indeed, our FC plugin expects that someone attaches the volume before. I am sorry for confusion, I expected that our FC plugin is smarter and can attach a FC LUN by itself. So, could we use a simpler and more obvious use case with say iSCSI?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jsafrane
Thank you for the confirmation.
I expected that our FC plugin is smarter and can attach a FC LUN by itself.
I see. Current FC plugin only has very limited functionality.
So, could we use a simpler and more obvious use case with say iSCSI?
How about following?
- Admin creates a volume and expose it to kubelet node1 VM running on KVM.(This is done by storage operation)
- Admin creates an iSCSI persistent volume using storage information such as portal IP, iqn and lun.
- Admin adds "raw-disk" label to the kubelet node1 VM.
- User creates a persistent volume claim with volumeType: block option to bind pre-created iSCSI PV.
- User creates a Pod yaml which uses raw-pvc PVC and selects a node where kubelet node1 VM is running via nodeSelector option.
During Pod creation, iSCSI Plugin attaches iSCSI volume to the kubelet node1 VM using storage information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that sounds better. Out of curiosity, why does the admin expose the volume just to one node? Wouldn't it be better if the volume could be used by all nodes in the cluster so the pod can run anywhere? That's IMO the typical use case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jsafrane
I forgot to change that scenario. That is specific to the FC plugin case. I'll change it.
As for the FC case, if we expose a FC volume to all worker nodes, the FC volume may be recognized by OS on those all worker nodes. If a container consumes the volume on a specific worker node, device file like /dev/sdX will be removed during volume deletion from the worker node, but the device file keeps remaining on other worker nodes even if it's unnecessary.
To avoid this situation, I suggested to expose FC volume to specific worker node and consume it on a selected worker node.
image: mysql | ||
volumeMounts: | ||
- name: my-db-data | ||
mountPath: /var/lib/mysql/data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is /var/lib/mysql/data
the right path for a device?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jsafrane The PV would have the device path. This is the path for the container application
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, how is a raw block device exposed to a pod if it is not mountPath
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jsafrane In the PV where it's defined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PV defines source of the volume. How it is exposed to the pod? Above you wrote that it's not mountPath
. So how is a MySQL in a container supposed to find the device?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jsafrane The pod would specify the PVC in it, which is bound to the PV you created with the device path
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jsafrane from your comment below (I can't seem to comment on it). Why would you have several raw block devices exposed in a pod? I guess I thought we were trying to follow the same process for consuming raw block devices as we do filesystems and one PV would be == to one device. Therefore, a pod would only be able to use the single claim to a single rbd.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One pod can use several PersistentVolumes as filesystems right now. With raw block volumes I expect the same functionality. Simple example would be a Gluster brick, managing several physical hard drives.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jsafrane personally I have never done this but I can understand the use case. I don't understand how it would be different then for this design as we are using the name conventions we use today for non-block devices by making a single PV to a single path..and in this case a device
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The answer to @jsafrane question is the volumeMounts.mountPath will also be used.
accessModes: | ||
- "ReadWriteOnce" | ||
gcePersistentDisk: | ||
fsType: "raw" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to @msau42, if someone wants a raw device then volume plugin should ignore fsType part of the PV. If some crazy pod wants a raw device with a filesystem on it the pod can call mkfs on its own.
pdName: "gce-disk-1" | ||
``` | ||
|
||
***If admin specifies volumeType: block + fstype: ext4 then they would get what they already get today *** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this statement. Who gets what?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the statement is trying to say that "file on block" is today's behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jsafrane I made this more concise
storage: 10Gi | ||
``` | ||
***SUITABLE FOR: NETWORK ATTACHED BLOCK*** | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would perhaps add an explicit use case where admin / dynamic provisioner creates a PV with volumeType: block
and claim wants volumeType: file
and everything works as usual - the PVC gets bound, the device is automatically formatted by kubernetes and mounted as a filesystem wherever a pod wants it.
So one PV can be consumed both as block or filesystem, depending on the PVC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also add an example where claim wants volumeType: block
, and the provisioner cannot support it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jsafrane I was confused on that model also. I cannot come up with a use case where the PV is block and the claim is filesystem, just seem as that is what is done today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lpabon matrix updated...
|
||
By extending the API for volumes to specifically request a raw block device, we provide an explicit method for volume comsumption, | ||
whereas previously it was always a fileystem. In addition, the ability to use a raw block device without a filesystem will allow | ||
Kuberenets better support of high performance applications that can utilitze raw block devices directly for their storage. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/Kuberenets/Kubernetes/
By extending the API for volumes to specifically request a raw block device, we provide an explicit method for volume comsumption, | ||
whereas previously it was always a fileystem. In addition, the ability to use a raw block device without a filesystem will allow | ||
Kuberenets better support of high performance applications that can utilitze raw block devices directly for their storage. | ||
For example, MariaDB or MongoDB. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Ceph Bluestore
Therefore, the risk outweighed the reward when trying to leverage the power of local storage needed for stateful applications like | ||
databases. | ||
|
||
By extending the API for volumes to specifically request a raw block device, we provide an explicit method for volume comsumption, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/raw block device/raw block volume/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am consistently using raw block device rather than volume as for some people volume implies filesystem and these are raw
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The value add to kubernetes section seems incomplete. The last sentence seems like the only actual use case (i want to bypass the filesystem for better performance for a few databases), the rest seems like a background section.
This section should clearly describe why someone who has kubernetes as is today wants raw block devices - make the case that kubernetes should be changed. Give examples of use cases that raw block devices enable that are not possible today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@smarterclayton The use cases below the value add section are all things you cannot do today in Kubernetes. I will notate this in this section to better develop the case for changing it.
definitions for the block devices will be driven through the PVC and PV and Storage Class definitions. Along with Storage | ||
Resource definitions, this will provide the admin with a consistent way of managing all storage. | ||
The API changes proposed in the following section are minimal with the idea of defining a volumeType to indicate both the definition | ||
and consumption of the devices. Since it's possible to create a volume as a block device and then later consume it by provisioning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/of the devices/of the volumes/
GetVolumePath() string | ||
} | ||
``` | ||
# Mounter binding matrix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PV binder does not have (and should not have) access to fsType
in PV *VolumeSource. The whole table makes very little sense to me. Again, why do we need fsType=raw
? It just complicates things unnecessarily.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jsafrane you could have a provisioner that does both. Hence the provisioner takes the fsType in as a parameter. For instance, in GlusterFS they several provisioners that use the same base code but based on a flag it either installs a filesystem (fstype=ext4) onto the defvice or (fstype=raw) uses it as a raw block device. I just think if you want to automate something like Gluster E2E then you would need to have the ability to create or discover the block devices and then install the filesystem on it, instead of it being a multi-step process. Otherwise, it's not an improvement over what we have today which is very manual and error prone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, binder does not see fsType and thus can't do any decision based on it. These two lines are equal to the binder, yet you assume different action.
PV volumeType | Plugin fstype | PVC volumeType | Result |
---|---|---|---|
unspecified | ext4 | block | NO BIND |
unspecified | raw | block | BIND |
So are these lines (note that I added the second one as it's not covered by the table).
PV volumeType | Plugin fstype | PVC volumeType | Result |
---|---|---|---|
unspecified | raw | unspecified | NO BIND |
unspecified | ext4 | unspecified | BIND |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For instance, in GlusterFS they several provisioners that use the same base code/
That's what PVC.VolumeType is for. Provisioner sees a PVC and provides the right PV (Gluster volume or iSCSI LUN) for it.
based on a flag it either installs a filesystem (fstype=ext4) onto the defvice or (fstype=raw) uses it as a raw block device.
Provisioners typically don't install filesystems on a device, that happens when the volume is mounted for the first time. And it won't have any FS created by kubernetes or provisioner when it's used as a raw block device.
you would need to have the ability to create or discover the block devices and then install the filesystem on it,
Kubernetes installs a FS on volumes when needed, why should a test do that?
Automatic merge from submit-queue (batch tested with PRs 50457, 55558, 53483, 55731, 52842). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. VolumeMode and VolumeDevice api **What this PR does / why we need it:** Adds volumeType api to PV and PVC for local block support based on this proposal (kubernetes/community#805) and this feature issue: kubernetes/enhancements#351 **Special notes for your reviewer:** There are other PR changes coming, this just simply creates the api fields #53385 - binding logic changes dependent on this change **Release note:** NONE Notes will be added in subsequents PR with the volume plugin changes, CRI, etc... cc @msau42 @liggitt @jsafrane @mtanino @saad-ali @erinboyd
Automatic merge from submit-queue (batch tested with PRs 50457, 55558, 53483, 55731, 52842). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. VolumeMode and VolumeDevice api **What this PR does / why we need it:** Adds volumeType api to PV and PVC for local block support based on this proposal (kubernetes/community#805) and this feature issue: kubernetes/enhancements#351 **Special notes for your reviewer:** There are other PR changes coming, this just simply creates the api fields #53385 - binding logic changes dependent on this change **Release note:** NONE Notes will be added in subsequents PR with the volume plugin changes, CRI, etc... cc @msau42 @liggitt @jsafrane @mtanino @saad-ali @erinboyd Kubernetes-commit: 5b32e4d24dd65573fc79b654c99f7c7f46de4ebc
Automatic merge from submit-queue (batch tested with PRs 55112, 56029, 55740, 56095, 55845). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Block volume: Command line printer update **What this PR does / why we need it**: Add cmdline printer support changes. **Which issue this PR fixes**: Based on this proposal (kubernetes/community#805 & kubernetes/community#1265) and this feature issue: kubernetes/enhancements#351 **Special notes for your reviewer**: There are another PRs related to this functionality. (#50457) API Change (#53385) VolumeMode PV-PVC Binding change (#51494) Container runtime interface change, volumemanager changes, operationexecutor changes (#55112) Block volume: Command line printer update Plugins (#51493) Block volumes Support: FC plugin update (#54752) Block volumes Support: iSCSI plugin update **Release note**: ``` NONE ``` /sig storage /cc @msau42 @jsafrane @saad-ali @erinboyd @screeley44 @kubernetes/sig-storage-pr-reviews - Command results ``` ~/sample/storage/fc_loop/file % k get pv,pvc,pod NAME CAPACITY ACCESS MODES VOLUME MODE RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pv/block-pv0001 1Gi RWO Block Retain Bound default/nginx-block-pvc01 slow 2m pv/file-pv0001 1Gi RWO Filesystem Retain Bound default/nginx-file-pvc01 slow 24s NAME STATUS VOLUME CAPACITY ACCESS MODES VOLUME MODE STORAGECLASS AGE pvc/nginx-block-pvc01 Bound block-pv0001 1Gi RWO Block slow 2m pvc/nginx-file-pvc01 Bound file-pv0001 1Gi RWO Filesystem slow 25s NAME READY STATUS RESTARTS AGE po/nginx-file-pod1 0/1 ContainerCreating 0 4s po/nginx-pod1 1/1 Running 0 2m ~/sample/storage/fc_loop/file % k get pv,pvc,pod NAME CAPACITY ACCESS MODES VOLUME MODE RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pv/block-pv0001 1Gi RWO Block Retain Bound default/nginx-block-pvc01 slow 2m pv/file-pv0001 1Gi RWO Filesystem Retain Bound default/nginx-file-pvc01 slow 40s NAME STATUS VOLUME CAPACITY ACCESS MODES VOLUME MODE STORAGECLASS AGE pvc/nginx-block-pvc01 Bound block-pv0001 1Gi RWO Block slow 2m pvc/nginx-file-pvc01 Bound file-pv0001 1Gi RWO Filesystem slow 40s NAME READY STATUS RESTARTS AGE po/nginx-file-pod1 1/1 Running 0 19s po/nginx-pod1 1/1 Running 0 2m ~/sample/storage/fc_loop/file % k describe pv/block-pv0001 Name: block-pv0001 Labels: <none> Annotations: pv.kubernetes.io/bound-by-controller=yes volume.beta.kubernetes.io/storage-class=slow StorageClass: slow Status: Bound Claim: default/nginx-block-pvc01 Reclaim Policy: Retain Access Modes: RWO VolumeMode: Block Capacity: 1Gi Message: Source: Type: FC (a Fibre Channel disk) TargetWWNs: 28000001ff0414e2 LUN: 0 FSType: ReadOnly: true Events: <none> ~/sample/storage/fc_loop/file % k describe pv/file-pv0001 Name: file-pv0001 Labels: <none> Annotations: pv.kubernetes.io/bound-by-controller=yes volume.beta.kubernetes.io/storage-class=slow StorageClass: slow Status: Bound Claim: default/nginx-file-pvc01 Reclaim Policy: Retain Access Modes: RWO VolumeMode: Filesystem Capacity: 1Gi Message: Source: Type: FC (a Fibre Channel disk) TargetWWNs: 28000001ff0414e2 LUN: 0 FSType: ReadOnly: true Events: <none> ~/sample/storage/fc_loop/file % k describe pvc/nginx-block-pvc01 Name: nginx-block-pvc01 Namespace: default StorageClass: slow Status: Bound Volume: block-pv0001 Labels: <none> Annotations: pv.kubernetes.io/bind-completed=yes pv.kubernetes.io/bound-by-controller=yes volume.beta.kubernetes.io/storage-class=slow Capacity: 1Gi Access Modes: RWO VolumeMode: Block Events: <none> ~/sample/storage/fc_loop/file % k describe pvc/nginx-file-pvc01 Name: nginx-file-pvc01 Namespace: default StorageClass: slow Status: Bound Volume: file-pv0001 Labels: <none> Annotations: pv.kubernetes.io/bind-completed=yes pv.kubernetes.io/bound-by-controller=yes volume.beta.kubernetes.io/storage-class=slow Capacity: 1Gi Access Modes: RWO VolumeMode: Filesystem Events: <none> ```
Automatic merge from submit-queue (batch tested with PRs 55938, 56055, 53385, 55796, 55922). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. VolumeMode binding logic update Adds VolumeMode binding logic to pv-controller for local block support based on this proposal (kubernetes/community#805) and this feature issue: kubernetes/enhancements#351 **Special notes for your reviewer:** this change is dependent on #50457 cc @msau42 @jsafrane @mtanino @erinboyd
…anager Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Block volumes Support: CRI, volumemanager and operationexecutor changes **What this PR does / why we need it**: This PR contains following items to enable block volumes support feature. - container runtime interface change - volumemanager changes - operationexecuto changes **Which issue this PR fixes**: Based on this proposal (kubernetes/community#805) and this feature issue: kubernetes/enhancements#351 **Special notes for your reviewer**: There are another PRs related to this functionality. (#50457) API Change (#53385) VolumeMode PV-PVC Binding change (#51494) Container runtime interface change, volumemanager changes, operationexecutor changes (#55112) Block volume: Command line printer update Plugins (#51493) Block volumes Support: FC plugin update (#54752) Block volumes Support: iSCSI plugin update **Release note**: ``` Adds alpha support for block volume, which allows the users to attach raw block volume to their pod without filesystem on top of the volume. ``` /cc @msau42 @liggitt @jsafrane @saad-ali @erinboyd @screeley44
…rt-fc Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Block volumes Support: FC plugin update **What this PR does / why we need it**: Add interface changes to FC volume plugin to enable block volumes support feature. **Which issue this PR fixes**: Based on this proposal (kubernetes/community#805 & kubernetes/community#1265) and this feature issue: kubernetes/enhancements#351 **Special notes for your reviewer**: This PR temporarily includes following changes except FC plugin change for reviewing purpose. These changes will be removed from the PR once they are merged. - (kubernetes#50457) API Change - (kubernetes#53385) VolumeMode PV-PVC Binding change - (kubernetes#51494) Container runtime interface change, volumemanager changes, operationexecutor changes There are another PRs related to this functionality. (kubernetes#50457) API Change (kubernetes#53385) VolumeMode PV-PVC Binding change (kubernetes#51494) Container runtime interface change, volumemanager changes, operationexecutor changes (kubernetes#55112) Block volume: Command line printer update Plugins (kubernetes#51493) Block volumes Support: FC plugin update (kubernetes#54752) Block volumes Support: iSCSI plugin update **Release note**: ``` FC plugin: Support for block volume - This enables uses to allow attaching raw block volume to their pod without filesystem through FC plugin. ```
Automatic merge from submit-queue (batch tested with PRs 50457, 55558, 53483, 55731, 52842). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. VolumeMode and VolumeDevice api **What this PR does / why we need it:** Adds volumeType api to PV and PVC for local block support based on this proposal (kubernetes/community#805) and this feature issue: kubernetes/enhancements#351 **Special notes for your reviewer:** There are other PR changes coming, this just simply creates the api fields #53385 - binding logic changes dependent on this change **Release note:** NONE Notes will be added in subsequents PR with the volume plugin changes, CRI, etc... cc @msau42 @liggitt @jsafrane @mtanino @saad-ali @erinboyd Kubernetes-commit: 5b32e4d24dd65573fc79b654c99f7c7f46de4ebc
Automatic merge from submit-queue (batch tested with PRs 50457, 55558, 53483, 55731, 52842). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. VolumeMode and VolumeDevice api **What this PR does / why we need it:** Adds volumeType api to PV and PVC for local block support based on this proposal (kubernetes/community#805) and this feature issue: kubernetes/enhancements#351 **Special notes for your reviewer:** There are other PR changes coming, this just simply creates the api fields #53385 - binding logic changes dependent on this change **Release note:** NONE Notes will be added in subsequents PR with the volume plugin changes, CRI, etc... cc @msau42 @liggitt @jsafrane @mtanino @saad-ali @erinboyd Kubernetes-commit: 5b32e4d24dd65573fc79b654c99f7c7f46de4ebc
Automatic merge from submit-queue (batch tested with PRs 50457, 55558, 53483, 55731, 52842). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. VolumeMode and VolumeDevice api **What this PR does / why we need it:** Adds volumeType api to PV and PVC for local block support based on this proposal (kubernetes/community#805) and this feature issue: kubernetes/enhancements#351 **Special notes for your reviewer:** There are other PR changes coming, this just simply creates the api fields #53385 - binding logic changes dependent on this change **Release note:** NONE Notes will be added in subsequents PR with the volume plugin changes, CRI, etc... cc @msau42 @liggitt @jsafrane @mtanino @saad-ali @erinboyd Kubernetes-commit: 5b32e4d24dd65573fc79b654c99f7c7f46de4ebc
Automatic merge from submit-queue (batch tested with PRs 50457, 55558, 53483, 55731, 52842). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. VolumeMode and VolumeDevice api **What this PR does / why we need it:** Adds volumeType api to PV and PVC for local block support based on this proposal (kubernetes/community#805) and this feature issue: kubernetes/enhancements#351 **Special notes for your reviewer:** There are other PR changes coming, this just simply creates the api fields #53385 - binding logic changes dependent on this change **Release note:** NONE Notes will be added in subsequents PR with the volume plugin changes, CRI, etc... cc @msau42 @liggitt @jsafrane @mtanino @saad-ali @erinboyd Kubernetes-commit: 5b32e4d24dd65573fc79b654c99f7c7f46de4ebc
Automatic merge from submit-queue (batch tested with PRs 50457, 55558, 53483, 55731, 52842). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. VolumeMode and VolumeDevice api **What this PR does / why we need it:** Adds volumeType api to PV and PVC for local block support based on this proposal (kubernetes/community#805) and this feature issue: kubernetes/enhancements#351 **Special notes for your reviewer:** There are other PR changes coming, this just simply creates the api fields #53385 - binding logic changes dependent on this change **Release note:** NONE Notes will be added in subsequents PR with the volume plugin changes, CRI, etc... cc @msau42 @liggitt @jsafrane @mtanino @saad-ali @erinboyd Kubernetes-commit: 5b32e4d24dd65573fc79b654c99f7c7f46de4ebc
Automatic merge from submit-queue (batch tested with PRs 50457, 55558, 53483, 55731, 52842). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. VolumeMode and VolumeDevice api **What this PR does / why we need it:** Adds volumeType api to PV and PVC for local block support based on this proposal (kubernetes/community#805) and this feature issue: kubernetes/enhancements#351 **Special notes for your reviewer:** There are other PR changes coming, this just simply creates the api fields #53385 - binding logic changes dependent on this change **Release note:** NONE Notes will be added in subsequents PR with the volume plugin changes, CRI, etc... cc @msau42 @liggitt @jsafrane @mtanino @saad-ali @erinboyd Kubernetes-commit: 5b32e4d24dd65573fc79b654c99f7c7f46de4ebc
Automatic merge from submit-queue (batch tested with PRs 50457, 55558, 53483, 55731, 52842). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. VolumeMode and VolumeDevice api **What this PR does / why we need it:** Adds volumeType api to PV and PVC for local block support based on this proposal (kubernetes/community#805) and this feature issue: kubernetes/enhancements#351 **Special notes for your reviewer:** There are other PR changes coming, this just simply creates the api fields #53385 - binding logic changes dependent on this change **Release note:** NONE Notes will be added in subsequents PR with the volume plugin changes, CRI, etc... cc @msau42 @liggitt @jsafrane @mtanino @saad-ali @erinboyd Kubernetes-commit: 5b32e4d24dd65573fc79b654c99f7c7f46de4ebc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR has broken GitHub and has been duplicated to #1265
Automatic merge from submit-queue (batch tested with PRs 54230, 58100, 57861, 54752). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Block volumes Support: iSCSI plugin update **What this PR does / why we need it**: Add interface changes to iSCSI volume plugin to enable block volumes support feature. **Which issue this PR fixes**: Based on this proposal (kubernetes/community#805 & kubernetes/community#1265) and this feature issue: kubernetes/enhancements#351 **Special notes for your reviewer**: This PR temporarily includes following changes except iSCSI plugin change for reviewing purpose. These changes will be removed from the PR once they are merged. - (#50457) API Change - (#51494) Container runtime interface change, volumemanager changes, operationexecutor changes There are another PRs related to this functionality. (#50457) API Change (#53385) VolumeMode PV-PVC Binding change (#51494) Container runtime interface change, volumemanager changes, operationexecutor changes (#55112) Block volume: Command line printer update Plugins (#51493) Block volumes Support: FC plugin update (#54752) Block volumes Support: iSCSI plugin update **Release note**: ``` NONE ```
…anager Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Block volumes Support: CRI, volumemanager and operationexecutor changes **What this PR does / why we need it**: This PR contains following items to enable block volumes support feature. - container runtime interface change - volumemanager changes - operationexecuto changes **Which issue this PR fixes**: Based on this proposal (kubernetes/community#805) and this feature issue: kubernetes/enhancements#351 **Special notes for your reviewer**: There are another PRs related to this functionality. (#50457) API Change (#53385) VolumeMode PV-PVC Binding change (#51494) Container runtime interface change, volumemanager changes, operationexecutor changes (#55112) Block volume: Command line printer update Plugins (#51493) Block volumes Support: FC plugin update (#54752) Block volumes Support: iSCSI plugin update **Release note**: ``` Adds alpha support for block volume, which allows the users to attach raw block volume to their pod without filesystem on top of the volume. ``` /cc @msau42 @liggitt @jsafrane @saad-ali @erinboyd @screeley44
…anager Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Block volumes Support: CRI, volumemanager and operationexecutor changes **What this PR does / why we need it**: This PR contains following items to enable block volumes support feature. - container runtime interface change - volumemanager changes - operationexecuto changes **Which issue this PR fixes**: Based on this proposal (kubernetes/community#805) and this feature issue: kubernetes/enhancements#351 **Special notes for your reviewer**: There are another PRs related to this functionality. (#50457) API Change (#53385) VolumeMode PV-PVC Binding change (#51494) Container runtime interface change, volumemanager changes, operationexecutor changes (#55112) Block volume: Command line printer update Plugins (#51493) Block volumes Support: FC plugin update (#54752) Block volumes Support: iSCSI plugin update **Release note**: ``` Adds alpha support for block volume, which allows the users to attach raw block volume to their pod without filesystem on top of the volume. ``` /cc @msau42 @liggitt @jsafrane @saad-ali @erinboyd @screeley44
Design proposal for raw block via PV