-
Notifications
You must be signed in to change notification settings - Fork 372
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
spec: Remove VolumeHandle
as request parameter for volume-related RPCs
#103
Comments
One of our goals for CSI was to facilitate the creation of fully stateless plugins for simple cases. We included volume metadata in the request/response payloads with the idea that it should be possible for a plugin to place data in the metadata with the expectation that the CO will pass it back to the plugin on subsequent calls. This allows the plugin to fully implement CSI without needing a backing store or back-channel communication between the node plugins and the controller plugin. If we replace the handle with a simple name, then we'll be requiring plugins to maintain a backing store so that they can recover context from one call to the next. So, to me it seems like we'd lose a bunch of capability, and gain very little capability. Plugins that just want volume names and don't want to use any metadata can simply not use the optional metadata part of the handle at which point the handle is basically just a volume name plus a tiny bit of annoying overhead. But probably I am missing something? |
Hi @julian-hj, A few things:
|
And FYI - two diff assumptions is fine as long as you also agree that the situation I'm describing is equally valid. In my context a name is more valuable than ID as it enable an immediate attempt to obtain a lock for the volume -- and can forgoe any more operations if no lock is possible. If an ID to name translation is required then locking becomes much slower. |
I guess I was assuming that the metadata would be derived from the storage platform plus some set of input configuration passed into the If I am understanding your requirement right, then we can basically solve it by just replacing That way the key we use to identify the volume is consistent across calls, and if the plugin has some other internal id it wants to use to look stuff up down the road, it can just put it in a key in Having said all that, if a plugin wants to use the name for locking purposes, it can simply decide to set |
Hi @julian-hj, I'm not discussing anything that is necessarily plug-in specific, so let's try and avoid using that as our point of context.
This is what I'm trying to avoid -- let's stop discussing what a plug-in may or may not do. A volume's name, even from the CO perspective, is the only piece of information that is useful for locking, as the name exists even if the volume does not. Therefore I believe the name should be required for all volume-related calls -- not as some piece of the volume metadata, not instead of the ID. In fact, it makes sense that a `VolumeHandle should be: message VolumeHandle {
string name = 1; // required
string id = 2; // optional
map<string, string> metadata = 3; // optional
} As long as the Also, I'm fine with a |
I agree that message VolumeHandle {
string name = 1; // required
string id = 2; // optional
map<string, string> metadata = 3; // optional
} |
I like the idea of the proposed direction. I would in fact take a step further: message VolumeHandle {
string id = 1; // required
map<string, string> metadata = 2; // optional
}
message CreateVolumeRequest {
...
string id = 2; // required, renamed from 'name'
} Plugin specific ID information can go to I think the only tricky part is |
@jieyu - Your suggestion removes name altogether and is in direct contradiction to this proposal, so I'm not really sure I follow what you're saying. The ID cannot replace the name for the create request. The ID is not yet known. I'm suggesting the name be a first class field, like the ID, as the name is far more important than the ID for things like concurrency and idempotency. The ID is important to the storage platform only, whereas the name is useful to the COs and concurrency implemented external to the storage platform. |
@akutz I meant to rename |
Ah! Somehow I missed the intent of that example. I saw it, but only inferred the removal of the name in favor of the handle. Thank you for the clarification. It makes perfect sense to me now! It's also in concert with my suggestion that the spec require the name be the Handle ID returned by ListVolumes' volume info structs. |
As for your concern about ListVolumes; I share the hesitance. I believe that's why, I think, I originally proposed we include Name and ID as first class fields in the Handle. That way it works with existing volumes. |
Sorry, on mobile and I keep accidentally hitting send. Anyway, thank you again for the response, and I'm sorry I misunderstood it. I think you're right that existing volumes may be the largest barrier here. Maybe I got ahead of myself with the idea that ID be the name, but maybe like you said this isn't of concern? That the plug-in should be connected to storage provisioned only by CSI? Again, I think having the name and ID as separate fields would at least Handle this situation and still promote the name as a first class component of the model. |
cc @saad-ali here. I think the very early draft of this spec asked the CO the specify the volume ID. I think the reason we changed it to the current way is exactly because we want to handle pre-existing volumes. IIRC, there is a requirement from k8s to support specifying the connection information for the source of a volume directly (essentially the plugin specific If we required that the CO generated ID (or name) be specified for each volume operation, what will be the ID (or name) the CO should use in the pre-existing volume case? I am not sure if it makes sense asking the user to specify that given that this is CO internal details. Another solution is to introduce some sort of "import" operation, which takes a |
FWIW, I am on board with the idea of only the name being necessary, but the pre-provisioned volumes case does present a wrinkle. Internally, a CO is still going to generate a name/ID for the volume, correct? The CO must call
(emphasis added) So, how is the CO to know the ID, as the ID is provided by the plugin? Seems like a chicken and egg problem when a volume is pre-provisioned. @jieyu brings up a good example with NFS. Before d2bdb91, this was not an issue, as the In order to rely only on a name, One thing that came out of me thinking about this is that I do think d2bdb91 broke the workflow for pre-provisioned volumes. It's not possible for a CO to provide a name for a pre-provisioned volume to |
Hi @codenrhoden,
On today's CSI call we agreed that the above use case must be handled by the CO first becoming aware of pre-provisioned volumes through a |
RPC interactions documented here: https://github.com/container-storage-interface/spec/blob/master/spec.md#rpc-interactions please re-open if needed |
This issue is taking an even more aggressive stance than PR #88 and proposes that
VolumeHandle
be removed from the following RPCs' request parameters:DeleteVolume
ControllerPublishVolume
ControllerUnpublishVolume
NodePublishVolume
NodeUnpublishVolume
The
VolumeHandle
should be replaced entirely by the volume's name for three key reasons:Names are Unique
The CSI specification states that COs must treat volume names as unique. This in and of itself isn't a justifiable reason to remove the
VolumeHandle
as a request parameter as the latter can shorten the time to look up a specific volume.Concurrency
However, consider concurrency. An important part of idempotency is the specification's requirements that COs try their best to guarantee a single, in-flight operation on a given volume. Because crash scenarios mean that plug-ins must be able to cope with concurrency as well. That requires some type of lock system and some type of lock key. And what piece of information is available whether or not a volume has already been created or has been deleted? The volume name.
Because volume names exist regardless of the volume, the name is the obvious piece of information for plug-ins to use as a lock key.
Idempotency
Once a plug-in can guarantee serial access to volumes, idempotency becomes much easier to implement. In fact, the GoCSI project provides a gRPC interceptor that provides both serial and idempotent access to volumes using names as the lock key. For all RPCs except
CreateVolume
, the interceptor follows this pattern:The first step to each of the interceptor's RPC-related functions, except
CreateVolume
, is to take the provided volume ID and look up the volume's name. If the name cannot be found then theVOLUME_DOES_NOT_EXIST
error code is returned. Otherwise the name is used to try and obtain a lock for that volume. If no lock can be obtained, anOPERATION_PENDING
error is returned.The point is that the volume name is the most important piece of information because it alone can be used to implement a generic means of both serial access and idempotency simply because the volume name always exists -- even if the volume does not.
Alternative Suggestion
A different approach would be to update the pending #88 and instead of making the request parameter
oneof
a volume name or its handle -- require both. That way the name must be provided by the CO as well as the ID.The text was updated successfully, but these errors were encountered: