Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spec: Remove VolumeHandle as request parameter for volume-related RPCs #103

Closed
akutz opened this issue Sep 14, 2017 · 16 comments
Closed

spec: Remove VolumeHandle as request parameter for volume-related RPCs #103

akutz opened this issue Sep 14, 2017 · 16 comments

Comments

@akutz
Copy link
Contributor

akutz commented Sep 14, 2017

This issue is taking an even more aggressive stance than PR #88 and proposes that VolumeHandle be removed from the following RPCs' request parameters:

  • DeleteVolume
  • ControllerPublishVolume
  • ControllerUnpublishVolume
  • NodePublishVolume
  • NodeUnpublishVolume

The VolumeHandle should be replaced entirely by the volume's name for three key reasons:

  1. Names are unique
  2. Concurrency
  3. Idempotency

Names are Unique

The CSI specification states that COs must treat volume names as unique. This in and of itself isn't a justifiable reason to remove the VolumeHandle as a request parameter as the latter can shorten the time to look up a specific volume.

Concurrency

However, consider concurrency. An important part of idempotency is the specification's requirements that COs try their best to guarantee a single, in-flight operation on a given volume. Because crash scenarios mean that plug-ins must be able to cope with concurrency as well. That requires some type of lock system and some type of lock key. And what piece of information is available whether or not a volume has already been created or has been deleted? The volume name.

Because volume names exist regardless of the volume, the name is the obvious piece of information for plug-ins to use as a lock key.

Idempotency

Once a plug-in can guarantee serial access to volumes, idempotency becomes much easier to implement. In fact, the GoCSI project provides a gRPC interceptor that provides both serial and idempotent access to volumes using names as the lock key. For all RPCs except CreateVolume, the interceptor follows this pattern:

	name, err := i.p.GetVolumeName(ctx, req.VolumeId)
	if err != nil {
		return nil, err
	}

The first step to each of the interceptor's RPC-related functions, except CreateVolume, is to take the provided volume ID and look up the volume's name. If the name cannot be found then the VOLUME_DOES_NOT_EXIST error code is returned. Otherwise the name is used to try and obtain a lock for that volume. If no lock can be obtained, an OPERATION_PENDING error is returned.

The point is that the volume name is the most important piece of information because it alone can be used to implement a generic means of both serial access and idempotency simply because the volume name always exists -- even if the volume does not.

Alternative Suggestion

A different approach would be to update the pending #88 and instead of making the request parameter oneof a volume name or its handle -- require both. That way the name must be provided by the CO as well as the ID.

@julian-hj
Copy link
Contributor

One of our goals for CSI was to facilitate the creation of fully stateless plugins for simple cases. We included volume metadata in the request/response payloads with the idea that it should be possible for a plugin to place data in the metadata with the expectation that the CO will pass it back to the plugin on subsequent calls. This allows the plugin to fully implement CSI without needing a backing store or back-channel communication between the node plugins and the controller plugin.

If we replace the handle with a simple name, then we'll be requiring plugins to maintain a backing store so that they can recover context from one call to the next.

So, to me it seems like we'd lose a bunch of capability, and gain very little capability. Plugins that just want volume names and don't want to use any metadata can simply not use the optional metadata part of the handle at which point the handle is basically just a volume name plus a tiny bit of annoying overhead.

But probably I am missing something?

@akutz
Copy link
Contributor Author

akutz commented Sep 14, 2017

Hi @julian-hj,

A few things:

  • You and I are operating on two different assumptions -- in my mind the ID and name and even metadata are all derived from the storage platform, not the plug-in. So losing state wouldn't matter -- especially once the idempotent interceptor can locate its cache externally.

  • Did you see my alt suggestion of simply always requiring the name? The fact is that it's the one piece of info that can be used to provide serial access, and thus idempotency. An ID doesn't exist when a volume doesn't -- but a name does -- and thus can be used to lock volume requests ahead of a volume's creation and after its deletion.

@akutz
Copy link
Contributor Author

akutz commented Sep 14, 2017

And FYI - two diff assumptions is fine as long as you also agree that the situation I'm describing is equally valid. In my context a name is more valuable than ID as it enable an immediate attempt to obtain a lock for the volume -- and can forgoe any more operations if no lock is possible. If an ID to name translation is required then locking becomes much slower.

@julian-hj
Copy link
Contributor

julian-hj commented Sep 14, 2017

I guess I was assuming that the metadata would be derived from the storage platform plus some set of input configuration passed into the parameters and credentials fields of CreateVolumeRequest. We didn't want to assume that the plugin would have a convenient way to store that information in the event that it needs it later in the lifecycle, so the idea was that it could echo it back in whatever way it likes through the metadata in the VolumeHandle.

If I am understanding your requirement right, then we can basically solve it by just replacing id with name in the VolumeHandle and stipulating that response.volume_info.handle.name must equal request.name. Does that make sense?

That way the key we use to identify the volume is consistent across calls, and if the plugin has some other internal id it wants to use to look stuff up down the road, it can just put it in a key in response.volume_info.handle.metadata.

Having said all that, if a plugin wants to use the name for locking purposes, it can simply decide to set response.volume_info.handle.id = request.name during the create call and then the CO will pass the original name back to the plugin with any new request, so I am not sure we really need to change the spec to accommodate that requirement.

@akutz
Copy link
Contributor Author

akutz commented Sep 17, 2017

Hi @julian-hj,

I'm not discussing anything that is necessarily plug-in specific, so let's try and avoid using that as our point of context.

Having said all that, if a plugin wants to use the name for locking purposes

This is what I'm trying to avoid -- let's stop discussing what a plug-in may or may not do.

A volume's name, even from the CO perspective, is the only piece of information that is useful for locking, as the name exists even if the volume does not. Therefore I believe the name should be required for all volume-related calls -- not as some piece of the volume metadata, not instead of the ID. In fact, it makes sense that a `VolumeHandle should be:

message VolumeHandle {
  string name = 1; // required
  string id = 2; // optional
  map<string, string> metadata = 3; // optional
}

As long as the ListVolumes RPC exists, the above change to a VolumeHandle would also allow the retrieval of a volume's name via a first-class field.

Also, I'm fine with a VolumeHandle's id field being the volume name, but if that's the case, the spec should indicate that the value provided via a CreateVolume request's name parameter should be the value used in a VolumeHandle's id field for all volumes. Let the metadata contain any storage platform specific ID then.

@cpuguy83
Copy link
Contributor

I agree that name should be an actual required parameter in the message since name is not exactly metadata and like the last example you provided with:

message VolumeHandle {
  string name = 1; // required
  string id = 2; // optional
  map<string, string> metadata = 3; // optional
}

@jieyu
Copy link
Member

jieyu commented Sep 20, 2017

I like the idea of the proposed direction. I would in fact take a step further:

message VolumeHandle {
  string id = 1; // required
  map<string, string> metadata = 2; // optional
}

message CreateVolumeRequest {
  ...
  string id = 2; // required, renamed from 'name'
}

Plugin specific ID information can go to metadata.

I think the only tricky part is ListVolumes, because it is now required to show CO generated id in the result. That means, the volumes in the ListVolumes result must be volumes that were previously provisioned by the CO. Maybe this is not a bad thing?

@akutz
Copy link
Contributor Author

akutz commented Sep 20, 2017

@jieyu - Your suggestion removes name altogether and is in direct contradiction to this proposal, so I'm not really sure I follow what you're saying.

The ID cannot replace the name for the create request. The ID is not yet known.

I'm suggesting the name be a first class field, like the ID, as the name is far more important than the ID for things like concurrency and idempotency.

The ID is important to the storage platform only, whereas the name is useful to the COs and concurrency implemented external to the storage platform.

@jieyu
Copy link
Member

jieyu commented Sep 20, 2017

@akutz I meant to rename name to id (see CreateVolumeRequest).

@akutz
Copy link
Contributor Author

akutz commented Sep 20, 2017

Ah! Somehow I missed the intent of that example. I saw it, but only inferred the removal of the name in favor of the handle. Thank you for the clarification. It makes perfect sense to me now! It's also in concert with my suggestion that the spec require the name be the Handle ID returned by ListVolumes' volume info structs.

@akutz
Copy link
Contributor Author

akutz commented Sep 20, 2017

As for your concern about ListVolumes; I share the hesitance. I believe that's why, I think, I originally proposed we include Name and ID as first class fields in the Handle. That way it works with existing volumes.

@akutz
Copy link
Contributor Author

akutz commented Sep 20, 2017

Sorry, on mobile and I keep accidentally hitting send.

Anyway, thank you again for the response, and I'm sorry I misunderstood it.

I think you're right that existing volumes may be the largest barrier here. Maybe I got ahead of myself with the idea that ID be the name, but maybe like you said this isn't of concern? That the plug-in should be connected to storage provisioned only by CSI?

Again, I think having the name and ID as separate fields would at least Handle this situation and still promote the name as a first class component of the model.

@jieyu
Copy link
Member

jieyu commented Sep 20, 2017

cc @saad-ali here.

I think the very early draft of this spec asked the CO the specify the volume ID. I think the reason we changed it to the current way is exactly because we want to handle pre-existing volumes. IIRC, there is a requirement from k8s to support specifying the connection information for the source of a volume directly (essentially the plugin specific metadata field). For example, nfs ip and share name.

If we required that the CO generated ID (or name) be specified for each volume operation, what will be the ID (or name) the CO should use in the pre-existing volume case? I am not sure if it makes sense asking the user to specify that given that this is CO internal details. Another solution is to introduce some sort of "import" operation, which takes a VolumeHandle in the request, and the storage provider will establish the mapping between CO generated ID or name to the metadata (e.g., by tagging the volume, checkpointing, etc.).

@codenrhoden
Copy link

FWIW, I am on board with the idea of only the name being necessary, but the pre-provisioned volumes case does present a wrinkle. Internally, a CO is still going to generate a name/ID for the volume, correct? The CO must call ValidateVolumeCapabilies to validate the existing volume, and that message has a volume_info.handle struct which has a required ID field. That ID is defined as

ID is the identity of the provisioned volume specified by the Plugin.

(emphasis added)

So, how is the CO to know the ID, as the ID is provided by the plugin? Seems like a chicken and egg problem when a volume is pre-provisioned. @jieyu brings up a good example with NFS. Before d2bdb91, this was not an issue, as the VolumeID was only a map of the volume metadata, so the entire volume could be referenced by it's IP and share.

In order to rely only on a name, ValidateVolumeCapabilities would need to be modified such that a name is not required, and then an SP could provide back a name in a format that it would understand on subsequent calls in order to look up the volume. But in this scenario, the SP-provided name and the CO-internal generated name would not match. So does the CO have to be smart enough to provide it's name when it's a volume it has provisioned, and the SP's name when it's pre-provisioned (I would argue it does), or do have to support both an ID and a name, with a clear distinction that one is owned by the CO and one is owned by the SP?

One thing that came out of me thinking about this is that I do think d2bdb91 broke the workflow for pre-provisioned volumes. It's not possible for a CO to provide a name for a pre-provisioned volume to ValidateVolumeCapabilities. I should have noticed this when working on {code}'s csi-nfs plugin, but it was last tested right before that commit went into the spec.

@akutz
Copy link
Contributor Author

akutz commented Sep 20, 2017

Hi @codenrhoden,

One thing that came out of me thinking about this is that I do think d2bdb91 broke the workflow for pre-provisioned volumes. It's not possible for a CO to provide a name for a pre-provisioned volume to ValidateVolumeCapabilities. I should have noticed this when working on {code}'s csi-nfs plugin, but it was last tested right before that commit went into the spec.

On today's CSI call we agreed that the above use case must be handled by the CO first becoming aware of pre-provisioned volumes through a ListVolumes RPC, and then using the volume handles returned by that call to do any volume-related RPCs, such as ValidateVolumeCapabilities.

@jdef
Copy link
Member

jdef commented Dec 19, 2017

RPC interactions documented here: https://github.com/container-storage-interface/spec/blob/master/spec.md#rpc-interactions

please re-open if needed

@jdef jdef closed this as completed Dec 19, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants