-
Notifications
You must be signed in to change notification settings - Fork 5.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce CSI secret/credential handling #1816
Conversation
|
||
The CSI external-provisioner will reserve the parameter keys `csiProvisionerSecretName` and `csiProvisionerSecretNamespace`. If specified, the CSI Provisioner will fetch the secret `csiProvisionerSecretName` in the Kubernetes namespace `csiProvisionerSecretNamespace` and pass it to: | ||
1. The CSI `CreateVolumeRequest` in the `controller_create_credentials` field. | ||
2. The CSI `DeleteVolumeRequest` in the `controller_delete_credentials` field. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see the controller_create_credentials
or controller_delete_credentials
fields in these structs... what is the data type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, thanks, I was looking at the CSI types vendored in kube
since Secret objects can contain binary data (data map[string][]byte
), how would non-utf-8 data be coerced into map[string]string
in the controller_*_credentials
fields? if non-utf-8 data was encountered, what would the provisioner do? corrupt the data, transform into base64 strings, fail the provision request, or something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@liggitt Would unconditionally converting secret's data (string or binary) to base64 before assigning to controller_*_credentials and then converting it back by a plugin not be an acceptable solution?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per the spec, CSI will only except string data. But a plugin can dictate in its documentation that a specific secret contain binary data and specify what binary-to-text encoding to use (base64, quoted-printable, etc.). It is the responsibility of the entity (cluster admin, user, etc.) that creates the secret, in that case, to ensure its content is what the plugin expects and is encoded in the format the plugin expects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will clarify this in the doc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would unconditionally converting secret's data (string or binary) to base64 before assigning to controller_*_credentials and then converting it back by a plugin not be an acceptable solution?
no... I'd expect transparent plumbing of Secret data in. that likely means limiting the accepted secrets to ones containing data that fits in the string type without corruption.
a plugin can dictate in its documentation that a specific secret contain binary data and specify what binary-to-text encoding to use (base64, quoted-printable, etc.). It is the responsibility of the entity (cluster admin, user, etc.) that creates the secret, in that case, to ensure its content is what the plugin expects and is encoded in the format the plugin expects.
I agree, just wanted to spec out the behavior if the provisioner encountered a secret containing binary data
|
||
Permissions for the corresponding external-provisioner may be limited to the specified namespace to prevent a compromised provisioner from gaining any other secrets. | ||
|
||
If the secret object contains more than one secret, all secrets are passed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
describe how the data key/values from the secret are placed into the controller_create_credentials/controller_delete_credentials fields
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
// sensitive information to pass to the CSI driver during NodePublish. | ||
// Optional: ControllerPublishSecretRef is a reference to the secret object | ||
// containing sensitive information to pass to the CSI driver during | ||
// ControllerPublishVolume and ControllerUnpublishVolume calls. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what piece of the volume lifecycle (provision/attach/mount) does publish correspond to? same question for unpublish? what component is expected to have access to this secret (a controller? the kubelet responsible for this volume?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ControllerPublishVolume -> Attach (called by external-attacher)
ControllerUnpublishVolume -> Dettach (called by external-attacher)
NodePublishVolume -> Mount (Setup) (called by kubelet)
NodeUnpublishVolume -> Unmount (Teardown) (called by kubelet)
Will clarify in doc.
1. The CSI `ControllerPublishVolume` in the `controller_publish_credentials` field. | ||
2. The CSI `ControllerUnpublishVolume` in the `controller_unpublish_credentials` field. | ||
|
||
For the beta implementation of this feature instead of a new filed in the `CSIPersistentVolumeSource`, the secret reference will be passed via an annotation in the PV object: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't do beta fields as annotations, since they are impossible to migrate to first class fields. see https://github.com/kubernetes/community/blob/master/contributors/devel/api_changes.md#adding-unstable-features-to-stable-versions for details about adding new fields to stable objects. I would recommend simply using the *SecretReference type as was previously described
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack, will use add new optional field.
// This may be empty if no secret is required. If the secret object contains | ||
// more than one secret, all secrets are passed. | ||
// +optional | ||
AttachSecretRef *SecretReference `json:"attachSecretRef,omitempty" protobuf:"bytes,4,opt,name=attachSecretRef"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously, this design distinguished between mount secrets (made accessible to kubelets) and attach secrets (only used by the attach/detach controller). Is that distinction being removed? Under what circumstances (if any) are kubelets expected to have access to "publish secrets"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That distinction still exists, but the secrets are renamed to align with the name of the CSI calls:
csiProvisionerSecretName
andcsiProvisionerSecretNamespace
inStorageClass
- Used for:
- CreateVolume -> Provision (called by external-provisioner)
- DeleteVolume -> Deprovision (called by external-provisioner)
- Used for:
ControllerPublishSecretRef *SecretReference
inCSIPersistentVolumeSource
- Used for:
- ControllerPublishVolume -> Attach (called by external-attacher)
- ControllerUnpublishVolume -> Dettach (called by external-attacher)
- Used for:
NodePublishSecretRef *LocalObjectReference
inPersistentVolumeClaimSpec
- Used for:
- NodePublishVolume -> Mount (Setup) (called by kubelet)
- NodeUnpublishVolume -> Unmount (Teardown) (called by kubelet)
- Used for:
Kubelet is only expected to have access to 3. Not to secrets corresponding to 1 and 2.
ac7d271
to
17d1381
Compare
@liggitt feedback addressed, PTAL! |
|
||
Because the kubelet would be responsible for fetching and passing the mount secret to the CSI driver,the Kubernetes NodeAuthorizer must be updated to allow kubelet read access to mount secrets. | ||
Kubelet is already allowed to fetch secrets in the namespace of the pod that it is setting up, so no additional permissions are required. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kubelet is already allowed to fetch secrets in the namespace of the pod that it is setting up
that is not accurate... this will require watching PVC objects and adding their referenced secrets to the node authorizer graph
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will update and clarify.
type PersistentVolumeClaimSpec struct { | ||
|
||
// Optional: NodePublishSecretRef is a reference to the secret object | ||
// containing sensitive information to pass to the CSI driver during |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there has been much discussion of parameters on PVCs in the past, and this is essentially adding that via a referenced secret. for these user-provided credentials, it's worth discussing:
- how the user is expected to discover the structure of the parameters the provisioner expects
- the impact on portability of adding artifacts to a PVC (especially when combined with a default storage class)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this covers user-provided credentials. is it common for the provisioner to need to make available a confidential parameter to the kubelet? if so, I would expect that to be provided on the PV.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great point @liggitt. This would effectively make PVC, an otherwise portable object, potentially non-portable if this new field is added.
In order to keep it completely portable, the credentials should be specified in the PV object only not in the PVC. But we do need to be able to specify credentials unique per workload. Imagine the case of a single NFS share (PV/PVC) that permits multiple users access via different credentials. We should be able to specify that somehow (this use case corresponds to number 3 below).
this covers user-provided credentials. is it common for the provisioner to need to make available a confidential parameter to the kubelet? if so, I would expect that to be provided on the PV.
Maybe something like an encryption key, unique per volume. That would correspond to ControllerPublishSecretRef
, number 2 below.
So, lets break it down this way -- we effectively have three secrets in CSI:
- Create/Delete Volume credential per class of volume
- Grants user of StorageClass ability to provision and delete volumes.
- Secret name/namespace in StorageClass makes sense.
- Attach/Detach Volume credential per volume
- Grants all consumers access to volume, e.g. encryption key for volume
- Secret name/namespace in PV makes sense.
- Mount/Unmount Volume credential per workload
- Grants a specific user of volume ability to consume volume: e.g. a single NFS share (represented by a PVC/PV) that permits multiple users access via different credentials.
- Secret name specified in PVC or pod would be nice, but as you stated that makes PVC non-portable.
For number 3, to achieve both portability of PVC object and the ability to specify workload specific credentials, how about we: remove NodePublishSecretRef *LocalObjectReference
from PVC and move it to CSIPersistentVolumeSource
with the caveat that it is still a LocalObjectReference
and the namespace is assumed to be the namespace of the pod referencing the volume. This will permit both secrets that are unique per volume, as well as secrets that are unique per workload, while maintaining the portability of PVC.
type CSIPersistentVolumeSource struct {
// Secret unique to this volume, e.g. encryption key.
ControllerPublishSecretRef *SecretReference
// Secret unique per consumer, e.g. user credentials
// This secret must exist in the namespace of the pod referencing this volume.
NodePublishSecretRef *LocalObjectReference
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And it may help naming these better:
type CSIPersistentVolumeSource struct {
// Secret required by plugin to access the specified volume.
// Secret is unique to this volume, but independent of any consumer.
// Example: volume decryption key
SecretRef *SecretReference
// Secret used to authorize a workload for this volume.
// Unique per consumer.
// Secret must exist in the namespace of the pod referencing this volume.
// Example: user credentials for a volume that that permits multiple user
// access and requires per consumer auth
WorkloadSecretRef *LocalObjectReference
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove NodePublishSecretRef *LocalObjectReference from PVC and move it to CSIPersistentVolumeSource
moving the reference to the PV makes sense to me
with the caveat that it is still a LocalObjectReference and the namespace is assumed to be the namespace of the pod referencing the volume. This will permit both secrets that are unique per volume, as well as secrets that are unique per workload
I would strongly recommend keeping all the secret references in the PV namespace-qualified. "Unique per volume" should not require exposing the secret into the PVC namespace. That enables two use cases:
- a workload-specific secret living in the same namespace as the PVC
- a volume-specific secret living in a provisioner-managed namespace, hidden from the PVC (for example, a per-volume encryption key generated at provision time, placed in the provisioner-managed namespace, and made available to the kubelet responsible for mounting the volume)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a volume-specific secret living in a provisioner-managed namespace, hidden from the PVC (for example, a per-volume encryption key generated at provision time, placed in the provisioner-managed namespace, and made available to the kubelet responsible for mounting the volume)
Volume specific secret being fully qualified with namespace (SecretReference
), I completely agree.
a workload-specific secret living in the same namespace as the PVC
Workload specific secret being fully qualified with namespace, I'm not so sure about. If workload specific secret is a fully qualified with namespace (SecretReference
), then regardless of the consumer, the secret is expected to exist in one namespace which the consumer may or may not have access to. Instead if workload secret is a LocalObjectReference
, the namespace can be implied as always being the namespace of the caller. This allows for multiple consumers of a single PVC, each specifying their workload secret in their own namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we effectively have three secrets in CSI:
- Create/Delete Volume credential per class of volume
- Grants user of StorageClass ability to provision and delete volumes.
- Secret name/namespace in StorageClass makes sense.
- Attach/Detach Volume credential per volume
- Grants all consumers access to volume, e.g. encryption key for volume
- Secret name/namespace in PV makes sense.
- Mount/Unmount Volume credential per workload
- Grants a specific user of volume ability to consume volume: e.g. a single NFS share (represented by a PVC/PV) that permits multiple users access via different credentials.
- Secret name specified in PVC or pod would be nice, but as you stated that makes PVC non-portable.
A few questions:
- kubelets should only have access to
#3
, correct? - is it odd to couple when a secret is needed to the location/visibility of the secret? don't we have secrets used at mount time that are set up by the provisioner today (azure file comes to mind)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@saad-ali Could you please confirm if #3
has nothing to do with CSI, as I have a hard time mapping it to any CSI provided functions and if it is the case, should we limit this proposal to address just #1
and #2
which are relevant to CSI. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@saad-ali Could you please confirm if #3 has nothing to do with CSI, as I have a hard time mapping it to any CSI provided functions and if it is the case, should we limit this proposal to address just #1 and #2 which are relevant to CSI. What do you think?
@liggitt and I chatted offline. What I suggested in 3 won't work because a PVC/PV pair can only exist in a single namespace, and reusing the same volume requires creating a new PVC/PV pair, therefore it does not make sense to try and differentiate per volume vs per workload secrets. However, it enabling different secrets for different CSI operations should be allowed to allow a plugin to control which components have access to a secret. A secret required only on attach should not be required to be passed in on mount, for example (that would require kubelet to also have access to it when only external-attacher needs access).
Based on this discussion I will revise the proposal as follows:
type CSIPersistentVolumeSource struct {
// Secrets required by the CSI driver to complete the ControllerPublishVolume call
// Fetched by external-attacher.
ControllerPublishSecretRef *SecretReference
// Secrets required by the CSI driver to complete the NodeStageVolume call
// Fetched by kubelet.
NodeStageSecretRef *SecretReference
// Secrets required by the CSI driver to complete the NodePublishVolume call
// Fetched by kubelet.
NodePublishSecretRef *SecretReference
}
All of these will be SecretReference
with both namespace and name specified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that looks good to me:
- maps naturally to the CSI phases
- documents clearly which kubernetes component is expected to work with each piece
- provides the right information for the node authorizer to allow kubelets proper access
- supports a wide variety of use cases
@liggitt |
Good question. The requirement for a secret is specific per storage implementation. If I move my PVC from a cluster who's storage system requires a secret to a cluster that doesn't require one, I shouldn't have to modify my PVC. |
The default storage class allows PVCs to be provisioned without requesting a particular storage class
StorageClass is just a name. It doesn't have to refer to a particular backend (you could request a storage class That's not to say it should never be done (I was very interested in parameterization of storage classes), but it should be done intentionally, with the impact on portability called out, and take into account how users determine what parameters are supported/expected. Related issues/designs: |
|
||
Kubernetes also requires secret string data to be base64 encoded, but CSI does not. | ||
|
||
Therefore, before passing secret data to CSI, Kubernetes (either the core components or helper containers) will convert the secret data from bytes to string (Kubernetes does not specify the character encoding, but Kubernetes internally uses golang to cast from string to byte and vice versa which assumes UTF-8 character set), and the resulting string will be run through base64 decoding to get the plain secret to pass to CSI. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the resulting string will be run through base64 decoding to get the plain secret to pass to CSI
I'm unsure what this means... an example might be useful.
Is the decoding you're referring to the natural decoding golang does of base64-encoded byte slices in the secret data field?
What will the helper do if non-UTF8 byte data is placed in a secret data field that cannot be coerced to a string for CSI?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://kubernetes.io/docs/concepts/configuration/secret/#creating-a-secret-manually says Each item must be base64 encoded
. No such restriction exists on the CSI side (a plain string is allowed).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That documentation only applies to the JSON API. The secret must be base64-encoded when submitted as json because it is allowing for storing binary data in a json string.
If CSI only supports string data, then we just need to define what happens when a referenced secret contains binary data that cannot be represented in a CSI string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, maybe I'm importing the problems we had with json-serialized golang strings where clients would corrupt non-utf-8 string data in configmaps.
from https://blog.golang.org/strings:
It's important to state right up front that a string holds arbitrary bytes. It is not required to hold Unicode text, UTF-8 text, or any other predefined format. As far as the content of a string is concerned, it is exactly equivalent to a slice of bytes.
As long as all the CSI API serializations and clients hold the same view of strings (arbitrary bytes, not assumed to be UTF-8), then secret map[string][]byte
-> CSI map[string]string
is lossless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack. In that case conversion from byte to string should be all that is needed. I will drop requirement for base64 decode.
17d1381
to
f28cbd4
Compare
@liggitt feedback addressed, PTAL. |
@@ -441,37 +441,143 @@ Alternatively, deployment could be simplified by having all components (includin | |||
|
|||
### CSI Credentials | |||
|
|||
This part of proposal is not going to be implemented in alpha release. | |||
CSI allows specifying credentials in CreateVolume/DeleteVolume, ControllerPublishVolume/ControllerUnpublishVolume, and NodePublishVolume/NodeUnpublishVolume operations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add NodeStage (and Unstage?) operation pairs here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, fixing.
AttachSecretRef *SecretReference `json:"attachSecretRef,omitempty" protobuf:"bytes,4,opt,name=attachSecretRef"` | ||
// NodeStageSecretRef is a reference to the secret object containing sensitive | ||
// information to pass to the CSI driver to complete the CSI NodeStageVolume | ||
// and NodeStageVolume calls. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NodeUnstageVolume?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, fixing.
f28cbd4
to
c19175f
Compare
@liggitt feedback addressed, PTAL. |
LGTM from a sig-auth, node authorizer, and API review perspective |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: liggitt, saad-ali The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/lgmt |
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Adding credentials support for k8s core CSI PR implements changes proposed in: kubernetes/community#1816 ```release-note CSI now allows credentials to be specified on CreateVolume/DeleteVolume, ControllerPublishVolume/ControllerUnpublishVolume, and NodePublishVolume/NodeUnpublishVolume operations ```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Adding credentials support for k8s core CSI PR implements changes proposed in: kubernetes/community#1816 ```release-note CSI now allows credentials to be specified on CreateVolume/DeleteVolume, ControllerPublishVolume/ControllerUnpublishVolume, and NodePublishVolume/NodeUnpublishVolume operations ``` Kubernetes-commit: 8e8601a1cbf9400ea15116b1591c03e72dfcb0bb
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Adding credentials support for k8s core CSI PR implements changes proposed in: kubernetes/community#1816 ```release-note CSI now allows credentials to be specified on CreateVolume/DeleteVolume, ControllerPublishVolume/ControllerUnpublishVolume, and NodePublishVolume/NodeUnpublishVolume operations ``` Kubernetes-commit: 8e8601a1cbf9400ea15116b1591c03e72dfcb0bb
Introduce CSI secret/credential handling
CSI now allows credentials to be specified on
CreateVolume
/DeleteVolume
,ControllerPublishVolume
/ControllerUnpublishVolume
, andNodePublishVolume
/NodeUnpublishVolume
operations. This PR describes how credentials stored in Kubernetes secrets maybe passed to CSI.