Skip to content

Conversation

@gnufied
Copy link
Member

@gnufied gnufied commented Oct 6, 2020

cc @openshift/sig-storage

Comment on lines 71 to 74
Currently while deploying Openshift a user can configure datastore used by OCP via install-config.yaml. vSphere CSI driver however can't use datastore directly and must be configured with vSphere storage policy. This can be done by an optional field in `vsphere.Platform` type of installer and if populated vsphere CSI driver operator will create storageClass with selected vSphere storage policy.

One downside of having storage policy as an optional field is - when in-tree vSphere driver is removed from Kubernetes then without a default storage policy
configured in OCP - such clusters will come up with no default storageClass installed and will require manual creation of storageClasses.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please note explicitly that we may need storage policy as mandatory in a future 4.x release. In the same way, datastore field may become optional and even deprecated + removed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

datastore field we may still be needed to configure location where data files for VMs will be stored. cc @jcpowermac

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

datastore field we may still be needed to configure location where data files for VMs will be stored. cc @jcpowermac

Exactly it can't be removed. Its used in MAO and installer terraform.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like users configuring storage for PVs and for installer/machine API in two different ways. Would it be possible to create storagePolicy knowing the data store? Or the other way around? Or update terraform / machine API to use storagePolicy, if provided?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

theoretically it is possible to create a storagepolicy from datastore. We could create a tag and then apply the tag to datastore and then create a storagepolicy which selects datastores tagged with named tag. If we are planning to do this via installer then I think support should be there in terraform to do that.

@jsafrane were you thinking of doing this from operator?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gnufied we use tags already for virtualmachines, no reason why we can't add to include datastores.

https://github.com/openshift/installer/blob/master/data/data/vsphere/main.tf#L54-L70

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tagging a datastore & creation of a storage policy on install time sounds great. Thinking about consequences:

  • We will stick to datastore in install-config and users can't ever use their own storage policy for VMs.
  • The default storage class will also use only that datastore.
    • If user wants a special storage policy, they must create their own StorageClass.

Works for me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • OCP will also need permissions to create storage policies. This may be problematic.


There are certain aspects of driver which require special handling:

1. When vSphere CSI operator starts, using credentials provided by cluster-storage-operator, it will first verifiy vCenter version and HW version of VMs. If vCenter version is not 6.7u3 or greater and HW version is not 15 or greater on all VMs - it will set `vSphereOperatorDisabled: true` and stop further processing.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if a new VM with HW ver < 15 is added later? Should it un-install the existing driver too? I would expect that the driver becomes Available=False, maybe even Degraded=True with an alert or something, but the operator should keep resyncing periodically.

It could be even useful to label nodes that are able to run the driver and run the driver only there.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that is a good point. I think we will want to mark operator as Degraded but it would still keep resyncing and existing pods/volumes should keep working. But it also means that, if a pod wants to be scheduled to a node which is on HW<15, then that node can't run the pod.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added some language about degraded driver status when this happens.

To solve this following alternatives were considered:

* We could provide manual instructions to the user about updating hardware version of their VMs. This won't be automatic but in some cases not even possible because some existing VMs can't be upgraded to HW version 15 in-place.
* We could update installer to set HW version 15 - when appropriate hypervisor version(6.7u3 or higher) is detected.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the vSphere cluster has heterogenous hosts and some support ver 15 and some only older versions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can check vCenter version and all ESXi host versions for the specific vSphere cluster that is configured in the install-config.yaml. Not 100% sure what the appropriate outcome of that information gathering should be though.


* We could provide manual instructions to the user about updating hardware version of their VMs. This won't be automatic but in some cases not even possible because some existing VMs can't be upgraded to HW version 15 in-place.
* We could update installer to set HW version 15 - when appropriate hypervisor version(6.7u3 or higher) is detected.
* For UPI install - we can additionally document required HW version while creating VMs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this should just be documentation.

### Risks and Mitigations

* We don't yet know state of vSphere CSI driver. We need to start running e2e tests with vSphere driver as early as possible, so as we can determine how stable it is.
* We have some concerns about datastore not being supported in storageClass anymore. This means that in future when in-tree driver is removed, clusters without storagePolicy will become unupgradable.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the operator see this condition and resolve it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there way how to reliably pick a storagePolicy listing vSphere API? Does it have permissions to do so? How it finds the right policy there?
IMO, the operator can make the cluster unupgradeable and wait for user to provide the policy.

@gnufied
Copy link
Member Author

gnufied commented Oct 6, 2020

/assign @abhinavdahiya @jsafrane @jcpowermac

@squeed
Copy link
Contributor

squeed commented Oct 7, 2020

/unassign


There are certain aspects of driver which require special handling:

1. When vSphere CSI operator starts, using credentials provided by cluster-storage-operator, it will first verifiy vCenter version and HW version of VMs. If vCenter version is not 6.7u3 or greater and HW version is not 15 or greater on all VMs - it will set `vSphereOperatorDisabled: true` and stop further processing. If additional VMs are added later into the cluster and they do not have HW version 15 or greater, Operator will mark itself as `degraded` and nodes which don't have right HW version will have annotation `vsphere.driver.status.csi.openshift.io: degraded` added to them.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way this reads is that if we don't detect the correct vCenter version or HW version then we disable the vSphere Operator and disable processing AND THEN if additional VMs are added later the Operator will mark itself as degraded.

Perhaps these two cases should be broken into two different scenarios?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed an update to address this.

1. When vSphere CSI operator starts, using credentials provided by cluster-storage-operator, it will first verifiy vCenter version and HW version of VMs. If vCenter version is not 6.7u3 or greater and HW version is not 15 or greater on all VMs - it will set `vSphereOperatorDisabled: true` and stop further processing. If additional VMs are added later into the cluster and they do not have HW version 15 or greater, Operator will mark itself as `degraded` and nodes which don't have right HW version will have annotation `vsphere.driver.status.csi.openshift.io: degraded` added to them.


2. Since VMWare will ship its own vSphere CSI driver operator via OLM, it should be possible to disable default cluster-storage-operator managed operator(i.e this operator). Currently we are planning to let user set `ManagementState: Removed` in `ClusterCSIDriver` CR, which will cause operand to be removed(i.e the CSI driver) but operator will still be running. This will allow user to use external vSphere CSI operatrs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this mention also be included in Disabling the Operator?


There are certain aspects of driver which require special handling:

1. When vSphere CSI operator starts, using credentials provided by cluster-storage-operator, it will first verifiy vCenter version and HW version of VMs. If vCenter version is not 6.7u3 or greater and HW version is not 15 or greater on all VMs - it will set `vSphereOperatorDisabled: true` and stop further processing. If additional VMs are added later into the cluster and they do not have HW version 15 or greater, Operator will mark itself as `degraded` and nodes which don't have right HW version will have annotation `vsphere.driver.status.csi.openshift.io: degraded` added to them.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a HW with ver < 15 should initiate an alert, but the cluster should not get degraded, at least in 4.7.

4.7 recommends using HW ver. 15, but it works with older version. At some point we will require ver. 15, but exact OCP version is not set yet. Users need to update the VMs manually and fix any issues with it. Good documentation is required.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want uniform behaviour between operator noticing a VM with <=15 later vs it notices a VM afterwards. In both cases, we ideally want same thing to happen.

I am thinking of snowflaking this degraded status in CSO, so as we don't degrade the cluster.

Scenario #1. vSphere operator starts and notices one or more VMs are not on HW >=15. It marks itself as degraded but starts controller and daemonset pods anyways. nodes which don't have matching HW version are labelled.

Scenario#2. vSphere notices that there are VMs which are not on HW >=15 during normal course of operation. It marks itself as degraded but nodes which don't have matching HW version will have driver pods running.

@huffmanca this also kinda addresses point you raised above. I don't think it is a good idea to treat these two cases as different ones. I do understand, this will result in pods silently failing to come up or if controller side of driver got scheduled on a node with HW <=15 then attach/detach and volume provisioning won't work. But if we base our scheduling decisions around HW on nodes, it will get very complicated..

3. The operator will get RBAC rules necessary for running the operator and its operand (i.e the actual vSphere CSI driver).
4. A deployment will be created for starting the operator and will be managed by cluster-storage-operator.
5. A instance of `ClusterCSIDriver` will be created to faciliate managment of driver operator. `ClusterCSIDriver` is already defined in - https://github.com/openshift/api/blob/master/operator/v1/types_csi_cluster_driver.go but needs to be expanded to include vSphere CSI driver.
6. cluster-storage-operator will create required cloud-credentials for talking with vCenter API.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately cluster-storage-operator can't create those creds, they should be created in cluster-version-operator regardless of the platform. cloud-credentials-operator might be running in manual mode and users should be able to create the creds manually.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about the StorageClass?

Copy link
Member Author

@gnufied gnufied Oct 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes by CSO I meant, it is CSO that will carry those YAMLs and ensure that CVO creates them. How we do for other cloudproviders. My intention is exactly what you say but wording needs fixing I think.

Also for storageClass - SC will be created by the operator itself.

2. A service account will be created for running the operator.
3. The operator will get RBAC rules necessary for running the operator and its operand (i.e the actual vSphere CSI driver).
4. A deployment will be created for starting the operator and will be managed by cluster-storage-operator.
5. A instance of `ClusterCSIDriver` will be created to faciliate managment of driver operator. `ClusterCSIDriver` is already defined in - https://github.com/openshift/api/blob/master/operator/v1/types_csi_cluster_driver.go but needs to be expanded to include vSphere CSI driver.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the CR flags that the operator should first, install the driver

4. A deployment will be created and managed by operator to handle control-plane sidecars and controller-plane driver deployment with controller-side of CSI services.
5. A DaemonSet will be created and managed by operator to run node side of driver operations.
6. A `CSIDriver` object will be created to expose driver features to control-plane.
7. The driver operator will use and expose cloud-credentials created by cluster-storage-operator.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/CSO/CVO/g
or by the user when CCO is disabled (manual mode)

Copy link
Member Author

@gnufied gnufied Oct 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

- The operator will continue installing CSI driver without creating storageClass. The operation of creating storagePolicy and storageClass will be however retried expecting permissions to be fixed or transient errors to go away.
- It is expected that user will manually create a storageClass to consume vSphere CSI volumes if CSI operator can not create storagePolicy and storageClass.
- CSI operator will mark itself as degraded but CSO itself will not be degraded and vSphere CSI operator's degraded status must be handled in a different way in CSO.
- Create a metric if storagePolicy and storageClass creation fails in 4.7.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can something else, like the installer, create the storagePolicy? If the operator could assume the storagePolicy exists, it wouldn't have to do any juggling with conditions nor potentially leave the cluster without a default StorageClass.

If creating the storagePolicy somewhere else is not an option:

The ultimate goal of the vsphere-csi-driver-operator should be creating a StorageClass in the cluster that users can use to provision their volumes. If for some reason it lacks any permissions to achieve such a goal, IMO it shouldn't leave the cluster with part of the work done and expect the user to fill in the gaps.

IMO it's reasonable to expect the operator to have such permissions in the cluster.

Copy link
Member Author

@gnufied gnufied Oct 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this in one of the design calls. During an upgrade, there is no installer pod and hence it must be vsphere CSI operator that creates the storagePolicy.

The operator is not going to leave with just storagePolicy, it will also create storageClass. I will update the wording.


Currently while deploying Openshift a user can configure datastore used by OCP via install-config.yaml. vSphere CSI driver however can't use datastore directly and must be configured with vSphere storage policy.

To solve this problem vsphere CSI operator is going to create a storagePolicy by tagging selected datastore in the installer. This will require OCP to have expanded permissions of creating storagePolicies. After creating the storagePolicy, the vSphere CSI operator will also create corresponding storageclass.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tagging selected datastore in the installer

  1. Tagging datastore is not mentioned in the installer chapter above
  2. Installer is not running during upgrade. To upgrade a running cluster, something must tag the datastore and I would expect it's the CSI driver operator.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is why we moved tagging and create storagepolicy work to operator and not in the installer.

@jsafrane
Copy link
Contributor

jsafrane commented Nov 5, 2020

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Nov 5, 2020
@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: gnufied, jsafrane

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 5, 2020
@openshift-merge-robot openshift-merge-robot merged commit 8049956 into openshift:master Nov 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants