Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DigitalOcean CSI E2E Test Failures #14004

Closed
rifelpet opened this issue Jul 19, 2022 · 4 comments · Fixed by #14005
Closed

DigitalOcean CSI E2E Test Failures #14004

rifelpet opened this issue Jul 19, 2022 · 4 comments · Fixed by #14005
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@rifelpet
Copy link
Member

/kind bug

the DigitalOcean e2e job has a few CSI StatefulSet test failures:

https://testgrid.k8s.io/kops-misc#e2e-kops-do-calico

The statefulset pods are failing to be created:

STEP: Creating statefulset ss in namespace statefulset-6052
Jul 18 21:16:18.031: INFO: Default storage class: "do-block-storage"
STEP: Saturating stateful set ss
Jul 18 21:16:18.132: INFO: Waiting for stateful pod at index 0 to enter Running
Jul 18 21:16:18.229: INFO: Waiting for pod ss-0 to enter Running - Ready=false, currently Pending - Ready=false
Jul 18 21:16:28.328: INFO: Waiting for pod ss-0 to enter Running - Ready=false, currently Pending - Ready=false
...

the kube-scheduler logs report:

I0718 21:16:18.099175 11 scheduler.go:351] "Unable to schedule pod; no fit; waiting" pod="statefulset-6052/ss-0" err="0/5 nodes are available: 5 pod has unbound immediate PersistentVolumeClaims. preemption: 0/5 nodes are available: 5 Preemption is not helpful for scheduling."

and the CSI Controller logs report:

I0718 21:16:49.114169 1 event.go:282] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"statefulset-6052", Name:"datadir-ss-0", UID:"2010fb26-1eae-40fe-b4a0-36af33161d20", APIVersion:"v1", ResourceVersion:"15317", FieldPath:""}): type: 'Warning' reason: 'ProvisioningFailed' failed to provision volume with StorageClass "do-block-storage": rpc error: code = OutOfRange desc = invalid capacity range: required (1) can not be less than minimum supported volume size (1Gi)

The e2e test creates a volume claim template with a storage request of literally 1 (byte).

For reference, the AWS equivalent test passes based with these controller logs:

I0718 23:36:23.449330       1 connection.go:184] GRPC request: {"accessibility_requirements":{"preferred":[{"segments":{"topology.ebs.csi.aws.com/zone":"ap-southeast-1a"}}],"requisite":[{"segments":{"topology.ebs.csi.aws.com/zone":"ap-southeast-1a"}}]},"capacity_range":{"required_bytes":1},"name":"pvc-c72f1ca3-65dd-4056-bfb5-5e765f3a72a9","parameters":{"csi.storage.k8s.io/pv/name":"pvc-c72f1ca3-65dd-4056-bfb5-5e765f3a72a9","csi.storage.k8s.io/pvc/name":"datadir-ss-0","csi.storage.k8s.io/pvc/namespace":"statefulset-5710","encrypted":"true","type":"gp3"},"volume_capabilities":[{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}}]}
...
I0718 23:36:26.815328       1 connection.go:186] GRPC response: {"volume":{"accessible_topology":[{"segments":{"topology.ebs.csi.aws.com/zone":"ap-southeast-1a"}}],"capacity_bytes":1073741824,"volume_id":"vol-0648af63ddb05af0c"}}
...
I0718 23:36:26.815872       1 controller.go:858] successfully created PV pvc-c72f1ca3-65dd-4056-bfb5-5e765f3a72a9 for PVC datadir-ss-0 and csi volume name vol-0648af63ddb05af0c

The "required_bytes": 1 in the request and "capacity_bytes":1073741824 (1GB) in the response seem to suggest that the AWS controller just uses the default minimum EBS volume size for claims that request fewer bytes than that. I'm wondering if the DO controller should do the same.

/cc @srikiz @timoreimann if you prefer to track this in the CSI driver repo, I'm happy to open one there.

@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Jul 19, 2022
@hakman
Copy link
Member

hakman commented Jul 19, 2022

@rifelpet This should already be fixed in the latest release https://github.com/digitalocean/csi-digitalocean/releases/tag/v4.2.0 by digitalocean/csi-digitalocean#441.
Please correct me if I'am wrong @srikiz.

@srikiz
Copy link
Contributor

srikiz commented Jul 19, 2022

Yes @hakman - updating to the new CSI controller should fix it.

@hakman
Copy link
Member

hakman commented Jul 19, 2022

Thanks for confirming @srikiz.

@rifelpet
Copy link
Member Author

Confirmed this fixed the e2e failure 👍🏻
https://prow.k8s.io/view/gs/kubernetes-jenkins/logs/e2e-kops-do-calico/1549377185301663744

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants