Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubectl exec -it <podname> -- df -h shows size of 60Z #108

Open
somethingwikid opened this issue Dec 3, 2021 · 19 comments
Open

kubectl exec -it <podname> -- df -h shows size of 60Z #108

somethingwikid opened this issue Dec 3, 2021 · 19 comments

Comments

@somethingwikid
Copy link

Followed the instructions on this page when i got to testing the deployment recieved the following out put from

kubectl exec -it deployment-localdisk-6f95f4f858-bjkms -- df -h
Filesystem Size Used Avail Use% Mounted on
overlay 124G 23G 102G 19% /
tmpfs 64M 0 64M 0% /dev
tmpfs 32G 0 32G 0% /sys/fs/cgroup
/dev/sdb1 60Z 60Z 0 100% /mnt/localdisk
/dev/sda1 124G 23G 102G 19% /etc/hosts
shm 64M 0 64M 0% /dev/shm
tmpfs 32G 12K 32G 1% /run/secrets/kubernetes.io/serviceaccount
tmpfs 32G 0 32G 0% /proc/acpi
tmpfs 32G 0 32G 0% /proc/scsi
tmpfs 32G 0 32G 0% /sys/firmware

connected to the pod and tried to make a directory at the mountpath /mnt/localdisk
mkdir test
mkdir: cannot create directory 'test': Structure needs cleaning
logs shows many
/bin/sh: 1: cannot create /mnt/localdisk/outfile: Structure needs cleaning
/bin/sh: 1: cannot create /mnt/localdisk/outfile: Structure needs cleaning
/bin/sh: 1: cannot create /mnt/localdisk/outfile: Structure needs cleaning
/bin/sh: 1: cannot create /mnt/localdisk/outfile: Structure needs cleaning
/bin/sh: 1: cannot create /mnt/localdisk/outfile: Structure needs cleaning
/bin/sh: 1: cannot create /mnt/localdisk/outfile: Structure needs cleaning
/bin/sh: 1: cannot create /mnt/localdisk/outfile: Structure needs cleaning

@somethingwikid
Copy link
Author

aks kubernetes version 1.21.2

@somethingwikid
Copy link
Author

node type E8ds_v4

@andyzhangx
Copy link
Collaborator

what's your disk size? looks like file system is corrupted

@andyzhangx
Copy link
Collaborator

@somethingwikid
Copy link
Author

I am in aks if I drop the deployment and reistablish it its supposed to repartition and reformat it correct. This doesnt seem to be happening. Can I drop the partition from a pod and recreate it?

@somethingwikid
Copy link
Author

df
Filesystem 1K-blocks Used Available Use% Mounted on
overlay 129900528 24193776 105690368 19% /
tmpfs 65536 0 65536 0% /dev
tmpfs 32934120 0 32934120 0% /sys/fs/cgroup
/dev/sdb1 69679034472154449220 69679034471845929516 0 100% /mnt/localdisk
/dev/sda1 129900528 24193776 105690368 19% /etc/hosts
shm 65536 0 65536 0% /dev/shm
tmpfs 32934120 12 32934108 1% /run/secrets/kubernetes.io/serviceaccount
tmpfs 32934120 0 32934120 0% /proc/acpi
tmpfs 32934120 0 32934120 0% /proc/scsi
tmpfs 32934120 0 32934120 0% /sys/firmware
root@deployment-localdisk-6f95f4f858-jct9d:/# e2fsck /dev/sdb1
e2fsck 1.46.2 (28-Feb-2021)
/dev/sdb1 is mounted.

WARNING!!! The filesystem is mounted. If you continue you WILL
cause SEVERE filesystem damage.

Do you really want to continue? yes
e2fsck: No such file or directory while trying to open /dev/sdb1
Possibly non-existent device?

@somethingwikid
Copy link
Author

Seems like this should be something the provisioner should be handling. I dont seem to be able to do anything about it.

@somethingwikid
Copy link
Author

su -
root@deployment-localdisk-687ccc64f6-4vgs2:~# umount /dev/sdb1
umount: /mnt/localdisk: must be superuser to unmount.

@somethingwikid
Copy link
Author

also its a 300G but I reserver 299.

@somethingwikid
Copy link
Author

I also tried an aks debug pod. That didnt work either.

root@aks-default2-35939363-vmss000000:/# su -
root@aks-default2-35939363-vmss000000:~# umount /dev/sdb1
umount: /host/var/lib/kubelet/plugins/kubernetes.io/local-volume/mounts/local-pv-41ef20bb: must be superuser to unmount.

It seems I cannot fix the issue with the node. Setting up a new node pool every time this happens is less then desirable.
the problem is either in the kubernetes version or this driver.

@andyzhangx
Copy link
Collaborator

could you provide kubectl logs local-volume-provisioner-xxx-n kube-system logs on that agent node? thanks.

@somethingwikid
Copy link
Author

provisioner_node2.log
here is the log. I basically followed the example then deleted the deployment, pvc and pv. It left the mount point for the pv on the node in a corrupted state at /host/var/lib/kubelet/plugins/kubernetes.io/local-volume/mounts/local-pv-d655187b

ls
ls: reading directory '.': Structure needs cleaning

@somethingwikid
Copy link
Author

it looks like the log is continuing to try and clean /dev/sdb1 with the shred script. I guess I just have to wait. The go program is scanning all the files under /dev. This seems to be slowing it down considerably.

@somethingwikid
Copy link
Author

so the cleanup finished I waited a few minutes after that then redeployed same issue.
I1206 19:08:29.589363 1 discovery.go:287] file(vga_arbiter) under(/dev) does not match pattern(sdb1*)
I1206 19:08:30.002249 1 deleter.go:319] Cleanup pv "local-pv-d655187b": StderrBuf - "shred: /dev/sdb1: pass 3/3 (000000)...296GiB/300GiB 98%"
I1206 19:08:35.002239 1 deleter.go:319] Cleanup pv "local-pv-d655187b": StderrBuf - "shred: /dev/sdb1: pass 3/3 (000000)...298GiB/300GiB 99%"
I1206 19:08:39.139795 1 deleter.go:319] Cleanup pv "local-pv-d655187b": StderrBuf - "shred: /dev/sdb1: pass 3/3 (000000)...300GiB/300GiB 100%"
I1206 19:08:39.140383 1 deleter.go:283] Completed cleanup of pv "local-pv-d655187b"
I1206 19:08:39.589961 1 discovery.go:287] file(termination-log) under(/dev) does not match pattern(sdb1*)
I1206 19:08:39.590022 1 discovery.go:394] Found new volume at host path "/dev/sdb1" with capacity 322120450048, creating Local PV "local-pv-d655187b", required volumeMode "Filesystem"
I1206 19:08:39.605529 1 discovery.go:428] Created PV "local-pv-d655187b" for volume at "/dev/sdb1"
I1206 19:08:39.605556 1 discovery.go:287] file(isst_interface) under(/dev) does not match pattern(sdb1*)
I1206 19:08:39.605561 1 cache.go:55] Added pv "local-pv-d655187b" to cache

kubectl exec -it deployment-localdisk-6f95f4f858-twx7s -- df -h
Filesystem Size Used Avail Use% Mounted on
overlay 124G 21G 103G 17% /
tmpfs 64M 0 64M 0% /dev
tmpfs 32G 0 32G 0% /sys/fs/cgroup
/dev/sdb1 64Z 64Z 295G 100% /mnt/localdisk
/dev/sda1 124G 21G 103G 17% /etc/hosts
shm 64M 0 64M 0% /dev/shm
tmpfs 32G 12K 32G 1% /run/secrets/kubernetes.io/serviceaccount
tmpfs 32G 0 32G 0% /proc/acpi
tmpfs 32G 0 32G 0% /proc/scsi
tmpfs 32G 0 32G 0% /sys/firmwar
logs_removepvc.log
e

@somethingwikid
Copy link
Author

So I see some issues
The scan of the /dev
The shred script messing up the volume without repartitioning and formating it after.

Is it appropriate on the aks E8ds_v4's to be using the following?
hostDir: /dev
mountDir: /dev

@somethingwikid
Copy link
Author

it seems to contradict what I am seeing as the mount path on the node.
/host/var/lib/kubelet/plugins/kubernetes.io/local-volume/mounts/local-pv-d655187b

@somethingwikid
Copy link
Author

somethingwikid commented Dec 6, 2021

the mount path for temp appears to be this for a non provisioned node.
lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 128G 0 disk
|-sda1 8:1 0 127.9G 0 part /host
|-sda14 8:14 0 4M 0 part
`-sda15 8:15 0 106M 0 part /host/boot/efi
sdb 8:16 0 300G 0 disk

`-sdb1 8:17 0 300G 0 part /host/mnt

@somethingwikid
Copy link
Author

I have not heard about this issue in quite a while. I am kind of stuck in terms of using tempdb space and nvme space until i either understand how to resolve this, or configure it correctly or if there is a patch comming do you have any kind of eta on it.

@somethingwikid
Copy link
Author

somethingwikid commented Dec 14, 2021

following these instructions.
https://github.com/Azure/kubernetes-volume-drivers/tree/master/local
anything yet?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants