-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Velero+restic backup gets stuck when NFS persistent volume is included #2721
Comments
Hi @guimenezes, Give it a go and run another backup then run "velero describe backup --details"...at the bottom you should see your pvc name backup |
Hi, thanks for your response. This is my installation command now: However it seems like the backup is stuck in "New" state now:
Pods:
kubectl get pods -n velero
kubectl logs deployment/velero -n velero
kubectl logs restic-w8bjc -n velero
Any known issue that rings a bell? Please let me know what other information to get. I am sure I am missing something here. |
I still think there are issues with your install command, specifically the s3 parameters section. Here´s a blanked version of mine, note the use of quotations around the s3 parameters and also the region setting. I use a cert for backups over https but you don´t need that if you´re going over http also I´m using an internal registry, hence the repo flags, something else you don´t need. velero install |
Thanks. Honestly I believe the s3 configuration is fine since I can back up other resources if the pod is not annotated with the NFS PVC. I was able though to do some progress. After ssh'ing into minikube VM I listed the contents of /var/lib/kubelet/pods/ to see if I could access the NFS mount point manually. Seems like my commands get stuck when trying to contact the NFS server from inside the minikube VM:
However, I do see that the actual Mysql pod can normally access it to write its data. This is the NFS partition on the NFS server:
So it sounds like restic is stuck while trying to access this NFS mounted directory inside the minikube VM. I believe restic would have exactly the same problem. Unless I am missing something. I am not sure exactly why this happens. The NFS server is accessible and visible from inside the Mysql container. |
@guimenezes If I am understanding your issue correctly, you are able to backup and restore your applications that don't use the NFS PVs. But trying to backup applications that are using NFS PVs, the backup does not make progress and the backup is stuck in If the Backup is stuck in |
Hello there the same things is happening to my openshift 4.3 version. Velero and restic is installed and running properly. kubectl get pod -n velero I get this error if I check for errors in restic pod `velero backup describe 000-test4-okd-nprod --details Phase: PartiallyFailed (run Errors: 6 Namespaces: Resources: Label selector: Storage Location: default Velero-Native Snapshot PVs: auto TTL: 720h0m0s Hooks: Backup Format Version: 1 Started: 2020-08-06 11:19:12 +0000 UTC Expiration: 2020-09-05 11:19:12 +0000 UTC Total items to be backed up: 176 Resource List: Velero-Native Snapshots: Restic Backups: Can someone give me some guidance? It seems that velero + restic can not backup the aws efs yet. Please comment here asap. |
Sorry that you are having issues with this. Related #2789 |
I use automatic annotation (https://github.com/duyanghao/velero-volume-controller) with NFS. which works fine with NFS PVs. But not tested with minikube, my setup with kubeadm. |
Perfect. I use kubectl too. COuld you please elaborate further on how you
automated the annotation on NFS (aws efs) ? Please
…On Fri, Aug 7, 2020 at 3:39 AM Cloud Cafe ***@***.***> wrote:
I use automatic annotation with NFS. which works fine with NFS PVs.
But not tested with minikube, my setup with kubeadm.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#2721 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AOHNPNNXCBQ7CPPX5R4EFMDR7NSOFANCNFSM4O7EWTZQ>
.
|
@ashish-amarnath in the original issue the backup gets stuck "In Progress". Unfortunately I don't have the setup anymore but I can reproduce it again as soon as I get some time. Also in the original issue when looking inside the minikube VM I get into the same stuck condition by just ls'ing the NFS mount point. So I think it is likely restic was also getting stuck in the same way. I am in the process of deploying Kubernetes in EKS and I can verify if this happens or not there as well. I suspect (hopefully) this may be just a minikube issue, but let's see. |
I think that this is a Velero issue. I am having the same issue on a cluster that was installed using kubeadm + NFS server. By the way @guimenezes , the installation you did on Velero was just fine. The issue here is that I think Velero doesn't support NFS migrations. Is that correct? |
@guimenezes Related #1229 Check this one out. |
kubeadm + NFS server + Minio without what I faced with |
@cloudcafetech that's interesting. May I know how you managed to fix the issue of the |
I am not using cacert, for that I use Minio setup with http not https kubeadm + NFS server + Minio without cacert |
@ironknight78 @cloudcafetech @guimenezes I'm curious, do you know what the How is the NFS volume mounted into the pod in each of your cases? Can I see the YAML of a pod? |
This issue has been inactive for a while so I'm closing it. |
Since when closing an issue for inactivity is a proper solution to the issue itself? The problem is still there. This issue even got tagged "needs-investigation" and you close it due to a meaningless reason? "Standards". I mean, if there would be a reason like "closing due to a duplicate" or "we dont have time and dont want to waste time and effort (and ofc money)" on this, it would be understandable. Can be opened again. |
Did you find the problem? Did you see any error in logs from nfs server side? Did rpcbind start? |
If it would work, I wouldn't suggest to reopen it again. The backup gets stuck, as the title of this issue states and I therefore do not get any errors in both the NFS server and velero. |
No dude, still not working with efs-aws
…--
Marco Seramondi | DevOps
[signature_1543129984]
Globacap Limited is a limited company registered in England and Wales with registered number 11046987, VAT registered number GB285063004, and registered office address 322 High Holborn, London, England, WC1V 7PB. Globacap Limited is regulated by the Financial Conduct Authority (FCA), registration number 811661. Further details can be confirmed directly through the FCA register<https://register.fca.org.uk/ShPo_FirmDetailsPage?id=0010X00004AWZbEQAX>.
From: osaffer <[email protected]>
Date: Tuesday, 2 February 2021 at 23:30
To: vmware-tanzu/velero <[email protected]>
Cc: Marco Seramondi <[email protected]>, Comment <[email protected]>
Subject: Re: [vmware-tanzu/velero] Velero+restic backup gets stuck when NFS persistent volume is included (#2721)
Did you find the problem?
Did you see any error in logs from nfs server side?
Did rpcbind start?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub<#2721 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AOHNPNJCM3XR7QG2NMSIYEDS5CDLBANCNFSM4O7EWTZQ>.
|
Having the same issue. Restic gets stuck when trying to back up an nfs-pv |
Please update here if you have news findings.
A bit useless as tool as it can not be used with all storage system available.
…--
Marco Seramondi | DevOps
[signature_27455383]
Globacap Limited is a limited company registered in England and Wales with registered number 11046987, VAT registered number GB285063004, and registered office address 322 High Holborn, London, England, WC1V 7PB. Globacap Limited is regulated by the Financial Conduct Authority (FCA), registration number 811661. Further details can be confirmed directly through the FCA register<https://register.fca.org.uk/ShPo_FirmDetailsPage?id=0010X00004AWZbEQAX>.
From: fsz285 <[email protected]>
Date: Wednesday, 10 February 2021 at 12:46
To: vmware-tanzu/velero <[email protected]>
Cc: Marco Seramondi <[email protected]>, Comment <[email protected]>
Subject: Re: [vmware-tanzu/velero] Velero+restic backup gets stuck when NFS persistent volume is included (#2721)
CAUTION: THIS EMAIL ORIGINATED FROM OUTSIDE OF THE ORGANISATION. DO NOT CLICK LINKS OR OPEN ATTACHMENTS UNLESS YOU RECOGNISE THE SENDER AND KNOW THE CONTENT IS SAFE.
Having the same issue. Restic gets stuck when trying to back up an nfs-pv
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub<#2721 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AOHNPNJ2AWQ6KTQNJQBRXVTS6J5Y5ANCNFSM4O7EWTZQ>.
|
see #3450 |
This might be an implementation specific issue. Velero and Restic able to successfully backup & restore nfs-client-provisioner: v3.1.0-k8s1.11 $ velero backup describe ecommerce-etcd-qa-02-13-2021 --details
Name: ecommerce-etcd-qa-02-13-2021
Namespace: kube-velero
Labels: velero.io/storage-location=default
Annotations: velero.io/source-cluster-k8s-gitversion=v1.17.6-docker-1
velero.io/source-cluster-k8s-major-version=1
velero.io/source-cluster-k8s-minor-version=17+
Phase: Completed
Errors: 0
Warnings: 0
Namespaces:
Included: ecommerce-etcd-qa
Excluded: <none>
Resources:
Included: *
Excluded: <none>
Cluster-scoped: auto
Label selector: <none>
Storage Location: default
Velero-Native Snapshot PVs: auto
TTL: 720h0m0s
Hooks: <none>
Backup Format Version: 1.1.0
Started: 2021-02-13 07:38:21 -0800 PST
Completed: 2021-02-13 07:41:31 -0800 PST
Expiration: 2021-03-15 08:38:21 -0700 PDT
Total items to be backed up: 38
Items backed up: 38
Resource List:
...
v1/PersistentVolume:
- pvc-192e7963-a024-4c53-80c6-21552b30ee91
- pvc-41b92c87-1a8c-413e-aae8-5d6c1ef0a2b2
- pvc-685729c3-d6a9-40e1-be57-1455b6cf9780 $ kubectl describe pv pvc-192e7963-a024-4c53-80c6-21552b30ee91
Name: pvc-192e7963-a024-4c53-80c6-21552b30ee91
Labels: <none>
Annotations: pv.kubernetes.io/provisioned-by: fuseim.pri/ifs
Finalizers: [kubernetes.io/pv-protection]
StorageClass: managed-nfs-storage
Status: Bound
Claim: ecommerce-etcd-qa/data-etcd-1
Reclaim Policy: Delete
Access Modes: RWO
VolumeMode: Filesystem
Capacity: 8Gi
Node Affinity: <none>
Message:
Source:
Type: NFS (an NFS mount that lasts the lifetime of a pod)
Server: redacted
Path: /redacted/ecommerce-etcd-qa-data-etcd-1-pvc-192e7963-a024-4c53-80c6-21552b30ee91
ReadOnly: false
Events: <none> |
What steps did you take and what happened:
I am testing Velero to backup my application.
In a minikube deployment, I followed the basic mysql + wordpress tutorial here:
https://kubernetes.io/docs/tutorials/stateful-application/mysql-wordpress-persistent-volume/
I setup an NFS server in a VM and changed the example to use an NFS persistent volume instead.
The mysql pod config I used is the following:
The example also has a wordpress-deployment.yaml file and kustomization.yaml file which I did not modify.
After deployment:
The S3 endpoint is a vanilla Minio S3 server. I installed velero with:
Started a backup (without annotation) and things work perfectly for metadata:
However after adding the annotation the backup gets stuck:
These are the last messages in the log:
I enabled debug logs like described in https://velero.io/docs/master/troubleshooting/.
Also re-deployed to clean state and pod name changed to wordpress-mysql-5dcc45d9f9-2zgkn.
Full output is here: velero.txt
What did you expect to happen:
Backup should either fail or succeed.
The output of the following commands will help us better understand what's going on:
(Pasting long output into a GitHub gist or other pastebin is fine.)
kubectl logs deployment/velero -n velero
velero.txt
velero backup describe <backupname>
orkubectl get backup/<backupname> -n velero -o yaml
velero backup logs <backupname>
velero restore describe <restorename>
orkubectl get restore/<restorename> -n velero -o yaml
NA
velero restore logs <restorename>
NA
Anything else you would like to add:
NA
Environment:
velero version
):velero client config get features
):kubectl version
):/etc/os-release
): MacOsTHANKS!!
Vote on this issue!
This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.
The text was updated successfully, but these errors were encountered: