Add exponential backoff retries to CreateSnapshot#153
Add exponential backoff retries to CreateSnapshot#153ggriffiths wants to merge 1 commit intokubernetes-csi:masterfrom
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: ggriffiths The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Hi @ggriffiths. Thanks for your PR. I'm waiting for a kubernetes-csi or kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
@jsafrane Can you take a look to see if I'm on the right track here? I added a few notes in the PR description for tasks I still need to do. |
|
/ok-to-test |
e432cae to
c86ff40
Compare
|
Is it ready for review? I see failing tests. And please don't add |
Yeah it's not quite ready yet. Still need to fix tests. Will mark it ready for review when it is :-) And thanks I'll remove it - added it accidentally |
5023849 to
e4949cb
Compare
1a085e1 to
9bf26a9
Compare
9bf26a9 to
9a7c8ce
Compare
|
@msau42 @xing-yang tests are passing now and this PR is ready for review. I did not add Exponential retries to the |
| _, updateErr := ctrl.storeSnapshotUpdate(snapshotObj) | ||
| if updateErr != nil { | ||
| // We will get an "snapshot update" event soon, this is not a big error | ||
| klog.V(4).Infof("createSnapshot [%s]: cannot update internal cache: %v", snapshotKey(snapshotObj), updateErr) |
There was a problem hiding this comment.
do we continue to retry updating in this case? @xing-yang
| if snapshot.Status.Error != nil && !isControllerUpdateFailError(snapshot.Status.Error) { | ||
| klog.V(4).Infof("error is already set in snapshot, do not retry to create: %s", snapshot.Status.Error.Message) | ||
| return snapshot, nil | ||
| return snapshot, snapshotter.SnapshottingInBackground, nil |
There was a problem hiding this comment.
I think the errors before CreateSnapshot should be NoChange. The reasoning behind this is: we haven't issued a new CreateSnapshot yet, however this could be our subsequent retry, so the state should remain the same as before.
82dfc90 to
7bc228d
Compare
|
One thing to note is that I haven't added a
I didn't think I would need to add this, but now I realize I'm not really accessing I can add the snapshotInformer/indexer, but am open to other ideas before continuing on that work. |
|
I don't think you need to follow external-provisioner exactly. I think the main thing that's missing is in Also cc @wongma7 since he is familiar with the provisioner design. |
7bc228d to
9595e5a
Compare
|
Sounds good, I've updated the PR to check snapshotsInProgress if we don't find one from the informer. |
| ctrl.updateSnapshot(newSnapshot) | ||
| } | ||
|
|
||
| ctrl.snapshotQueue.Forget(keyObj) |
There was a problem hiding this comment.
I don't think we want to forget and delete in this case. We should probably follow the whole L230-235 section. Can we inject the snapshotsInProgressCheck before L230 so they both go through the same codepath?
There was a problem hiding this comment.
Makes sense, updated to remove the forget/delete and reworked the informer/snapshotsInProgress checks.
9595e5a to
192fbca
Compare
308ca7d to
8766952
Compare
Signed-off-by: Grant Griffiths <grant@portworx.com>
8766952 to
fd0e055
Compare
|
@xing-yang should I re-do this PR for a future snapshotter release? Now that the split controller work is done. |
Yes, please do this after the 1.17 release. I'm actually still working on improving the controller logic. |
Sounds good, will do 👍 |
|
@ggriffiths: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
@ggriffiths: PR needs rebase. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
@ggriffiths are you working on this rebase ? would like to see this with |
|
Hi @humblec, I think this PR is not needed any more. In the Beta version of external-snapshotter, we made some changes. I tested myself and don't see this issue any more. That's why I asked the person who opened the issue to re-test. |
Thanks @xing-yang for clarifying it.. Then lets double confirm and close this . 👍 |
|
Closing this it was put on hold for the beta API changes last release. If this is still needed after testing #134 with the beta API, I'll submit a separate PR. |
c0a4fb1 Merge pull request kubernetes-csi#164 from anubha-v-ardhan/patch-1 9c6a6c0 Master to main cleanup 682c686 Merge pull request kubernetes-csi#162 from pohly/pod-name-via-shell-command 36a29f5 Merge pull request kubernetes-csi#163 from pohly/remove-bazel 68e43ca prow.sh: remove Bazel build support c5f59c5 prow.sh: allow shell commands in CSI_PROW_SANITY_POD 71c810a Merge pull request kubernetes-csi#161 from pohly/mock-test-fixes 9e438f8 prow.sh: fix mock testing d7146c7 Merge pull request kubernetes-csi#160 from pohly/kind-update 4b6aa60 prow.sh: update to KinD v0.11.0 7cdc76f Merge pull request kubernetes-csi#159 from pohly/fix-deployment-selection ef8bd33 prow.sh: more flexible CSI_PROW_DEPLOYMENT, part II 204bc89 Merge pull request kubernetes-csi#158 from pohly/fix-deployment-selection 61538bb prow.sh: more flexible CSI_PROW_DEPLOYMENT 2b0e6db Merge pull request kubernetes-csi#157 from humblec/csi-release a2fcd6d Adding myself to csi reviewers group f325590 Merge pull request kubernetes-csi#149 from pohly/cluster-logs 4b03b30 Merge pull request kubernetes-csi#155 from pohly/owners a6453c8 owners: introduce aliases ad83def Merge pull request kubernetes-csi#153 from pohly/fix-image-builds 5561780 build.make: fix image publishng 29bd39b Merge pull request kubernetes-csi#152 from pohly/bump-csi-test bc42793 prow.sh: use csi-test v4.2.0 b546baa Merge pull request kubernetes-csi#150 from mauriciopoppe/windows-multiarch-args bfbb6f3 add parameter base_image and addon_image to BUILD_PARAMETERS 2d61d3b Merge pull request kubernetes-csi#151 from humblec/cm 48e71f0 Replace `which` command ( non standard) with `command -v` builtin feb20e2 prow.sh: collect cluster logs 7b96bea Merge pull request kubernetes-csi#148 from dobsonj/add-checkpathcmd-to-prow 2d2e03b prow.sh: enable -csi.checkpathcmd option in csi-sanity 09d4151 Merge pull request kubernetes-csi#147 from pohly/mock-testing 74cfbc9 prow.sh: support mock tests 4a3f110 prow.sh: remove obsolete test suppression 6616a6b Merge pull request kubernetes-csi#146 from pohly/kubernetes-1.21 510fb0f prow.sh: support Kubernetes 1.21 c63c61b prow.sh: add CSI_PROW_DEPLOYMENT_SUFFIX 51ac11c Merge pull request kubernetes-csi#144 from pohly/pull-jobs dd54c92 pull-test.sh: test importing csi-release-tools into other repo 7d2643a Merge pull request kubernetes-csi#143 from pohly/path-setup 6880b0c prow.sh: avoid creating paths unless really running tests git-subtree-dir: release-tools git-subtree-split: c0a4fb1
build.make: fix image publishng
Signed-off-by: Grant Griffiths grant@portworx.com
What type of PR is this?
/kind bug
What this PR does / why we need it:
Which issue(s) this PR fixes:
Fixes #134
Special notes for your reviewer:
This PR is modeled after @jsafrane's retry work on the external-provisioner:
pkg/snapshotter/are similar to kubernetes-csi/external-provisioner@8203a03pkg/controllerare based on:Ready for review now.
Does this PR introduce a user-facing change?: