Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AlreadyExists error cause repeated provisioning a volume #124

Closed
orainxiong opened this issue Aug 12, 2018 · 18 comments
Closed

AlreadyExists error cause repeated provisioning a volume #124

orainxiong opened this issue Aug 12, 2018 · 18 comments

Comments

@orainxiong
Copy link

orainxiong commented Aug 12, 2018

For some unknown reasons in our circumstance, external-provisioner probably provision an already existing Volume(the name of a Volume is the unique key to decide whether it exists). Base on the current logic, that means it causes repeated provisioning a volume for the given PVC. To be worse, there is no external API I can use to delete the volume manually.

In a very simplified view, the functionality of provisionClaimOperation is responsible for provisioning a volume and could be broken down into the following steps :

  • Obtains a given PVC mapping to specific storage class;
  • Calls against CSI-Driver to provision a Volume;
  • Create a PV to save the volume;

Based on the CSI RPC specifications, the current default behavior is that CSI-Driver explicitly raises an AlreadyExists error if the volume already exists. As a result, provisionClaimOperation have to repeated provision the volume unless the already existing volume be deleted manually.

volume, err := ctrl.provisioner.Provision(options)
	if err != nil {
		if ierr, ok := err.(*IgnoredError); ok {
			// Provision ignored, do nothing and hope another provisioner will provision it.
			glog.Infof("provision of claim %q ignored: %v", claimToClaimKey(claim), ierr)
			return nil
		}
		strerr := fmt.Sprintf("Failed to provision volume with StorageClass %q: %v", claimClass, err)
		glog.Errorf("Failed to provision volume for claim %q with StorageClass %q: %v", claimToClaimKey(claim), claimClass, err)
		ctrl.eventRecorder.Event(claim, v1.EventTypeWarning, "ProvisioningFailed", strerr)
		return err
	}

I plan to take some time to figure out why more than one provisioning for the same PVC happens.
For now, IMO probably we can think of AlreadyExists error as an ignored error to improve code robustness.

	opts := wait.Backoff{Duration: backoffDuration, Factor: backoffFactor, Steps: backoffSteps}
	err = wait.ExponentialBackoff(opts, func() (bool, error) {
		ctx, cancel := context.WithTimeout(context.Background(), p.timeout)
		defer cancel()
		rep, err = p.csiClient.CreateVolume(ctx, &req)
		if err == nil {
			// CreateVolume has finished successfully
			return true, nil
		}

		if status, ok := status.FromError(err); ok {
			if status.Code() == codes.DeadlineExceeded {
				// CreateVolume timed out, give it another chance to complete
				glog.Warningf("CreateVolume timeout: %s has expired, operation will be retried", p.timeout.String())
				return false, nil
			}
		}
		// CreateVolume failed , no reason to retry, bailing from ExponentialBackoff
		return false, err
	})

        // think of `AlreadyExists` error as an ignored error
	if status, ok := status.FromError(err); ok {
		if status.Code() == codes.AlreadyExists {
			// CreateVolume timed out, give it another chance to complete
			return nil, &controller.IgnoredError{Reason: "Volume already exists"}
		}
	}
@orainxiong
Copy link
Author

@vladimirvivien

Looking forward to your comments.

@msau42
Copy link
Collaborator

msau42 commented Aug 12, 2018

In the CSI drivers I've seen, we return success when the volume already exists, not error

@msau42
Copy link
Collaborator

msau42 commented Aug 12, 2018

Also from the CSI spec:

This operation MUST be idempotent. If a volume corresponding to the specified volume name already exists, is accessible from accessibility_requirements, and is compatible with the specified capacity_range, volume_capabilities and parameters in the CreateVolumeRequest, the Plugin MUST reply 0 OK with the corresponding CreateVolumeResponse

@orainxiong
Copy link
Author

orainxiong commented Aug 13, 2018

@msau42 Many Thanks.

I have modified the implementation of createVolume logic by returning success to make it idempotence, but I hit another serious problem.

After code refactors, provisionClaimOperation subsequently creates a PV object in order to save the Volume. Because the corresponding PV already exists as well, the function of PersistentVolumes().Create() will raise the similar AlreadyExists Error as CSI-Driver does.

	for i := 0; i < ctrl.createProvisionedPVRetryCount; i++ {
		glog.V(4).Infof("provisionClaimOperation [%s]: trying to save volume %s", claimToClaimKey(claim), volume.Name)
		if _, err = ctrl.client.CoreV1().PersistentVolumes().Create(volume); err == nil {
			// Save succeeded.
			glog.Infof("volume %q for claim %q saved", volume.Name, claimToClaimKey(claim))
			break
		}
		// Save failed, try again after a while.
		glog.Infof("failed to save volume %q for claim %q: %v", volume.Name, claimToClaimKey(claim), err)
		time.Sleep(ctrl.createProvisionedPVInterval)
	}

After the several attempts fail, the default behavior will clean Volume by deleting it.

		for i := 0; i < ctrl.createProvisionedPVRetryCount; i++ {
			if err = ctrl.provisioner.Delete(volume); err == nil {
				// Delete succeeded
				glog.V(4).Infof("provisionClaimOperation [%s]: cleaning volume %s succeeded", claimToClaimKey(claim), volume.Name)
				break
			}
			// Delete failed, try again after a while.
			glog.Infof("failed to delete volume %q: %v", volume.Name, err)
			time.Sleep(ctrl.createProvisionedPVInterval)
		}

For more code details

As a result, Volume has been deleted, while both PVC and PV still exists. In case the Volume is in use, it will cause unpredictable IO error.

IMO, probably We are supposed to distinguish the AlreadyExists with the other errors.

@msau42
Copy link
Collaborator

msau42 commented Aug 13, 2018

Can you clarify the sequence of events you are seeing? The PV object should only be created after Provision() successfully returns. And once PV object is successfully created, then you should not see the Provision() being called again

@wongma7
Copy link
Contributor

wongma7 commented Aug 13, 2018

Yes, we should be checking for AlreadyExists in the PV.Create part of the code like upstream https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/volume/persistentvolume/pv_controller.go#L1502 However the issue will be extremely rare due to the safeguard here at the start of Provision https://github.com/kubernetes-incubator/external-storage/blob/master/lib/controller/controller.go#L949 and the impossibility (I would hope) of 2 controllers racing to Provision the same thing at the same time

As for a CSI driver returning AlreadyExists, if it's following the spec then AlreadyExists is an error where the volume already exists but is incompatible.

"Indicates that a volume corresponding to the specified volume name already exists but is incompatible with the specified capacity_range, volume_capabilities or parameters."

A volume that already exists and is compatible should return nil error as msau42 said.

@orainxiong
Copy link
Author

I am not sure I fully understand this problem, but I think it probably is an cache related issue rather than the racing issue of multiple controllers.

The following code (for more details) is responsible for ensuring whether the corresponding PV already exists or not, if exists the Provision()should not be called again.

	_, exists, err := ctrl.volumes.GetByKey(fmt.Sprintf("%s/%s", namespace, pvName))

In order to avoid API throttling, the latest version external-provisioner is refactored to use local cache rather than directly touching API Server. For more details, please visit: #68

Once there are more than one provisionClaimOperation being called for a given PVC in a very short time(0.1 ~ 0.5 second), the second one probably can't find the corresponding PV object from the local cache after the first one successfully returns. The printed log shows more details:

// The first one started
I0813 17:57:58.956265       1 controller.go:1099] scheduleOperation[provision-default/archive-yt-test1-0[5bc11503-9edf-11e8-923d-faa1f8c84900]]
I0813 17:57:58.956292       1 controller.go:749] provisionClaimOperation [default/archive-yt-test1-0] started, class: "csi-qcfs"

// The first one successfully returned
I0813 17:58:00.617709       1 controller.go:831] provisionClaimOperation [default/archive-yt-test1-0]: trying to save volume pvc-5bc115039edf11e8
I0813 17:58:00.684158       1 controller.go:834] volume "pvc-5bc115039edf11e8" for claim "default/archive-yt-test1-0" saved
I0813 17:58:00.684180       1 controller.go:870] volume "pvc-5bc115039edf11e8" provisioned for claim "default/archive-yt-test1-0"

// The second one came along in about 0.3 second after the PV object is successfully created
I0813 17:58:00.931261       1 controller.go:1099] scheduleOperation[provision-default/archive-yt-test1-0[5bc11503-9edf-11e8-923d-faa1f8c84900]]
I0813 17:58:00.931360       1 controller.go:749] provisionClaimOperation [default/archive-yt-test1-0] started, class: "csi-qcfs"

I think it is a cache out-of-sync related issue. IMO, there are probably 2 options to fix the problem :

  • Directly call against API Server in this place to fetch the exact status of the corresponding PV object;
  • provisionClaimOperation supports idempotence as CSI driver does;

@wongma7
Copy link
Contributor

wongma7 commented Aug 13, 2018

What version of the controller exactly did that happen in? Two provision claim operations happening on the same claim object within 2 seconds shouldn’t be possible.

@wongma7
Copy link
Contributor

wongma7 commented Aug 13, 2018

By version I mean git commit hash btw. I think scheduleoperation is goroutinemap thing

@orainxiong
Copy link
Author

Base on release 0.2.0, I just refactor code to use informer cache.

The specified Volume is successfully created and subsequently deleted in a very short time.
image

I think the second Provision() comes because of the resyncing of cache.

image

@orainxiong
Copy link
Author

orainxiong commented Aug 13, 2018

I think we could reproduce it when all conditions are met at the same time:

  • The first Provision() successfully returns because of the adding event of PVC ;
  • The second Provision() comes in a very short time because of the updating event of PVC;
  • The second Provision() use the local cache of PV object to check whether it exists or not, but the local cache of PV object is not the most up-to-date at that time;

Poor performance (CPU, Memory, Network) could increase the probability of that happening.

@msau42
Copy link
Collaborator

msau42 commented Aug 13, 2018

The way that the PV controller avoids this issue is that it updates it's in-memory cache first, and then sends the API update.

@wongma7
Copy link
Contributor

wongma7 commented Aug 14, 2018

@orainxiong okay I understand the issue now. Recall my comments here: kubernetes-retired/external-storage#837 (comment) . I said I'm okay with the change with the assumption that informer cache would reflect the new PV immediately, which is probably wrong. I also said something about switching from goroutinemap to work queues in the 3rd comment there but I'm not sure what I meant. If you can replicate this easily, I would try again with the latest lib version instead of v0.2.0 since it has work queues and the informer changes already, maybe we will figure out what I meant :)

Directly call against API Server in this place to fetch the exact status of the corresponding PV object;

I would prefer this, i.e. we just change it back to how it was

provisionClaimOperation supports idempotence as CSI driver does;

it does; like i said, the CSI driver returning AlreadyExists is an error. it should return nil.

IMO we should do both. If you tally up the API calls it’s at most 2kube+1backend

@wongma7
Copy link
Contributor

wongma7 commented Aug 14, 2018

Sorry, made a mistake in last comment, provisionclaimopetation Isn’t idempotent since we need to make it check for pv AlreadyExists (different from csi alreadyexists... confusing)

@orainxiong
Copy link
Author

@wongma7 You are right. Indeed, I don' t understand your definitive meaning until I recall your comments. : - )

IMO I think probably we should move on refactoring, so I'm gonna mirror PV controller current logic, updates it's in-memory cache as @msau42 proposed, to guarantee cache up-to-date.

I will break this work down into the following steps:

  • Upgrade to the latest external-storage v5.0.0, and check which is compatible with external-provisioner
  • Move on code refactor by mirroring PV controller current logic
  • Do endurance test , like repeatedly provisioning/deleting 100 Volume at once in a day, and that sort of thing

If it works, I will create PR to repo external-storage.

Please let me know, if I ignore something important.

@wongma7
Copy link
Contributor

wongma7 commented Aug 15, 2018

I don't want to maintain an internal cache just to solve this and we cannot modify the SharedInformer's cache. My understanding ist hat PV controller needs internal cache not only to avoid Provisioning twice but to avoid conflicts with syncClaim & syncVolume, it is complicated. I would prefer we bring back the single API call in Provision. Consider that the pv controller still does the API call https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/volume/persistentvolume/pv_controller.go#L1404 even though its internal cache should be up to date and trustable. https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/volume/persistentvolume/pv_controller.go#L1510

@orainxiong
Copy link
Author

Anyway, I think we missed an important detail.

In fact, external-storage use getProvisionedVolumeNameForClaim() to obtain deterministic PV name(like pvc-9278f458-a053-11e8-9af0-5254003d0a20) to check whether PV exists or not, and external-provisioner use makeVolumeName() to generate PV name(like pvc-9278f458a05311e8) to finally create a PV, they are independent and different. That means, we actually don't know the correct PV name of a given unbound PVC. Obviously, whatever touching API Server or updating local cache both of them can found nothing even if the PV object already exists.

As the image shows the difference:
image

There are the main two steps to ensure the operation of provisioning performs correctly:

  • GoRoutineMap is responsible for guaranteeing there is no more than one Provision() in flight for a given PVC;
  • Touching API Server to avoid Provision() an already existing PV;

We had expected both of them works well. Indeed, the step 2 is always invalid, and GoRoutineMap basically plays the only role to guarantee all the operations correct. Without the E2E endurance test, it is very hard to notice it. However, after fixing step 2, it hits the out-of-sync related issue again, and we can easily reproduce it by repeatedly creating/deleting a bunch of PVC at once. So, it is a complicated issue which involves many factors.

Sorry for my half-solution and mistakes, I intend to create patch to fix all the issues based on v5.0.0, which mainly involves the following changes:

  • Make both of makeVolumeName() and getProvisionedVolumeNameForClaim() the same. We could start by simply duplicating codes, and later we should discuss how to make external-storage aware makeVolumeName(), maybe we could introduce a new parameter being exposed to external-provisioner;
  • Back to call out to API Server instead of using local cache;

@wongma7
Copy link
Contributor

wongma7 commented Aug 16, 2018

@orainxiong OK, thanks again for thorough analysis.

Make both of makeVolumeName() and getProvisionedVolumeNameForClaim() the same. We could start by simply duplicating codes, and later we should discuss how to make external-storage aware makeVolumeName(), maybe we could introduce a new parameter being exposed to external-provisioner;

There have been requests in the past to make getProvisionedVolumeNameForClaim() configurable / exposing some interface to let provisioners determine the name. I will work on this, exposing it. As long as it returns a unique name that can be derived from the PVC there should be no problem. edit: there was a PR kubernetes-retired/external-storage#612 but needs work. I'll pick it up

pohly added a commit to pohly/external-provisioner that referenced this issue Jan 26, 2021
1d60e7792 Merge pull request kubernetes-csi#131 from pohly/kubernetes-1.20-tag
9f1045909 prow.sh: support building Kubernetes for a specific version
fe1f28481 Merge pull request kubernetes-csi#121 from kvaps/namespace-check
8fdf0f786 Merge pull request kubernetes-csi#128 from fengzixu/master
1c94220d2 fix: fix a bug of csi-sanity
a4c41e6af Merge pull request kubernetes-csi#127 from pohly/fix-boilerplate
ece0f500e check namespace for snapshot-controller
dbd896722 verify-boilerplate.sh: fix path to script
9289fd16c Merge pull request kubernetes-csi#125 from sachinkumarsingh092/optional-spelling-boilerplate-checks
ad29307f7 Make the spelling and boilerplate checks optional
5f06d0249 Merge pull request kubernetes-csi#124 from sachinkumarsingh092/fix-spellcheck-boilerplate-tests
48186eba9 Fix spelling and boilerplate errors
71690affa Merge pull request kubernetes-csi#122 from sachinkumarsingh092/include-spellcheck-boilerplate-tests
981be3fed Adding spelling and boilerplate checks.
2bb752537 Merge pull request kubernetes-csi#117 from fengzixu/master
3b6d17b13 Merge pull request kubernetes-csi#118 from pohly/cloud-build-timeout
9318c6ccd cloud build: double the timeout, now 1 hour
4ab8b154c use the tag to replace commit of csi-test
5d74e4550 change the csi-test import path to v4
7dcd0a992 upgrade csi-test to v4.0.2
86ff58021 Merge pull request kubernetes-csi#116 from andyzhangx/export-image-name
c3a966251 allow export image name and registry name

git-subtree-dir: release-tools
git-subtree-split: 1d60e7792624a9938c0bd1b045211fbb89e513d6
chrishenzie added a commit to chrishenzie/external-provisioner that referenced this issue Apr 4, 2021
a1e11275b Merge pull request kubernetes-csi#139 from pohly/kind-for-kubernetes-latest
1c0fb096c prow.sh: use KinD main for latest Kubernetes
1d77cfcbf Merge pull request kubernetes-csi#138 from pohly/kind-update-0.10
bff2fb7eb prow.sh: KinD 0.10.0
95eac3362 Merge pull request kubernetes-csi#137 from pohly/fix-go-version-check
437e4311e verify-go-version.sh: fix check after removal of travis.yml
1748b16b4 Merge pull request kubernetes-csi#136 from pohly/go-1.16
ec844ea60 remove travis.yml, Go 1.16
df76aba89 Merge pull request kubernetes-csi#134 from andyzhangx/add-build-arg
e314a56d0 add build-arg ARCH for building multi-arch images, e.g. ARG ARCH FROM k8s.gcr.io/build-image/debian-base-${ARCH}:v2.1.3
7bc70e526 Merge pull request kubernetes-csi#129 from pohly/squash-documentation
e0b02e725 README.md: document usage of --squash
316cb957c Merge pull request kubernetes-csi#132 from yiyang5055/bugfix/boilerplate
26e2ab106 fix: default boilerplate path
1add8c182 Merge pull request kubernetes-csi#133 from pohly/kubernetes-1.20-tag
3e811d6c9 prow.sh: fix "on-master" prow jobs
1d60e7792 Merge pull request kubernetes-csi#131 from pohly/kubernetes-1.20-tag
9f1045909 prow.sh: support building Kubernetes for a specific version
fe1f28481 Merge pull request kubernetes-csi#121 from kvaps/namespace-check
8fdf0f786 Merge pull request kubernetes-csi#128 from fengzixu/master
1c94220d2 fix: fix a bug of csi-sanity
a4c41e6af Merge pull request kubernetes-csi#127 from pohly/fix-boilerplate
ece0f500e check namespace for snapshot-controller
dbd896722 verify-boilerplate.sh: fix path to script
9289fd16c Merge pull request kubernetes-csi#125 from sachinkumarsingh092/optional-spelling-boilerplate-checks
ad29307f7 Make the spelling and boilerplate checks optional
5f06d0249 Merge pull request kubernetes-csi#124 from sachinkumarsingh092/fix-spellcheck-boilerplate-tests
48186eba9 Fix spelling and boilerplate errors
71690affa Merge pull request kubernetes-csi#122 from sachinkumarsingh092/include-spellcheck-boilerplate-tests
981be3fed Adding spelling and boilerplate checks.
2bb752537 Merge pull request kubernetes-csi#117 from fengzixu/master
3b6d17b13 Merge pull request kubernetes-csi#118 from pohly/cloud-build-timeout
9318c6ccd cloud build: double the timeout, now 1 hour
4ab8b154c use the tag to replace commit of csi-test
5d74e4550 change the csi-test import path to v4
7dcd0a992 upgrade csi-test to v4.0.2
86ff58021 Merge pull request kubernetes-csi#116 from andyzhangx/export-image-name
c3a966251 allow export image name and registry name
c6a88c6ed Merge pull request kubernetes-csi#113 from xing-yang/install_snapshot_controller
45ec4c69f Fix the install of snapshot CRDs and controller
5d874cce4 Merge pull request kubernetes-csi#112 from xing-yang/cleanup
79bbca7bc Cleanup
d43767304 Merge pull request kubernetes-csi#111 from xing-yang/update_snapshot_v1_rc
57718f834 Update snapshot CRD version
4aff857d8 Merge pull request kubernetes-csi#109 from pohly/alpha-test-defaults
0427289d5 Merge pull request kubernetes-csi#110 from pohly/kind-0.9-bazel-build-workaround
9a370ab90 prow.sh: work around "kind build node-image" failure
522361ec9 prow.sh: only run alpha tests for latest Kubernetes release
22c0395c9 Merge pull request kubernetes-csi#108 from bnrjee/master
b5b447b50 Add go ldflags using LDFLAGS at the time of compilation
16f4afbd8 Merge pull request kubernetes-csi#107 from pohly/kind-update
7bcee13d7 prow.sh: update to kind 0.9, support Kubernetes 1.19
df518fbd6 prow.sh: usage of Bazel optional
c3afd427e Merge pull request kubernetes-csi#104 from xing-yang/snapshot
dde93b220 Update to snapshot-controller v3.0.0
a0f195c Merge pull request kubernetes-csi#106 from msau42/fix-canary
7100c12 Only set staging registry when running canary job
b3c65f9 Merge pull request kubernetes-csi#99 from msau42/add-release-process
e53f3e8 Merge pull request kubernetes-csi#103 from msau42/fix-canary
d129462 Document new method for adding CI jobs are new K8s versions
e73c2ce Use staging registry for canary tests
2c09846 Add cleanup instructions to release-notes generation
60e1cd3 Merge pull request kubernetes-csi#98 from pohly/kubernetes-1-19-fixes
0979c09 prow.sh: fix E2E suite for Kubernetes >= 1.18
3b4a2f1 prow.sh: fix installing Go for Kubernetes 1.19.0
1fbb636 Merge pull request kubernetes-csi#97 from pohly/go-1.15
82d108a switch to Go 1.15
d8a2530 Merge pull request kubernetes-csi#95 from msau42/add-release-process
843bddc Add steps on promoting release images
0345a83 Merge pull request kubernetes-csi#94 from linux-on-ibm-z/bump-timeout
1fdf2d5 cloud build: bump timeout in Prow job
41ec6d1 Merge pull request kubernetes-csi#93 from animeshk08/patch-1
5a54e67 filter-junit: Fix gofmt error
0676fcb Merge pull request kubernetes-csi#92 from animeshk08/patch-1
36ea4ff filter-junit: Fix golint error
f5a4203 Merge pull request kubernetes-csi#91 from cyb70289/arm64
43e50d6 prow.sh: enable building arm64 image
0d5bd84 Merge pull request kubernetes-csi#90 from pohly/k8s-staging-sig-storage
3df86b7 cloud build: k8s-staging-sig-storage
c5fd961 Merge pull request kubernetes-csi#89 from pohly/cloud-build-binfmt
db0c2a7 cloud build: initialize support for running commands in Dockerfile
be902f4 Merge pull request kubernetes-csi#88 from pohly/multiarch-windows-fix
340e082 build.make: optional inclusion of Windows in multiarch images
5231f05 build.make: properly declare push-multiarch
4569f27 build.make: fix push-multiarch ambiguity
17dde9e Merge pull request kubernetes-csi#87 from pohly/cloud-build
bd41690 cloud build: initial set of shared files
9084fec Merge pull request kubernetes-csi#81 from msau42/add-release-process
6f2322e Update patch release notes generation command
0fcc3b1 Merge pull request kubernetes-csi#78 from ggriffiths/fix_csi_snapshotter_rbac_version_set
d8c76fe Support local snapshot RBAC for pull jobs
c1bdf5b Merge pull request kubernetes-csi#80 from msau42/add-release-process
ea1f94a update release tools instructions
152396e Merge pull request kubernetes-csi#77 from ggriffiths/snapshotter201_update
7edc146 Update snapshotter to version 2.0.1
4cf843f Merge pull request kubernetes-csi#76 from pohly/build-targets
3863a0f build for multiple platforms only in CI, add s390x
8322a7d Merge pull request kubernetes-csi#72 from pohly/hostpath-update
7c5a89c prow.sh: use 1.3.0 hostpath driver for testing
b8587b2 Merge pull request kubernetes-csi#71 from wozniakjan/test-vet
fdb3218 Change 'make test-vet' to call 'go vet'
d717c8c Merge pull request kubernetes-csi#69 from pohly/test-driver-config
a1432bc Merge pull request kubernetes-csi#70 from pohly/kubelet-feature-gates
5f74333 prow.sh: also configure feature gates for kubelet
84f78b1 prow.sh: generic driver installation
3c34b4f Merge pull request kubernetes-csi#67 from windayski/fix-link
fa90abd fix incorrect link
ff3cc3f Merge pull request kubernetes-csi#54 from msau42/add-release-process
ac8a021 Document the process for releasing a new sidecar
23be652 Merge pull request kubernetes-csi#65 from msau42/update-hostpath
6582f2f Update hostpath driver version to get fix for connection-timeout
4cc9174 Merge pull request kubernetes-csi#64 from ggriffiths/snapshotter_2_version_update
8191eab Update snapshotter to version v2.0.0
3c463fb Merge pull request kubernetes-csi#61 from msau42/enable-snapshots
8b0316c Fix overriding of junit results by using unique names for each e2e run
5f444b8 Merge pull request kubernetes-csi#60 from saad-ali/updateHostpathVersion
af9549b Update prow hostpath driver version to 1.3.0-rc2
f6c74b3 Merge pull request kubernetes-csi#57 from ggriffiths/version_gt_kubernetes_fix
fc80975 Fix version_gt to work with kubernetes prefix
9f1f3dd Merge pull request kubernetes-csi#56 from msau42/enable-snapshots
b98b2ae Enable snapshot tests in 1.17 to be run in non-alpha jobs.
9ace020 Merge pull request kubernetes-csi#52 from msau42/update-readme
540599b Merge pull request kubernetes-csi#53 from msau42/fix-make
a4e6299 fix syntax for ppc64le build
771ca6f Merge pull request kubernetes-csi#49 from ggriffiths/prowsh_improve_version_gt
d7c69d2 Merge pull request kubernetes-csi#51 from msau42/enable-multinode
4ad6949 Improve snapshot pod running checks and improve version_gt
53888ae Improve README by adding an explicit Kubernetes dependency section
9a7a685 Create a kind cluster with two worker nodes so that the topology feature can be tested. Test cases that test accessing volumes from multiple nodes need to be skipped
4ff2f5f Merge pull request kubernetes-csi#50 from darkowlzz/kind-0.6.0
80bba1f Use kind v0.6.0
6d674a7 Merge pull request kubernetes-csi#47 from Pensu/multi-arch
8adde49 Merge pull request kubernetes-csi#45 from ggriffiths/snapshot_beta_crds
003c14b Add snapshotter CRDs after cluster setup
a41f386 Merge pull request kubernetes-csi#46 from mucahitkurt/kind-cluster-cleanup
1eaaaa1 Delete kind cluster after tests run.
83a4ef1 Adding build for ppc64le
4fcafec Merge pull request kubernetes-csi#43 from pohly/system-pod-logging
f41c135 prow.sh: also log output of system containers
ee22a9c Merge pull request kubernetes-csi#42 from pohly/use-vendor-dir
8067845 travis.yml: also use vendor directory
23df4ae prow.sh: use vendor directory if available
a53bd4c Merge pull request kubernetes-csi#41 from pohly/go-version
c8a1c4a better handling of Go version
5e773d2 update CI to use Go 1.13.3
f419d74 Merge pull request kubernetes-csi#40 from msau42/add-1.16
e0fde8c Add new variables for 1.16 and remove 1.13
adf00fe Merge pull request kubernetes-csi#36 from msau42/full-clone
f1697d2 Do full git clones in travis. Shallow clones are causing test-subtree errors when the depth is exactly 50.
2c81919 Merge pull request kubernetes-csi#34 from pohly/go-mod-tidy
518d6af Merge pull request kubernetes-csi#35 from ddebroy/winbld2
2d6b3ce Build Windows only for amd64
c1078a6 go-get-kubernetes.sh: automate Kubernetes dependency handling
194289a update Go mod support
0affdf9 Merge pull request kubernetes-csi#33 from gnufied/enable-hostpath-expansion
6208f6a Enable hostpath expansion
6ecaa76 Merge pull request kubernetes-csi#30 from msau42/fix-windows
ea2f1b5 build windows binaries with .exe suffix
2d33550 Merge pull request kubernetes-csi#29 from mucahitkurt/create-2-node-kind-cluster
a8ea8bc create 2-node kind cluster since topology support is added to hostpath driver
df8530d Merge pull request kubernetes-csi#27 from pohly/dep-vendor-check
35ceaed prow.sh: install dep if needed
f85ab5a Merge pull request kubernetes-csi#26 from ddebroy/windows1
9fba09b Add rule for building Windows binaries
0400867 Merge pull request kubernetes-csi#25 from msau42/fix-master-jobs
dc0a5d8 Update kind to v0.5.0
aa85b82 Merge pull request kubernetes-csi#23 from msau42/fix-master-jobs
f46191d Kubernetes master changed the way that releases are tagged, which needed changes to kind. There are 3 changes made to prow.sh:
1cac3af Merge pull request kubernetes-csi#22 from msau42/add-1.15-jobs
0c0dc30 prow.sh: tag master images with a large version number
f4f73ce Merge pull request kubernetes-csi#21 from msau42/add-1.15-jobs
4e31f07 Change default hostpath driver name to hostpath.csi.k8s.io
4b6fa4a Update hostpath version for sidecar testing to v1.2.0-rc2
ecc7918 Update kind to v0.4.0. This requires overriding Kubernetes versions with specific patch versions that kind 0.4.0 supports. Also, feature gate setting is only supported on 1.15+ due to kind.sigs.k8s.io/v1alpha3 and kubeadm.k8s.io/v1beta2 dependencies.
a6f21d4 Add variables for 1.15
db8abb6 Merge pull request kubernetes-csi#20 from pohly/test-driver-config
b2f4e05 prow.sh: flexible test driver config
0399988 Merge pull request kubernetes-csi#19 from pohly/go-mod-vendor
066143d build.make: allow repos to use 'go mod' for vendoring
0bee749 Merge pull request kubernetes-csi#18 from pohly/go-version
e157b6b update to Go 1.12.4
88dc9a4 Merge pull request kubernetes-csi#17 from pohly/prow
0fafc66 prow.sh: skip sanity testing if component doesn't support it
bcac1c1 Merge pull request kubernetes-csi#16 from pohly/prow
0b10f6a prow.sh: update csi-driver-host-path
0c2677e Merge pull request kubernetes-csi#15 from pengzhisun/master
ff9bce4 Replace 'return' to 'exit' to fix shellcheck error
c60f382 Merge pull request kubernetes-csi#14 from pohly/prow
7aaac22 prow.sh: remove AllAlpha=all, part II
6617773 Merge pull request kubernetes-csi#13 from pohly/prow
cda2fc5 prow.sh: avoid AllAlpha=true
546d550 prow.sh: debug failing KinD cluster creation
9b0d9cd build.make: skip shellcheck if Docker is not available
aa45a1c prow.sh: more efficient execution of individual tests
f3d1d2d prow.sh: fix hostpath driver version check
31dfaf3 prow.sh: fix running of just "alpha" tests
f501443 prow.sh: AllAlpha=true for unknown Kubernetes versions
95ae9de Merge pull request kubernetes-csi#9 from pohly/prow
d87eccb prow.sh: switch back to upstream csi-driver-host-path
6602d38 prow.sh: different E2E suite depending on Kubernetes version
741319b prow.sh: improve building Kubernetes from source
29545bb prow.sh: take Go version from Kubernetes source
429581c prow.sh: pull Go version from travis.yml
0a0fd49 prow.sh: comment clarification
2069a0a Merge pull request kubernetes-csi#11 from pohly/verify-shellcheck
55212ff initial Prow test job
6c7ba1b build.make: integrate shellcheck into "make test"
b2d25d4 verify-shellcheck.sh: make it usable in csi-release-tools
3b6af7b Merge pull request kubernetes-csi#12 from pohly/local-e2e-suite
104a1ac build.make: avoid unit-testing E2E test suite
34010e7 Merge pull request kubernetes-csi#10 from pohly/vendor-check
e6db50d check vendor directory
fb13c51 verify-shellcheck.sh: import from Kubernetes
94fc1e3 build.make: avoid unit-testing E2E test suite
849db0a Merge pull request kubernetes-csi#8 from pohly/subtree-check-relax
cc564f9 verify-subtree.sh: relax check and ignore old content
33d58fd Merge pull request kubernetes-csi#5 from pohly/test-enhancements
be8a440 Merge pull request kubernetes-csi#4 from pohly/canary-fix
b0336b5 build.make: more readable "make test" output
09436b9 build.make: fix pushing of "canary" image from master branch
147892c build.make: support suppressing checks
154e33d build.make: clarify usage of "make V=1"

git-subtree-dir: release-tools
git-subtree-split: a1e11275b5a4febd6ad21beeac730e22c579825b
kbsonlong pushed a commit to kbsonlong/external-provisioner that referenced this issue Dec 29, 2023
1d60e779 Merge pull request kubernetes-csi#131 from pohly/kubernetes-1.20-tag
9f104590 prow.sh: support building Kubernetes for a specific version
fe1f2848 Merge pull request kubernetes-csi#121 from kvaps/namespace-check
8fdf0f78 Merge pull request kubernetes-csi#128 from fengzixu/master
1c94220d fix: fix a bug of csi-sanity
a4c41e6a Merge pull request kubernetes-csi#127 from pohly/fix-boilerplate
ece0f500 check namespace for snapshot-controller
dbd89672 verify-boilerplate.sh: fix path to script
9289fd16 Merge pull request kubernetes-csi#125 from sachinkumarsingh092/optional-spelling-boilerplate-checks
ad29307f Make the spelling and boilerplate checks optional
5f06d024 Merge pull request kubernetes-csi#124 from sachinkumarsingh092/fix-spellcheck-boilerplate-tests
48186eba Fix spelling and boilerplate errors
71690aff Merge pull request kubernetes-csi#122 from sachinkumarsingh092/include-spellcheck-boilerplate-tests
981be3fe Adding spelling and boilerplate checks.
2bb75253 Merge pull request kubernetes-csi#117 from fengzixu/master
3b6d17b1 Merge pull request kubernetes-csi#118 from pohly/cloud-build-timeout
9318c6cc cloud build: double the timeout, now 1 hour
4ab8b154 use the tag to replace commit of csi-test
5d74e455 change the csi-test import path to v4
7dcd0a99 upgrade csi-test to v4.0.2
86ff5802 Merge pull request kubernetes-csi#116 from andyzhangx/export-image-name
c3a96625 allow export image name and registry name

git-subtree-dir: release-tools
git-subtree-split: 1d60e7792624a9938c0bd1b045211fbb89e513d6
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants