Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

option to predeploy any resource #906

Open
1 task
bcha opened this issue Oct 4, 2022 · 3 comments
Open
1 task

option to predeploy any resource #906

bcha opened this issue Oct 4, 2022 · 3 comments

Comments

@bcha
Copy link

bcha commented Oct 4, 2022

Feature request

  • If the maintainers agree with the feature as described here, I intend to submit a Pull Request myself.1

Proposal: Could it be possible to predeploy any resouce, controlled by some annotation? Like krane.shopify.io/predeployed, which currently only supports CRDs.

My personal use-case is deploying kind: ExternalSecret resources from https://github.com/external-secrets/external-secrets. Essentially it's an operator, which syncs secrets from external places like secret management solutions offered by AWS, Azure, GCP, whatever & creates corresponding kind: Secret resources.

The problem is that the kind: ExternalSecret doesn't always have enough time for initial sync as they're deployed at the same time as kind: Deployment for example, which can cause krane to result in failure like so:

[INFO][2022-10-04 06:47:53 +0000]	----------------------------------Phase 4: Deploying all resources----------------------------------
[INFO][2022-10-04 06:47:53 +0000]	Deploying resources:
[INFO][2022-10-04 06:47:53 +0000]	- Deployment/account-deployment (timeout: 420s)
[INFO][2022-10-04 06:47:53 +0000]	- ExternalSecret/foo (timeout: 300s)
...blablabla
[WARN][2022-10-04 06:48:23 +0000]	Don't know how to monitor resources of type ExternalSecret. Assuming ExternalSecret/foo deployed successfully.
...blablabla
[INFO][2022-10-04 06:48:29 +0000]	------------------------------------------Result: FAILURE-------------------------------------------
[FATAL][2022-10-04 06:48:29 +0000]	Successfully deployed 13 resources and failed to deploy 1 resource
[FATAL][2022-10-04 06:48:29 +0000]	
[FATAL][2022-10-04 06:48:29 +0000]	Successful resources
[FATAL][2022-10-04 06:48:29 +0000]	Deployment/account-scheduler-deployment           0 replicas
[FATAL][2022-10-04 06:48:29 +0000]	ExternalSecret/foo                       Not Found
... blablabla
[FATAL][2022-10-04 06:48:29 +0000]	Deployment/account-deployment: FAILED
[FATAL][2022-10-04 06:48:29 +0000]	Latest ReplicaSet: account-deployment-67f9b49988
[FATAL][2022-10-04 06:48:29 +0000]	
[FATAL][2022-10-04 06:48:29 +0000]	The following containers are in a state that is unlikely to be recoverable:
[FATAL][2022-10-04 06:48:29 +0000]	> account: Failed to generate container configuration: secret "foo" not found
[FATAL][2022-10-04 06:48:29 +0000]	
[FATAL][2022-10-04 06:48:29 +0000]	  - Final status: 1 replica, 1 updatedReplica, 1 unavailableReplica
[FATAL][2022-10-04 06:48:29 +0000]	  - Events (common success events excluded):
[FATAL][2022-10-04 06:48:29 +0000]	      [Deployment/account-deployment]	ScalingReplicaSet: Scaled up replica set account-deployment-67f9b49988 to 1 (1 events)
[FATAL][2022-10-04 06:48:29 +0000]	      [Pod/account-deployment-67f9b49988-4hzc5]	Failed: Error: secret "foo" not found (2 events)

Of course as the secret is synced often just seconds after the deployment will recover automatically, but as we're running krane in CICD it'll still report the deployment as failed.

https://github.com/Shopify/krane#deploying-custom-resources would be another option, but unfortunately external-secrets-operator doesn't currently implement observedGeneration & thus I can't use this.

@benlangfeld
Copy link
Contributor

All custom resources are already pre-deployed. Your full log output should show this.

@bcha
Copy link
Author

bcha commented Jan 13, 2023

Yeah, looks like you're correct. I looked back on our CI/CD logs from October to confirm & back then ESO resources were not being predeployed, though curiously some other custom resources were. Logs from October:

[INFO][2022-10-03 12:46:08 +0000]	------------------------------Phase 3: Predeploying priority resources------------------------------
[INFO][2022-10-03 12:46:08 +0000]	Deploying ServiceAccount/session-serviceaccount (timeout: 30s)
[INFO][2022-10-03 12:46:11 +0000]	Successfully deployed in 2.3s: ServiceAccount/session-serviceaccount
[INFO][2022-10-03 12:46:11 +0000]	
[INFO][2022-10-03 12:46:11 +0000]	Deploying resources:
[INFO][2022-10-03 12:46:13 +0000]	Deploying Mapping/session-apigw (timeout: 300s)
[WARN][2022-10-03 12:46:15 +0000]	Don't know how to monitor resources of type Mapping. Assuming Mapping/session-apigw deployed successfully.
[INFO][2022-10-03 12:46:15 +0000]	Successfully deployed in 2.3s: Mapping/session-apigw
[INFO][2022-10-03 12:46:15 +0000]	
[INFO][2022-10-03 12:46:15 +0000]	
[INFO][2022-10-03 12:46:15 +0000]	----------------------------------Phase 4: Deploying all resources----------------------------------
[INFO][2022-10-03 12:46:15 +0000]	Deploying resources:
...blablabla
[INFO][2022-10-03 12:46:15 +0000]	- ExternalSecret/xyzzy (timeout: 300s)
[WARN][2022-10-03 12:46:35 +0000]	Don't know how to monitor resources of type ExternalSecret. Assuming ExternalSecret/xyzzy deployed successfully.

Just did a fresh new test & ExternalSecrets are now being predeployed as they should:

[INFO][2023-01-13 10:07:03 +0000]	------------------------------Phase 3: Predeploying priority resources------------------------------
[INFO][2023-01-13 10:07:03 +0000]	Deploying ExternalSecret/xyzzy (timeout: 300s)
[WARN][2023-01-13 10:07:05 +0000]	Don't know how to monitor resources of type ExternalSecret. Assuming ExternalSecret/xyzzy deployed successfully.

Looks like krane version has been 2.4.7 in both cases. And external-secrets-operator has remained in the same version as well. I tracked down some more logs & found out that the change in behavior seems to match pretty much perfectly with our upgrade from EKS 1.21 to 1.22, so that's my main suspect right now.

Anyway, I think this behavior in krane is working as expected 👍

@benlangfeld
Copy link
Contributor

I wonder if this could be related to #773

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants