Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix perfomance issue of keyvault #6866

Closed
wants to merge 3 commits into from
Closed

Conversation

njuCZ
Copy link
Contributor

@njuCZ njuCZ commented May 12, 2020

fix #6196

I think the root cause of this issue is every key/secret/certificate will list all key vaults, which causes the poor performance and Context Deadline Exceeded.

according to https://docs.microsoft.com/en-us/azure/key-vault/general/about-keys-secrets-certificates#objects-identifiers-and-versioning

the key vault name is global unique, and

An object identifier has the following general format:

https://{keyvault-name}.vault.azure.net/{object-type}/{object-name}/{object-version}

we could directly get baseURL from keyvault name and get keyvault name from baseURL.

AS for get keyvault id from name, I use resourcesClient instead of listing to improve the performance.

image
image

@ghost ghost added the size/XL label May 12, 2020
@tombuildsstuff
Copy link
Contributor

hi @njuCZ

Thanks for opening this PR.

Taking a look at #6196 - there's two reasons behind this issue:

  1. Terraform is returning a "Context Deadline Exceeded" since the default timeouts are being hit..

  2. .. because Azure's returning a HTTP 429 because of too many requests being sent (and so we're being forced to wait and retry) which in turn exceeds the default timeout

Prior to version 2.0 of the Azure Provider all resources shared the same timeout (3 hours) - which whilst convenient in some cases caused issues in others where some Azure Services take a long time to provision (e.g. App Service Environment).

Version 2.0 of the Azure Provider added support for Custom Timeouts on all resources - as such we've gone through all the Data Sources and Resources available in the Azure Provider and added reasonable-ish defaults for these, based on how long it takes to provision/destroy these resources (and we're updating this over time).

Whilst this approach may help a little in the short-term - in practice since this information isn't being cached it likely won't alleviate that too much. Instead we'd be better to take an approach similar to how Storage Accounts caching works by looking this information up and caching it - which would probably give us a small performance boost.

In Terraform we recommend using multiple smaller state files rather than one large state file - primarily to reduce the "blast radius" when something goes wrong - but also to avoid situations like this. Whilst one option would be to work through and add Custom Timeouts to all of the resources (as has been partially tried in #6196) - it's preferable to split this Terraform Configuration up so that it's not doing so much, which would solve this issue.

Whilst this approach may help to alleviate some of the pain in the short-term - even if we cached this we'd end up with the same issue longer-term albeit with a slightly higher threshold. As mentioned above, it should instead be possible to work around this by splitting this configuration into multiple State files - which is our recommendation here - and as such whilst I'd like to thank you for this contribution I'm going to close this PR for the moment.

Thanks!

@njuCZ
Copy link
Contributor Author

njuCZ commented May 14, 2020

@tombuildsstuff thanks for your opinion, that makes sense to me. So what about this issue #6196 ? should we tell the author to split the configuration file and close it ?

@ghost
Copy link

ghost commented Jun 11, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks!

@ghost ghost locked and limited conversation to collaborators Jun 11, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

keyvault: caching of metadata required in larger terraform configurations
2 participants