-
Notifications
You must be signed in to change notification settings - Fork 2.1k
core-services/prow/02_config: Rational Azure limits #12840
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core-services/prow/02_config: Rational Azure limits #12840
Conversation
78d99b5 to
86f5480
Compare
|
LGTM but has conflicts |
|
Conflicts are with #12842. We can't move forward here until we figure out why the transition to static names didn't work. |
In October, James Russell bumped the Azure limits to:
Central US:
2000 standardDSv3Family
400 LowPriorityCores (Total Regional Spot vCPUs) [1]
100 Public IPs
Our three other regions:
400 standardDSv3Family
200 LowPriorityCores (Total Regional Spot vCPUs)
100 Public IPs
Surpassing the DSv3 limits leads to errors like:
Code=OperationNotAllowed
Message=Operation could not be completed as it results in exceeding
approved standardDSv3Family Cores quota. Additional details -
Deployment Model: Resource Manager, Location: centralus, Current
Limit: 1000, Current Usage: 1000, Additional Required: 4,
(Minimum) New Limit Required: 1004. Submit a request for Quota
increase at...
Surpassing LowPriorityCores limits leads to errors like:
Code=OperationNotAllowed
Message=Operation could not be completed as it results in exceeding
approved LowPriorityCores quota. Additional details - Deployment
Model: Resource Manager, Location: eastus2, Current Limit: 10,
Current Usage: 8, Additional Required: 4, (Minimum) New Limit
Required: 12. Submit a request for Quota increase at ...
Surpassing public IP limits leads to errors like:
Code=PublicIPCountLimitReached
Message=Cannot create more than 50 public IP addresses for this
subscription in this region.
There are also "Standard Sku Public IP Addresses" and "Static Public
IP Addresses". The former lead to errors like:
Code=StandardSkuPublicIPCountLimitReached
Message=Cannot create more than 50 standard sku publicIpAddresses
for this subscription in this region.
I don't think I've ever seen the latter in CI, but the error is supposed to look like:
Code=StaticPublicIPCountLimitReached
Message=Cannot create more than 20 public IP addresses with static
allocation method for this subscription in this region.
Current docs recommend 40 vCPU per 3-compute cluster [2] with 3 public
IP addresses [3], which makes for the following limits:
Central US:
2000 standardDSv3Family / 40 = 50 clusters
400 LowPriorityCores (Total Regional Spot vCPUs) [1] / 18 vCPUs per spot test = 22 clusters
100 Public IPs / 3 per cluster = 33 clusters
Our three other regions:
400 standardDSv3Family / 40 = 10 clusters
200 LowPriorityCores (Total Regional Spot vCPUs) / 18 vCPUs per spot test = 11 clusters
100 Public IPs / 3 per cluster = 33 clusters
Our default limits:
1000 VNets / 1 per cluster [4] = 1000 clusters
65536 network interfaces / 6+ per cluster [5] = 10+k clusters
5000 network security groups / 2+ per cluster [6] = 2+k clusters
1000 network load balancers / 3+ per cluster [7] = 300+ clusters
?? private IP addresses / 7 per cluster [8] = ?? clusters
18 vCPUs per spot test is from Joel Speed.
I dunno what our current private IP quota is. I guess we'll see when
we bump into it. Anyhow, limit is 33 clusters for central US (most of
our tests do not involve spot instances) and 10 in the other regions.
Next thing to bump would be standardDSv3Family in the other regions,
followed by public IPs.
[1]: https://bugzilla.redhat.com/show_bug.cgi?id=1888380
[2]: https://github.com/openshift/openshift-docs/blame/1338581a9d0c8e44aecf0a415f8d7a2a61d48df2/modules/installation-azure-limits.adoc#L33
[3]: https://github.com/openshift/openshift-docs/blame/1338581a9d0c8e44aecf0a415f8d7a2a61d48df2/modules/installation-azure-limits.adoc#L105-L110
[4]: https://github.com/openshift/openshift-docs/blame/1338581a9d0c8e44aecf0a415f8d7a2a61d48df2/modules/installation-azure-limits.adoc#L66-L68
[5]: https://github.com/openshift/openshift-docs/blame/1338581a9d0c8e44aecf0a415f8d7a2a61d48df2/modules/installation-azure-limits.adoc#L72-L75
[6]: https://github.com/openshift/openshift-docs/blame/1338581a9d0c8e44aecf0a415f8d7a2a61d48df2/modules/installation-azure-limits.adoc#L79-L83
[7]: https://github.com/openshift/openshift-docs/blame/1338581a9d0c8e44aecf0a415f8d7a2a61d48df2/modules/installation-azure-limits.adoc#L92-L102
[8]: https://github.com/openshift/openshift-docs/blame/1338581a9d0c8e44aecf0a415f8d7a2a61d48df2/modules/installation-azure-limits.adoc#L113-L116
86f5480 to
1b664cc
Compare
|
Rebased around #14285 with 86f548040d -> 1b664cc. That also adds some handwaving around |
petr-muller
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤞
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: petr-muller, wking The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@wking: Updated the following 2 configmaps:
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@coverprice recently bumped the Azure limits to:
Central US:
Our three other regions:
Surpassing the DSv3 limits leads to errors like:
Surpassing LowPriorityCores limits leads to errors like:
Surpassing public IP limits leads to errors like:
Current docs recommend 40 vCPU per 3-compute cluster with 3 public IP addresses, which makes for the following limits:
Central US:
Our three other regions:
Our default limits:
I dunno what our current private IP quota is. I guess we'll see when we bump into it. Anyhow, limit is 33 clusters for central US (most of our tests do not involve spot instances) and 10 in the other regions. Next thing to bump would be standardDSv3Family in the other regions, followed by public IPs.