Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

azurerm_subnet.subnet: autorest:DoErrorUnlessStatusCode 429 PUT #7153

Closed
colemickens opened this issue Jun 13, 2016 · 11 comments · Fixed by #7307
Closed

azurerm_subnet.subnet: autorest:DoErrorUnlessStatusCode 429 PUT #7153

colemickens opened this issue Jun 13, 2016 · 11 comments · Fixed by #7307

Comments

@colemickens
Copy link

Most of the time when deploying this terraform file, I get a 429 on one of the subnet operations.

Looks like the subnet is created before the NSG is fully finished provisioning. Retrying always allows me to proceed.

Terraform v0.7.0-rc1 (301da85f30239e87b30db254a25706a6d41c2522)

Affected Resource(s)

  • azurerm_subnet

Terraform Configuration Files

https://gist.githubusercontent.com/anonymous/0b0ffa731d79be1097127380479e2cff/raw/65bc650a553ea88a3d8aa907af1cd0df5e823fc7/azure.tf

Debug Output

[hopefully not necessary]

Standard Output

azurerm_storage_container.sc: Creation complete
Error applying plan:

1 error(s) occurred:

* azurerm_subnet.subnet: autorest:DoErrorUnlessStatusCode 429 PUT https://management.azure.com/subscriptions/27b750cd-ed43-42fd-9044-8d75e124ae55/resourceGroups/colemick-kube1/providers/Microsoft.Network/virtualnetworks/colemick-kube1-vnet/subnets/colemick-kube1-subnet?api-version=2015-06-15 failed with 429

Terraform does not automatically rollback in the face of errors.
Instead, your Terraform state file has been partially updated with
any resources that successfully completed. Please address the error
above and apply again to incrementally change your infrastructure.

Azure Error

This is from the body of the 429 back from Azure:

{
    "error": {
        "code": "RetryableError",
        "details": [
            {
                "code": "ReferencedResourceNotProvisioned",
                "message": "Cannot proceed with operation since resource /subscriptions/27b750cd-ed43-42fd-9044-8d75e124ae55/resourceGroups/colemick-kube1/providers/Microsoft.Network/networkSecurityGroups/colemick-kube1-nsg used by
resource /subscriptions/27b750cd-ed43-42fd-9044-8d75e124ae55/resourceGroups/colemick-kube1/providers/Microsoft.Network/virtualNetworks/colemick-kube1-vnet/subnets/colemick-kube1-subnet is not in Succeeded state. Resource is in Updating
state and the last operation that updated/is updating the resource is PutSecurityRuleOperation."
            }
        ],
        "message": "A retryable error occured."
    }
}
@colemickens
Copy link
Author

(And sure enough that gist has my client_secret at the bottom. It's been revoked already in case anyone notices and is concerned.)

@jen20
Copy link
Contributor

jen20 commented Jun 15, 2016

Hi @colemickens! This is actually a confusing error message I think: HTTP 429 is supposed to represent "Too Many Requests" - implying there is rate limiting taking place on the ARM API. My guess is that we need to implement some kind of catch-all system of allowing 429 to be caught and cause the provider to sleep. Ideally the SDK would do this - I can add it to the Riviera SDK used by some resources, but probably not to the official one, so we will need to work around this in Terraform.

@colemickens
Copy link
Author

Hm. I'm not sure that's the case. At this point in the plan, it's only had to provision a few resources (vnet, subnet, route table) and it when it chokes (about 50% of the time), it's always on this subnet resource. On the other hand, when I retry and it passes the subnet, it immediately sends a much larger number of requests (because within a couple seconds of passing the subnet, it will have created 20 NICs and 20 VMs) and I never get this error even when creating a relatively large number of VMs.

Do you guys currently poll on the nsg creation operation? If you create the nsg and then the subnet too quickly, I think it will cause this error.

@ghost
Copy link

ghost commented Jun 17, 2016

+1

@colemickens
Copy link
Author

@stack72 Replying here to avoid 140 char limits.

Based on the error message, it looks like we are adding a rule to the SG. The SG then goes into an "Updating" state. Before the state is fully back to "Ready", we are trying to assign the SG to the subnet. The Subnet modification fails because it references a resource that is in flux.

So yes, I think that polling for completion of the SecurityGroupRulesClient CreateOrUpdate operation would be helpful.

The latest client should be doing this sort of polling automatically. If it's not, that's a bug on us.

@stack72
Copy link
Contributor

stack72 commented Jun 23, 2016

Thanks for getting back to me

So I have just opened a PR that will add some polling. We are using version v2.1.1-beta-8-gca4d906 of the SDK. When we upgraded to this version, we removed our polling. So this could be an issue in either side.

P.

@stack72
Copy link
Contributor

stack72 commented Jun 24, 2016

Closed via #7307

@ghost
Copy link

ghost commented Jul 4, 2016

Hi @stack72

I've built from master and I'm getting the following:

* azurerm_network_interface.alm-build-server: network.InterfacesClient#CreateOrUpdate: Failure responding to request: StatusCode=429 -- Original Error: autorest/azure: Service returned an error. Status=429 Code="RetryableError" Message="A retryable error occured." Details=[{"code":"ReferencedResourceNotProvisioned","message":"Cannot proceed with operation since resource /subscriptions/XXXXXX/resourceGroups/core-develop/providers/Microsoft.Network/networkSecurityGroups/alm-build-server-security-group used by resource /subscriptions/XXXXXX/resourceGroups/core-develop/providers/Microsoft.Network/networkInterfaces/alm-build-server-0-nic is not in Succeeded state. Resource is in Updating state and the last operation that updated/is updating the resource is PutSecurityRuleOperation."}]
* azurerm_network_interface.alm-resource-server: network.InterfacesClient#CreateOrUpdate: Failure responding to request: StatusCode=429 -- Original Error: autorest/azure: Service returned an error. Status=429 Code="RetryableError" Message="A retryable error occured." Details=[{"code":"ReferencedResourceNotProvisioned","message":"Cannot proceed with operation since resource /subscriptions/XXXXXX/resourceGroups/core-develop/providers/Microsoft.Network/networkSecurityGroups/alm-resource-server-security-group used by resource /subscriptions/XXXXXX/resourceGroups/core-develop/providers/Microsoft.Network/networkInterfaces/alm-resource-server-0-nic is not in Succeeded state. Resource is in Updating state and the last operation that updated/is updating the resource is PutSecurityRuleOperation."}]
* azurerm_public_ip.alm-build-agent: network.PublicIPAddressesClient#CreateOrUpdate: Failure responding to request: StatusCode=429 -- Original Error: autorest/azure: Service returned an error. Status=429 Code="RetryableError" Message="A retryable error occured." Details=[{"Code":"GatewayError","Message":"Error occured in resource provider infrastructure services.","Target":null}]

This looks like it might be related - have you seen it?

@stack72
Copy link
Contributor

stack72 commented Jul 6, 2016

Hi @Tasquith

This looks like the resource is not waiting till it is finished. I need to revisit some of the code in this area

Paul

@zot24
Copy link

zot24 commented Aug 2, 2016

Just to let you know I'm getting same error than @Tasquith!

I'm using v0.7.0-rc4

@ghost
Copy link

ghost commented Apr 23, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked and limited conversation to collaborators Apr 23, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants