Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error finding VLAN order: couldn't find resource (21 retries) #2613

Open
martinohansen opened this issue May 12, 2021 · 15 comments
Open

Error finding VLAN order: couldn't find resource (21 retries) #2613

martinohansen opened this issue May 12, 2021 · 15 comments
Labels
service/Classic Infrastructure Issues related to classic Infrastructure

Comments

@martinohansen
Copy link

martinohansen commented May 12, 2021

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform CLI and Terraform IBM Provider Version

$ ./terraform -v
Terraform v0.13.2
+ provider registry.terraform.io/ibm-cloud/ibm v1.24.0

Affected Resource(s)

  • ibm_network_vlan

Terraform Configuration Files

Please include all Terraform configurations required to reproduce the bug. Bug reports without a functional reproduction may be closed without investigation.

resource "ibm_network_vlan" "private_vlans" {
  name            = "test.bcr01a.seo01"
  datacenter      = "seo01"
  type            = "PRIVATE"
  router_hostname = "bcr01a.seo01"
  tags            = "test"
}

Debug Output

2021-05-12T14:22:13.085+0200 [DEBUG] plugin.terraform-provider-ibm_v1.24.0: 2021/05/12 14:22:13 [DEBUG] Request URL:  GET https://api.softlayer.com/rest/v3/SoftLayer_Account/getNetworkVlans.json?objectFilter=%7B%22networkVlans%22%3A%7B%22billingItem%22%3A%7B%22orderItem%22%3A%7B%22order%22%3A%7B%22id%22%3A%7B%22operation%22%3A%2278563420%22%7D%7D%7D%7D%7D%7D&objectMask=id
2021-05-12T14:22:13.085+0200 [DEBUG] plugin.terraform-provider-ibm_v1.24.0: 2021/05/12 14:22:13 [DEBUG] Parameters:
module.iks.ibm_network_vlan.private_vlans["seo01"]: Still creating... [4m0s elapsed]
2021-05-12T14:22:14.381+0200 [DEBUG] plugin.terraform-provider-ibm_v1.24.0: 2021/05/12 14:22:14 [DEBUG] Status Code:  200
2021-05-12T14:22:14.381+0200 [DEBUG] plugin.terraform-provider-ibm_v1.24.0: 2021/05/12 14:22:14 [DEBUG] Response:  []
2021/05/12 14:22:14 [DEBUG] module.iks.ibm_network_vlan.private_vlans["seo01"]: apply errored, but we're indicating that via the Error pointer rather than returning it: Error finding VLAN order 78563420: couldn't find resource (21 retries)
2021/05/12 14:22:14 [ERROR] eval: *terraform.EvalApplyPost, err: Error finding VLAN order 78563420: couldn't find resource (21 retries)
2021/05/12 14:22:14 [ERROR] eval: *terraform.EvalSequence, err: Error finding VLAN order 78563420: couldn't find resource (21 retries)

[...]
Error: Error finding VLAN order 78563420: couldn't find resource (21 retries)

  on modules/iks/main.tf line 12, in resource "ibm_network_vlan" "private_vlans":
  12: resource "ibm_network_vlan" "private_vlans" {

Expected Behavior

We would expect the resource to get created within an adjustable timeout.

Actual Behavior

The resource never get created and the timeout/retries for it are not adjustable. I've tried adjusting the following timeout and retries: IBMCLOUD_TIMEOUT=300 IAAS_CLASSIC_TIMEOUT=300 MAX_RETRIES=30

Using the IBM Cloud CLI i can indeed replicate the behavior,

# During the Terraform apply
$ ibmcloud sl call-api SoftLayer_Account getNetworkVlans --filter '{"networkVlans":{"billingItem":{"orderItem":{"order":{"id":{"operation":"78563420"}}}}}}'
[]%

# A variable time after Terraform failed -- usually a few minutes.
$ ic sl call-api SoftLayer_Account getNetworkVlans --filter '{"networkVlans":{"billingItem":{"orderItem":{"order":{"id":{"operation":"78563420"}}}}}}'
[
	{
		"accountId": 2257146,
		"id": 3087952,
		"modifyDate": "2021-05-12T05:47:43-06:00",
		"primarySubnetId": 1354559,
		"vlanNumber": 1147
	}
]%

Steps to Reproduce

Execute terraform apply with the above ibm_network_vlan resource.

@kavya498 kavya498 added the service/Classic Infrastructure Issues related to classic Infrastructure label May 12, 2021
@hkantare
Copy link
Collaborator

@martinohansen Can you please share us the complete log ....We are not able to reproduce from our end

2021-05-18T11:31:20.623+0530 [DEBUG] plugin.terraform-provider-ibm: 2021/05/18 11:31:20 [DEBUG] Response:  true
2021-05-18T11:31:20.623+0530 [DEBUG] plugin.terraform-provider-ibm: 2021/05/18 11:31:20 [DEBUG] Request URL:  GET https://api.softlayer.com/rest/v3/SoftLayer_Network_Vlan/3090826.json?objectMask=mask%5Bid%2Cname%2CprimaryRouter%5Bdatacenter%5Bname%5D%5D%2CprimaryRouter%5Bhostname%5D%2CvlanNumber%2CbillingItem%5BrecurringFee%5D%2CguestNetworkComponentCount%2Csubnets%5BnetworkIdentifier%2Ccidr%2CsubnetType%5D%2CtagReferences%5Bid%2Ctag%5Bname%5D%5D%5D
2021-05-18T11:31:20.623+0530 [DEBUG] plugin.terraform-provider-ibm: 2021/05/18 11:31:20 [DEBUG] Parameters:  
ibm_network_vlan.private_vlans: Still creating... [30s elapsed]
2021-05-18T11:31:21.901+0530 [DEBUG] plugin.terraform-provider-ibm: 2021/05/18 11:31:21 [DEBUG] Status Code:  200
2021-05-18T11:31:21.901+0530 [DEBUG] plugin.terraform-provider-ibm: 2021/05/18 11:31:21 [DEBUG] Response:  {"id":3090826,"name":"test.bcr01a.seo01","vlanNumber":1212,"guestNetworkComponentCount":0,"billingItem":{"recurringFee":"16.5"},"primaryRouter":{"hostname":"bcr01a.seo01","datacenter":{"name":"seo01"}},"subnets":[],"tagReferences":[{"id":665421164,"tag":{"name":"test"}}]}
ibm_network_vlan.private_vlans: Creation complete after 30s [id=3090826]
2021-05-18T11:31:21.924+0530 [WARN]  plugin.stdio: received EOF, stopping recv loop: err="rpc error: code = Unavailable desc = transport is closing"
2021-05-18T11:31:21.930+0530 [DEBUG] plugin: plugin process exited: path=/Users/harinireddy/terraform/terraform-provider-ibm pid=58288
2021-05-18T11:31:21.930+0530 [DEBUG] plugin: plugin exited

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

With below sample configuration

resource "ibm_network_vlan" "private_vlans" {
  name            = "test.bcr01a.seo01"
  datacenter      = "seo01"
  type            = "PRIVATE"
  router_hostname = "bcr01a.seo01"
  tags            = ["test"]
}

@martinohansen
Copy link
Author

martinohansen commented May 18, 2021

Interesting, here is the complete log.

@hkantare
Copy link
Collaborator

Can you please try to run this from a brower url and share us the outpur
(use same classic crdentials username and apikey(as password) when its prompted)
https://api.softlayer.com/rest/v3/SoftLayer_Account/getNetworkVlans.json?objectFilter=%7B%22networkVlans%22%3A%7B%22billingItem%22%3A%7B%22orderItem%22%3A%7B%22order%22%3A%7B%22id%22%3A%7B%22operation%22%3A%2279022860%22%7D%7D%7D%7D%7D%7D&objectMask=id

We should see someoutput like below (id will be different)

[{"id":3090842}]

@martinohansen
Copy link
Author

That works: [{"id":3090846}]. But is it not more or less the same as I did here?

# During the Terraform apply
$ ibmcloud sl call-api SoftLayer_Account getNetworkVlans --filter '{"networkVlans":{"billingItem":{"orderItem":{"order":{"id":{"operation":"78563420"}}}}}}'
[]%

# A variable time after Terraform failed -- usually a few minutes.
$ ic sl call-api SoftLayer_Account getNetworkVlans --filter '{"networkVlans":{"billingItem":{"orderItem":{"order":{"id":{"operation":"78563420"}}}}}}'
[
	{
		"accountId": 2257146,
		"id": 3087952,
		"modifyDate": "2021-05-12T05:47:43-06:00",
		"primarySubnetId": 1354559,
		"vlanNumber": 1147
	}
]%

@martinohansen
Copy link
Author

Also, this might be relevant. I tried importing the state after Terraform failed and the VLAN order went through. The subsequently apply has this diff:

  # module.iks.ibm_network_vlan.private_vlans["seo01"] will be updated in-place
  ~ resource "ibm_network_vlan" "private_vlans" {
        child_resource_count    = 0
        datacenter              = "seo01"
        id                      = "3090266"
      + name                    = "test.bcr01a.seo01"
        resource_controller_url = "https://cloud.ibm.com/classic/network/vlans/3090266"
        router_hostname         = "bcr01a.seo01"
        softlayer_managed       = false
        subnets                 = []
      + tags                    = [
          + "test",
        ]
        type                    = "PRIVATE"
        vlan_number             = 1148

        timeouts {}
    }

So it seems the VLAN gets created but is missing the tag and name.

hkantare added a commit to hkantare/terraform-provider-ibm that referenced this issue May 18, 2021
hkantare added a commit that referenced this issue May 18, 2021
@hkantare
Copy link
Collaborator

When we googled we found simialr issues opend
hashicorp/terraform-provider-rancher#45

Similarly increased NotFoundChecks but not sure will it fix the issue from your end since we are not able to replicate it

Can you please try with the latest release v1.25.0
https://github.com/IBM-Cloud/terraform-provider-ibm/releases/tag/v1.25.0
https://registry.terraform.io/providers/IBM-Cloud/ibm/latest

@martinohansen
Copy link
Author

martinohansen commented May 18, 2021

Looks like 10m wasn't enough in the first run, but a 2nd run finished within ~3 minutes. Not sure what we can expect from the backend here, but I'm gonna play a little more with it to see how often it fails.

Error: Error finding VLAN order 79049398: timeout while waiting for state to become 'complete' (last state: 'pending', timeout: 10m0s)

  on modules/iks/main.tf line 12, in resource "ibm_network_vlan" "private_vlans":
  12: resource "ibm_network_vlan" "private_vlans" {

Thanks for pushing a version so quickly 🙏

Edit: it took 15 min for the first VLAN to get approved.

@martinohansen
Copy link
Author

FYI I just tested another deployment, it took 82 minutes for the VLAN to get approved and during that period the API call returned an empty response. It was order #79096162 and #79095256.

@hkantare
Copy link
Collaborator

@martinohansen May be can you raise ticket on softlayerAPI why are the provisioning taking too long

@martinohansen
Copy link
Author

martinohansen commented May 19, 2021

Will do 🙏

Update: ticket number CS2311670

@martinohansen
Copy link
Author

Could we make the timeout adjustable by the consumer at runtime?

@hkantare
Copy link
Collaborator

yes we have customizable timeout user can define in templates (https://registry.terraform.io/providers/IBM-Cloud/ibm/latest/docs/resources/resource_instance#timeouts)

resource "ibm_network_vlan" "private_vlans" {
  name            = "test.bcr01a.seo01"
  datacenter      = "seo01"
  type            = "PRIVATE"
  router_hostname = "bcr01a.seo01"
  tags            = "test"   
 //User can increase timeouts
  timeouts {
    create = "1h"
    delete = "15m"
  }
}

@martinohansen
Copy link
Author

But they don't change the NotFoundChecks AFAIK?

@hkantare
Copy link
Collaborator

Yes those timeout can't change the NotFoundChecks...we are still looking into the code how can we control these NOtFoundChecks..Meanwhile did you get any update from Classic Infrastructure team why it takes around 80 mins for vlan approval

@martinohansen
Copy link
Author

Ticket is still in-progress, no real reply yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
service/Classic Infrastructure Issues related to classic Infrastructure
Projects
None yet
Development

No branches or pull requests

3 participants