Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Attempting to create a Virtual Node with ACM cert in Failed status returns HTTP 500 #180

Closed
ewbankkit opened this issue Mar 27, 2020 · 5 comments
Assignees
Labels
Bug Something isn't working

Comments

@ewbankkit
Copy link

SECURITY NOTICE: If you think you’ve found a potential security issue, please do not post it in the Issues. Instead, please follow the instructions here or email AWS security directly.

Summary
If I try and create an App Mesh Virtual Node with a TLS listener whose ACM certificate is in Failed status then the API call returns an HTTP 500:

2020-03-27T16:27:41.200-0400 [DEBUG] plugin.terraform-provider-aws: 2020/03/27 16:27:41 [DEBUG] [aws-sdk-go] DEBUG: Request App Mesh/CreateVirtualNode Details:
2020-03-27T16:27:41.200-0400 [DEBUG] plugin.terraform-provider-aws: ---[ REQUEST POST-SIGN ]-----------------------------
2020-03-27T16:27:41.201-0400 [DEBUG] plugin.terraform-provider-aws: PUT /v20190125/meshes/test001/virtualNodes HTTP/1.1
2020-03-27T16:27:41.201-0400 [DEBUG] plugin.terraform-provider-aws: Host: appmesh.us-west-2.amazonaws.com
2020-03-27T16:27:41.201-0400 [DEBUG] plugin.terraform-provider-aws: User-Agent: aws-sdk-go/1.29.24 (go1.13.3; linux; amd64) APN/1.0 HashiCorp/1.0 Terraform/0.11+compatible (+https://www.terraform.io)
2020-03-27T16:27:41.201-0400 [DEBUG] plugin.terraform-provider-aws: Content-Length: 463
2020-03-27T16:27:41.201-0400 [DEBUG] plugin.terraform-provider-aws: Authorization: AWS4-HMAC-SHA256 Credential=AKIAIRV5XPPAX43M3Z5Q/20200327/us-west-2/appmesh/aws4_request, SignedHeaders=content-length;content-type;host;x-amz-date, Signature=**REDACTED**
2020-03-27T16:27:41.201-0400 [DEBUG] plugin.terraform-provider-aws: Content-Type: application/json
2020-03-27T16:27:41.201-0400 [DEBUG] plugin.terraform-provider-aws: X-Amz-Date: 20200327T202741Z
2020-03-27T16:27:41.201-0400 [DEBUG] plugin.terraform-provider-aws: Accept-Encoding: gzip
2020-03-27T16:27:41.201-0400 [DEBUG] plugin.terraform-provider-aws: 
2020-03-27T16:27:41.201-0400 [DEBUG] plugin.terraform-provider-aws: -----------------------------------------------------
2020-03-27T16:27:41.729-0400 [DEBUG] plugin.terraform-provider-aws: 2020/03/27 16:27:41 [DEBUG] [aws-sdk-go] DEBUG: Response App Mesh/CreateVirtualNode Details:
2020-03-27T16:27:41.729-0400 [DEBUG] plugin.terraform-provider-aws: ---[ RESPONSE ]--------------------------------------
2020-03-27T16:27:41.729-0400 [DEBUG] plugin.terraform-provider-aws: HTTP/1.1 500 Internal Server Error
2020-03-27T16:27:41.729-0400 [DEBUG] plugin.terraform-provider-aws: Connection: close
2020-03-27T16:27:41.729-0400 [DEBUG] plugin.terraform-provider-aws: Content-Length: 47
2020-03-27T16:27:41.729-0400 [DEBUG] plugin.terraform-provider-aws: Content-Type: application/json
2020-03-27T16:27:41.729-0400 [DEBUG] plugin.terraform-provider-aws: Date: Fri, 27 Mar 2020 20:27:41 GMT
2020-03-27T16:27:41.730-0400 [DEBUG] plugin.terraform-provider-aws: Server: envoy
2020-03-27T16:27:41.730-0400 [DEBUG] plugin.terraform-provider-aws: X-Amzn-Errortype: InternalServerErrorException:http://internal.amazon.com/coral/com.amazonaws.lattice.v20190125/
2020-03-27T16:27:41.730-0400 [DEBUG] plugin.terraform-provider-aws: X-Amzn-Requestid: 182a78e7-bc74-4175-b48b-193eece8e39c
2020-03-27T16:27:41.730-0400 [DEBUG] plugin.terraform-provider-aws: X-Envoy-Upstream-Service-Time: 124
2020-03-27T16:27:41.730-0400 [DEBUG] plugin.terraform-provider-aws: 
2020-03-27T16:27:41.730-0400 [DEBUG] plugin.terraform-provider-aws: 
2020-03-27T16:27:41.730-0400 [DEBUG] plugin.terraform-provider-aws: -----------------------------------------------------
2020-03-27T16:27:41.730-0400 [DEBUG] plugin.terraform-provider-aws: 2020/03/27 16:27:41 [DEBUG] [aws-sdk-go] {"message":"An internal server error occurred"}

Steps to Reproduce
Relevant Terraform code:

resource "aws_appmesh_virtual_node" "foo" {
  name      = "test003"
  mesh_name = "${aws_appmesh_mesh.test.id}"

  spec {
    backend {
      virtual_service {
        virtual_service_name = "servicea.simpleapp.local"
      }
    }

    listener {
      port_mapping {
        port     = 8080
        protocol = "http"
      }

      tls {
        certificate {
          acm {
            certificate_arn = "${aws_acm_certificate.cert.arn}"
          }
        }

        mode = "STRICT"
      }
    }

    service_discovery {
      dns {
        hostname = "serviceb.simpleapp.local"
      }
    }
  }
}

The ACM cert referenced by aws_acm_certificate.cert.arn is in Failed state (because the PCA hasn't had its CA cert installed yet).

Are you currently working around this issue?
Installing CA cert on the PCA and reissuing the cert.

Additional context
I have discovered this in the context of adding support to the Terraform AWS Provider: hashicorp/terraform-provider-aws#12541.

The current behaviour causes Terraform to go into a minutes long retry loop with an eventual failure and no useful error message.

Attachments
If you think you might have additional information that you'd like to include via an attachment, please do - we'll take a look. (Remember to remove any personally-identifiable information.)

@ewbankkit ewbankkit added the Bug Something isn't working label Mar 27, 2020
@bigdefect bigdefect self-assigned this Mar 27, 2020
@bigdefect
Copy link
Contributor

Thanks for the detail! I was just looking into this. So to confirm:

  1. Create a PCA, don't install a cert
  2. Attempt to issue a cert from that PCA, should go into Failed but still generate an ARN
  3. Try to create a virtual node with that cert

That look right?

@ewbankkit
Copy link
Author

@efe-selcuk Yes, correct.

@bigdefect
Copy link
Contributor

Ok, I'm able to repro. Looks like we aren't correctly handling certificates that aren't in the ISSUED state, fix in progress, we'll throw a helpful error.

@bigdefect
Copy link
Contributor

The fix is rolling out to production regions, should be global by end of the week.

@bigdefect
Copy link
Contributor

We had an unrelated hiccup delaying the deployment to me-south-1 and ap-east-1, but otherwise the fix is deployed globally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants