Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resource/aws_emr_cluster: fix error on missing cluster #16924

Merged
merged 2 commits into from
Jan 6, 2021
Merged

resource/aws_emr_cluster: fix error on missing cluster #16924

merged 2 commits into from
Jan 6, 2021

Conversation

apetresc
Copy link
Contributor

@apetresc apetresc commented Dec 29, 2020

After a certain amount of time, destroyed EMR clusters completely disappear from AWS API output. They don't just display some deleted state, they're just gone. DescribeClusters that specifically target their ID return status 400 with ErrorCode: "NoSuchCluster".

Currently, if you have one of these clusters in a statefile and you never update it in between the time when it was deleted and when it is purged from the API, terraform crashes the next time you try to run it, with an error such as this one:

-----------------------------------------------------: timestamp=2020-12-29T15:33:16.926-0500
2020-12-29T15:33:16.926-0500 [INFO]  plugin.terraform-provider-aws: 2020/12/29 15:33:16 [DEBUG] [aws-sdk-go] {"__type":"InvalidRequestException","ErrorCode":"NoSuchCluster","Message":"Cluster id 'j-38DWY37V6BEMY' is not valid."}: timestamp=2020-12-29T15:33:16.926-0500
2020-12-29T15:33:16.926-0500 [INFO]  plugin.terraform-provider-aws: 2020/12/29 15:33:16 [DEBUG] [aws-sdk-go] DEBUG: Validate Response elasticmapreduce/DescribeCluster failed, attempt 0/25, error InvalidReques

tException: Cluster id 'j-38DWY37V6BEMY' is not valid.
{
  RespMetadata: {
    StatusCode: 400,
    RequestID: "981d28dd-c27d-48cc-b351-b68edaaae134"
  },
  ErrorCode: "NoSuchCluster",
  Message_: "Cluster id 'j-38DWY37V6BEMY' is not valid."
}: timestamp=2020-12-29T15:33:16.926-0500
2020/12/29 15:33:16 [TRACE] vertex "aws_emr_cluster.cluster": visit complete
2020/12/29 15:33:16 [TRACE] vertex "aws_emr_cluster.cluster": dynamic subgraph encountered errors
2020/12/29 15:33:16 [TRACE] vertex "aws_emr_cluster.cluster": visit complete
2020/12/29 15:33:16 [TRACE] vertex "aws_emr_cluster.cluster (expand)": dynamic subgraph encountered errors
2020/12/29 15:33:16 [TRACE] vertex "aws_emr_cluster.cluster (expand)": visit complete
2020/12/29 15:33:16 [TRACE] dag/walk: upstream of "meta.count-boundary (EachMode fixup)" errored, so skipping
2020/12/29 15:33:16 [TRACE] dag/walk: upstream of "provider[\"registry.terraform.io/hashicorp/aws\"] (close)" errored, so skipping
2020/12/29 15:33:16 [TRACE] dag/walk: upstream of "root" errored, so skipping
2020/12/29 15:33:16 [INFO] backend/local: plan operation completed

This has been reported in, e.g, #7783.

This patch fixes that.

  • Tested manually.

Community Note

  • Please vote on this pull request by adding a 👍 reaction to the original pull request comment to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for pull request followers and do not help prioritize the request

Closes #7783.

Release note for CHANGELOG:

Fixes a crash when planning with a statefile that includes a very old deleted cluster

After a certain amount of time, destroyed EMR clusters completely
disappear from AWS API output. They don't just display some deleted
state, they're just gone.

Currently, if you have one of these clusters in a statefile and you
never update it in between the time when it was deleted and when it is
purged from the API, terraform crashes the next time you try to run it.
This patch fixes that.
@apetresc apetresc requested a review from a team as a code owner December 29, 2020 22:47
@ghost ghost added size/XS Managed by automation to categorize the size of a PR. service/emr Issues and PRs that pertain to the emr service. labels Dec 29, 2020
@github-actions github-actions bot added the needs-triage Waiting for first response or review from a maintainer. label Dec 29, 2020
@apetresc
Copy link
Contributor Author

(Note: the build updated a transitive dependency in go.mod/go.sum, I'm not sure if those changes are meant to be included as part of the PR or if those things get updated periodically on master. Let me know if I should amend it! 🙂)

@bflad bflad added bug Addresses a defect in current functionality. and removed needs-triage Waiting for first response or review from a maintainer. labels Jan 4, 2021
@bflad bflad self-assigned this Jan 4, 2021
@bflad bflad added this to the v3.23.0 milestone Jan 4, 2021
Copy link
Contributor

@bflad bflad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add a code comment in there since the message matching could mask some legitimate request errors, but as-is this follows our typical handling of this situation. Thanks, @apetresc 🚀

Output from acceptance testing:

--- PASS: TestAccAWSEMRCluster_configurationsJson (414.06s)
--- PASS: TestAccAWSEMRCluster_disappears (458.00s)
--- PASS: TestAccAWSEMRCluster_Kerberos_ClusterDedicatedKdc (458.66s)
--- PASS: TestAccAWSEMRCluster_additionalInfo (468.10s)
--- PASS: TestAccAWSEMRCluster_security_config (543.09s)
--- PASS: TestAccAWSEMRCluster_CoreInstanceGroup_InstanceCount (551.34s)
--- PASS: TestAccAWSEMRCluster_basic (611.02s)
--- PASS: TestAccAWSEMRCluster_CoreInstanceGroup_AutoscalingPolicy (643.71s)
--- PASS: TestAccAWSEMRCluster_Step_Multiple (682.41s)
--- PASS: TestAccAWSEMRCluster_keepJob (413.06s)
--- PASS: TestAccAWSEMRCluster_terminationProtected (466.42s)
--- PASS: TestAccAWSEMRCluster_Step_Basic (922.25s)
--- PASS: TestAccAWSEMRCluster_CoreInstanceGroup_InstanceType (922.69s)
--- PASS: TestAccAWSEMRCluster_Ec2Attributes_DefaultManagedSecurityGroups (923.50s)
--- PASS: TestAccAWSEMRCluster_MasterInstanceGroup_BidPrice (958.49s)
--- PASS: TestAccAWSEMRCluster_Step_ConfigMode (966.07s)
--- PASS: TestAccAWSEMRCluster_MasterInstanceGroup_InstanceType (987.75s)
--- PASS: TestAccAWSEMRCluster_MasterInstanceGroup_Name (989.32s)
--- PASS: TestAccAWSEMRCluster_CoreInstanceGroup_BidPrice (1020.07s)
--- PASS: TestAccAWSEMRCluster_CoreInstanceGroup_Name (1030.37s)
--- PASS: TestAccAWSEMRCluster_s3Logging (562.82s)
--- PASS: TestAccAWSEMRCluster_step_concurrency_level (471.16s)
--- PASS: TestAccAWSEMRCluster_ebs_config (449.44s)
--- PASS: TestAccAWSEMRCluster_MasterInstanceGroup_InstanceCount (1117.85s)
--- PASS: TestAccAWSEMRCluster_custom_ami_id (475.35s)
--- PASS: TestAccAWSEMRCluster_visibleToAllUsers (700.80s)
--- PASS: TestAccAWSEMRCluster_tags (734.40s)
--- PASS: TestAccAWSEMRCluster_root_volume_size (754.76s)
--- PASS: TestAccAWSEMRCluster_instance_fleet_master_only (456.76s)
--- PASS: TestAccAWSEMRCluster_bootstrap_ordering (1343.54s)
--- PASS: TestAccAWSEMRCluster_instance_fleet (563.45s)

aws/resource_aws_emr_cluster.go Show resolved Hide resolved
@bflad bflad merged commit a46304c into hashicorp:master Jan 6, 2021
bflad added a commit that referenced this pull request Jan 6, 2021
@ghost
Copy link

ghost commented Jan 8, 2021

This has been released in version 3.23.0 of the Terraform AWS provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template for triage. Thanks!

@ghost
Copy link

ghost commented Feb 5, 2021

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thanks!

@ghost ghost locked as resolved and limited conversation to collaborators Feb 5, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Addresses a defect in current functionality. service/emr Issues and PRs that pertain to the emr service. size/XS Managed by automation to categorize the size of a PR.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

emr: terraform state containing very old terminated cluster causes plan to fail
2 participants