Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue destroying EMR on AWS #3465

Closed
ghost opened this issue Feb 20, 2018 · 12 comments · Fixed by #12578
Closed

Issue destroying EMR on AWS #3465

ghost opened this issue Feb 20, 2018 · 12 comments · Fixed by #12578
Labels
bug Addresses a defect in current functionality. service/emr Issues and PRs that pertain to the emr service.
Milestone

Comments

@ghost
Copy link

ghost commented Feb 20, 2018

This issue was originally opened by @miloup as hashicorp/terraform#17330. It was migrated here as a result of the provider split. The original body of the issue is below.


Hi,

Many of us in many company are facing an issue while trying to destroy an EMR infrastructure on AWS.
The Terraform folder has an emr-requirements.tf file which contains Security-Groups, Security-Configuration,...etc. and an emr.tf file which creates the cluster using the configuration is the emr-requirements.tf

When running "terraform apply", the infrastructure is successfully created, but when running "terraform destroy", it seems that Terraform does not wait for the EMR cluster to terminate before destroying the remaining resources which leads to a failure (timeout) due to their dependencies to this cluster. The only way to have a clean "destroy" is by making sure that the EMR cluster is terminated (by checking on the AWS console for instance), then run "terraform destroy" again. Now, all the remaining resources are destroyed.

Would you please fix this bug?

Thanks

@bflad bflad added bug Addresses a defect in current functionality. service/emr Issues and PRs that pertain to the emr service. labels Feb 21, 2018
@FireballDWF
Copy link

I've observed this behavior as well.

@simonvanderveldt
Copy link
Contributor

We're experiencing the same issue, currently only with a security configuration. It seems like Terraform thinks the EMR cluster has been removed and continues to remove the other resources that the EMR cluster depended on, but the EMR cluster isn't actually removed yet, which causes the destroy of the security configuration to fail because it's still in use.

@Paul-Verardi
Copy link

We are facing the same issue as well, for the time being, our "solution" has been:

  1. Run Targeted destroy on EMR cluster only
  2. Wait until EMR cluster has been reported as terminated in the console
  3. Run normal destroy to delete the remaining resources

This causes an execution flow without errors, but it may be hard to fully automate this process with a script as step 2 above would require a CLI command to determine if the EMR cluster is terminated but I believe that CLI command is the same bugged command that is reporting prematurely back to terraform.

@mgasner
Copy link

mgasner commented Nov 19, 2018

👍

@miloup
Copy link

miloup commented Nov 19, 2018

Or, another way to do it is to use a shell script with a loop knowing that terraform will return a "0" exit code if it's finished with no error.

#!/bin/bash

terraform destroy
RETURN_CODE=$?
LOOP_COUNT=10      # try 11 times (1st time + 10 from the loop)

if [ "$RETURN_CODE" -ne 0 ] && [ "$LOOP_COUNT" -gt 0 ]; then
  LOOP_COUNT=$((LOOP_COUNT-1))
  echo "Failed to destroy the cluster. Retrying..."
  terraform destroy
  RETURN_CODE=$?
fi

# Display error message on the terminal if the destruction fails after 10 attempts.
if [ "$RETURN_CODE" -ne 0 ] && [ "$LOOP_COUNT" -eq 0 ]; then
  echo "Failed to destroy the cluster after $LOOP_COUNT attempts. Exiting."
  exit 1
fi

@ahujarajesh
Copy link

Hi,

Any updated by when it will be fixed?
My cluster recreates everytime :(

@plainsane
Copy link

im having this issue as well when i change the security configuration. it tries to destroy the old one and aws reports its still in use, if i wait 5 seconds and run again all is well

@aaditi30
Copy link

aaditi30 commented Mar 3, 2020

We have been facing this issue as well. While destroying EMR with terraform.
first destroy we get errors about
aws_emr_security_configuration.security_config: InvalidRequestException: Security configuration 'clustter-name-security-config' cannot be deleted because it is in use by active clusters
if we delete the cluster first and add another delete in the end after cluster has been destroyed
that creates even more module level errors
module.stack_module.output.master_lb_dns: Resource 'aws_lb.master_lb' does not have attribute 'dns_name' for variable 'aws_lb.master_lb.dns_name'
* module.stack_module.output.master_public_dns: Resource 'aws_emr_cluster.cluster' does not have attribute 'master_public_dns' for variable 'aws_emr_cluster.cluster.master_public_dns'
* module.stack_module.output.master_private_ip: Resource 'data.aws_instance.master_node' does not have attribute 'private_ip' for variable 'data.aws_instance.master_node.private_ip'
* module.stack_module.output.lb_security_group_id: Resource 'aws_security_group.emr_lb_security_group' does not have attribute 'id' for variable 'aws_security_group.emr_lb_security_group.id'
* module.stack_module.output.name: Resource 'aws_emr_cluster.cluster' does not have attribute 'name' for variable 'aws_emr_cluster.cluster.name'
* module.stack_module.output.master_url: Resource 'aws_route53_record.lb_cname' does not have attribute 'fqdn' for variable 'aws_route53_record.lb_cname.fqdn'
* module.stack_module.output.id: Resource 'aws_emr_cluster.cluster' does not have attribute 'id' for variable 'aws_emr_cluster.cluster.id'

so we can never create no error workflow. Cluster gets destroyed and its not hampering our work.

We use the terraform version:
Terraform v0.11.14

  • provider.aws v2.51.0
  • provider.template v2.1.2

IS this issue fixed in terraform version 0.12? or timeline on when it will be fixed. or any workarounds we can use to fixe this issue?

@annyip
Copy link

annyip commented Nov 16, 2020

any updates on this?

@annyip
Copy link

annyip commented Dec 3, 2020

Seems like a temporary fix is to use local-exec provisioner and issue a sleep and hope for the best that the cluster is done terminating before Terraform destrys the security configuration.

@github-actions
Copy link

This functionality has been released in v3.70.0 of the Terraform AWS Provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template. Thank you!

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 24, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Addresses a defect in current functionality. service/emr Issues and PRs that pertain to the emr service.
Projects
None yet
10 participants