Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(custom-resource-handler): auto-delete-[objects|images] breaks on cloudformation rollback #29581

Merged
merged 10 commits into from
Apr 19, 2024

Conversation

GavinZZ
Copy link
Contributor

@GavinZZ GavinZZ commented Mar 22, 2024

Issue # (if applicable)

Closes #27199

Reason for this change

Based on the way the custom resource is implemented, it is likely that unexpected behavior happens on Cloudformation rollback, i.e. the custom resource will prematurely delete the objects.

Consider the following scenario:

UPDATE target resource (replacement, creates a new resource)
UPDATE custom resource (old -> new, objects in old bucket are deleted)
(...stuff happens...)
ERROR, triggers a rollback
UPDATE custom resource (new -> old)
DELETE target resource (deletes the new resource, remembers the existing one)

We will have deleted objects in the bucket that has been rolled back to in this scenario, but the content is now gone.

Description of changes

Instead of deleting it right during update, we send back PhysicalResourceId in the event handler which if the id changes, it will let CFN to empty and delete the bucket at the end of the deployment.

Description of how you validated changes

New & updated tests. Also manually tested with deploying a template

const bucket = new s3.Bucket(this, 'Bucket', {
      removalPolicy: cdk.RemovalPolicy.DESTROY,
      bucketName: <a bucket name that's not used>,
      autoDeleteObjects: true,
    });

    // Intentionally failure since `mybucket-1` exists
    const bucket2 = new s3.Bucket(this, 'Bucket2', {
      removalPolicy: cdk.RemovalPolicy.DESTROY,
      bucketName: <a bucket name that's not used>,
    });

    bucket2.node.addDependency(bucket);

Once the deployment is successful, add some random content to the bucket, then update the code so that the first bucket's bucketName is updated to another valid name. Update the second bucket's bucketName to be an existing bucket name, which will trigger a deployment failure hence roll back.

After the change, the content will stay there if a deployment failure happens. The content & bucket will be deleted if deployment is successful.

Checklist


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license

@github-actions github-actions bot added bug This issue is a bug. effort/small Small work item – less than a day of effort p1 labels Mar 22, 2024
@aws-cdk-automation aws-cdk-automation requested a review from a team March 22, 2024 17:49
@mergify mergify bot added the contribution/core This is a PR that came from AWS. label Mar 22, 2024
Copy link
Collaborator

@aws-cdk-automation aws-cdk-automation left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pull request linter has failed. See the aws-cdk-automation comment below for failure reasons. If you believe this pull request should receive an exemption, please comment and provide a justification.

A comment requesting an exemption should contain the text Exemption Request. Additionally, if clarification is needed add Clarification Request to a comment.

@GavinZZ GavinZZ changed the title fix(s3|ecr|custom-resource-handler): auto-delete-[objects|images] breaks on cloudformation rollback fix(custom-resource-handler): auto-delete-[objects|images] breaks on cloudformation rollback Mar 22, 2024
@GavinZZ GavinZZ added the pr-linter/exempt-integ-test The PR linter will not require integ test changes label Mar 22, 2024
@aws-cdk-automation aws-cdk-automation dismissed their stale review March 22, 2024 18:13

✅ Updated pull request passes all PRLinter validations. Dismissing previous PRLinter review.

@GavinZZ GavinZZ force-pushed the yuanhaoz/custom_handler_rollback_fix branch 4 times, most recently from 2acaa3f to 16c524b Compare March 28, 2024 22:09
@GavinZZ GavinZZ marked this pull request as ready for review March 28, 2024 22:46
@GavinZZ GavinZZ force-pushed the yuanhaoz/custom_handler_rollback_fix branch from 16c524b to edc1af6 Compare April 18, 2024 22:24
from the event's PhysicalResourceId will trigger a `Delete` event for the custom
resource. The `Delete` event will trigger `onDelete` function which will
empty the content of the repository and then proceed to delete the repository. */
return { PhysicalResourceId: newRepositoryName };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the original return value from onDelete() and how was it used after this function call? Or does the return value not matter and it was just to perform the deletion as part of thie onUpdate()?

Also where is the logic that checks if the repository name has changed or not and triggers the deletion of the bucket at the end of the deployment?

Copy link
Contributor Author

@GavinZZ GavinZZ Apr 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

onDelete doesn't return any value so it doesn't matter. We used to directly call the delete methods defined in onDelete not not using the default behaviour CFN provided (which is to do the deletion for us if the id changes). We didn't need to return anything before since we're explicitly calling the delete. Now we want to leverage the default behaviour of CFN custom resource so we returns the new physical id.

That logic is default with in CFN custom resource. The value returned for a PhysicalResourceId can change custom resource update operations. If the value returned is the same, it is considered a normal update. If the value returned is different, AWS CloudFormation recognizes the update as a replacement and sends a delete request to the old resource. . https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/crpg-ref-requesttypes-delete.html

@paulhcsun
Copy link
Contributor

@Mergifyio update

Copy link
Contributor

mergify bot commented Apr 18, 2024

update

☑️ Nothing to do

  • #commits-behind>0 [📌 update requirement]
  • -closed [📌 update requirement]
  • -conflict [📌 update requirement]
  • queue-position=-1 [📌 update requirement]

@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildv2Project1C6BFA3F-wQm2hXv2jqQv
  • Commit ID: d5d4ff6
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

Copy link
Contributor

mergify bot commented Apr 19, 2024

Thank you for contributing! Your pull request will be updated from main and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork).

@GavinZZ GavinZZ merged commit 69ea52f into main Apr 19, 2024
10 of 11 checks passed
@GavinZZ GavinZZ deleted the yuanhaoz/custom_handler_rollback_fix branch April 19, 2024 19:34
@aws aws locked as resolved and limited conversation to collaborators Jul 25, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug This issue is a bug. contribution/core This is a PR that came from AWS. effort/small Small work item – less than a day of effort p1 pr-linter/exempt-integ-test The PR linter will not require integ test changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

s3|ecr: auto-delete-[objects|images] breaks on cloudformation rollback (suspected)
3 participants