Skip to content

Conversation

@djglaser
Copy link
Contributor

@djglaser djglaser commented Jul 17, 2025

Rollback Plan

If a change needs to be reverted, we will publish an updated version of the library.

Changes to Security Controls

Are there any changes to security controls (access controls, encryption, logging) in this pull request? If so, explain.
No

Description

Adds support for blue/green deployment in ECS Service, enabling users to perform controlled production releases with testing and validation.

Key Changes:

  • Added new deployment configuration attributes:
    • strategy - Supports BLUE_GREEN in addition to existing ROLLING strategy
    • bake_time_in_minutes - Configurable bake time (0-1440 minutes) for blue/green deployments
    • lifecycle_hook - Lambda function hooks for validating deployment stages
  • Added load balancer advanced configuration for blue/green routing:
    • Support for alternate target groups
    • Production and test listener rules
    • IAM role for target group operations
  • Applies Service Connect test_traffic_rules.

Stabilization logic

  • Enhanced deployment status monitoring for ECS deployment controller when customer has set wait_for_steady_state = true.
  • Stabilization logic now checks deployment status vs. just task count
  • Ensures lifecycle hooks complete during creation
  • Validates bake time completion during updates
  • Surfaces deployment failures and rollbacks
  • Flow diagram:
image

Testing:

  • Added comprehensive test cases covering:
    • Basic blue/green configuration
    • Lifecycle hook behaviors
    • Circuit breaker rollbacks
    • Strategy changes
    • Failure scenarios

Relations

Closes #43431.

References

Output from Acceptance Testing

% make testacc TESTS=TestAccECSService_BlueGreenDeployment PKG=ecs
make: Verifying source code with gofmt...
==> Checking that code complies with gofmt requirements...
TF_ACC=1 go1.24.5 test ./internal/service/ecs/... -v -count 1 -parallel 3  -run='TestAccECSService_BlueGreenDeployment'  -timeout 360m -vet=off
2025/07/17 11:00:47 Creating Terraform AWS Provider (SDKv2-style)...
2025/07/17 11:00:47 Initializing Terraform AWS Provider (SDKv2-style)...
=== RUN   TestAccECSService_BlueGreenDeployment_basic
=== PAUSE TestAccECSService_BlueGreenDeployment_basic
=== RUN   TestAccECSService_BlueGreenDeployment_circuitBreakerRollback
=== PAUSE TestAccECSService_BlueGreenDeployment_circuitBreakerRollback
=== RUN   TestAccECSService_BlueGreenDeployment_createFailure
=== PAUSE TestAccECSService_BlueGreenDeployment_createFailure
=== RUN   TestAccECSService_BlueGreenDeployment_changeStrategy
=== PAUSE TestAccECSService_BlueGreenDeployment_changeStrategy
=== RUN   TestAccECSService_BlueGreenDeployment_updateFailure
=== PAUSE TestAccECSService_BlueGreenDeployment_updateFailure
=== RUN   TestAccECSService_BlueGreenDeployment_waitServiceActive
=== PAUSE TestAccECSService_BlueGreenDeployment_waitServiceActive
=== CONT  TestAccECSService_BlueGreenDeployment_basic
=== CONT  TestAccECSService_BlueGreenDeployment_changeStrategy
=== CONT  TestAccECSService_BlueGreenDeployment_waitServiceActive
--- PASS: TestAccECSService_BlueGreenDeployment_waitServiceActive (926.77s)
=== CONT  TestAccECSService_BlueGreenDeployment_updateFailure
--- PASS: TestAccECSService_BlueGreenDeployment_basic (1387.12s)
=== CONT  TestAccECSService_BlueGreenDeployment_createFailure
--- PASS: TestAccECSService_BlueGreenDeployment_changeStrategy (1526.22s)
=== CONT  TestAccECSService_BlueGreenDeployment_circuitBreakerRollback
--- PASS: TestAccECSService_BlueGreenDeployment_createFailure (362.02s)
--- PASS: TestAccECSService_BlueGreenDeployment_updateFailure (1081.02s)
--- PASS: TestAccECSService_BlueGreenDeployment_circuitBreakerRollback (3065.83s)
PASS
ok      github.com/hashicorp/terraform-provider-aws/internal/service/ecs        4596.810s

@djglaser djglaser requested a review from a team as a code owner July 17, 2025 14:34
@github-actions
Copy link
Contributor

Community Guidelines

This comment is added to every new Pull Request to provide quick reference to how the Terraform AWS Provider is maintained. Please review the information below, and thank you for contributing to the community that keeps the provider thriving! 🚀

Voting for Prioritization

  • Please vote on this Pull Request by adding a 👍 reaction to the original post to help the community and maintainers prioritize it.
  • Please see our prioritization guide for additional information on how the maintainers handle prioritization.
  • Please do not leave +1 or other comments that do not add relevant new information or questions; they generate extra noise for others following the Pull Request and do not help prioritize the request.

Pull Request Authors

  • Review the contribution guide relating to the type of change you are making to ensure all of the necessary steps have been taken.
  • Whether or not the branch has been rebased will not impact prioritization, but doing so is always a welcome surprise.

@github-actions
Copy link
Contributor

github-actions bot commented Jul 17, 2025

✅ Thank you for correcting the previously detected issues! The maintainers appreciate your efforts to make the review process as smooth as possible.

@github-actions github-actions bot added needs-triage Waiting for first response or review from a maintainer. tests PRs: expanded test coverage. Issues: expanded coverage, enhancements to test infrastructure. service/ecs Issues and PRs that pertain to the ecs service. size/XL Managed by automation to categorize the size of a PR. labels Jul 17, 2025
@jar-b jar-b added partner Contribution from a partner. and removed needs-triage Waiting for first response or review from a maintainer. labels Jul 17, 2025
@github-actions github-actions bot added the documentation Introduces or discusses updates to documentation. label Jul 17, 2025
jar-b
jar-b previously approved these changes Jul 17, 2025
Copy link
Member

@jar-b jar-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🎉

% make testacc PKG=ecs TESTS=TestAccECSService_
make: Verifying source code with gofmt...
==> Checking that code complies with gofmt requirements...
TF_ACC=1 go1.24.5 test ./internal/service/ecs/... -v -count 1 -parallel 20 -run='TestAccECSService_'  -timeout 360m -vet=off
2025/07/17 12:05:39 Creating Terraform AWS Provider (SDKv2-style)...
2025/07/17 12:05:39 Initializing Terraform AWS Provider (SDKv2-style)...

--- PASS: TestAccECSService_DaemonSchedulingStrategy_setDeploymentMinimum (36.29s)
=== CONT  TestAccECSService_ServiceRegistries_basic
--- PASS: TestAccECSService_DaemonSchedulingStrategy_basic (48.34s)
=== CONT  TestAccECSService_ServiceRegistries_container
--- PASS: TestAccECSService_PlacementConstraints_basic (61.18s)
=== CONT  TestAccECSService_PlacementStrategy_unnormalized
--- PASS: TestAccECSService_iamRole (64.07s)
=== CONT  TestAccECSService_CapacityProviderStrategy_update
--- PASS: TestAccECSService_Tags_basic (72.03s)
=== CONT  TestAccECSService_CapacityProviderStrategy_forceNewDeployment
--- PASS: TestAccECSService_VolumeConfiguration_basic (74.02s)
=== CONT  TestAccECSService_CapacityProviderStrategy_basic
--- PASS: TestAccECSService_LaunchTypeEC2_network (76.06s)
=== CONT  TestAccECSService_LaunchTypeFargate_basic
--- PASS: TestAccECSService_Tags_managed (81.66s)
=== CONT  TestAccECSService_ServiceConnect_full
--- PASS: TestAccECSService_familyAndRevision (93.41s)
=== CONT  TestAccECSService_LatticeConfigurations
--- PASS: TestAccECSService_basic (96.29s)
=== CONT  TestAccECSService_disappears
--- PASS: TestAccECSService_PlacementStrategy_unnormalized (67.63s)
=== CONT  TestAccECSService_Identity_RegionOverride
--- PASS: TestAccECSService_VolumeConfiguration_update (133.11s)
=== CONT  TestAccECSService_Identity_Basic
=== NAME  TestAccECSService_LaunchTypeFargate_basic
    service_test.go:1655: Error running post-test destroy, there may be dangling resources: exit status 1

        Error: deleting ENIs for EC2 Subnet (subnet-0d2a78be5e83f6a2c): listing EC2 Network Interfaces: operation error EC2: DescribeNetworkInterfaces, https response error StatusCode: 0, RequestID: , request send failed, Post "https://ec2.us-east-2.amazonaws.com/": dial tcp: lookup ec2.us-east-2.amazonaws.com: no such host


        Error: deleting ENIs using Security Group (sg-01fc7970e2c4dc9aa): listing EC2 Network Interfaces: operation error EC2: DescribeNetworkInterfaces, https response error StatusCode: 0, RequestID: , request send failed, Post "https://ec2.us-east-2.amazonaws.com/": dial tcp: lookup ec2.us-east-2.amazonaws.com: no such host


        Error: deleting ENIs using Security Group (sg-0e85c200e6201943a): listing EC2 Network Interfaces: operation error EC2: DescribeNetworkInterfaces, https response error StatusCode: 0, RequestID: , request send failed, Post "https://ec2.us-east-2.amazonaws.com/": dial tcp: lookup ec2.us-east-2.amazonaws.com: no such host

--- FAIL: TestAccECSService_LaunchTypeFargate_basic (69.77s)
=== CONT  TestAccECSService_executeCommand
--- PASS: TestAccECSService_ServiceConnect_remove (160.58s)
=== CONT  TestAccECSService_AvailabilityZoneRebalancing
--- PASS: TestAccECSService_LaunchTypeFargate_waitForSteadyState (167.47s)
=== CONT  TestAccECSService_ServiceRegistries_removal
--- PASS: TestAccECSService_ServiceConnect_tls_with_empty_timeout (171.27s)
=== CONT  TestAccECSService_Tags_propagate
--- PASS: TestAccECSService_ServiceRegistries_basic (137.88s)
=== CONT  TestAccECSService_forceNewDeployment
--- PASS: TestAccECSService_CapacityProviderStrategy_forceNewDeployment (106.91s)
=== CONT  TestAccECSService_PlacementStrategy_missing
--- PASS: TestAccECSService_PlacementStrategy_missing (0.92s)
=== CONT  TestAccECSService_PlacementStrategy_basic
--- PASS: TestAccECSService_ServiceConnect_ingressPortOverride (181.06s)
=== CONT  TestAccECSService_forceNewDeploymentTriggers
--- PASS: TestAccECSService_ServiceRegistries_container (136.76s)
=== CONT  TestAccECSService_alb
--- PASS: TestAccECSService_Identity_RegionOverride (58.62s)
=== CONT  TestAccECSService_multipleTargetGroups
--- PASS: TestAccECSService_LaunchTypeFargate_updateWaitForSteadyState (191.85s)
=== CONT  TestAccECSService_clusterName
--- PASS: TestAccECSService_Identity_Basic (60.00s)
=== CONT  TestAccECSService_VolumeConfiguration_tagSpecifications
--- PASS: TestAccECSService_executeCommand (55.38s)
=== CONT  TestAccECSService_VolumeConfiguration_volumeInitializationRate
--- PASS: TestAccECSService_ServiceRegistries_changes (232.94s)
=== CONT  TestAccECSService_BlueGreenDeployment_createFailure
--- PASS: TestAccECSService_AvailabilityZoneRebalancing (74.49s)
=== CONT  TestAccECSService_deploymentCircuitBreaker
--- PASS: TestAccECSService_forceNewDeployment (60.99s)
=== CONT  TestAccECSService_DeploymentValues_minZeroMaxOneHundred
--- PASS: TestAccECSService_Tags_propagate (68.55s)
=== CONT  TestAccECSService_DeploymentValues_basic
--- PASS: TestAccECSService_forceNewDeploymentTriggers (61.62s)
=== CONT  TestAccECSService_DeploymentConfiguration_strategy
--- PASS: TestAccECSService_ServiceConnect_full (161.85s)
=== CONT  TestAccECSService_BlueGreenDeployment_waitServiceActive
--- PASS: TestAccECSService_CapacityProviderStrategy_basic (171.16s)
=== CONT  TestAccECSService_BlueGreenDeployment_updateFailure
--- PASS: TestAccECSService_clusterName (59.29s)
=== CONT  TestAccECSService_BlueGreenDeployment_changeStrategy
--- PASS: TestAccECSService_VolumeConfiguration_tagSpecifications (72.26s)
=== CONT  TestAccECSService_LaunchTypeFargate_platformVersion
--- PASS: TestAccECSService_loadBalancerChanges (268.18s)
=== CONT  TestAccECSService_VolumeConfiguration_throughputTypeChange
--- PASS: TestAccECSService_disappears (177.60s)
=== CONT  TestAccECSService_alarmsAdd
--- PASS: TestAccECSService_ServiceRegistries_removal (110.82s)
=== CONT  TestAccECSService_BlueGreenDeployment_circuitBreakerRollback
--- PASS: TestAccECSService_PlacementStrategy_basic (99.91s)
=== CONT  TestAccECSService_PlacementConstraints_emptyExpression
--- PASS: TestAccECSService_DeploymentControllerType_codeDeploy (286.41s)
=== CONT  TestAccECSService_BlueGreenDeployment_basic
--- PASS: TestAccECSService_healthCheckGracePeriodSeconds (300.92s)
=== CONT  TestAccECSService_alarmsUpdate
--- PASS: TestAccECSService_DeploymentValues_minZeroMaxOneHundred (67.24s)
=== CONT  TestAccECSService_replicaSchedulingStrategy
--- PASS: TestAccECSService_deploymentCircuitBreaker (67.41s)
=== CONT  TestAccECSService_DeploymentControllerType_external
--- PASS: TestAccECSService_DeploymentValues_basic (66.83s)
=== CONT  TestAccECSService_DeploymentControllerType_codeDeployUpdateDesiredCountAndHealthCheckGracePeriod
--- PASS: TestAccECSService_VolumeConfiguration_volumeInitializationRate (110.07s)
=== CONT  TestAccECSService_DeploymentControllerMutability_codeDeployToECS
--- PASS: TestAccECSService_DeploymentConfiguration_strategy (82.42s)
=== CONT  TestAccECSService_ServiceConnect_basic
--- PASS: TestAccECSService_DeploymentControllerType_external (26.93s)
--- PASS: TestAccECSService_LaunchTypeFargate_platformVersion (70.95s)
--- PASS: TestAccECSService_PlacementConstraints_emptyExpression (67.70s)
--- PASS: TestAccECSService_alarmsAdd (75.21s)
--- PASS: TestAccECSService_CapacityProviderStrategy_update (294.51s)
--- PASS: TestAccECSService_VolumeConfiguration_throughputTypeChange (93.67s)
--- PASS: TestAccECSService_alarmsUpdate (65.00s)
--- PASS: TestAccECSService_replicaSchedulingStrategy (67.45s)
--- PASS: TestAccECSService_multipleTargetGroups (265.13s)
--- PASS: TestAccECSService_alb (271.57s)
--- PASS: TestAccECSService_ServiceConnect_basic (146.62s)
--- PASS: TestAccECSService_DeploymentControllerMutability_codeDeployToECS (241.97s)
--- PASS: TestAccECSService_BlueGreenDeployment_createFailure (347.61s)
--- PASS: TestAccECSService_LatticeConfigurations (599.47s)
--- PASS: TestAccECSService_DeploymentControllerType_codeDeployUpdateDesiredCountAndHealthCheckGracePeriod (477.81s)
--- PASS: TestAccECSService_BlueGreenDeployment_waitServiceActive (931.89s)
--- PASS: TestAccECSService_BlueGreenDeployment_updateFailure (1086.95s)
--- PASS: TestAccECSService_BlueGreenDeployment_basic (1416.11s)
--- PASS: TestAccECSService_BlueGreenDeployment_changeStrategy (1541.21s)
--- PASS: TestAccECSService_BlueGreenDeployment_circuitBreakerRollback (3347.24s)
FAIL
FAIL    github.com/hashicorp/terraform-provider-aws/internal/service/ecs        3631.560s

Failure is transient and unrelated to this change.

@jar-b jar-b self-assigned this Jul 17, 2025
@github-actions github-actions bot added the prioritized Part of the maintainer teams immediate focus. To be addressed within the current quarter. label Jul 17, 2025
Copy link
Contributor

@ewbankkit ewbankkit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🚀.

% make testacc TESTARGS='-run=TestAccECSService_BlueGreenDeployment_\|TestAccECSService_basic' PKG=ecs ACCTEST_PARALLELISM=3
make: Verifying source code with gofmt...
==> Checking that code complies with gofmt requirements...
TF_ACC=1 go1.24.5 test ./internal/service/ecs/... -v -count 1 -parallel 3  -run=TestAccECSService_BlueGreenDeployment_\|TestAccECSService_basic -timeout 360m -vet=off
2025/07/17 14:48:31 Creating Terraform AWS Provider (SDKv2-style)...
2025/07/17 14:48:31 Initializing Terraform AWS Provider (SDKv2-style)...
=== RUN   TestAccECSService_basic
=== PAUSE TestAccECSService_basic
=== RUN   TestAccECSService_BlueGreenDeployment_basic
=== PAUSE TestAccECSService_BlueGreenDeployment_basic
=== RUN   TestAccECSService_BlueGreenDeployment_circuitBreakerRollback
=== PAUSE TestAccECSService_BlueGreenDeployment_circuitBreakerRollback
=== RUN   TestAccECSService_BlueGreenDeployment_createFailure
=== PAUSE TestAccECSService_BlueGreenDeployment_createFailure
=== RUN   TestAccECSService_BlueGreenDeployment_changeStrategy
=== PAUSE TestAccECSService_BlueGreenDeployment_changeStrategy
=== RUN   TestAccECSService_BlueGreenDeployment_updateFailure
=== PAUSE TestAccECSService_BlueGreenDeployment_updateFailure
=== RUN   TestAccECSService_BlueGreenDeployment_waitServiceActive
=== PAUSE TestAccECSService_BlueGreenDeployment_waitServiceActive
=== CONT  TestAccECSService_basic
=== CONT  TestAccECSService_BlueGreenDeployment_changeStrategy
=== CONT  TestAccECSService_BlueGreenDeployment_circuitBreakerRollback
--- PASS: TestAccECSService_basic (74.02s)
=== CONT  TestAccECSService_BlueGreenDeployment_createFailure
--- PASS: TestAccECSService_BlueGreenDeployment_createFailure (340.37s)
=== CONT  TestAccECSService_BlueGreenDeployment_waitServiceActive
--- PASS: TestAccECSService_BlueGreenDeployment_waitServiceActive (958.40s)
=== CONT  TestAccECSService_BlueGreenDeployment_basic
--- PASS: TestAccECSService_BlueGreenDeployment_changeStrategy (1590.18s)
=== CONT  TestAccECSService_BlueGreenDeployment_updateFailure
--- PASS: TestAccECSService_BlueGreenDeployment_basic (1418.32s)
--- PASS: TestAccECSService_BlueGreenDeployment_updateFailure (1322.56s)
--- PASS: TestAccECSService_BlueGreenDeployment_circuitBreakerRollback (3466.86s)
PASS
ok  	github.com/hashicorp/terraform-provider-aws/internal/service/ecs	3472.166s

@jar-b
Copy link
Member

jar-b commented Jul 17, 2025

Thanks for your contribution, @djglaser! 👍

@jar-b jar-b merged commit fcc5988 into hashicorp:main Jul 17, 2025
47 checks passed
@github-actions
Copy link
Contributor

Warning

This Issue has been closed, meaning that any additional comments are much easier for the maintainers to miss. Please assume that the maintainers will not see them.

Ongoing conversations amongst community members are welcome, however, the issue will be locked after 30 days. Moving conversations to another venue, such as the AWS Provider forum, is recommended. If you have additional concerns, please open a new issue, referencing this one where needed.

@github-actions github-actions bot added this to the v6.4.0 milestone Jul 17, 2025
@github-actions
Copy link
Contributor

This functionality has been released in v6.4.0 of the Terraform AWS Provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template. Thank you!

@github-actions
Copy link
Contributor

I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 17, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

documentation Introduces or discusses updates to documentation. partner Contribution from a partner. service/ecs Issues and PRs that pertain to the ecs service. size/XL Managed by automation to categorize the size of a PR. tests PRs: expanded test coverage. Issues: expanded coverage, enhancements to test infrastructure.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

r/aws_ecs_service: Support deployment configuration strategy

4 participants