Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
{
"Resources": {
"ASG": {
"Type": "AWS::AutoScaling::AutoScalingGroup",
"Properties": {
"MinSize": "1",
"MaxSize": "10",
"DesiredCapacity": "2"
},
"UpdatePolicy": {
"AutoScalingRollingUpdate": {
"MaxBatchSize": 2,
"MinInstancesInService": 1,
"MinSuccessfulInstancesPercent": 75,
"MinActiveInstancesPercent": 50,
"PauseTime": "PT5M",
"SuspendProcesses": ["HealthCheck", "ReplaceUnhealthy"],
"WaitOnResourceSignals": true
}
}
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend we have a integ test for this here with a similar UpdatePolicy json for the EC2 autoscaling resource

Original file line number Diff line number Diff line change
Expand Up @@ -655,6 +655,27 @@ describe('CDK Include', () => {
);
});

test('All AutoScalingRollingUpdate properties are correctly parsed', () => {
const cfnTemplate = includeTestTemplate(stack, 'autoscaling-rolling-update-all-properties.json');
const cfnAsg = cfnTemplate.getResource('ASG');

expect(cfnAsg.cfnOptions.updatePolicy).toBeDefined();
expect(cfnAsg.cfnOptions.updatePolicy?.autoScalingRollingUpdate).toBeDefined();

// Verify all properties are correctly parsed
expect(cfnAsg.cfnOptions.updatePolicy?.autoScalingRollingUpdate?.maxBatchSize).toBeDefined();
expect(cfnAsg.cfnOptions.updatePolicy?.autoScalingRollingUpdate?.minInstancesInService).toBeDefined();
expect(cfnAsg.cfnOptions.updatePolicy?.autoScalingRollingUpdate?.minSuccessfulInstancesPercent).toBeDefined();
expect(cfnAsg.cfnOptions.updatePolicy?.autoScalingRollingUpdate?.minActiveInstancesPercent).toBeDefined();
expect(cfnAsg.cfnOptions.updatePolicy?.autoScalingRollingUpdate?.pauseTime).toBeDefined();
expect(cfnAsg.cfnOptions.updatePolicy?.autoScalingRollingUpdate?.suspendProcesses).toBeDefined();
expect(cfnAsg.cfnOptions.updatePolicy?.autoScalingRollingUpdate?.waitOnResourceSignals).toBeDefined();

Template.fromStack(stack).templateMatches(
loadTestFileToJsObject('autoscaling-rolling-update-all-properties.json'),
);
});

test('Intrinsics can be used in the leaf nodes of autoscaling replacing update policy', () => {
const cfnTemplate = includeTestTemplate(stack, 'intrinsics-update-policy-autoscaling-replacing-update.json');
const cfnBucket = cfnTemplate.getResource('ASG');
Expand Down
7 changes: 7 additions & 0 deletions packages/aws-cdk-lib/core/lib/cfn-resource-policy.ts
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand we are doing this only for CfnInclude construct. Not a blocker, but may be it would be good we support this in the rollingUpdate method in aws-autoscaling module which will be used when UpdatePolicy.rollingUpdate() is called.

We could have it as a separate PR. But just calling this out here.

Original file line number Diff line number Diff line change
Expand Up @@ -187,6 +187,13 @@ export interface CfnAutoScalingRollingUpdate {
*/
readonly minSuccessfulInstancesPercent?: number;

/**
* Specifies the percentage of instances in an Auto Scaling group that must remain in service while AWS CloudFormation
* updates old instances. You can specify a value from 0 to 100. AWS CloudFormation rounds to the nearest tenth of a percent.
* For example, if you update five instances with a minimum active percentage of 50, three instances must remain in service.
*/
readonly minActiveInstancesPercent?: number;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some concerns about potential breaking changes:

Based on the CFN doc as indicated below,

Setting MinActiveInstancesPercent in your UpdatePolicy will also affect instances launched when the DesiredCapacity property of the AWS::AutoScaling::AutoScalingGroup resource is set higher than the current desired capacity of that Auto Scaling group.

Currently, minActiveInstancesPercent is silently ignored when parsing templates in CfnInclude, which means customers are effectively getting the default behavior (100% active instances). By adding direct support for this property, we could potentially impact existing customers in the following ways:

  1. Customers who have MinActiveInstancesPercent specified in their templates but it's currently being ignored would suddenly see different scaling behavior when their DesiredCapacity increases
  2. This could affect application availability during scaling events and might break assumptions in their deployment processes

// Example scenario

// Template has:
UpdatePolicy: {
  AutoScalingRollingUpdate: {
    MinActiveInstancesPercent: 50
  }
}
// Current behavior: 100% instances stay active
// New behavior with this change: Only 50% instances stay active

I think we should handle this with feature flag since it will change existing behaviour. Please help to check.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I don’t believe this will cause an immediate traffic disruption for customers—since it only affects the update behavior—users will only notice the change in the number of active instances during an update operation, not during regular traffic on the auto-scaling group. However, the impact could be on operations where previously 100% of in-service instances were required for success, but now will be marked as successfully updated earlier with fewer instances. To be cautious, we can introduce a feature flag, but even then, the default behavior should be to consider the minActiveInstancesPresent field.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I don't anticipate availability issues during rolling updates with this change, I think it will be good to verify the behaviour when enforcing the MinActiveInstancesPercent property during rolling updates. This property specifies the percentage of instances that must be in InService state relative to the ASG's desired capacity for an update to succeed. Currently, since this property is silently ignored, it effectively defaults to 100%. I'm thinking if an existing customer has specified MinActiveInstancesPercent as 50% with a desired capacity of 4, will enforcing this property allow the rolling update to proceed with only 2 active instances (50% of desired capacity), whereas currently it might be maintaining more instances in service during the update process?

I feel it is good to compare the current rolling update behavior against the behavior with this property enforced, specifically focusing on the number of InService instances available during the update process. For example, if the current behavior maintains 3 active instances during a rolling update, but with MinActiveInstancesPercent: 50 only maintains 2 active instances, this would indicate reduced availability as fewer instances would be handling the application traffic during the rolling update period. To be clear, this concern is specifically about availability during the rolling update event, not about the general availability of the application after the update.

If we can verify there's no reduction in the number of available instances during rolling updates compared to current behavior, I'm comfortable proceeding without a feature flag. However, if we cannot conclusively verify the availability impact, it would be better to implement this with a feature flag as a safety measure. Given that rolling update behavior is also influenced by multiple other UpdatePolicy properties working in combination (MinInstancesInService, MaxBatchSize, etc.), it's challenging to predict all possible scenarios.

Copy link
Contributor Author

@scorbiere scorbiere Mar 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TL;DR I agree to stay on the safe side and implement a feature flag, because we cannot guarantee the potential drop in availability will not negatively impact existing services.

Here is the analysis summary.

Current Behavior (Before Fix)

Currently, when a customer imports a CloudFormation template with MinActiveInstancesPercent using CfnInclude, this property is silently dropped during parsing. When the CDK app synthesizes the template, this property is missing from the output, effectively using the default value of 100%.

New Behavior (After Fix)

After the fix, when a customer imports a CloudFormation template with MinActiveInstancesPercent using CfnInclude, this property will be correctly parsed and included in the synthesized template, using the value specified in the original template.

Potential Impact

According to the AWS CloudFormation documentation on UpdatePolicy, the MinActiveInstancesPercent property affects:

  1. During Rolling Updates: It determines the percentage of instances that must remain in service during an update.
  2. When Scaling Up: It affects instances launched when the desired capacity increases.

If a customer had templates with MinActiveInstancesPercent set to a value lower than 100%, but this property was being dropped by CDK, they were effectively getting 100% active instances during updates. After the fix, they will get the percentage they specified, which could be lower and potentially affect application availability during updates.

Actions That Trigger Rolling Updates

The following actions can trigger a rolling update for an Auto Scaling group:

  1. Changes to the Launch Template or Launch Configuration
  2. Changes to the Auto Scaling Group Properties
  3. Stack Updates with an UpdatePolicy attribute
  4. Desired Capacity Changes when increased above the current capacity

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update

There is no need for adding a new feature flag, because the property MinActiveInstancesPercent is automatically added when using CfnInclude since this PR: #32321

For the sake of clarity, I think it is good to keep the changes which include the explicit handling of this property in the CfnParse code and the tests (unit and integ).


/**
* The amount of time that AWS CloudFormation pauses after making a change to a batch of instances to give those instances
* time to start software applications. For example, you might need to specify PauseTime when scaling up the number of
Expand Down
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I suggest we have a unit test for this parser here. Atleast for this parseAutoScalingRollingUpdate method

Original file line number Diff line number Diff line change
Expand Up @@ -457,6 +457,7 @@
rollUp.parseCase('maxBatchSize', FromCloudFormation.getNumber);
rollUp.parseCase('minInstancesInService', FromCloudFormation.getNumber);
rollUp.parseCase('minSuccessfulInstancesPercent', FromCloudFormation.getNumber);
rollUp.parseCase('minActiveInstancesPercent', FromCloudFormation.getNumber);

Check warning on line 460 in packages/aws-cdk-lib/core/lib/helpers-internal/cfn-parse.ts

View check run for this annotation

Codecov / codecov/patch

packages/aws-cdk-lib/core/lib/helpers-internal/cfn-parse.ts#L460

Added line #L460 was not covered by tests
rollUp.parseCase('pauseTime', FromCloudFormation.getString);
rollUp.parseCase('suspendProcesses', FromCloudFormation.getStringArray);
rollUp.parseCase('waitOnResourceSignals', FromCloudFormation.getBoolean);
Expand Down
Loading