-
Notifications
You must be signed in to change notification settings - Fork 4.3k
fix(aws-ecs): drain hook lambda allows tasks to stop gracefully #13559
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…3506 fixes aws#13506 After the container instance is set to draining, the tasks running on it transition from RUNNING > DEACTIVATING > STOPPING > DEPROVISIONING > STOPPED. The current way of counting running tasks via `instance['runningTasksCount'] + instance['pendingTasksCount']` does not include tasks in those transitional states, leading to the EC2 instance being terminated prematurely. No unit tests were added as the lambda code is not currently covered by any tests I could find. I have verified the change by manually updating the automatically created drain hook lambda and then running a ASG refresh. I ran the test with additional debug output to compare the old logic of `runningTasksCount + pendingTasksCount` and the new logic that fetches the status of the tasks. I interleaved the logs from the ECS events, application running in the task and the drain hook lambda: ``` 2021-03-11T15:56:52.608-08:00 Instance i-1234567890abcdefg has container instance ARN arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv 2021-03-11T15:56:52.649-08:00 Instance ARN arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has task ARNs arn:aws:ecs:us-west-2:123456789012:task/fooservice/1234567890abcdefghijklmnopqrstuv 2021-03-11T15:57:03.018-08:00 OLD: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 1 tasks 2021-03-11T15:57:03.051-08:00 NEW: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 1 tasks 2021-03-11T15:57:13.215-08:00 OLD: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 1 tasks 2021-03-11T15:57:13.280-08:00 NEW: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 1 tasks 2021-03-11T15:57:15.280-08:00 service fooservice has stopped 1 running tasks: task 1234567890abcdefghijklmnopqrstuv. 2021-03-11T15:57:23.438-08:00 OLD: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 0 tasks 2021-03-11T15:57:23.490-08:00 NEW: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 1 tasks 2021-03-11T15:57:33.632-08:00 OLD: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 0 tasks 2021-03-11T15:57:33.690-08:00 NEW: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 1 tasks 2021-03-11T15:57:43.853-08:00 OLD: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 0 tasks 2021-03-11T15:57:43.890-08:00 NEW: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 1 tasks 2021-03-11T15:57:46.000-08:00 service fooservice has started 1 tasks: task 1234567890abcdefghijklmnopqrstuv. 2021-03-11T15:57:46.000-08:00 (service fooservice, taskSet ecs-svc/1234567890abcdefghi) has begun draining connections on 2 tasks. 2021-03-11T15:57:46.000-08:00 service fooservice deregistered 1 targets in target-group fooservice-vpce-target 2021-03-11T15:57:46.000-08:00 service fooservice deregistered 1 targets in target-group fooservice-target 2021-03-11T15:57:54.032-08:00 OLD: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 0 tasks 2021-03-11T15:57:54.090-08:00 NEW: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 1 tasks 2021-03-11T15:57:58.000-08:00 service fooservice registered 1 targets in target-group fooservice-vpce-target 2021-03-11T15:57:58.000-08:00 service fooservice registered 1 targets in target-group fooservice-target 2021-03-11T15:58:04.242-08:00 OLD: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 0 tasks 2021-03-11T15:58:04.270-08:00 NEW: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 1 tasks 2021-03-11T15:58:14.430-08:00 OLD: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 0 tasks 2021-03-11T15:58:14.470-08:00 NEW: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 1 tasks 2021-03-11T15:58:24.611-08:00 OLD: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 0 tasks 2021-03-11T15:58:24.650-08:00 NEW: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 1 tasks 2021-03-11T15:58:34.796-08:00 OLD: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 0 tasks 2021-03-11T15:58:34.850-08:00 NEW: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 1 tasks 2021-03-11T15:58:44.999-08:00 OLD: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 0 tasks 2021-03-11T15:58:45.030-08:00 NEW: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 1 tasks 2021-03-11T15:58:49.000-08:00 app received SIGTERM 2021-03-11T15:58:54.000-08:00 service fooservice has reached a steady state. 2021-03-11T15:58:55.170-08:00 OLD: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 0 tasks 2021-03-11T15:58:55.210-08:00 NEW: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 0 tasks 2021-03-11T15:58:55.210-08:00 Terminating instance i-1234567890abcdefg ```
I noticed that the expected output tests contained the lambda code, so I updated them accordingly.
|
Hello - Sorry about the delay in reviewing PRs. We are experiencing an increased backlog of items that need our attention. |
SoManyHs
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great catch, thank you so much for this fix! Definitely an oversight to not count any task in a transitional state -- I think that some of those states were added the original lambda integration was added to the ECS construct library. Your contribution is very much appreciated!
|
@Mergifyio update |
|
Command
|
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
|
Thank you for contributing! Your pull request will be updated from master and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork). |
fixes #13506 ### Description After the container instance is set to draining, the tasks running on it transition from RUNNING > DEACTIVATING > STOPPING > DEPROVISIONING > STOPPED. The current way of counting running tasks via `instance['runningTasksCount'] + instance['pendingTasksCount']` does not include tasks in those transitional states, leading to the EC2 instance being terminated prematurely. ### Verification I have verified the change by manually updating the automatically created drain hook lambda and then running a ASG refresh. I ran the test with additional debug output to compare the old logic of `runningTasksCount + pendingTasksCount` and the new logic that fetches the status of the tasks. I interleaved the logs from the ECS events, application running in the task and the drain hook lambda: ``` 2021-03-11T15:56:52.608-08:00 Instance i-1234567890abcdefg has container instance ARN arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv 2021-03-11T15:56:52.649-08:00 Instance ARN arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has task ARNs arn:aws:ecs:us-west-2:123456789012:task/fooservice/1234567890abcdefghijklmnopqrstuv 2021-03-11T15:57:03.018-08:00 OLD: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 1 tasks 2021-03-11T15:57:03.051-08:00 NEW: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 1 tasks 2021-03-11T15:57:13.215-08:00 OLD: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 1 tasks 2021-03-11T15:57:13.280-08:00 NEW: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 1 tasks 2021-03-11T15:57:15.280-08:00 service fooservice has stopped 1 running tasks: task 1234567890abcdefghijklmnopqrstuv. 2021-03-11T15:57:23.438-08:00 OLD: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 0 tasks 2021-03-11T15:57:23.490-08:00 NEW: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 1 tasks 2021-03-11T15:57:33.632-08:00 OLD: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 0 tasks 2021-03-11T15:57:33.690-08:00 NEW: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 1 tasks 2021-03-11T15:57:43.853-08:00 OLD: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 0 tasks 2021-03-11T15:57:43.890-08:00 NEW: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 1 tasks 2021-03-11T15:57:46.000-08:00 service fooservice has started 1 tasks: task 1234567890abcdefghijklmnopqrstuv. 2021-03-11T15:57:46.000-08:00 (service fooservice, taskSet ecs-svc/1234567890abcdefghi) has begun draining connections on 2 tasks. 2021-03-11T15:57:46.000-08:00 service fooservice deregistered 1 targets in target-group fooservice-vpce-target 2021-03-11T15:57:46.000-08:00 service fooservice deregistered 1 targets in target-group fooservice-target 2021-03-11T15:57:54.032-08:00 OLD: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 0 tasks 2021-03-11T15:57:54.090-08:00 NEW: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 1 tasks 2021-03-11T15:57:58.000-08:00 service fooservice registered 1 targets in target-group fooservice-vpce-target 2021-03-11T15:57:58.000-08:00 service fooservice registered 1 targets in target-group fooservice-target 2021-03-11T15:58:04.242-08:00 OLD: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 0 tasks 2021-03-11T15:58:04.270-08:00 NEW: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 1 tasks 2021-03-11T15:58:14.430-08:00 OLD: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 0 tasks 2021-03-11T15:58:14.470-08:00 NEW: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 1 tasks 2021-03-11T15:58:24.611-08:00 OLD: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 0 tasks 2021-03-11T15:58:24.650-08:00 NEW: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 1 tasks 2021-03-11T15:58:34.796-08:00 OLD: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 0 tasks 2021-03-11T15:58:34.850-08:00 NEW: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 1 tasks 2021-03-11T15:58:44.999-08:00 OLD: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 0 tasks 2021-03-11T15:58:45.030-08:00 NEW: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 1 tasks 2021-03-11T15:58:49.000-08:00 app received SIGTERM 2021-03-11T15:58:54.000-08:00 service fooservice has reached a steady state. 2021-03-11T15:58:55.170-08:00 OLD: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 0 tasks 2021-03-11T15:58:55.210-08:00 NEW: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 0 tasks 2021-03-11T15:58:55.210-08:00 Terminating instance i-1234567890abcdefg ``` The logs show that the new approach allows ecs to drain connections, deregister the target and respect the `deregistrationDelay` ( set to 1 minute in this case ). The old approach would have terminated the EC2 instance 23 seconds prior to ECS even deregistering the target, leading to 502 errors. ### Pull Request Checklist - [x] Testing I was not able to find any tests validating the functionality of the lambda. However, I have updated `expected.json` files to expect the new lambda function code. - [ ] Docs - *Not Applicable* No previously documented behavior has changed - [x] Title and Description - [ ] Sensitive Modules (requires 2 PR approvers) - *Not Applicable* ### Impact End users utilizing ECS on EC2 with capacity provided by an ASG will see an increase in instance termination time, however the process is now much safer, respects the ALBs `deregistrationDelay` and will reduce connection errors. ---- *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
…13559) fixes aws#13506 ### Description After the container instance is set to draining, the tasks running on it transition from RUNNING > DEACTIVATING > STOPPING > DEPROVISIONING > STOPPED. The current way of counting running tasks via `instance['runningTasksCount'] + instance['pendingTasksCount']` does not include tasks in those transitional states, leading to the EC2 instance being terminated prematurely. ### Verification I have verified the change by manually updating the automatically created drain hook lambda and then running a ASG refresh. I ran the test with additional debug output to compare the old logic of `runningTasksCount + pendingTasksCount` and the new logic that fetches the status of the tasks. I interleaved the logs from the ECS events, application running in the task and the drain hook lambda: ``` 2021-03-11T15:56:52.608-08:00 Instance i-1234567890abcdefg has container instance ARN arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv 2021-03-11T15:56:52.649-08:00 Instance ARN arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has task ARNs arn:aws:ecs:us-west-2:123456789012:task/fooservice/1234567890abcdefghijklmnopqrstuv 2021-03-11T15:57:03.018-08:00 OLD: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 1 tasks 2021-03-11T15:57:03.051-08:00 NEW: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 1 tasks 2021-03-11T15:57:13.215-08:00 OLD: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 1 tasks 2021-03-11T15:57:13.280-08:00 NEW: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 1 tasks 2021-03-11T15:57:15.280-08:00 service fooservice has stopped 1 running tasks: task 1234567890abcdefghijklmnopqrstuv. 2021-03-11T15:57:23.438-08:00 OLD: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 0 tasks 2021-03-11T15:57:23.490-08:00 NEW: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 1 tasks 2021-03-11T15:57:33.632-08:00 OLD: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 0 tasks 2021-03-11T15:57:33.690-08:00 NEW: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 1 tasks 2021-03-11T15:57:43.853-08:00 OLD: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 0 tasks 2021-03-11T15:57:43.890-08:00 NEW: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 1 tasks 2021-03-11T15:57:46.000-08:00 service fooservice has started 1 tasks: task 1234567890abcdefghijklmnopqrstuv. 2021-03-11T15:57:46.000-08:00 (service fooservice, taskSet ecs-svc/1234567890abcdefghi) has begun draining connections on 2 tasks. 2021-03-11T15:57:46.000-08:00 service fooservice deregistered 1 targets in target-group fooservice-vpce-target 2021-03-11T15:57:46.000-08:00 service fooservice deregistered 1 targets in target-group fooservice-target 2021-03-11T15:57:54.032-08:00 OLD: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 0 tasks 2021-03-11T15:57:54.090-08:00 NEW: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 1 tasks 2021-03-11T15:57:58.000-08:00 service fooservice registered 1 targets in target-group fooservice-vpce-target 2021-03-11T15:57:58.000-08:00 service fooservice registered 1 targets in target-group fooservice-target 2021-03-11T15:58:04.242-08:00 OLD: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 0 tasks 2021-03-11T15:58:04.270-08:00 NEW: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 1 tasks 2021-03-11T15:58:14.430-08:00 OLD: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 0 tasks 2021-03-11T15:58:14.470-08:00 NEW: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 1 tasks 2021-03-11T15:58:24.611-08:00 OLD: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 0 tasks 2021-03-11T15:58:24.650-08:00 NEW: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 1 tasks 2021-03-11T15:58:34.796-08:00 OLD: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 0 tasks 2021-03-11T15:58:34.850-08:00 NEW: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 1 tasks 2021-03-11T15:58:44.999-08:00 OLD: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 0 tasks 2021-03-11T15:58:45.030-08:00 NEW: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 1 tasks 2021-03-11T15:58:49.000-08:00 app received SIGTERM 2021-03-11T15:58:54.000-08:00 service fooservice has reached a steady state. 2021-03-11T15:58:55.170-08:00 OLD: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 0 tasks 2021-03-11T15:58:55.210-08:00 NEW: Instance arn:aws:ecs:us-west-2:123456789012:container-instance/fooservice/1234567890abcdefghijklmnopqrstuv has 0 tasks 2021-03-11T15:58:55.210-08:00 Terminating instance i-1234567890abcdefg ``` The logs show that the new approach allows ecs to drain connections, deregister the target and respect the `deregistrationDelay` ( set to 1 minute in this case ). The old approach would have terminated the EC2 instance 23 seconds prior to ECS even deregistering the target, leading to 502 errors. ### Pull Request Checklist - [x] Testing I was not able to find any tests validating the functionality of the lambda. However, I have updated `expected.json` files to expect the new lambda function code. - [ ] Docs - *Not Applicable* No previously documented behavior has changed - [x] Title and Description - [ ] Sensitive Modules (requires 2 PR approvers) - *Not Applicable* ### Impact End users utilizing ECS on EC2 with capacity provided by an ASG will see an increase in instance termination time, however the process is now much safer, respects the ALBs `deregistrationDelay` and will reduce connection errors. ---- *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
fixes #13506
Description
After the container instance is set to draining, the tasks running on it transition from RUNNING > DEACTIVATING > STOPPING > DEPROVISIONING > STOPPED.
The current way of counting running tasks via
instance['runningTasksCount'] + instance['pendingTasksCount']does not include tasks in those transitional states, leading to the EC2 instance being terminated prematurely.Verification
I have verified the change by manually updating the automatically created drain hook lambda and then running a ASG refresh.
I ran the test with additional debug output to compare the old logic of
runningTasksCount + pendingTasksCountand the new logic that fetches the status of the tasks.I interleaved the logs from the ECS events, application running in the task and the drain hook lambda:
The logs show that the new approach allows ecs to drain connections, deregister the target and respect the
deregistrationDelay( set to 1 minute in this case ).The old approach would have terminated the EC2 instance 23 seconds prior to ECS even deregistering the target, leading to 502 errors.
Pull Request Checklist
I was not able to find any tests validating the functionality of the lambda. However, I have updated
expected.jsonfiles to expect the new lambda function code.No previously documented behavior has changed
Impact
End users utilizing ECS on EC2 with capacity provided by an ASG will see an increase in instance termination time, however the process is now much safer, respects the ALBs
deregistrationDelayand will reduce connection errors.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license