-
Notifications
You must be signed in to change notification settings - Fork 323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ECS] [request/bug?]: Add scale in protection on hosts running a task via RunTask #1207
Comments
Have you enabled managed termination protection? https://docs.aws.amazon.com/AmazonECS/latest/APIReference/API_AutoScalingGroupProvider.html |
Yes we have. This is our terraform template (not sure if it helps)
And the Webgui also shows: ... but still sometimes tasks gets started on instances via RunTask action, where the instance is not protected from scale in. |
Maybe this is the wrong place to discuss, but the scale in protection works kind of different as I would expect it from the documentation The instances, which gets started via launch template of the auto scaling group, registering them self to the cluster and it seems that only some of the instances or only after some time gets marked as protected from scale in. I would have expected that the protected from scale in flag is only assigned as soon as the first task gets started on the instance and it would be removed as soon as the last task on the host gets stopped. Currently it feels like ECS is only removing the scale in protection, when it tries to scale down the cluster. |
I'd prefer to have ECS manage the "protect from scale-in" flag on an EC2 instance. Until then, I added a call in my code's startup handler chain to set "protect from scale-in", and another in the shutdown handler chain to remove that protection, along with setting the corresponding ECS Container Instance state to DRAINING. |
Community Note
Tell us about your request
When using a ECS cluster with EC2 auto scaling group as capacity provider.
When starting tasks via RunTask actions on ECS the EC2 instance, where the task gets placed on, should be protected from scale in. Those are markes as protected from scale in, but still gets stopped when auto scaling group is scaling down/in.
Which service(s) is this request for?
ECS with EC2 ASG as capacity provider
Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
We are using ECS to manage our background processors (Rails app with resque workers running inside a task).
We have two kind of jobs, short-running interrupt able or repeatable jobs (image processing) and long running not interrupt able jobs (video processing and streaming). The short running jobs are managed by a ECS Service.
As we can't tell ECS Services, which jobs should be stopped, when scale in, we needed to implement our own 'scaling logic' for long running jobs. We use RunTask for scheduling new workers and stop tasks by them self, when scaling down.
Bug or unexpected behavior:
When we start a new task via RunTask action, we would expect the instances, where the task gets started on to be marked as protected from scale in, but it doesn't.
Are you currently working around this issue?
We manually observe the task count and start additional tasks, if any task got stopped due to the termination of the underlying EC2 host.
Additional context
We would not need to use our own scaling logic via RunTask, if scale in for services would be controllable, see:
#125
The text was updated successfully, but these errors were encountered: