Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support clean shutdown #320

Closed
teosoft123 opened this issue Oct 15, 2018 · 5 comments · Fixed by #1051
Closed

Support clean shutdown #320

teosoft123 opened this issue Oct 15, 2018 · 5 comments · Fixed by #1051
Labels
feature New functionality/enhancement

Comments

@teosoft123
Copy link

It would be great to implement a clean shutdown mode, that handles some signal and waits for all the child processes to complete before really shutting down.

This is closely related to upgrades in situations where Atlantis Server code is cleanly separated from data, for example running atlantis sever in a Docker container with atlantis user home mounted as a Docker volume.

Tracking child processes can be nicely done by using cgroups, but I would discourage this implementation because it does not have a close analogs in the Docker/k8s world. AFAIK anyways.

@lkysow lkysow added the feature New functionality/enhancement label Apr 4, 2019
@atheiman
Copy link

We run Atlantis in AWS Fargate and for upgrading Atlantis or pushing out configuration changes we first block ingress to the fargate task on it's AWS ELB. Then after some time we assume all Atlantis Terraform processes have completed and we recreate the fargate task with new configuration or container image tag using local terraform.

But we don't reliably know when all the terraform tasks are completed. I think this could be improved by adding an api endpoint to get the count of current terraform processes running.

Then our upgrade process would be:

  1. Restrict ingress to IP address where I am running terraform
  2. Poll that api endpoint for current count of tf processes - wait for count to drop to 0
  3. Safely terraform apply the atlantis upgrade

@lkysow
Copy link
Member

lkysow commented Aug 26, 2019

The work to know how many TF processes are running is the same to properly pass a context through to everything and then keep the Atlantis process running until all the TF processes are stopped so I'm not sure we need an API endpoint.

@atheiman
Copy link

atheiman commented Aug 26, 2019

Yea that would be fine as long as terraform applying fargate task changes can support waiting for the clean shutdown to happen - we use https://github.com/terraform-aws-modules/terraform-aws-atlantis

I could see this being a problem if a clean shutdown is waiting an hour for a long terraform process to finish

@lkysow
Copy link
Member

lkysow commented Aug 26, 2019

Hmm, it looks like there's a 2m max (https://forums.aws.amazon.com/thread.jspa?messageID=907417) so that wouldn't necessarily work. Maybe an API endpoint like /drain or something would be necessary.

@benoit74
Copy link
Contributor

Hi,
I'm working on it (implementing a drain).
I need it since we are deploying Atlantis with Atlantis in a K8s cluster, so we need RollingUgrades + a clean pod termination.
As you suggested, I'm implementing a drain endpoint, with a POST to start the drain and a GET to check its completion.
I will probably implement an operation like "atlantis shutdown" which will call this endpoint locally and wait for completion of the drain, so that it can be used in the preStop hook on K8s. I will probably have a working prototype before the end of the week.
We will battlefield test it asap on our cluster.
This means that I will propose as well a chart update for rollingUgrades + preStop hook in lifecycle.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New functionality/enhancement
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants