Implement an automerge feature. #389

brndnmtthws · 2018-12-13T02:15:44Z

Similar to autoplan, automerge is a feature whereby Atlantis will merge
a PR/MR once all plans have successfully been applied.

This addresses issue #186.

brndnmtthws · 2018-12-13T02:19:05Z

I've tested this with GitHub, but not GitLab or Bitbucket.

codecov · 2018-12-13T02:22:07Z

Codecov Report

Merging #389 into master will decrease coverage by 0.71%.
The diff coverage is 32.43%.

@@            Coverage Diff             @@
##           master     #389      +/-   ##
==========================================
- Coverage   69.77%   69.05%   -0.72%     
==========================================
  Files          62       62              
  Lines        3801     3875      +74     
==========================================
+ Hits         2652     2676      +24     
- Misses        960     1004      +44     
- Partials      189      195       +6

Impacted Files	Coverage Δ
server/events/project_command_builder.go	`82.9% <ø> (ø)`	⬆️
server/events/yaml/valid/valid.go	`92% <ø> (ø)`	⬆️
cmd/server.go	`79.16% <ø> (ø)`	⬆️
server/events/vcs/proxy.go	`28.57% <0%> (-2.2%)`	⬇️
server/events/vcs/bitbucketcloud/client.go	`39.02% <0%> (-2.01%)`	⬇️
server/events/vcs/github_client.go	`60.22% <0%> (-7.73%)`	⬇️
server/events/vcs/gitlab_client.go	`18.58% <0%> (-1.23%)`	⬇️
server/events/vcs/bitbucketserver/client.go	`30.18% <0%> (-1.82%)`	⬇️
server/events/vcs/not_configured_vcs_client.go	`0% <0%> (ø)`	⬆️
server/server.go	`66.4% <100%> (+0.13%)`	⬆️
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 858dbe4...fc45f10. Read the comment docs.

codecov · 2018-12-13T02:22:08Z

Codecov Report

Merging #389 into master will decrease coverage by 0.85%.
The diff coverage is 52.75%.

@@            Coverage Diff             @@
##           master     #389      +/-   ##
==========================================
- Coverage   70.78%   69.92%   -0.86%     
==========================================
  Files          63       63              
  Lines        3987     4067      +80     
==========================================
+ Hits         2822     2844      +22     
- Misses        970     1016      +46     
- Partials      195      207      +12

Impacted Files	Coverage Δ
server/events/yaml/valid/valid.go	`93.54% <ø> (ø)`	⬆️
server/events/project_command_builder.go	`84.47% <ø> (ø)`	⬆️
cmd/server.go	`78.23% <ø> (-0.72%)`	⬇️
server/user_config.go	`100% <ø> (ø)`	⬆️
server/events/vcs/bitbucketcloud/client.go	`39.02% <0%> (-2.01%)`	⬇️
server/events/vcs/not_configured_vcs_client.go	`0% <0%> (ø)`	⬆️
server/events/vcs/gitlab_client.go	`18.26% <0%> (-1.19%)`	⬇️
server/events/vcs/bitbucketserver/client.go	`29.81% <0%> (-1.77%)`	⬇️
server/events/vcs/proxy.go	`28.57% <0%> (-2.2%)`	⬇️
server/events/yaml/raw/config.go	`100% <100%> (ø)`	⬆️
... and 18 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 56b09b6...88909dd. Read the comment docs.

brndnmtthws · 2018-12-13T02:52:16Z

The most recent CI failure seems unrelated to any code changes. I think it just needs to be re-run.

ajbouh · 2018-12-20T01:18:58Z

@brndnmtthws Excited about this PR, looks awesome!

brndnmtthws · 2018-12-20T01:41:42Z

Just rebased on master to resolve the conflicts.

brndnmtthws · 2018-12-20T01:50:52Z

Looks like some tests broke because of changes elsewhere, so I've updated them.

lkysow · 2019-01-04T18:02:24Z

Hey Brenden, I'm just getting back into things after the holidays. I'll review this asap.

brndnmtthws · 2019-01-04T18:09:35Z

Sweet, thanks for the update! Hope you had a nice break.

pratikmallya · 2019-01-04T22:21:09Z

runatlantis.io/docs/automerging.md

@@ -0,0 +1,17 @@
+# Automerging


Wouldn't the service account used to post comments etc. require write permissions for this to work?

The account would indeed require write permissions. Is there something in the doc which contradicts that?

Nothing that contradicts it, but it doesn't mention that requirement explicitly anywhere :)

It's required for Atlantis to work correctly (I think?), so I don't see why it would be explicitly called out in this section of the doc.

See here: https://www.runatlantis.io/docs/deployment.html#create-an-access-token-for-atlantis

Ah, cool. Nvm then 👍

It actually does require increased permissions depending on your setup. For example, my user token can comment on public pull requests but it can't merge unless I'm a collaborator.

The token needs to be from a user with those permissions.

lkysow · 2019-01-08T21:44:19Z

@brndnmtthws don't worry about rebasing or any of the conflicts. I'm pulling this into my own branch to make some finishing touches.

Can you tell me which Git hosts you've tested this with?

lkysow · 2019-01-08T21:56:37Z

Hmm, I've found something concerning. If you generate a plan for multiple projects and one of the plans fails, then if you apply the other project, Atlantis will automerge because there's no outstanding plans. The only reason it thinks that though is because there's no plan files.

Instead, we really should keep track of which projects we expect plans for in a PR. I've run into something similar in another bug where just using the planfiles on disk isn't enough. Instead I think we might need to keep track of this stuff in the on-disk database. I'm not asking you to do it, I will take it on, but what do you think?

edit: see https://github.com/lkysow/atlantis-example/pull/5

brndnmtthws · 2019-01-08T22:29:47Z

Can you tell me which Git hosts you've tested this with?

Only tested with GitHub.

Hmm, I've found something concerning. If you generate a plan for multiple projects and one of the plans fails, then if you apply the other project, Atlantis will automerge because there's no outstanding plans. The only reason it thinks that though is because there's no plan files.

Instead, we really should keep track of which projects we expect plans for in a PR. I've run into something similar in another bug where just using the planfiles on disk isn't enough. Instead I think we might need to keep track of this stuff in the on-disk database. I'm not asking you to do it, I will take it on, but what do you think?

Ah, I never considered that case. I sort of assumed that if the plan fails, it would not allow you to apply. Perhaps that's the real issue: maybe it shouldn't allow you to run apply until all the plans have succeeded? You could set a trap where it removes any pending plans if any of the plans in the repo fail for any reason.

Edit: to elaborate a little more on that, the scary thing for me would be to end up in an inconsistent state where part of the changes are applied, and the other part are not. Seems wise to me to err on the side of caution and back out.

lkysow · 2019-01-09T15:13:46Z

I'm not sure I like deleting successful plans if any other projects fail because I like that Atlantis gives the users a choice. Automatically deleting the successful plans feels like Atlantis trying to be "too smart". There's all sorts of weird edge cases people might get in, and I like that Atlantis gives you a way to get out of them (for example via a manual atlantis plan -d). I know that this is also giving people the ability to shoot themselves in the foot but personally I like that tradeoff.

@gordonbondon @Vlaaaaaaaad @pratikmallya @majormoses @sstarcher how do you feel about if there are >1 plans in a pull request, and one of them fails, that the other plans are deleted and you can't apply any of them until you have all of the plans being generated successfully?

brndnmtthws · 2019-01-09T15:17:33Z

I'm not sure I like deleting successful plans if any other projects fail because I like that Atlantis gives the users a choice. Automatically deleting the successful plans feels like Atlantis trying to be "too smart". There's all sorts of weird edge cases people might get in, and I like that Atlantis gives you a way to get out of them (for example via a manual atlantis plan -d). I know that this is also giving people the ability to shoot themselves in the foot but personally I like that tradeoff.

Yep this makes sense. For my particular use case, it's preferred to avoid any sort of traps, so it's better to fail all plans if any one plan fails.

Would be good to see what other people think.

Another option is to provide a config parameter for this, but it's probably better to take the convention over configuration approach. More knobs, more problems.

ajbouh · 2019-01-09T15:37:35Z

Really excited about this feature!

The only argument I can think of against deleting successful plans if others fail is to fight flakiness.

If you have an enormous number of projects and there's a decent chance one of them will fail to generate a plan, you'd rather "lock in" the ones that work and just try to regenerate the ones that didn't.

In spite of that I would actually assume that the behavior is all or nothing (just like plan generation for a individual project).

sstarcher · 2019-01-09T15:47:09Z

I think the main benefit of requiring all of the plans to pass would be in ensuring all plans are applied before a merge happens. I would not want a user applying a single plan and ignoring the others and merging in the PR. If as the user is fixing the other plan they decide to not merge the PR now they now have changes that are applied for one set of terraform states, but will never be on master and must be reapplied by a user to revert it out.

Vlaaaaaaad · 2019-01-09T16:03:29Z

Yup, I agree with failing everything if a single plan fails. I don't want to end up in weird states with weird plans. And if a previous plan was successful it should be successful the second time it's ran too so deleting all of them works for me.

I think Atlantis should fail fast and give me the chance to fix the issue.

kipkoan · 2019-01-10T16:34:08Z

This is a breaking change so it needs to be a configurable option. Currently Atlantis allows us to apply a single project/workspace. We need that since we have multiple environments in a repo, and need to be able to plan/apply changes in test without also applying in prod.

…

On Jan 9, 2019, at 8:03 AM, Vlad Ionescu ***@***.***> wrote: Yup, I agree with failing everything if a single plan fails. I don't want to end up in weird states with weird plans. And if a previous plan was successful it should be successful the second time it's ran too so deleting all of them works for me. I think Atlantis should fail fast and give me the chance to fix the issue. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

brndnmtthws · 2019-01-10T17:32:33Z

This is a breaking change so it needs to be a configurable option. Currently Atlantis allows us to apply a single project/workspace. We need that since we have multiple environments in a repo, and need to be able to plan/apply changes in test without also applying in prod.

I agree, this would be a breaking change if you depend on the current behaviour. So the question is, what should the behaviour be? Is this the behaviour we want? My preference is to emulate terraform itself, and fail when anything goes wrong.

Could you please elaborate a little bit more on the way you're using Atlantis?

The idea of "disallow apply when any plan fails" is to prevent any PR with interdependent changes across projects/workspaces from resulting in inconsistent state. It's the same way terraform itself handles terraform plan. If any individual change fails, the entire process fails (unless you explicitly state which target to apply with terraform plan -target=....

There isn't an easy way to handle separate workspaces in Atlantis yet. Maybe we should add an abstraction which lets you configure the failure handling on a per-workspace basis? The mapping of project to workspace is currently 1:1, but maybe we need to be able to specify workspaces and projects separately.

Here's how it currently looks:

version: 2
projects:
- dir: project1
  workspace: staging
- dir: project1
  workspace: production
- dir: project2
  workspace: staging
- dir: project2
  workspace: production
- dir: project3
  workspace: staging
- dir: project3
  workspace: production

If it is going to be configurable, I think it would make a lot more sense to do it on a per-project/workspace level. Having another global config flag isn't desirable IMO. Something like this would be a lot better:

version: 3
workspaces:
- name: staging
  allow_failures: yes
- name: production
  allow_failures: no
projects:
- dir: project1
- dir: project2
- dir: project3
  allow_failures: yes

Thoughts?

sstarcher · 2019-01-10T17:36:26Z

@kipkoan I don't know if we are talking about the same thing. We are not talking about requiring you to apply all changes. We are only talking about all changes must plan successfully. If they do then you will still be able to apply individual changes.

kipkoan · 2019-01-10T20:01:24Z

Our repo directory structure is exactly like this: https://www.terraform.io/docs/enterprise/workspaces/repo-structure.html#directories

When we do a PR, Atlantis runs a plan in each environment directory.

Even if one of the environment plans fails, we need to be able to apply any plan that succeeded.

Vlaaaaaaad · 2019-01-10T20:10:38Z

@kipkoan no separate workspaces for prod vs stage vs dev? Do you modify all of them in a single pull request and do specific applies?

kipkoan · 2019-01-10T20:11:08Z

The idea of "disallow apply when any plan fails" is to prevent any PR with interdependent changes across projects/workspaces from resulting in inconsistent state.

Our environment directories (and thus terraform plans) are not interdependent.

If you want multiple plans to be interdependent, and thus need to be applied as a group or not at all, then you can combine them into the same plan.

Atlantis currently behaves the way Terraform behaves -- plans are independent. You can apply one plan even if a different plan is failing. This should be the default.

I'm fine with having Atlantis be able to group plans to make them interdependent (and only apply if all plans succeed), but that should not be the default.

kipkoan · 2019-01-10T20:17:38Z

no separate workspaces for prod vs stage vs dev?

We do not use workspaces(*) for separate environments (and have verified that this is HashiCorp's recommendation).

Do you modify all of them in a single pull request and do specific applies?

A single PR could modify a single environment (and thus only one environment would have changes to apply), or a single PR could modify a module that is used across multiple environments. When multiple environments are changed, we almost always only apply only in non-prod environment (dev/test/stage) and will open another PR later to apply in prod once testing/QA has been done.

(*) OSS Terraform uses the term "workspaces" differently than TFE does.

sstarcher · 2019-01-10T20:34:48Z

In your specific use-case I would recommend not enabling this automerge feature. Merging in unapplied PRs seems to be a very strong anti-pattern that we should lean away from supporting.

A PR should be considered a group. Yes terraform treats things separately, but Atlantis should not.

kipkoan · 2019-01-10T20:53:41Z

In your specific use-case I would recommend not enabling this automerge feature.

Agreed, we don't currently have a use case for automerging. I just wanted to make sure that existing functionality is maintained.

Merging in unapplied PRs seems to be a very strong anti-pattern that we should lean away from supporting.

How do you promote changes through non-prod -> prod environments?

A PR should be considered a group. Yes terraform treats things separately, but Atlantis should not.

I think until there is a way for Atlantis to do environment promotions, Atlantis needs to treat plans separately.

kipkoan · 2019-01-10T20:56:08Z

@sstarcher and @Vlaaaaaaad -

What is your repo directory structure?
How do you promote tf changes through non-prod environments to prod?
What is the use case for having multiple plans in a single PR, while at the same time not wanting to combine those into a single plan?

sstarcher · 2019-01-10T21:00:42Z

Atlantis should have nothing to do with environment promotion.

For me I version my terraform modules. I create a PR for dev. When I'm ready for staging I create a PR for staging etc

kipkoan · 2019-01-10T22:19:20Z

For me I version my terraform modules. I create a PR for dev. When I'm ready for staging I create a PR for staging etc

How do you guarantee that the PR for prod has already been successfully run in staging?

kipkoan · 2019-01-10T22:40:23Z

For me I version my terraform modules. I create a PR for dev. When I'm ready for staging I create a PR for staging etc

How do you version your modules? I'm curious about your workflow to see if it could work for us too. Do you have a separate repo for your modules than your environments?

sstarcher · 2019-01-10T23:57:01Z

Each terraform module is in its own github repo. Each github repo has tests. In development we will use the module SHA to select what to use for fast iteration. After it's ready to go to stg we will tag that module. Afterwards we update staging with that version of the module to point to the tag. After we are ready from that we update prod to use that version.

We treat terraform like we treat any other piece of software. We test, version, tag and pin our versions.

kipkoan · 2019-01-11T03:49:09Z

[moved to Slack]

lkysow · 2019-01-11T20:40:07Z

I think if automerge is on, then we require all plans to be successful. Otherwise we keep the original behaviour where some plans can fail but you can still apply successful plans.

ajbouh · 2019-01-11T21:09:55Z

Is this an easy thing to do in terms of code?

brndnmtthws · 2019-01-16T00:00:16Z

I have updated the PR with the following changes:

Atlantis will write errors to a file when either terraform init or terraform plan fails.
If any error files are found from any previous steps, automerge will not proceed.
Added a few more test cases.

I may have missed some things, but I'd like to collect feedback in the mean time. Everything appears to be working as intended.

brndnmtthws · 2019-01-16T00:04:33Z

Instead, we really should keep track of which projects we expect plans for in a PR. I've run into something similar in another bug where just using the planfiles on disk isn't enough. Instead I think we might need to keep track of this stuff in the on-disk database. I'm not asking you to do it, I will take it on, but what do you think?

I looked at the on-disk DB stuff, and I wasn't sure of a good way to handle this, so I went a different route: I'm simply writing a file into the working dir with the TF output when something fails. I think this is nice because it's easy to inspect externally, and doesn't require anything other than the filesystem.

Also, the circleci website_link_check job is failing, but I think that's because I'm pushing from an external repo.

Similar to autoplan, automerge is a feature whereby Atlantis will merge a PR/MR once all plans have successfully been applied. This addresses issue runatlantis#186.

brndnmtthws · 2019-01-17T19:32:06Z

I just rebased this on master again, fixed the conflicts, and removed an extra println() that I forgot to remove.

pratikmallya · 2019-02-04T01:41:56Z

Can this be merged? This is a feature that is often asked for : )

lkysow · 2019-02-04T21:00:43Z

I am currently working on this in a different branch. I don't like the solution of writing out files to the file system to mimic a database. Instead I will be using the boltDB database we already have.

brndnmtthws force-pushed the automerge branch from 81fad6d to 6e3c3ba Compare December 13, 2018 02:18

brndnmtthws force-pushed the automerge branch from 6e3c3ba to fc45f10 Compare December 13, 2018 02:20

brndnmtthws force-pushed the automerge branch 2 times, most recently from 96fa097 to 317cf83 Compare December 13, 2018 02:33

brndnmtthws force-pushed the automerge branch from 317cf83 to c3c8c31 Compare December 20, 2018 01:41

brndnmtthws force-pushed the automerge branch from c3c8c31 to e85dc79 Compare December 20, 2018 01:50

pratikmallya reviewed Jan 4, 2019

View reviewed changes

brndnmtthws force-pushed the automerge branch from e85dc79 to 4cb2ec2 Compare January 15, 2019 23:54

Implement an automerge feature.

88909dd

Similar to autoplan, automerge is a feature whereby Atlantis will merge a PR/MR once all plans have successfully been applied. This addresses issue runatlantis#186.

brndnmtthws force-pushed the automerge branch from ea71b25 to 88909dd Compare January 17, 2019 19:31

lkysow mentioned this pull request Feb 6, 2019

Add automerge capability #457

Merged

lkysow closed this in #457 Feb 7, 2019

Implement an automerge feature. #389

Implement an automerge feature. #389

Conversation

brndnmtthws commented Dec 13, 2018

brndnmtthws commented Dec 13, 2018

codecov bot commented Dec 13, 2018

Codecov Report

codecov bot commented Dec 13, 2018 • edited Loading

Codecov Report

brndnmtthws commented Dec 13, 2018

ajbouh commented Dec 20, 2018

brndnmtthws commented Dec 20, 2018

brndnmtthws commented Dec 20, 2018

lkysow commented Jan 4, 2019

brndnmtthws commented Jan 4, 2019

pratikmallya Jan 4, 2019

Choose a reason for hiding this comment

brndnmtthws Jan 4, 2019

Choose a reason for hiding this comment

pratikmallya Jan 4, 2019

Choose a reason for hiding this comment

brndnmtthws Jan 4, 2019 • edited Loading

Choose a reason for hiding this comment

brndnmtthws Jan 4, 2019

Choose a reason for hiding this comment

pratikmallya Jan 5, 2019

Choose a reason for hiding this comment

lkysow Jan 8, 2019

Choose a reason for hiding this comment

lkysow commented Jan 8, 2019

lkysow commented Jan 8, 2019 • edited Loading

brndnmtthws commented Jan 8, 2019 • edited Loading

lkysow commented Jan 9, 2019

brndnmtthws commented Jan 9, 2019

ajbouh commented Jan 9, 2019 • edited Loading

sstarcher commented Jan 9, 2019

Vlaaaaaaad commented Jan 9, 2019

kipkoan commented Jan 10, 2019 via email

brndnmtthws commented Jan 10, 2019

sstarcher commented Jan 10, 2019

kipkoan commented Jan 10, 2019

Vlaaaaaaad commented Jan 10, 2019

kipkoan commented Jan 10, 2019

kipkoan commented Jan 10, 2019 • edited Loading

sstarcher commented Jan 10, 2019

kipkoan commented Jan 10, 2019 • edited Loading

kipkoan commented Jan 10, 2019

sstarcher commented Jan 10, 2019

kipkoan commented Jan 10, 2019

kipkoan commented Jan 10, 2019

sstarcher commented Jan 10, 2019

kipkoan commented Jan 11, 2019 • edited Loading

lkysow commented Jan 11, 2019

ajbouh commented Jan 11, 2019

brndnmtthws commented Jan 16, 2019

brndnmtthws commented Jan 16, 2019

brndnmtthws commented Jan 17, 2019

pratikmallya commented Feb 4, 2019

lkysow commented Feb 4, 2019

codecov bot commented Dec 13, 2018 •

edited

Loading

brndnmtthws Jan 4, 2019 •

edited

Loading

lkysow commented Jan 8, 2019 •

edited

Loading

brndnmtthws commented Jan 8, 2019 •

edited

Loading

ajbouh commented Jan 9, 2019 •

edited

Loading

kipkoan commented Jan 10, 2019 •

edited

Loading

kipkoan commented Jan 10, 2019 •

edited

Loading

kipkoan commented Jan 11, 2019 •

edited

Loading