-
Notifications
You must be signed in to change notification settings - Fork 262
installer/scripts/maintenance/: AWS tag and delete scripts #1239
Conversation
92195ff to
57f7781
Compare
| fi | ||
|
|
||
| echo "Tags to delete:" | ||
| cat "$tag_file" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think it'd be valuable to add a prompt here to ask whether it is going to do what was intended? As we also run this in CI, could add a -f flag to bypass it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As in prompt the user with a printout of tag/config files before actually deleting resources with those tags?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. Tags to delete: ... Are you sure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup I can add that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The last worldwide S3 outage was due to a cleanup/scale-down user-operated script that was given 'a little more'.
|
|
||
| set +eux | ||
|
|
||
| rm -rf $tmp_dir |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You already trap that, right?
| --tag-file A file containing a TagFilter list. | ||
| --date-override (optional) Date of the format YYYY-MM-DD that overrides | ||
| the default tag value. Only use if --tag-file is not used. | ||
| --workspace-dir (optional) Parent directory for a temporary directory. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like the description is not super clear, especially given this script will remove that directory. What if I pass /tmp there? :'(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the script will create a "${workspace}/config" dir and only remove that
| --end-hour Integer hour to end looking at CloudTrail logs. | ||
| --date-override (optional) Date of the format YYYY-MM-DD that overrides | ||
| the default tag value. | ||
| --workspace-dir (optional) Parent directory for a temporary directory. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
|
|
||
| set +eux | ||
|
|
||
| rm -rf "$tmp_dir" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
| -e AWS_ACCESS_KEY_ID="$AWS_ACCESS_KEY_ID" \ | ||
| -e AWS_SECRET_ACCESS_KEY="$AWS_SECRET_ACCESS_KEY" \ | ||
| -e CONFIG_FILE="/tmp/config/delete-config-${region}.toml" \ | ||
| -e TAG_FILE="/tmp/config/$(basename $tag_file)" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hardcoded /tmp/config, versus $tmp_dir.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
once the file is in the Docker container, its location does not matter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh yeah cool soz, didn't pay attention
| -e AWS_ACCESS_KEY_ID="$AWS_ACCESS_KEY_ID" \ | ||
| -e AWS_SECRET_ACCESS_KEY="$AWS_SECRET_ACCESS_KEY" \ | ||
| -e CONFIG_FILE="/tmp/config/tag-config-${region}.toml" \ | ||
| -e TAG_FILE="/tmp/config/tag-exclude.json" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
|
Great stuff, thanks! Could you explain me why do we need |
| --aws-region The AWS region you wish to query for taggable resources. This | ||
| flag is optional if AWS_REGION is set. | ||
| --tag-file A file containing a TagFilter list. | ||
| --date-override (optional) Date of the format YYYY-MM-DD that overrides |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit more description would be nice here too.
| [grafiti] | ||
| region = "$region" | ||
| backoffTime = 500 | ||
| EOF |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Would be nice if some day grafiti didn't require config files, and you could pass these as flags or env vars to avoid needing stuff like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or if at least some of them could be overridden with flags
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. I'll make an issue in coreos/grafiti, unless you'd like to
| @@ -0,0 +1,111 @@ | |||
| #!/usr/bin/env bash | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general it would be nice if most of this logic could be moved into Go or accommodated by grafiti in some way. Having a big bash wrapper script is not fun.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sym3tri would you like to postpone merging these scripts until that time? I'm going to add env variable support for certain config file fields asap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@estroz https://github.com/spf13/viper is your best friend. CLI arguments + Env-based configuration + YAML/JSON/TOML/HCL! configuration in one tool. Usually used with https://github.com/spf13/cobra too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@estroz Is this concern resolved? Are we addressing this in another PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Quentin-M @sym3tri final thoughts? Writing this in Go wouldn't be difficult, but perhaps it should be written in Ruby as per @mxinden's work
185050f to
46d17e1
Compare
|
@Quentin-M grafiti can't parse Route53 events because they are not available via the CloudTrail API (only via CloudTrail S3 log files), so I wrote that script to tag all hosted zones in the same manner as |
3fefd29 to
d3bd9df
Compare
|
@estroz What is the difference between the scripts If I understand correctly you want to check the |
7facba8 to
a20abb2
Compare
|
@mxinden yes these scripts correspond to those Jenkins jobs. These scripts are meant to be more versatile than the Jenkins jobs, i.e. we can use them from the command line if needed. |
a20abb2 to
9ee6d53
Compare
|
@estroz One comment/question, the grafiti image is public? or do we need login into quay to get the image? thanks |
|
@cpanato the image is indeed public |
|
@estroz What is the current status of this PR? Are there any blockers? |
57c44af to
73c46ac
Compare
|
@mxinden PR is good to go |
|
Removing label |
|
@estroz Could you address the failing |
scripts/maintenance: scripts that tag and delete AWS resources using the `grafiti` Docker container (see https://github.com/coreos/grafiti), and a script that only tags Route53 hosted zones using the AWS CLI.
73c46ac to
82bdd9f
Compare
|
@estroz Thanks for fixing the shellcheck issues. If there are no further objections by anyone I would like to merge here in order to proceed with #1564 as well. @estroz Outside of this PR: Do you have time to move this logic into the Grafiti go binary so that it can be both configured via a config file as well as via command line parameters? Would that be an option? |
|
@mxinden I'm reworking some of grafiti's configuration options. I'll take into account this use case. We can revisit this issue when I'm finished. For now let's merge this! |
The typos are from 82bdd9f (installer/scripts: AWS tag and delete scripts, 2017-06-28, coreos/tectonic-installer#1239). While I'm touching these lines, also send their output to stderr instead of stdout (because we're reporting an error). Also exit nonzero in these unrecognized-option cases.
…ing deps At least on Bash 4.4, the -V form writes a missing-command message to stderr: $ echo $BASH_VERSION 4.4.23(1)-release $ command -v does-not-exist >/dev/null $ command -V does-not-exist >/dev/null -bash: command: does-not-exist: not found Because we're not redirecting command's stderr in the script, it will fall through to the caller's stderr and make it easier for them to figure out what went wrong. Also write a "Missing required dependencies" string to stderr. We'd had a "Dependencies not installed." string literal from 82bdd9f (installer/scripts: AWS tag and delete scripts, 2017-06-28, coreos/tectonic-installer#1239), but it hadn't been written anywhere so it was effectively an internal comment.
The 'echo -e' calls landed with the script in 82bdd9f (installer/scripts: AWS tag and delete scripts, 2017-06-28, coreos/tectonic-installer#1239). But $tags doesn't actually need any backslash expansion: $ date_string='2000-01-01' $ tags="[{\"Key\":\"expirationDate\",\"Value\":\"$date_string\"}]" $ echo "${tags}" [{"Key":"expirationDate","Value":"2000-01-01"}] Perhaps the original concern was over the \" in the tags definition, but those are expanded by the shell during that initialization. By the time we get around to invoking echo, there are no backslashes left (unless the user has injected some via --date-override, and we don't want to support that ;).
So folks don't need to bother setting --aws-region or $AWS_REGION to use their usual default. Docs for the config file settings are in [1]. I've also adjusted the logic so that the precedence is: 1. --aws-region, falling back to 2. $AWS_REGION, falling back to 3. ~/.aws Previously, $AWS_REGION took precedence, and has since the scripts landed in 82bdd9f (installer/scripts: AWS tag and delete scripts, 2017-06-28, coreos/tectonic-installer#1239). But having environment variables override explicitly-set command line options is not idiomatic. [1]: https://docs.aws.amazon.com/cli/latest/userguide/cli-config-files.html
tag-aws.sh is using grafiti, whose tagPatterns takes jq expressions [1]. We've been using strftime since the script landed in 82bdd9f (installer/scripts: AWS tag and delete scripts, 2017-06-28, coreos/tectonic-installer#1239). jq's strftime doesn't respect your configured $TZ, but the coming jq 1.6 will add strflocaltime which does [2,3]. jq uses seconds since the epoch for date-time values [4]. You can test the new construct with: $ jq --null-input --raw-output 'now + 24*60*60 | strftime("%Y-%m-%d")' 2018-07-27 -d is not part of the POSIX date specification [5], but it (and the 'tomorrow' value) are supported by GNU Coreutils [6,7]. We've been using -d in clean-aws.sh for a while now, so this is now a new dependency. I've also dropped date_override, since we can just set date_string directly. And I've shuffled around some of the conditionals to avoid calling the 'date' and 'jq' commands needlessly when --date-override is set. I've also replaced the multiple date calls in clean-aws.sh with a single call to jq. jq was already a required dependency for this script, and only needing a single child process is much faster: $ time for i in $(seq 100); do A="$(jq --null-input '[["%Y-%m-%d", "%Y-%-m-%-d", "%m-%d-%Y", "%m-%-d-%-Y", "%-m-%-d-%-Y", "%d-%m-%Y", "%d-%-m-%-Y"][] | . as $format | [now, now - 24*60*60][] | strftime($format)]')"; done real 0m0.256s user 0m0.186s sys 0m0.077s $ time for i in $(seq 100); do A="$(date "+%Y-%m-%d" -d "-1 day")\",\"$(date "+%Y-%-m-%-d" -d "-1 day")\",\"$(date "+%m-%-d-%-Y" -d "-1 day")\",\"$(date "+%-m-%-d-%-Y" -d "-1 day")\",\"$(date "+%d-%m-%-Y" -d "-1 day")\",\"$(date "+%d-%-m-%-Y" -d "-1 day")\",\"$(date +%m-%d-%Y)\",\"$(date +%d-%m-%Y)\",\"$(date +%d-%-m-%Y)\",\"$(date +%Y-%m-%d)\",\"$(date +%Y-%-m-%-d)"; done real 0m1.358s user 0m0.604s sys 0m0.832s And that's despite the fact that the old approach skipped some formats for today (e.g. %m-%-d-%-Y had been only used to format yesterday). The plethora of date formats are mostly from 3995263 (ci: add more date format when grafiti apply the cleanning, 2017-09-12, coreos/tectonic-installer#1890), although we've had some since 82bdd9f. The motivation seems to be matching human-generated tags [8], which are less reliably formatted. [1]: https://github.com/coreos/grafiti/blob/89a8bc92ad7fde49cd3dd78197c6b3616a857f36/README.md#configure-grafiti [2]: https://github.com/stedolan/jq/wiki/FAQ [3]: jqlang/jq@06f2060 [4]: https://stedolan.github.io/jq/manual/v1.5/#Dates [5]: http://pubs.opengroup.org/onlinepubs/9699919799/utilities/date.html [6]: https://www.gnu.org/software/coreutils/manual/html_node/Options-for-date.html [7]: https://www.gnu.org/software/coreutils/manual/html_node/Relative-items-in-date-strings.html [8]: coreos/tectonic-installer#1890 (comment)
So folks don't need to bother setting --aws-region or $AWS_REGION to use their usual default. Docs for the config file settings are in [1]. I've also adjusted the logic so that the precedence is: 1. --aws-region, falling back to 2. $AWS_REGION, falling back to 3. ~/.aws Previously, $AWS_REGION took precedence, and has since the scripts landed in 82bdd9f (installer/scripts: AWS tag and delete scripts, 2017-06-28, coreos/tectonic-installer#1239). But having environment variables override explicitly-set command line options is not idiomatic. [1]: https://docs.aws.amazon.com/cli/latest/userguide/cli-config-files.html
tag-aws.sh is using grafiti, whose tagPatterns takes jq expressions [1]. We've been using strftime since the script landed in 82bdd9f (installer/scripts: AWS tag and delete scripts, 2017-06-28, coreos/tectonic-installer#1239). jq's strftime doesn't respect your configured $TZ, but the coming jq 1.6 will add strflocaltime which does [2,3]. jq uses seconds since the epoch for date-time values [4]. You can test the new construct with: $ jq --null-input --raw-output 'now + 24*60*60 | strftime("%Y-%m-%d")' 2018-07-27 -d is not part of the POSIX date specification [5], but it (and the 'tomorrow' value) are supported by GNU Coreutils [6,7]. We've been using -d in clean-aws.sh for a while now, so this is now a new dependency. I've also dropped date_override, since we can just set date_string directly. And I've shuffled around some of the conditionals to avoid calling the 'date' and 'jq' commands needlessly when --date-override is set. I've also replaced the multiple date calls in clean-aws.sh with a single call to jq. jq was already a required dependency for this script, and only needing a single child process is much faster: $ time for i in $(seq 100); do A="$(jq --null-input '[["%Y-%m-%d", "%Y-%-m-%-d", "%m-%d-%Y", "%m-%-d-%-Y", "%-m-%-d-%-Y", "%d-%m-%Y", "%d-%-m-%-Y"][] | . as $format | [now, now - 24*60*60][] | strftime($format)]')"; done real 0m0.256s user 0m0.186s sys 0m0.077s $ time for i in $(seq 100); do A="$(date "+%Y-%m-%d" -d "-1 day")\",\"$(date "+%Y-%-m-%-d" -d "-1 day")\",\"$(date "+%m-%-d-%-Y" -d "-1 day")\",\"$(date "+%-m-%-d-%-Y" -d "-1 day")\",\"$(date "+%d-%m-%-Y" -d "-1 day")\",\"$(date "+%d-%-m-%-Y" -d "-1 day")\",\"$(date +%m-%d-%Y)\",\"$(date +%d-%m-%Y)\",\"$(date +%d-%-m-%Y)\",\"$(date +%Y-%m-%d)\",\"$(date +%Y-%-m-%-d)"; done real 0m1.358s user 0m0.604s sys 0m0.832s And that's despite the fact that the old approach skipped some formats for today (e.g. %m-%-d-%-Y had been only used to format yesterday). The plethora of date formats are mostly from 3995263 (ci: add more date format when grafiti apply the cleanning, 2017-09-12, coreos/tectonic-installer#1890), although we've had some since 82bdd9f. The motivation seems to be matching human-generated tags [8], which are less reliably formatted. [1]: https://github.com/coreos/grafiti/blob/89a8bc92ad7fde49cd3dd78197c6b3616a857f36/README.md#configure-grafiti [2]: https://github.com/stedolan/jq/wiki/FAQ [3]: jqlang/jq@06f2060 [4]: https://stedolan.github.io/jq/manual/v1.5/#Dates [5]: http://pubs.opengroup.org/onlinepubs/9699919799/utilities/date.html [6]: https://www.gnu.org/software/coreutils/manual/html_node/Options-for-date.html [7]: https://www.gnu.org/software/coreutils/manual/html_node/Relative-items-in-date-strings.html [8]: coreos/tectonic-installer#1890 (comment)
maintenance/: scripts that tag and delete AWS resources using the
grafitiDocker container (see https://github.com/coreos/grafiti), and a script that only tags Route53 hosted zones using the AWS CLI.These scripts can be used to tag and delete AWS resources by region, provided you have access to quay.io/coreos/grafiti.