Skip to content
This repository was archived by the owner on Feb 5, 2020. It is now read-only.
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
60 changes: 34 additions & 26 deletions Documentation/dev/node-bootstrap-flow.md
Original file line number Diff line number Diff line change
Expand Up @@ -118,9 +118,15 @@ ExecStartPost=/bin/touch /opt/tectonic/init_tectonic.done
Tectonic service unit is not enabled by default. It is instead triggered by a path unit, which waits for assets written synchronously by terraform.

This service waits for bootkube process to be *completed* via systemd dependency.
It is a oneshot service, thus marked as started only once the script return with success.
It is a oneshot service, thus marked as started only once the script returns with success.
It is skipped on further boots, as the condition-path exists.

### `rm-assets.path` and `rm-assets.service`

This service waits for the bootkube and tectonic process to be completed.
It is a oneshot service, thus marked as started only once the script returns with success.
This is an optional service only present on platforms which pull assets from block storage.

## Diagram

This is a visual simplified representation of the overall bootstrapping flow.
Expand All @@ -135,29 +141,31 @@ Legend:
* b.s -> bootkube.service
* t.p -> tectonic.path
* t.s -> tectonic.service

.------------------------------------------------------------------------------------------------------------------.
| |
| Provision cloud/userdata +----------+ Provision files |
| ,----------------------------------------------o| TF |o-----------------.------------------------. |
| | +----------+ | | |
| | v v |
| | +----------+ +-----+ +-------+ |
| | .--->| (reboot) |----. | b.p | | t.p | |
| | | +----------+ | +-----+ +-------+ |
| V | | o o |
| +-------+ | v Before +------------+ Before | Trigger Trigger | |
| | IGN | | *---------->| k.s |o--------. | | |
| +-------+ o ^ +------------+ | v v |
| | +----------+ | ^ | | +-----+ Before +-------+ |
| '------>| knb.s |o--------------' | v '--->| b.s |o--------------->| t.s | |
| Enable +----------+ '------' +-----+ +-------+ |
| ^ | |
| | v |
| '----' o o |
| | | |
| * First boot | * Each boot | * First boot |
| * All nodes | * All nodes | * Bootkube master |
| | | |
'----------------------------------------------o----------------------------o--------------------------------------'
* rm.p -> rm-assets.path
* rm.s -> rm-assets.service

.---------------------------------------------------------------------------------------------------------------------------------------+
| |
| Provision cloud/userdata +----------+ Provision files |
| ,----------------------------------------------o| TF |o-----------------.------------------------.-----------------+ |
| | +----------+ | | | |
| | v v v |
| | +----------+ +-----+ +-------+ +------+ |
| | .--->| (reboot) |----. | b.p | | t.p | | rm.p | |
| | | +----------+ | +-----+ +-------+ +------+ |
| V | | o o o |
| +-------+ | v Before +------------+ Before | Trigger Trigger | Trigger | |
| | IGN | | *---------->| k.s |o--------. | | | |
| +-------+ o ^ +------------+ | v v v |
| | +----------+ | ^ | | +-----+ Before +-------+ Before +-----+ |
| '------>| knb.s |o--------------' | v '--->| b.s |o--------------->| t.s |--------> |rm.s | |
| Enable +----------+ '------' +-----+ +-------+ +-----+ |
| ^ | |
| | v |
| '----' o o |
| | | |
| * First boot | * Each boot | * First boot |
| * All nodes | * All nodes | * Bootkube master |
| | | |
'----------------------------------------------o----------------------------o-----------------------------------------------------------+
```
1 change: 1 addition & 0 deletions modules/aws/master-asg/ignition.tf
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ data "ignition_config" "main" {
var.ign_tectonic_service_id,
var.ign_bootkube_path_unit_id,
var.ign_tectonic_path_unit_id,
var.ign_rm_assets_path_unit_id,
))}"]
}

Expand Down
2 changes: 1 addition & 1 deletion modules/aws/master-asg/master.tf
Original file line number Diff line number Diff line change
Expand Up @@ -153,7 +153,7 @@ resource "aws_iam_role_policy" "master_policy" {
"s3:GetObject",
"s3:HeadObject",
"s3:ListBucket",
"s3:DeleteObject"
"s3:PutObject"
],
"Resource": "arn:aws:s3:::*",
"Effect": "Allow"
Expand Down
45 changes: 18 additions & 27 deletions modules/aws/master-asg/resources/rm-assets.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,34 +2,25 @@
set -e

s3_clean() {
# Delete Install assets from S3
# shellcheck disable=SC2086,SC2154,SC2016
/usr/bin/docker run \
--volume /tmp:/tmp \
--network=host \
--env LOCATION="${assets_s3_location}" \
--entrypoint=/bin/bash \
${awscli_image} \
-c '
REGION=$(wget -q -O - http://169.254.169.254/latest/meta-data/placement/availability-zone | sed '"'"'s/[a-zA-Z]$//'"'"')
usr/bin/aws --region=$${REGION} s3 rm s3://$${LOCATION}
'
}

# shellcheck disable=SC2086,SC2154
/usr/bin/docker run \
--volume /run/metadata:/run/metadata \
--volume /opt/detect-master.sh:/detect-master.sh:ro \
--network=host \
--env CLUSTER_NAME=${cluster_name} \
--entrypoint=/detect-master.sh \
${awscli_image}
# instead of simply removing the remote assets.zip,
# overwrite it with a zero byte file, such that terraform doesn't
# detect deletion but rather a simple change which it can ignore.
touch /tmp/assets.zip

# Don't do anything if cluster is still in startup
STARTUP=$(cat /run/metadata/master)
if [ "$STARTUP" == "true" ]; then
exit 0
fi
# shellcheck disable=SC2086,SC2154,SC2016
/usr/bin/docker run \
--volume /tmp:/tmp \
--network=host \
--env LOCATION="${assets_s3_location}" \
--entrypoint=/bin/bash \
${awscli_image} \
-c '
set -e
set -o pipefail
REGION=$(wget -q -O - http://169.254.169.254/latest/meta-data/placement/availability-zone | sed '"'"'s/[a-zA-Z]$//'"'"')
/usr/bin/aws --region="$REGION" s3 cp /tmp/assets.zip s3://"$LOCATION"
'
}

until s3_clean; do
echo "failed to clean up S3 assets. retrying in 5 seconds."
Expand Down
4 changes: 4 additions & 0 deletions modules/aws/master-asg/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -129,3 +129,7 @@ variable "ign_init_assets_service_id" {
variable "ign_rm_assets_service_id" {
type = "string"
}

variable "ign_rm_assets_path_unit_id" {
type = "string"
}
8 changes: 7 additions & 1 deletion modules/ignition/assets.tf
Original file line number Diff line number Diff line change
Expand Up @@ -74,9 +74,15 @@ data "ignition_systemd_unit" "init_assets" {
content = "${file("${path.module}/resources/services/init-assets.service")}"
}

data "ignition_systemd_unit" "rm_assets_path_unit" {
name = "rm-assets.path"
enable = true
content = "${file("${path.module}/resources/paths/rm-assets.path")}"
}

data "ignition_systemd_unit" "rm_assets" {
name = "rm-assets.service"
enable = "${var.assets_location != "" ? true : false}"
enable = false
content = "${file("${path.module}/resources/services/rm-assets.service")}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this meant to be always disabled?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes the idea is for it to be path-activated

}

Expand Down
4 changes: 4 additions & 0 deletions modules/ignition/outputs.tf
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,10 @@ output "rm_assets_service_id" {
value = "${data.ignition_systemd_unit.rm_assets.id}"
}

output "rm_assets_path_unit_id" {
value = "${data.ignition_systemd_unit.rm_assets_path_unit.id}"
}

output "s3_puller_id" {
value = "${data.ignition_file.s3_puller.id}"
}
Expand Down
7 changes: 7 additions & 0 deletions modules/ignition/resources/paths/rm-assets.path
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
[Unit]
Description=Trigger for rm-assets.service
[Path]
PathExists=/opt/tectonic/manifests
Unit=rm-assets.service
[Install]
WantedBy=multi-user.target
5 changes: 3 additions & 2 deletions modules/ignition/resources/services/rm-assets.service
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
[Unit]
Description=Clean up install assets from S3
Copy link
Contributor

@enxebre enxebre Nov 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couldn't we just run rm-assets.sh as a ExecStartPost in bootkube tectonic service to alleviate dependency hassle? then each platform implements the cleaning up as they wish (basically adding or not the cloud storage api call)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some platforms don't have the notion of rm-assets.sh which semantically only deletes remote assets.

Local assets are being removed inside tectonic.sh but here I believe another service makes sense.

Platforms not supporting pull semantics (openstack, vmware, baremetal) don't need this service.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also internally discussed the idea of running a one-shot k8s Job post-install, but this is better punted to a later refactoring (track 2), as this adds yet more manifest skew.

ConditionPathExists=/opt/tectonic/init_tectonic.done
After=tectonic.service
ConditionPathExists=!/opt/tectonic/init_rm_assets.done
After=bootkube.service tectonic.service

[Service]
Type=oneshot
Expand All @@ -13,6 +13,7 @@ Group=root

ExecStartPre=/usr/bin/bash /opt/rm-assets.sh
ExecStart=/usr/bin/echo "cleaned up installation assets"
ExecStartPost=/bin/touch /opt/tectonic/init_rm_assets.done

[Install]
WantedBy=multi-user.target
11 changes: 10 additions & 1 deletion modules/tectonic/resources/tectonic.sh
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,16 @@ wait_for_pods() {

asset_cleanup() {
echo "Cleaning up installation assets"
rm -rf "$${ASSETS_PATH:?}/"*

# shellcheck disable=SC2034
for d in "manifests" "auth" "bootstrap-manifests" "net-manifests" "tectonic" "tls"; do
rm -rf "$${ASSETS_PATH:?}/$${d:?}/"*
done

# shellcheck disable=SC2034
for f in "bootkube.sh" "tectonic.sh" "tectonic-wrapper.sh"; do
rm -f "$${ASSETS_PATH:?}/$${f:?}"
done
}

# chdir into the assets path directory
Expand Down
1 change: 1 addition & 0 deletions platforms/aws/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -132,6 +132,7 @@ module "masters" {
ign_kubelet_service_id = "${module.ignition_masters.kubelet_service_id}"
ign_locksmithd_service_id = "${module.ignition_masters.locksmithd_service_id}"
ign_max_user_watches_id = "${module.ignition_masters.max_user_watches_id}"
ign_rm_assets_path_unit_id = "${module.ignition_masters.rm_assets_path_unit_id}"
ign_rm_assets_service_id = "${module.ignition_masters.rm_assets_service_id}"
ign_s3_puller_id = "${module.ignition_masters.s3_puller_id}"
ign_tectonic_path_unit_id = "${var.tectonic_vanilla_k8s ? "" : module.tectonic.systemd_path_unit_id}"
Expand Down
8 changes: 8 additions & 0 deletions platforms/aws/s3.tf
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,10 @@ resource "aws_s3_bucket" "tectonic" {
"KubernetesCluster", "${var.tectonic_cluster_name}",
"tectonicClusterID", "${module.tectonic.cluster_id}"
), var.tectonic_aws_extra_tags)}"

lifecycle {
ignore_changes = ["*"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to ignore lifecycle changes on the bucket too even though only the one file is changing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I did experimentation locally and removed one file from the bucket, this affected also the bucket itself, hence I had to add this here.

}
}

# Bootkube / Tectonic assets
Expand All @@ -34,6 +38,10 @@ resource "aws_s3_bucket_object" "tectonic_assets" {
"KubernetesCluster", "${var.tectonic_cluster_name}",
"tectonicClusterID", "${module.tectonic.cluster_id}"
), var.tectonic_aws_extra_tags)}"

lifecycle {
ignore_changes = ["*"]
}
}

# kubeconfig
Expand Down