Skip to content
This repository was archived by the owner on Feb 5, 2020. It is now read-only.
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
158 changes: 158 additions & 0 deletions Documentation/dev/node-bootstrap-flow.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,158 @@
# Node bootstrapping flow

This is a development document which describes the bootstrapping flow for ContainerLinux nodes provisioned by the tectonic-installer as part of a Tectonic cluster.

## Overview

When a cluster node is being bootstrapped from scratch, it goes through several phases in the following order:

1. first-boot OS configuration, via ignition (systemd units, node configuration, etc)
1. provisioning of additional assets (k8s manifests, TLS material), via either of:
* pushing from terraform file/remote-exec (SSH)
* pulling from private cloud stores (S3 buckets)
1. system-wide updates via `k8s-node-bootstrap.service`, which includes:
* determining current kubernetes cluster version (when joining an existing cluster)
* triggering a ContainerLinux update, via update-engine (optional)
* downloading and deploying proper docker addon version, via tectonic-torcx
* writing the `kubelet.env` file
1. if needed, a node reboot is triggered to apply systemd-wide changes
1. `kubelet.service` picks up the `kubelet.env` file and actually starts the kubelet as a rkt-fly service.

Additionally, only on one of the master nodes the following kubernetes bootstrapping happens:

1. `bootkube.service` is started after `kubelet.service` start
1. a static bootstrapping control-plane is deployed
1. a fully self-hosted control-plane starts and takes over the previous one
1. `bootkube.service` is completed with success
1. `tectonic.service` is started
1. a self-hosted tectonic control-plane is deployed
1. `tectonic.service` is completed with success

## Systemd units

The following systemd units are deployed to a node by tectonic-installer and take part in the bootstrapping process:

* `k8s-node-bootstrap.service` ensures node and assets freshness. It is automatically started on boot, can crash-loop, and it runs only during bootstrap
* `kubelet.service` is the main kubelet deamon. It is automatically started on boot, it is crash-looping until `kubelet.env` is populated, and it runs on each boot

Additionally, only on one of the master nodes the following kubernetes bootstrapping happens:

* `bootkube.service` deploys the initial bootstrapping control-plane. It is started only after `kubelet.service` _is started_. It is a oneshot unit and cannot crash, and it runs only during bootstrap
* `bootkube.path` waits for bootkube assets/scripts to exist on disk and triggers `bootkube.service`
* `tectonic.service` deploys tectonic control-plane. It is started only after `bootkube.service` _has completed_. It is a oneshot unit and cannot crash, and it runs only during bootstrap
* `bootkube.path` waits for tectonic assets/scripts to exist on disk and triggers `tectonic.service`

## Service ordering

Service ordering is enforced via systemd dependencies. This is the rationale for the settings, with relevant snippets:

### `k8s-node-bootstrap.service`

```
ConditionPathExists=!/etc/kubernetes/kubelet.env
Before=kubelet.service
Restart=on-failure
ExecStartPre=[...]
ExecStart=/usr/bin/echo "node components bootstrapped"
WantedBy=multi-user.target kubelet.service
```

This service is enabled by default and can crash-loop until success.
Main logic happens in `Pre`, before the unit is marked as started, to block further services (a synchronous reboot can happen here).

In particular, this blocks kubelet from starting by:
* a `WantedBy=` and `Before=`
* writing the actual `kubelet.env` file on success.

It is skipped on further boots, as the condition-path exists.

### `kubelet.service`

```
EnvironmentFile=/etc/kubernetes/kubelet.env
ExecStart=/usr/lib/coreos/kubelet-wrapper [...]
Restart=always
WantedBy=multi-user.target
```

This service is enabled by default and can crash-loop until success.
On first boot, it is initially blocked by `k8s-node-bootstrap.service`.
It crash-loop until the `kubelet.env` file exists.
It is started on every boot.

### `bootkube.path` and `bootkube.service`

```
ConditionPathExists=!/opt/tectonic/init_bootkube.done
Wants=kubelet.service
After=kubelet.service
Type=oneshot
RemainAfterExit=true
ExecStart=/usr/bin/bash /opt/tectonic/bootkube.sh
ExecStartPost=/bin/touch /opt/tectonic/init_bootkube.done
```

Bootkube service unit is not enabled by default. It is instead triggered by a path unit, which waits for assets written synchronously by terraform.

This service waits for kubelet to be *started* via systemd dependency.
It is a oneshot service, thus marked as started only once the script return with success.
It is skipped on further boots, as the condition-path exists.

### `tectonic.path` and `tectonic.service`

```
ConditionPathExists=!/opt/tectonic/init_tectonic.done
Requires=bootkube.service
After=bootkube.service
Type=oneshot
RemainAfterExit=true
ExecStart=/usr/bin/bash /opt/tectonic/tectonic-rkt.sh
ExecStartPost=/bin/touch /opt/tectonic/init_tectonic.done
```

Tectonic service unit is not enabled by default. It is instead triggered by a path unit, which waits for assets written synchronously by terraform.

This service waits for bootkube process to be *completed* via systemd dependency.
It is a oneshot service, thus marked as started only once the script return with success.
It is skipped on further boots, as the condition-path exists.

## Diagram

This is a visual simplified representation of the overall bootstrapping flow.

```bob
Legend:
* TF -> terraform provisioner
* IGN -> ignition
* knb.s -> k8s-node-bootstrap.service
* k.s -> kubelet.service
* b.p -> bootkube.path
* b.s -> bootkube.service
* t.p -> tectonic.path
* t.s -> tectonic.service

.------------------------------------------------------------------------------------------------------------------.
| |
| Provision cloud/userdata +----------+ Provision files |
| ,----------------------------------------------o| TF |o-----------------.------------------------. |
| | +----------+ | | |
| | v v |
| | +----------+ +-----+ +-------+ |
| | .--->| (reboot) |----. | b.p | | t.p | |
| | | +----------+ | +-----+ +-------+ |
| V | | o o |
| +-------+ | v Before +------------+ Before | Trigger Trigger | |
| | IGN | | *---------->| k.s |o--------. | | |
| +-------+ o ^ +------------+ | v v |
| | +----------+ | ^ | | +-----+ Before +-------+ |
| '------>| knb.s |o--------------' | v '--->| b.s |o--------------->| t.s | |
| Enable +----------+ '------' +-----+ +-------+ |
| ^ | |
| | v |
| '----' o o |
| | | |
| * First boot | * Each boot | * First boot |
| * All nodes | * All nodes | * Bootkube master |
| | | |
'----------------------------------------------o----------------------------o--------------------------------------'
```
1 change: 1 addition & 0 deletions Documentation/variables/config.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ This document gives an overview of variables used in all platforms of the Tecton
| tectonic_admin_email | The e-mail address used to: 1. login as the admin user to the Tectonic Console. 2. generate DNS zones for some providers.<br><br>Note: This field MUST be in all lower-case e-mail address format and set manually prior to creating the cluster. | string | - |
| tectonic_admin_password_hash | The bcrypt hash of admin user password to login to the Tectonic Console. Use the bcrypt-hash tool (https://github.com/coreos/bcrypt-tool/releases/tag/v1.0.0) to generate it.<br><br>Note: This field MUST be set manually prior to creating the cluster. | string | - |
| tectonic_base_domain | The base DNS domain of the cluster. It must NOT contain a trailing period. Some DNS providers will automatically add this if necessary.<br><br>Example: `openstack.dev.coreos.systems`.<br><br>Note: This field MUST be set manually prior to creating the cluster. This applies only to cloud platforms.<br><br>[Azure-specific NOTE] To use Azure-provided DNS, `tectonic_base_domain` should be set to `""` If using DNS records, ensure that `tectonic_base_domain` is set to a properly configured external DNS zone. Instructions for configuring delegated domains for Azure DNS can be found here: https://docs.microsoft.com/en-us/azure/dns/dns-delegate-domain-azure-dns | string | - |
| tectonic_bootstrap_upgrade_cl | (internal) Whether to trigger a ContainerLinux upgrade on node bootstrap. | string | `true` |
| tectonic_ca_cert | (optional) The content of the PEM-encoded CA certificate, used to generate Tectonic Console's server certificate. If left blank, a CA certificate will be automatically generated. | string | `` |
| tectonic_ca_key | (optional) The content of the PEM-encoded CA key, used to generate Tectonic Console's server certificate. This field is mandatory if `tectonic_ca_cert` is set. | string | `` |
| tectonic_ca_key_alg | (optional) The algorithm used to generate tectonic_ca_key. The default value is currently recommended. This field is mandatory if `tectonic_ca_cert` is set. | string | `RSA` |
Expand Down
7 changes: 7 additions & 0 deletions config.tf
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,7 @@ variable "tectonic_container_images" {
tectonic_etcd_operator = "quay.io/coreos/tectonic-etcd-operator:v0.0.2"
tectonic_prometheus_operator = "quay.io/coreos/tectonic-prometheus-operator:v1.6.0"
tectonic_cluo_operator = "quay.io/coreos/tectonic-cluo-operator:v0.2.1"
tectonic_torcx = "quay.io/coreos/tectonic-torcx:installer-latest"
}
}

Expand Down Expand Up @@ -445,3 +446,9 @@ WARNING: Enabling an alpha feature means that future updates may become unsuppor
This should only be enabled on clusters that are meant to be short-lived to begin validating the alpha feature.
EOF
}

variable "tectonic_bootstrap_upgrade_cl" {
type = "string"
default = "true"
description = "(internal) Whether to trigger a ContainerLinux upgrade on node bootstrap."
}
7 changes: 4 additions & 3 deletions modules/aws/master-asg/ignition.tf
Original file line number Diff line number Diff line change
@@ -1,16 +1,17 @@
data "ignition_config" "main" {
files = [
"${data.ignition_file.detect_master.id}",
"${data.ignition_file.init_assets.id}",
"${var.ign_installer_kubelet_env_id}",
"${var.ign_max_user_watches_id}",
"${var.ign_s3_puller_id}",
"${data.ignition_file.init_assets.id}",
"${data.ignition_file.detect_master.id}",
]

systemd = ["${compact(list(
var.ign_docker_dropin_id,
var.ign_locksmithd_service_id,
var.ign_kubelet_service_id,
var.ign_s3_kubelet_env_service_id,
var.ign_k8s_node_bootstrap_service_id,
data.ignition_systemd_unit.init_assets.id,
var.ign_bootkube_service_id,
var.ign_tectonic_service_id,
Expand Down
4 changes: 2 additions & 2 deletions modules/aws/master-asg/resources/services/init-assets.service
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
[Unit]
Description=Download Tectonic Assets
ConditionPathExists=!/opt/init_assets.done
Before=bootkube.service kubelet-env.service
Before=bootkube.service k8s-node-bootstrap.service

[Service]
Type=oneshot
Expand All @@ -16,4 +16,4 @@ ExecStartPost=/bin/touch /opt/init_assets.done

[Install]
WantedBy=multi-user.target
RequiredBy=bootkube.service kubelet-env.service
RequiredBy=bootkube.service k8s-node-bootstrap.service
4 changes: 0 additions & 4 deletions modules/aws/master-asg/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -62,10 +62,6 @@ variable "extra_tags" {
default = {}
}

variable "ign_s3_kubelet_env_service_id" {
type = "string"
}

variable "ign_s3_puller_id" {
type = "string"
}
Expand Down
5 changes: 3 additions & 2 deletions modules/aws/worker-asg/ignition.tf
Original file line number Diff line number Diff line change
@@ -1,13 +1,14 @@
data "ignition_config" "main" {
files = [
"${var.ign_installer_kubelet_env_id}",
"${var.ign_max_user_watches_id}",
"${var.ign_s3_puller_id}",
]

systemd = [
"${var.ign_docker_dropin_id}",
"${var.ign_locksmithd_service_id}",
"${var.ign_k8s_node_bootstrap_service_id}",
"${var.ign_kubelet_service_id}",
"${var.ign_s3_kubelet_env_service_id}",
"${var.ign_locksmithd_service_id}",
]
}
4 changes: 0 additions & 4 deletions modules/aws/worker-asg/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,3 @@ variable "worker_iam_role" {
variable "ign_s3_puller_id" {
type = "string"
}

variable "ign_s3_kubelet_env_service_id" {
type = "string"
}
3 changes: 2 additions & 1 deletion modules/azure/master-as/ignition-master.tf
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
data "ignition_config" "master" {
files = [
"${data.ignition_file.kubeconfig.id}",
"${var.ign_kubelet_env_id}",
"${var.ign_installer_kubelet_env_id}",
"${var.ign_azure_udev_rules_id}",
"${var.ign_max_user_watches_id}",
"${data.ignition_file.cloud_provider_config.id}",
Expand All @@ -10,6 +10,7 @@ data "ignition_config" "master" {
systemd = ["${compact(list(
var.ign_docker_dropin_id,
var.ign_locksmithd_service_id,
var.ign_k8s_node_bootstrap_service_id,
var.ign_kubelet_service_id,
var.ign_tx_off_service_id,
var.ign_bootkube_service_id,
Expand Down
4 changes: 0 additions & 4 deletions modules/azure/master-as/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -23,10 +23,6 @@ variable "ign_azure_udev_rules_id" {
type = "string"
}

variable "ign_kubelet_env_id" {
type = "string"
}

variable "ign_tx_off_service_id" {
type = "string"
}
Expand Down
3 changes: 2 additions & 1 deletion modules/azure/worker-as/ignition-worker.tf
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
data "ignition_config" "worker" {
files = [
"${data.ignition_file.kubeconfig.id}",
"${var.ign_kubelet_env_id}",
"${var.ign_installer_kubelet_env_id}",
"${var.ign_azure_udev_rules_id}",
"${var.ign_max_user_watches_id}",
"${data.ignition_file.cloud-provider-config.id}",
Expand All @@ -10,6 +10,7 @@ data "ignition_config" "worker" {
systemd = [
"${var.ign_docker_dropin_id}",
"${var.ign_locksmithd_service_id}",
"${var.ign_k8s_node_bootstrap_service_id}",
"${var.ign_kubelet_service_id}",
"${var.ign_tx_off_service_id}",
]
Expand Down
4 changes: 0 additions & 4 deletions modules/azure/worker-as/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -29,10 +29,6 @@ variable "ign_azure_udev_rules_id" {
type = "string"
}

variable "ign_kubelet_env_id" {
type = "string"
}

variable "ign_tx_off_service_id" {
type = "string"
}
Expand Down
28 changes: 15 additions & 13 deletions modules/ignition/assets.tf
Original file line number Diff line number Diff line change
Expand Up @@ -48,21 +48,23 @@ data "ignition_systemd_unit" "kubelet" {
content = "${data.template_file.kubelet.rendered}"
}

data "template_file" "kubelet_env_service" {
template = "${file("${path.module}/resources/services/kubelet-env.service")}"
data "template_file" "k8s_node_bootstrap" {
template = "${file("${path.module}/resources/services/k8s-node-bootstrap.service")}"

vars {
kube_version_image_url = "${replace(var.container_images["kube_version"],var.image_re,"$1")}"
kube_version_image_tag = "${replace(var.container_images["kube_version"],var.image_re,"$2")}"
kubelet_image_url = "${replace(var.container_images["hyperkube"],var.image_re,"$1")}"
kubeconfig_fetch_cmd = "${var.kubeconfig_fetch_cmd != "" ? "ExecStartPre=${var.kubeconfig_fetch_cmd}" : ""}"
bootstrap_upgrade_cl = "${var.bootstrap_upgrade_cl}"
kubeconfig_fetch_cmd = "${var.kubeconfig_fetch_cmd != "" ? "ExecStartPre=${var.kubeconfig_fetch_cmd}" : ""}"
tectonic_torcx_image_url = "${replace(var.container_images["tectonic_torcx"],var.image_re,"$1")}"
tectonic_torcx_image_tag = "${replace(var.container_images["tectonic_torcx"],var.image_re,"$2")}"
torcx_skip_setup = "${var.tectonic_vanilla_k8s ? "true" : "false" }"
torcx_store_url = "${var.torcx_store_url}"
}
}

data "ignition_systemd_unit" "kubelet_env" {
name = "kubelet-env.service"
data "ignition_systemd_unit" "k8s_node_bootstrap" {
name = "k8s-node-bootstrap.service"
enable = true
content = "${data.template_file.kubelet_env_service.rendered}"
content = "${data.template_file.k8s_node_bootstrap.rendered}"
}

data "template_file" "s3_puller" {
Expand All @@ -88,7 +90,7 @@ data "ignition_systemd_unit" "locksmithd" {
mask = true
}

data "template_file" "kubelet_env" {
data "template_file" "installer_kubelet_env" {
template = "${file("${path.module}/resources/kubernetes/kubelet.env")}"

vars {
Expand All @@ -97,13 +99,13 @@ data "template_file" "kubelet_env" {
}
}

data "ignition_file" "kubelet_env" {
data "ignition_file" "installer_kubelet_env" {
filesystem = "root"
path = "/etc/kubernetes/kubelet.env"
path = "/etc/kubernetes/installer/kubelet.env"
mode = 0644

content {
content = "${data.template_file.kubelet_env.rendered}"
content = "${data.template_file.installer_kubelet_env.rendered}"
}
}

Expand Down
8 changes: 8 additions & 0 deletions modules/ignition/outputs.import
Original file line number Diff line number Diff line change
Expand Up @@ -15,3 +15,11 @@ variable "ign_kubelet_service_id" {
variable "ign_locksmithd_service_id" {
type = "string"
}

variable "ign_installer_kubelet_env_id" {
type = "string"
}

variable "ign_k8s_node_bootstrap_service_id" {
type = "string"
}
Loading