Skip to content

Commit af59eb0

Browse files
author
Grzegorz Lisowski
committed
- Worker locals/defaults moved to workers submodule
- Create separate defaults for node groups - Workers IAM management left outside of module as both node_group and worker_groups uses them - Add option to migrate to worker group module
1 parent e6d76d0 commit af59eb0

File tree

32 files changed

+1269
-384
lines changed

32 files changed

+1269
-384
lines changed

README.md

Lines changed: 11 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -57,12 +57,12 @@ module "my-cluster" {
5757
subnets = ["subnet-abcde012", "subnet-bcde012a", "subnet-fghi345a"]
5858
vpc_id = "vpc-1234556abcdef"
5959
60-
worker_groups = [
61-
{
60+
worker_groups = {
61+
group = {
6262
instance_type = "m4.large"
6363
asg_max_size = 5
6464
}
65-
]
65+
}
6666
}
6767
```
6868
## Conditional creation
@@ -161,8 +161,9 @@ Apache 2 Licensed. See [LICENSE](https://github.com/terraform-aws-modules/terraf
161161

162162
| Name | Source | Version |
163163
|------|--------|---------|
164-
| <a name="module_fargate"></a> [fargate](#module\_fargate) | ./modules/fargate | |
165-
| <a name="module_node_groups"></a> [node\_groups](#module\_node\_groups) | ./modules/node_groups | |
164+
| <a name="module_fargate"></a> [fargate](#module\_fargate) | ./modules/fargate | n/a |
165+
| <a name="module_node_groups"></a> [node\_groups](#module\_node\_groups) | ./modules/node_groups | n/a |
166+
| <a name="module_worker_groups"></a> [worker\_groups](#module\_worker\_groups) | ./modules/worker_groups | n/a |
166167

167168
## Resources
168169

@@ -266,7 +267,7 @@ Apache 2 Licensed. See [LICENSE](https://github.com/terraform-aws-modules/terraf
266267
| <a name="input_subnets"></a> [subnets](#input\_subnets) | A list of subnets to place the EKS cluster and workers within. | `list(string)` | n/a | yes |
267268
| <a name="input_tags"></a> [tags](#input\_tags) | A map of tags to add to all resources. Tags added to launch configuration or templates override these values for ASG Tags only. | `map(string)` | `{}` | no |
268269
| <a name="input_vpc_id"></a> [vpc\_id](#input\_vpc\_id) | VPC where the cluster and workers will be deployed. | `string` | n/a | yes |
269-
| <a name="input_wait_for_cluster_timeout"></a> [wait\_for\_cluster\_timeout](#wait\_for\_cluster\_timeout) | Allows for a configurable timeout (in seconds) when waiting for a cluster to come up | `number` | `300` | no |
270+
| <a name="input_wait_for_cluster_timeout"></a> [wait\_for\_cluster\_timeout](#input\_wait\_for\_cluster\_timeout) | A timeout (in seconds) to wait for cluster to be available. | `number` | `300` | no |
270271
| <a name="input_worker_additional_security_group_ids"></a> [worker\_additional\_security\_group\_ids](#input\_worker\_additional\_security\_group\_ids) | A list of additional security group ids to attach to worker instances | `list(string)` | `[]` | no |
271272
| <a name="input_worker_ami_name_filter"></a> [worker\_ami\_name\_filter](#input\_worker\_ami\_name\_filter) | Name filter for AWS EKS worker AMI. If not provided, the latest official AMI for the specified 'cluster\_version' is used. | `string` | `""` | no |
272273
| <a name="input_worker_ami_name_filter_windows"></a> [worker\_ami\_name\_filter\_windows](#input\_worker\_ami\_name\_filter\_windows) | Name filter for AWS EKS Windows worker AMI. If not provided, the latest official AMI for the specified 'cluster\_version' is used. | `string` | `""` | no |
@@ -275,8 +276,9 @@ Apache 2 Licensed. See [LICENSE](https://github.com/terraform-aws-modules/terraf
275276
| <a name="input_worker_create_cluster_primary_security_group_rules"></a> [worker\_create\_cluster\_primary\_security\_group\_rules](#input\_worker\_create\_cluster\_primary\_security\_group\_rules) | Whether to create security group rules to allow communication between pods on workers and pods using the primary cluster security group. | `bool` | `false` | no |
276277
| <a name="input_worker_create_initial_lifecycle_hooks"></a> [worker\_create\_initial\_lifecycle\_hooks](#input\_worker\_create\_initial\_lifecycle\_hooks) | Whether to create initial lifecycle hooks provided in worker groups. | `bool` | `false` | no |
277278
| <a name="input_worker_create_security_group"></a> [worker\_create\_security\_group](#input\_worker\_create\_security\_group) | Whether to create a security group for the workers or attach the workers to `worker_security_group_id`. | `bool` | `true` | no |
278-
| <a name="input_worker_groups"></a> [worker\_groups](#input\_worker\_groups) | A list of maps defining worker group configurations to be defined using AWS Launch Configurations. See workers\_group\_defaults for valid keys. | `any` | `[]` | no |
279-
| <a name="input_worker_groups_launch_template"></a> [worker\_groups\_launch\_template](#input\_worker\_groups\_launch\_template) | A list of maps defining worker group configurations to be defined using AWS Launch Templates. See workers\_group\_defaults for valid keys. | `any` | `[]` | no |
279+
| <a name="input_worker_groups"></a> [worker\_groups](#input\_worker\_groups) | A map of maps defining worker group configurations to be defined using AWS Launch Templates. See workers\_group\_defaults for valid keys. | `any` | `{}` | no |
280+
| <a name="input_worker_groups_launch_template_legacy"></a> [worker\_groups\_launch\_template\_legacy](#input\_worker\_groups\_launch\_template\_legacy) | A list of maps defining worker group configurations to be defined using AWS Launch Templates. See workers\_group\_defaults for valid keys. | `any` | `[]` | no |
281+
| <a name="input_worker_groups_legacy"></a> [worker\_groups\_legacy](#input\_worker\_groups\_legacy) | A list of maps defining worker group configurations to be defined using AWS Launch Configurations. See workers\_group\_defaults for valid keys. | `any` | `[]` | no |
280282
| <a name="input_worker_security_group_id"></a> [worker\_security\_group\_id](#input\_worker\_security\_group\_id) | If provided, all workers will be attached to this security group. If not given, a security group will be created with necessary ingress/egress to work with the EKS cluster. | `string` | `""` | no |
281283
| <a name="input_worker_sg_ingress_from_port"></a> [worker\_sg\_ingress\_from\_port](#input\_worker\_sg\_ingress\_from\_port) | Minimum port number from which pods will accept communication. Must be changed to a lower value if some pods in your cluster will expose a port lower than 1025 (e.g. 22, 80, or 443). | `number` | `1025` | no |
282284
| <a name="input_workers_additional_policies"></a> [workers\_additional\_policies](#input\_workers\_additional\_policies) | Additional policies to be added to workers | `list(string)` | `[]` | no |
@@ -311,6 +313,7 @@ Apache 2 Licensed. See [LICENSE](https://github.com/terraform-aws-modules/terraf
311313
| <a name="output_node_groups"></a> [node\_groups](#output\_node\_groups) | Outputs from EKS node groups. Map of maps, keyed by var.node\_groups keys |
312314
| <a name="output_oidc_provider_arn"></a> [oidc\_provider\_arn](#output\_oidc\_provider\_arn) | The ARN of the OIDC Provider if `enable_irsa = true`. |
313315
| <a name="output_security_group_rule_cluster_https_worker_ingress"></a> [security\_group\_rule\_cluster\_https\_worker\_ingress](#output\_security\_group\_rule\_cluster\_https\_worker\_ingress) | Security group rule responsible for allowing pods to communicate with the EKS cluster API. |
316+
| <a name="output_worker_groups"></a> [worker\_groups](#output\_worker\_groups) | Outputs from EKS worker groups. Map of maps, keyed by var.worker\_groups keys |
314317
| <a name="output_worker_iam_instance_profile_arns"></a> [worker\_iam\_instance\_profile\_arns](#output\_worker\_iam\_instance\_profile\_arns) | default IAM instance profile ARN for EKS worker groups |
315318
| <a name="output_worker_iam_instance_profile_names"></a> [worker\_iam\_instance\_profile\_names](#output\_worker\_iam\_instance\_profile\_names) | default IAM instance profile name for EKS worker groups |
316319
| <a name="output_worker_iam_role_arn"></a> [worker\_iam\_role\_arn](#output\_worker\_iam\_role\_arn) | default IAM role ARN for EKS worker groups |

aws_auth.tf

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,9 @@
11
locals {
2+
## DEPRECATED section which should be removed when users will be done migration to
3+
## worker nodes managed via maps. When updating remember about proper update in modules/worker_groups
4+
25
auth_launch_template_worker_roles = [
3-
for index in range(0, var.create_eks ? local.worker_group_launch_template_count : 0) : {
6+
for index in range(0, var.create_eks ? local.worker_group_launch_template_legacy_count : 0) : {
47
worker_role_arn = "arn:${data.aws_partition.current.partition}:iam::${data.aws_caller_identity.current.account_id}:role/${element(
58
coalescelist(
69
aws_iam_instance_profile.workers_launch_template.*.role,
@@ -10,15 +13,15 @@ locals {
1013
index
1114
)}"
1215
platform = lookup(
13-
var.worker_groups_launch_template[index],
16+
var.worker_groups_launch_template_legacy[index],
1417
"platform",
1518
local.workers_group_defaults["platform"]
1619
)
1720
}
1821
]
1922

2023
auth_worker_roles = [
21-
for index in range(0, var.create_eks ? local.worker_group_count : 0) : {
24+
for index in range(0, var.create_eks ? local.worker_group_legacy_count : 0) : {
2225
worker_role_arn = "arn:${data.aws_partition.current.partition}:iam::${data.aws_caller_identity.current.account_id}:role/${element(
2326
coalescelist(
2427
aws_iam_instance_profile.workers.*.role,
@@ -28,18 +31,20 @@ locals {
2831
index,
2932
)}"
3033
platform = lookup(
31-
var.worker_groups[index],
34+
var.worker_groups_legacy[index],
3235
"platform",
3336
local.workers_group_defaults["platform"]
3437
)
3538
}
3639
]
40+
## ~DEPRECATED
3741

3842
# Convert to format needed by aws-auth ConfigMap
3943
configmap_roles = [
4044
for role in concat(
4145
local.auth_launch_template_worker_roles,
4246
local.auth_worker_roles,
47+
module.worker_groups.aws_auth_roles,
4348
module.node_groups.aws_auth_roles,
4449
module.fargate.aws_auth_roles,
4550
) :

cluster.tf

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,7 @@ resource "aws_security_group" "cluster" {
5757
name_prefix = var.cluster_name
5858
description = "EKS cluster security group."
5959
vpc_id = var.vpc_id
60+
6061
tags = merge(
6162
var.tags,
6263
{

data.tf

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -64,23 +64,23 @@ data "aws_iam_policy_document" "cluster_assume_role_policy" {
6464
}
6565

6666
data "aws_iam_role" "custom_cluster_iam_role" {
67-
count = var.manage_cluster_iam_resources ? 0 : 1
67+
count = var.create_eks && !var.manage_cluster_iam_resources ? 1 : 0
6868
name = var.cluster_iam_role_name
6969
}
7070

7171
data "aws_iam_instance_profile" "custom_worker_group_iam_instance_profile" {
72-
count = var.manage_worker_iam_resources ? 0 : local.worker_group_count
72+
count = var.create_eks && !var.manage_worker_iam_resources ? local.worker_group_legacy_count : 0
7373
name = lookup(
74-
var.worker_groups[count.index],
74+
var.worker_groups_legacy[count.index],
7575
"iam_instance_profile_name",
7676
local.workers_group_defaults["iam_instance_profile_name"],
7777
)
7878
}
7979

8080
data "aws_iam_instance_profile" "custom_worker_group_launch_template_iam_instance_profile" {
81-
count = var.manage_worker_iam_resources ? 0 : local.worker_group_launch_template_count
81+
count = var.create_eks && !var.manage_worker_iam_resources ? local.worker_group_launch_template_legacy_count : 0
8282
name = lookup(
83-
var.worker_groups_launch_template[count.index],
83+
var.worker_groups_launch_template_legacy[count.index],
8484
"iam_instance_profile_name",
8585
local.workers_group_defaults["iam_instance_profile_name"],
8686
)

docs/faq.md

Lines changed: 10 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
## How do I customize X on the worker group's settings?
44

5-
All the options that can be customized for worker groups are listed in [local.tf](https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/local.tf) under `workers_group_defaults_defaults`.
5+
All the options that can be customized for worker groups are listed in [local.tf](https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/modules/worker_groups/local.tf) under `workers_group_defaults_defaults`.
66

77
Please open Issues or PRs if you think something is missing.
88

@@ -61,12 +61,6 @@ You need to add the tags to the VPC and subnets yourself. See the [basic example
6161

6262
An alternative is to use the aws provider's [`ignore_tags` variable](https://www.terraform.io/docs/providers/aws/#ignore\_tags-configuration-block). However this can also cause terraform to display a perpetual difference.
6363

64-
## How do I safely remove old worker groups?
65-
66-
You've added new worker groups. Deleting worker groups from earlier in the list causes Terraform to want to recreate all worker groups. This is a limitation with how Terraform works and the module using `count` to create the ASGs and other resources.
67-
68-
The safest and easiest option is to set `asg_min_size` and `asg_max_size` to 0 on the worker groups to "remove".
69-
7064
## Why does changing the worker group's desired count not do anything?
7165

7266
The module is configured to ignore this value. Unfortunately Terraform does not support variables within the `lifecycle` block.
@@ -77,9 +71,9 @@ You can change the desired count via the CLI or console if you're not using the
7771

7872
If you are not using autoscaling and really want to control the number of nodes via terraform then set the `asg_min_size` and `asg_max_size` instead. AWS will remove a random instance when you scale down. You will have to weigh the risks here.
7973

80-
## Why are nodes not recreated when the `launch_configuration`/`launch_template` is recreated?
74+
## Why are nodes not recreated when the `launch_configuration` is recreated?
8175

82-
By default the ASG is not configured to be recreated when the launch configuration or template changes. Terraform spins up new instances and then deletes all the old instances in one go as the AWS provider team have refused to implement rolling updates of autoscaling groups. This is not good for kubernetes stability.
76+
By default the ASG is not configured to be recreated when the launch configuration changes. Terraform spins up new instances and then deletes all the old instances in one go as the AWS provider team have refused to implement rolling updates of autoscaling groups. This is not good for kubernetes stability.
8377

8478
You need to use a process to drain and cycle the workers.
8579

@@ -137,35 +131,32 @@ Amazon EKS clusters must contain one or more Linux worker nodes to run core syst
137131
1. Build AWS EKS cluster with the next workers configuration (default Linux):
138132

139133
```
140-
worker_groups = [
141-
{
142-
name = "worker-group-linux"
134+
worker_groups = {
135+
worker-group-linux = {
143136
instance_type = "m5.large"
144137
platform = "linux"
145138
asg_desired_capacity = 2
146139
},
147-
]
140+
}
148141
```
149142

150143
2. Apply commands from https://docs.aws.amazon.com/eks/latest/userguide/windows-support.html#enable-windows-support (use tab with name `Windows`)
151144

152145
3. Add one more worker group for Windows with required field `platform = "windows"` and update your cluster. Worker group example:
153146

154147
```
155-
worker_groups = [
156-
{
157-
name = "worker-group-linux"
148+
worker_groups = {
149+
worker-group-linux = {
158150
instance_type = "m5.large"
159151
platform = "linux"
160152
asg_desired_capacity = 2
161153
},
162-
{
163-
name = "worker-group-windows"
154+
worker-group-windows = {
164155
instance_type = "m5.large"
165156
platform = "windows"
166157
asg_desired_capacity = 1
167158
},
168-
]
159+
}
169160
```
170161

171162
4. With `kubectl get nodes` you can see cluster with mixed (Linux/Windows) nodes support.

docs/spot-instances.md

Lines changed: 5 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -22,65 +22,27 @@ Notes:
2222
- There is an AWS blog article about this [here](https://aws.amazon.com/blogs/compute/run-your-kubernetes-workloads-on-amazon-ec2-spot-instances-with-amazon-eks/).
2323
- Consider using [k8s-spot-rescheduler](https://github.com/pusher/k8s-spot-rescheduler) to move pods from on-demand to spot instances.
2424

25-
## Using Launch Configuration
26-
27-
Example worker group configuration that uses an ASG with launch configuration for each worker group:
28-
29-
```hcl
30-
worker_groups = [
31-
{
32-
name = "on-demand-1"
33-
instance_type = "m4.xlarge"
34-
asg_max_size = 1
35-
kubelet_extra_args = "--node-labels=node.kubernetes.io/lifecycle=normal"
36-
suspended_processes = ["AZRebalance"]
37-
},
38-
{
39-
name = "spot-1"
40-
spot_price = "0.199"
41-
instance_type = "c4.xlarge"
42-
asg_max_size = 20
43-
kubelet_extra_args = "--node-labels=node.kubernetes.io/lifecycle=spot"
44-
suspended_processes = ["AZRebalance"]
45-
},
46-
{
47-
name = "spot-2"
48-
spot_price = "0.20"
49-
instance_type = "m4.xlarge"
50-
asg_max_size = 20
51-
kubelet_extra_args = "--node-labels=node.kubernetes.io/lifecycle=spot"
52-
suspended_processes = ["AZRebalance"]
53-
}
54-
]
55-
```
56-
5725
## Using Launch Templates
5826

5927
Launch Template support is a recent addition to both AWS and this module. It might not be as tried and tested but it's more suitable for spot instances as it allowed multiple instance types in the same worker group:
6028

6129
```hcl
62-
worker_groups = [
63-
{
64-
name = "on-demand-1"
30+
worker_groups = {
31+
on-demand-1 = {
6532
instance_type = "m4.xlarge"
6633
asg_max_size = 10
6734
kubelet_extra_args = "--node-labels=spot=false"
6835
suspended_processes = ["AZRebalance"]
69-
}
70-
]
71-
72-
73-
worker_groups_launch_template = [
74-
{
75-
name = "spot-1"
36+
},
37+
spot-1 = {
7638
override_instance_types = ["m5.large", "m5a.large", "m5d.large", "m5ad.large"]
7739
spot_instance_pools = 4
7840
asg_max_size = 5
7941
asg_desired_capacity = 5
8042
kubelet_extra_args = "--node-labels=node.kubernetes.io/lifecycle=spot"
8143
public_ip = true
8244
},
83-
]
45+
}
8446
```
8547

8648
## Using Launch Templates With Both Spot and On Demand

docs/upgrades.md

Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -58,3 +58,70 @@ Plan: 0 to add, 0 to change, 1 to destroy.
5858
5. If everything sounds good to you, run `terraform apply`
5959

6060
After the first apply, we recommand you to create a new node group and let the module use the `node_group_name_prefix` (by removing the `name` argument) to generate names and avoid collision during node groups re-creation if needed, because the lifce cycle is `create_before_destroy = true`.
61+
62+
## Upgrade module to vXX.X.X for Worker Groups Managed as maps
63+
64+
In this release, we added ability to manage Worker Groups as maps (not lists) which improves the ability to add/remove worker groups.
65+
66+
>NOTE: The new functionality supports only creating groups using Launch Templates!
67+
68+
1. Run `terraform apply` with the previous module version. Make sure all changes are applied before proceeding.
69+
70+
2. Upgrade your module and configure your worker groups by renaming existing variable names as follows:
71+
72+
```
73+
worker_groups = [...] => worker_groups_legacy = [...]
74+
75+
worker_groups_launch_template = [...] => worker_groups_launch_template_legacy = [...]
76+
```
77+
78+
Example:
79+
80+
FROM:
81+
82+
```hcl
83+
worker_groups_launch_template = [
84+
{
85+
name = "worker-group-1"
86+
instance_type = "t3.small"
87+
asg_desired_capacity = 2
88+
public_ip = true
89+
},
90+
]
91+
```
92+
93+
TO:
94+
95+
```hcl
96+
worker_groups_launch_template_legacy = [
97+
{
98+
name = "worker-group-1"
99+
instance_type = "t3.small"
100+
asg_desired_capacity = 2
101+
public_ip = true
102+
},
103+
]
104+
```
105+
106+
3. Run `terraform plan`. No infrastructure changes expected
107+
108+
4. Starting from now on you could define worker groups in a new way and migrate your workload there. Eventually the legacy groups could be deleted.
109+
110+
Example:
111+
112+
```hcl
113+
worker_groups_launch_template_legacy = [
114+
{
115+
name = "worker-group-1"
116+
instance_type = "t3.small"
117+
asg_desired_capacity = 2
118+
},
119+
]
120+
121+
worker_groups = {
122+
worker-group-1 = {
123+
instance_type = "t3.small"
124+
asg_desired_capacity = 2
125+
},
126+
}
127+
```

0 commit comments

Comments
 (0)