Skip to content

Commit

Permalink
demo ready emr-eks-karpenter
Browse files Browse the repository at this point in the history
  • Loading branch information
dalbhanj committed Mar 13, 2023
1 parent cdee7a4 commit 082088d
Show file tree
Hide file tree
Showing 51 changed files with 41,742 additions and 0 deletions.
75 changes: 75 additions & 0 deletions analytics/terraform/emr-eks-karpenter2/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
# Scaling EMR on EKS Spark Jobs with Karpenter Autoscaler
Checkout the [documentation website](https://awslabs.github.io/data-on-eks/docs/amazon-emr-on-eks/emr-eks-karpenter) to deploy this pattern and run sample tests.

<!-- BEGINNING OF PRE-COMMIT-TERRAFORM DOCS HOOK -->
## Requirements

| Name | Version |
|------|---------|
| <a name="requirement_terraform"></a> [terraform](#requirement\_terraform) | >= 1.0.0 |
| <a name="requirement_aws"></a> [aws](#requirement\_aws) | >= 3.72 |
| <a name="requirement_helm"></a> [helm](#requirement\_helm) | >= 2.4.1 |
| <a name="requirement_kubectl"></a> [kubectl](#requirement\_kubectl) | >= 1.14 |
| <a name="requirement_kubernetes"></a> [kubernetes](#requirement\_kubernetes) | >= 2.10 |
| <a name="requirement_null"></a> [null](#requirement\_null) | >= 3.0 |
| <a name="requirement_time"></a> [time](#requirement\_time) | >= 0.7 |

## Providers

| Name | Version |
|------|---------|
| <a name="provider_aws"></a> [aws](#provider\_aws) | >= 3.72 |
| <a name="provider_aws.ecr"></a> [aws.ecr](#provider\_aws.ecr) | >= 3.72 |
| <a name="provider_kubectl"></a> [kubectl](#provider\_kubectl) | >= 1.14 |

## Modules

| Name | Source | Version |
|------|--------|---------|
| <a name="module_eks"></a> [eks](#module\_eks) | terraform-aws-modules/eks/aws | ~> 19.9 |
| <a name="module_eks_blueprints_kubernetes_addons"></a> [eks\_blueprints\_kubernetes\_addons](#module\_eks\_blueprints\_kubernetes\_addons) | github.com/aws-ia/terraform-aws-eks-blueprints//modules/kubernetes-addons | v4.25.0 |
| <a name="module_emr_containers"></a> [emr\_containers](#module\_emr\_containers) | ./modules/emr-eks-containers | n/a |
| <a name="module_karpenter"></a> [karpenter](#module\_karpenter) | terraform-aws-modules/eks/aws//modules/karpenter | ~> 19.9 |
| <a name="module_vpc"></a> [vpc](#module\_vpc) | terraform-aws-modules/vpc/aws | ~> 3.0 |
| <a name="module_vpc_endpoints"></a> [vpc\_endpoints](#module\_vpc\_endpoints) | terraform-aws-modules/vpc/aws//modules/vpc-endpoints | ~> 3.0 |
| <a name="module_vpc_endpoints_sg"></a> [vpc\_endpoints\_sg](#module\_vpc\_endpoints\_sg) | terraform-aws-modules/security-group/aws | ~> 4.0 |

## Resources

| Name | Type |
|------|------|
| [aws_prometheus_workspace.amp](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/prometheus_workspace) | resource |
| [kubectl_manifest.karpenter_provisioner](https://registry.terraform.io/providers/gavinbunney/kubectl/latest/docs/resources/manifest) | resource |
| [aws_ami.eks](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/ami) | data source |
| [aws_availability_zones.available](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/availability_zones) | data source |
| [aws_caller_identity.current](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/caller_identity) | data source |
| [aws_ecrpublic_authorization_token.token](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/ecrpublic_authorization_token) | data source |
| [aws_eks_cluster_auth.this](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/eks_cluster_auth) | data source |
| [kubectl_path_documents.karpenter_provisioners](https://registry.terraform.io/providers/gavinbunney/kubectl/latest/docs/data-sources/path_documents) | data source |

## Inputs

| Name | Description | Type | Default | Required |
|------|-------------|------|---------|:--------:|
| <a name="input_eks_cluster_version"></a> [eks\_cluster\_version](#input\_eks\_cluster\_version) | EKS Cluster version | `string` | `"1.24"` | no |
| <a name="input_enable_yunikorn"></a> [enable\_yunikorn](#input\_enable\_yunikorn) | Enable Apache YuniKorn Scheduler | `bool` | `false` | no |
| <a name="input_name"></a> [name](#input\_name) | Name of the VPC and EKS Cluster | `string` | `"emr-eks-karpenter"` | no |
| <a name="input_private_subnets"></a> [private\_subnets](#input\_private\_subnets) | Private Subnets CIDRs. 32766 Subnet1 and 16382 Subnet2 IPs per Subnet | `list(string)` | <pre>[<br> "10.1.0.0/17",<br> "10.1.128.0/18"<br>]</pre> | no |
| <a name="input_public_subnets"></a> [public\_subnets](#input\_public\_subnets) | Public Subnets CIDRs. 62 IPs per Subnet | `list(string)` | <pre>[<br> "10.1.255.128/26",<br> "10.1.255.192/26"<br>]</pre> | no |
| <a name="input_region"></a> [region](#input\_region) | region | `string` | `"us-west-2"` | no |
| <a name="input_tags"></a> [tags](#input\_tags) | Default tags | `map(string)` | `{}` | no |
| <a name="input_vpc_cidr"></a> [vpc\_cidr](#input\_vpc\_cidr) | VPC CIDR | `string` | `"10.1.0.0/16"` | no |

## Outputs

| Name | Description |
|------|-------------|
| <a name="output_aws_auth_configmap_yaml"></a> [aws\_auth\_configmap\_yaml](#output\_aws\_auth\_configmap\_yaml) | Formatted yaml output for base aws-auth configmap containing roles used in cluster node groups/fargate profiles |
| <a name="output_cluster_arn"></a> [cluster\_arn](#output\_cluster\_arn) | The Amazon Resource Name (ARN) of the cluster |
| <a name="output_cluster_name"></a> [cluster\_name](#output\_cluster\_name) | The Amazon Resource Name (ARN) of the cluster |
| <a name="output_configure_kubectl"></a> [configure\_kubectl](#output\_configure\_kubectl) | Configure kubectl: make sure you're logged in with the correct AWS profile and run the following command to update your kubeconfig |
| <a name="output_eks_managed_node_groups"></a> [eks\_managed\_node\_groups](#output\_eks\_managed\_node\_groups) | Map of attribute maps for all EKS managed node groups created |
| <a name="output_eks_managed_node_groups_iam_role_name"></a> [eks\_managed\_node\_groups\_iam\_role\_name](#output\_eks\_managed\_node\_groups\_iam\_role\_name) | List of the autoscaling group names created by EKS managed node groups |
| <a name="output_emr_on_eks"></a> [emr\_on\_eks](#output\_emr\_on\_eks) | EMR on EKS |
| <a name="output_oidc_provider_arn"></a> [oidc\_provider\_arn](#output\_oidc\_provider\_arn) | The ARN of the OIDC Provider if `enable_irsa = true` |
<!-- END OF PRE-COMMIT-TERRAFORM DOCS HOOK -->
188 changes: 188 additions & 0 deletions analytics/terraform/emr-eks-karpenter2/addons.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,188 @@

module "eks_blueprints_kubernetes_addons" {
source = "github.com/aws-ia/terraform-aws-eks-blueprints//modules/kubernetes-addons?ref=v4.25.0"

# Wait on the node group(s) before provisioning addons
data_plane_wait_arn = join(",", [for group in module.eks.eks_managed_node_groups : group.node_group_arn])

eks_cluster_id = module.eks.cluster_name
eks_cluster_endpoint = module.eks.cluster_endpoint
eks_oidc_provider = module.eks.oidc_provider
eks_oidc_provider_arn = module.eks.oidc_provider_arn
eks_cluster_version = module.eks.cluster_version

#---------------------------------------
# Amazon EKS Managed Add-ons
#---------------------------------------
enable_amazon_eks_vpc_cni = true
enable_amazon_eks_coredns = true
enable_amazon_eks_kube_proxy = true
enable_amazon_eks_aws_ebs_csi_driver = true

#---------------------------------------
# Kubernetes Add-ons
#---------------------------------------
#---------------------------------------
# Metrics Server
#---------------------------------------
enable_metrics_server = true
metrics_server_helm_config = {
name = "metrics-server"
repository = "https://kubernetes-sigs.github.io/metrics-server/" # (Optional) Repository URL where to locate the requested chart.
chart = "metrics-server"
version = "3.8.2"
namespace = "kube-system"
timeout = "300"
values = [templatefile("${path.module}/helm-values/metrics-server-values.yaml", {})]
}

#---------------------------------------
# Cluster Autoscaler
#---------------------------------------
enable_cluster_autoscaler = true
cluster_autoscaler_helm_config = {
name = "cluster-autoscaler"
repository = "https://kubernetes.github.io/autoscaler" # (Optional) Repository URL where to locate the requested chart.
chart = "cluster-autoscaler"
version = "9.21.0"
namespace = "kube-system"
timeout = "300"
values = [templatefile("${path.module}/helm-values/cluster-autoscaler-values.yaml", {
aws_region = var.region,
eks_cluster_id = local.name
})]
}

#---------------------------------------
# Karpenter Autoscaler for EKS Cluster
#---------------------------------------
enable_karpenter = true
karpenter_enable_spot_termination_handling = true
karpenter_node_iam_instance_profile = module.karpenter.instance_profile_name

karpenter_helm_config = {
name = "karpenter"
chart = "karpenter"
repository = "oci://public.ecr.aws/karpenter"
version = "v0.25.0"
namespace = "karpenter"
repository_username = data.aws_ecrpublic_authorization_token.token.user_name
repository_password = data.aws_ecrpublic_authorization_token.token.password
}

#---------------------------------------
# CloudWatch metrics for EKS
#---------------------------------------
enable_aws_cloudwatch_metrics = true
aws_cloudwatch_metrics_helm_config = {
name = "aws-cloudwatch-metrics"
chart = "aws-cloudwatch-metrics"
repository = "https://aws.github.io/eks-charts"
version = "0.0.7"
namespace = "amazon-cloudwatch"
values = [templatefile("${path.module}/helm-values/aws-cloudwatch-metrics-valyes.yaml", {
eks_cluster_id = var.name
})]
}

#---------------------------------------
# AWS for FluentBit - DaemonSet
#---------------------------------------
enable_aws_for_fluentbit = true
aws_for_fluentbit_helm_config = {
name = "aws-for-fluent-bit"
chart = "aws-for-fluent-bit"
repository = "https://aws.github.io/eks-charts"
version = "0.1.21"
namespace = "aws-for-fluent-bit"
aws_for_fluent_bit_cw_log_group = "/${var.name}/fluentbit-logs" # Optional
aws_for_fluentbit_cwlog_retention_in_days = 90
values = [templatefile("${path.module}/helm-values/aws-for-fluentbit-values.yaml", {
region = var.region,
aws_for_fluent_bit_cw_log = "/${var.name}/fluentbit-logs"
})]
}

#---------------------------------------
# Kubecost
#---------------------------------------
enable_kubecost = true
kubecost_helm_config = {
name = "kubecost"
repository = "oci://public.ecr.aws/kubecost"
chart = "cost-analyzer"
version = "1.97.0"
namespace = "kubecost"
repository_username = data.aws_ecrpublic_authorization_token.token.user_name
repository_password = data.aws_ecrpublic_authorization_token.token.password
timeout = "300"
values = [templatefile("${path.module}/helm-values/kubecost-values.yaml", {})]
}

#---------------------------------------------------------------
# Apache YuniKorn Add-on
#---------------------------------------------------------------
enable_yunikorn = var.enable_yunikorn
yunikorn_helm_config = {
name = "yunikorn"
repository = "https://apache.github.io/yunikorn-release"
chart = "yunikorn"
version = "1.1.0"
timeout = "300"
values = [templatefile("${path.module}/helm-values/yunikorn-values.yaml", {
image_version = "1.1.0"
})]
timeout = "300"
}

#---------------------------------------
# Amazon Managed Prometheus
#---------------------------------------
enable_amazon_prometheus = true
amazon_prometheus_workspace_endpoint = aws_prometheus_workspace.amp.prometheus_endpoint

#---------------------------------------
# Prometheus Server Add-on
#---------------------------------------
enable_prometheus = true
prometheus_helm_config = {
name = "prometheus"
repository = "https://prometheus-community.github.io/helm-charts"
chart = "prometheus"
version = "15.10.1"
namespace = "prometheus"
timeout = "300"
values = [templatefile("${path.module}/helm-values/prometheus-values.yaml", {})]
}

tags = local.tags

} # End of EKS Blueprints Add-on module


#---------------------------------------
# Karpenter Provisioners
#---------------------------------------
data "kubectl_path_documents" "karpenter_provisioners" {
pattern = "${path.module}/provisioners/spark-*.yaml"
vars = {
azs = local.region
eks_cluster_id = module.eks.cluster_name
}
}

resource "kubectl_manifest" "karpenter_provisioner" {
for_each = toset(data.kubectl_path_documents.karpenter_provisioners.documents)
yaml_body = each.value

depends_on = [module.eks_blueprints_kubernetes_addons]
}

#---------------------------------------------------------------
# Amazon Prometheus Workspace
#---------------------------------------------------------------
resource "aws_prometheus_workspace" "amp" {
alias = format("%s-%s", "amp-ws", local.name)

tags = local.tags
}
31 changes: 31 additions & 0 deletions analytics/terraform/emr-eks-karpenter2/cleanup.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
#!/bin/bash
set -o errexit
set -o pipefail

read -p "Enter the region: " region
export AWS_DEFAULT_REGION=$region

targets=(
"module.emr_containers"
"module.eks_blueprints_kubernetes_addons"
"module.eks"
)

for target in "${targets[@]}"
do
destroy_output=$(terraform destroy -target="$target" -auto-approve 2>&1)
if [[ $? -eq 0 && $destroy_output == *"Destroy complete!"* ]]; then
echo "SUCCESS: Terraform destroy of $target completed successfully"
else
echo "FAILED: Terraform destroy of $target failed"
exit 1
fi
done

destroy_output=$(terraform destroy -auto-approve 2>&1)
if [[ $? -eq 0 && $destroy_output == *"Destroy complete!"* ]]; then
echo "SUCCESS: Terraform destroy of all targets completed successfully"
else
echo "FAILED: Terraform destroy of all targets failed"
exit 1
fi
22 changes: 22 additions & 0 deletions analytics/terraform/emr-eks-karpenter2/data.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
data "aws_eks_cluster_auth" "this" {
name = module.eks.cluster_name
}

data "aws_ecrpublic_authorization_token" "token" {
provider = aws.ecr
}

data "aws_availability_zones" "available" {}

data "aws_caller_identity" "current" {}

# This data source can be used to get the latest AMI for Managed Node Groups
data "aws_ami" "eks" {
owners = ["amazon"]
most_recent = true

filter {
name = "name"
values = ["amazon-eks-node-${module.eks.cluster_version}-*"]
}
}
39 changes: 39 additions & 0 deletions analytics/terraform/emr-eks-karpenter2/emr-eks.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
module "emr_containers" {
source = "./modules/emr-eks-containers"

eks_cluster_id = module.eks.cluster_name
eks_oidc_provider_arn = module.eks.oidc_provider_arn

emr_on_eks_config = {
# Example of all settings
emr-data-team-a = {
name = format("%s-%s", module.eks.cluster_name, "emr-data-team-a")

create_namespace = true
namespace = "emr-data-team-a"

execution_role_name = format("%s-%s", module.eks.cluster_name, "emr-eks-data-team-a")
execution_iam_role_description = "EMR Execution Role for emr-data-team-a"
execution_iam_role_additional_policies = ["arn:aws:iam::aws:policy/AmazonS3FullAccess"] # Attach additional policies for execution IAM Role

tags = {
Name = "emr-data-team-a"
}
},

emr-data-team-b = {
name = format("%s-%s", module.eks.cluster_name, "emr-data-team-b")

create_namespace = true
namespace = "emr-data-team-b"

execution_role_name = format("%s-%s", module.eks.cluster_name, "emr-eks-data-team-b")
execution_iam_role_description = "EMR Execution Role for emr-data-team-b"
execution_iam_role_additional_policies = ["arn:aws:iam::aws:policy/AmazonS3FullAccess"] # Attach additional policies for execution IAM Role

tags = {
Name = "emr-data-team-b"
}
}
}
}
Loading

0 comments on commit 082088d

Please sign in to comment.