Skip to content

Latest commit

 

History

History
440 lines (407 loc) · 54.1 KB

README.md

File metadata and controls

440 lines (407 loc) · 54.1 KB

Kubespot (AWS)

AWS EKS Setup for PCI-DSS, SOC2, HIPAA

Kubespot is AWS EKS customized to add security postures around SOC2, HIPAA, and PCI compliance. It is distributed as an open source terraform module allowing you to run it within your own AWS account without lock-in. Kubespot has been developed over a half a decade evolving with the AWS EKS distribution and before that kops. It is in use within multiple startups that have scaled from a couple founders in an apartment to billion dollar unicorns. By using Kubespot they were able to achieve the technical requirements for compliance while being able to deploy software fast.

Kubespot is a light wrapper around AWS EKS. The primary changes included in Kubespot are:

  • Locked down with security groups, private subnets and other compliance related requirements.
  • Locked down RDS and Elasticache if needed.
  • Users have a single Load Balancer through which all requests go through to reduce costs.
  • KEDA is used for scaling on event metrics such as queue sizes, user requests, CPU, memory or anything else Keda supports.
  • Karpenter is used for autoscaling.
  • Instance are lockdown with encryption, and a regular node cycle rate is set.

Tools & Setup

brew install kubectl kubernetes-helm awscli terraform

Cluster Usage

If the infrastructure is using the opsZero infrastructure as code template then you access the resources like the following:

Add your IAM credentials in ~/.aws/credentials.

[profile_name]
aws_access_key_id=<>key>
aws_secret_access_key=<secret_key>
region=us-west-2
cd environments/<nameofenv>
make kubeconfig
export KUBECONFIG=./kubeconfig # add to a .zshrc
kubectl get pods

Autoscaler

Kubespot uses Karpenter as the default autoscaler. To configure the autoscaler we need to create a file like the one below and run:

kubectl apply -f karpenter.yml
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
  name: default
spec:
  template:
    spec:
      requirements:
        - key: "karpenter.k8s.aws/instance-category"
          operator: In
          values: ["t", "c", "m"]
        - key: "kubernetes.io/arch"
          operator: In
          values: ["amd64"]
        - key: "karpenter.k8s.aws/instance-cpu"
          operator: In
          values: ["1", "2", "4", "8", "16"]
        - key: "karpenter.k8s.aws/instance-hypervisor"
          operator: In
          values: ["nitro"]
        - key: karpenter.sh/capacity-type
          operator: In
          values: ["spot", "on-demand"]
      nodeClassRef:
        name: default
  disruption:
    consolidationPolicy: WhenUnderutilized
    expireAfter: 2h # 30 * 24h = 720h
---
apiVersion: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
metadata:
  name: default
spec:
  amiFamily: Bottlerocket # Amazon Linux 2
  role: "Karpenter-opszero" # Set the name of the cluster
  subnetSelectorTerms:
    - tags:
        Name: opszero-public
  securityGroupSelectorTerms:
    - tags:
        Name: eks-cluster-sg-opszero-1249901478

Cluster Setup

aws iam create-service-linked-role --aws-service-name spot.amazonaws.com

CIS Kubernetes Benchmark

Note: PodSecurityPolicy (PSP) is deprecated and PodSecurity admission controller is the new standard. The CIS Benchmark is still using PSP. We have converted the PSP to the equivalent new standard.

Control Recommendation Level Status Description
1 Control Plane Components
2 Control Plane Configuration
2.1 Logging
2.1.1 Enable audit logs L1 Active cluster_logging is configured
3 Worker Nodes
3.1 Worker Node Configuration Files
3.1.1 Ensure that the kubeconfig file permissions are set to 644 or more restrictive L1 Won't Fix Use NodeGroups or Fargate
3.1.2 Ensure that the kubelet kubeconfig file ownership is set to root:root L1 Won't Fix Use NodeGroups or Fargate
3.1.3 Ensure that the kubelet configuration file has permissions set to 644 or more restrictive L1 Won't Fix Use NodeGroups or Fargate
3.1.4 Ensure that the kubelet configuration file ownership is set to root:root L1 Won't Fix Use NodeGroups or Fargate
3.2 Kubelet
3.2.1 Ensure that the Anonymous Auth is Not Enabled L1 Won't Fix Use NodeGroups or Fargate
3.2.2 Ensure that the --authorization-mode argument is not set to AlwaysAllow L1 Won't Fix Use NodeGroups or Fargate
3.2.3 Ensure that a Client CA File is Configured L1 Won't Fix Use NodeGroups or Fargate
3.2.4 Ensure that the --read-only-port is disabled L1 Won't Fix Use NodeGroups or Fargate
3.2.5 Ensure that the --streaming-connection-idle-timeout argument is not set to 0 L1 Won't Fix Use NodeGroups or Fargate
3.2.6 Ensure that the --protect-kernel-defaults argument is set to true L1 Won't Fix Use NodeGroups or Fargate
3.2.7 Ensure that the --make-iptables-util-chains argument is set to true L1 Won't Fix Use NodeGroups or Fargate
3.2.8 Ensure that the --hostname-override argument is not set L1 Won't Fix Use NodeGroups or Fargate
3.2.9 Ensure that the --eventRecordQPS argument is set to 0 or a level which ensures appropriate event capture L2 Won't Fix Use NodeGroups or Fargate
3.2.10 Ensure that the --rotate-certificates argument is not present or is set to true L1 Won't Fix Use NodeGroups or Fargate
3.2.11 Ensure that the RotateKubeletServerCertificate argument is set to true L1 Won't Fix Use NodeGroups or Fargate
3.3 Container Optimized OS
3.3.1 Prefer using a container-optimized OS when possible L2 Active Bottlerocket ContainerOS is used.
4 Policies
4.1 RBAC and Service Accounts
4.1.1 Ensure that the cluster-admin role is only used where required L1 Active Default Configuration
4.1.2 Minimize access to secrets L1 Active iam_roles pass limited RBAC
4.1.3 Minimize wildcard use in Roles and ClusterRoles L1 Manual terraform-kubernetes-rbac Set role
4.1.4 Minimize access to create pods L1 Manual terraform-kubernetes-rbac Limit role with pod create
4.1.5 Ensure that default service accounts are not actively used L1 Manual kubectl patch serviceaccount default -p $'automountServiceAccountToken: false'
4.1.6 Ensure that Service Account Tokens are only mounted where necessary L1 Active tiphys Default set to false
4.1.7 Avoid use of system:masters group L1 Active Must manually add users and roles to system:masters
4.1.8 Limit use of the Bind, Impersonate and Escalate permissions in the Kubernetes cluster L1 Manual Limit users with system:masters role
4.2 Pod Security Policies
4.2.1 Minimize the admission of privileged containers L1 Active tiphys defaultSecurityContext.allowPrivilegeEscalation=false
4.2.2 Minimize the admission of containers wishing to share the host process ID namespace L1 Active tiphys hostPID defaults to false
4.2.3 Minimize the admission of containers wishing to share the host IPC namespace L1 Active tiphys hostIPC defaults to false
4.2.4 Minimize the admission of containers wishing to share the host network namespace L1 Active tiphys hostNetwork defaults to false
4.2.5 Minimize the admission of containers with allowPrivilegeEscalation L1 Active tiphys defaultSecurityContext.allowPrivilegeEscalation=false
4.2.6 Minimize the admission of root containers L2 Active tiphys defaultSecurityContext.[runAsNonRoot=true,runAsUser=1001]
4.2.7 Minimize the admission of containers with added capabilities L1 Active tiphys defaultSecurityContext.allowPrivilegeEscalation=false
4.2.8 Minimize the admission of containers with capabilities assigned L1 Active tiphys defaultSecurityContext.capabilities.drop: ALL
4.3 CNI Plugin
4.3.1 Ensure CNI plugin supports network policies. L1 Manual calico_enabled=true
4.3.2 Ensure that all Namespaces have Network Policies defined L1 Manual Add Network Policy manually
4.4 Secrets Management
4.4.1 Prefer using secrets as files over secrets as environment variables L2 Active tiphys writes secrets to file
4.4.2 Consider external secret storage L2 Manual Pull secrets using AWS Secret Manager.
4.5 Extensible Admission Control
4.6 General Policies
4.6.1 Create administrative boundaries between resources using namespaces L1 Manul tiphys deploy on different namespace
4.6.2 Apply Security Context to Your Pods and Containers L2 Active tiphys defaultSecurityContext is set
4.6.3 The default namespace should not be used L2 Active tiphys select namespace
5 Managed services
5.1 Image Registry and Image Scanning
5.1.1 Ensure Image Vulnerability Scanning using Amazon ECR image scanning or a third party provider L1 Active Example
5.1.2 Minimize user access to Amazon ECR L1 Active terraform-aws-mrmgr
5.1.3 Minimize cluster access to read-only for Amazon ECR L1 Active terraform-aws-mrmgr with OIDC
5.1.4 Minimize Container Registries to only those approved L2 Active terraform-aws-mrmgr
5.2 Identity and Access Management (IAM)
5.2.1 Prefer using dedicated EKS Service Accounts L1 Active terraform-aws-mrmgr with OIDC
5.3 AWS EKS Key Management Service
5.3.1 Ensure Kubernetes Secrets are encrypted using Customer Master Keys (CMKs) managed in AWS KMS L1 Active
5.4 Cluster Networking
5.4.1 Restrict Access to the Control Plane Endpoint L1 Active Set cluster_public_access_cidrs
5.4.2 Ensure clusters are created with Private Endpoint Enabled and Public Access Disabled L2 Active Set cluster_private_access = true and cluster_public_access = false
5.4.3 Ensure clusters are created with Private Nodes L1 Active Set enable_nat = true and set nodes_in_public_subnet = false
5.4.4 Ensure Network Policy is Enabled and set as appropriate L1 Manual calico_enabled=true
5.4.5 Encrypt traffic to HTTPS load balancers with TLS certificates L2 Active terraform-helm-kubespot
5.5 Authentication and Authorization
5.5.1 Manage Kubernetes RBAC users with AWS IAM Authenticator for Kubernetes L2 Active iam_users use AWS IAM Authenticator
5.6 Other Cluster Configurations
5.6.1 Consider Fargate for running untrusted workloads L1 Active Set the fargate_selector

Providers

Name Version
aws n/a
helm n/a
http n/a
kubernetes n/a
null n/a
tls n/a

Inputs

Name Description Type Default Required
access_policies access policies list [] no
alb_controller_version The chart version of the ALB controller helm chart string "1.4.4" no
asg_nodes Map of ASG node configurations
map(object({
instance_type = string
max_instance_lifetime = number
nodes_desired_capacity = number
nodes_max_size = number
nodes_min_size = number
nodes_in_public_subnet = bool
node_disk_size = number
node_enabled_metrics = list(string)
spot_price = string
subnet_ids = list(string)
}))
{} no
aws_load_balancer_controller_enabled Enable ALB controller by default bool true no
calico_enabled Whether calico add-on is installed bool false no
calico_version The version of the calico helm chart string "v3.26.1" no
cidr_block The CIDR block used by the VPC string "10.2.0.0/16" no
cidr_block_private_subnet The CIDR block used by the private subnet list
[
"10.2.2.0/24",
"10.2.3.0/24"
]
no
cidr_block_public_subnet The CIDR block used by the private subnet list
[
"10.2.0.0/24",
"10.2.1.0/24"
]
no
cloudwatch_pod_logs_enabled Stream EKS pod logs to cloudwatch bool false no
cloudwatch_retention_in_days How long to keep CloudWatch logs in days number 30 no
cluster_authentication_mode Desired Kubernetes authentication. API or API_AND_CONFIG_MAP string "API" no
cluster_encryption_config Cluster Encryption Config Resources to encrypt, e.g. ['secrets'] list(any)
[
"secrets"
]
no
cluster_kms_policy Cluster Encryption Config KMS Key Resource argument - key policy string null no
cluster_logging List of the desired control plane logging to enable. https://docs.aws.amazon.com/eks/latest/userguide/control-plane-logs.html list
[
"api",
"audit",
"authenticator",
"controllerManager",
"scheduler"
]
no
cluster_private_access Whether the Amazon EKS private API server endpoint is enabled bool true no
cluster_public_access Whether the Amazon EKS private API server endpoint is enabled bool true no
cluster_public_access_cidrs List of CIDR blocks. Indicates which CIDR blocks can access the Amazon EKS public API server endpoint when enabled list
[
"0.0.0.0/0"
]
no
cluster_version Desired Kubernetes master version string "1.30" no
csi_enabled_namespaces n/a list(string) [] no
csi_secrets_store_enabled Specify whether the CSI driver is enabled on the EKS cluster bool false no
csi_secrets_store_version The version of the CSI store helm chart string "1.4.6" no
efs_enabled Specify whether the EFS is enabled on the EKS cluster bool false no
eips List of Elastic IPs list [] no
enable_egress_only_internet_gateway Create an egress-only Internet gateway for your VPC0 bool false no
enable_ipv6 Enable an Amazon-provided IPv6 CIDR block with a /56 prefix length for the VPC bool false no
environment_name Name of the environment to create AWS resources string n/a yes
fargate_selector Terraform object to create the EKS fargate profiles map
{
"serverless": {}
}
no
iam_roles Terraform object of the IAM roles map {} no
iam_users List of IAM users list [] no
karpenter_ami_family AMI family to use for the EC2 Node Class. Possible values: AL2 or Bottlerocket string "Bottlerocket" no
karpenter_enabled Specify whether the karpenter is enabled bool false no
karpenter_version The version of the karpenter helm chart string "1.0.1" no
metrics_server_version The version of the metric server helm chart string "3.11.0" no
nat_enabled Whether the NAT gateway is enabled bool true no
node_group_cpu_threshold The value of the CPU threshold string "70" no
node_groups Terraform object to create the EKS node groups map {} no
node_role_policies A list of The ARN of the policies you want to attach list [] no
redis_enabled Whether the redis cluster is enabled bool false no
redis_engine_version Version number of the cache engine to be used for the cache clusters in this replication group string "7.1" no
redis_node_type Instance class of the redis cluster to be used string "cache.t4g.micro" no
redis_num_nodes Number of nodes for redis number 1 no
s3_csi_bucket_names The name of the S3 bucket for the CSI driver list(string)
[
""
]
no
s3_csi_driver_enabled Enable or disable the S3 CSI driver bool false no
sql_cluster_enabled Whether the sql cluster is enabled bool false no
sql_cluster_monitoring_interval Monitoring Interval for SQL Cluster any null no
sql_cluster_monitoring_role_arn The ARN for the IAM role that permits RDS to send enhanced monitoring metrics to CloudWatch Logs any null no
sql_database_name The name of the database to create when the DB instance is created string "" no
sql_encrypted Specify whether the DB instance is encrypted bool true no
sql_engine The name of the database engine to be used for this DB cluster string "aurora-postgresql" no
sql_engine_mode The database engine mode string "provisioned" no
sql_engine_version The SQL engine version to use string "15.3" no
sql_iam_auth_enabled Specifies whether or not mappings of IAM accounts to database accounts is enabled bool true no
sql_identifier The name of the database string "" no
sql_instance_allocated_storage The allocated storage in gibibytes number 20 no
sql_instance_class The instance type of the RDS instance. string "db.t4g.micro" no
sql_instance_enabled Whether the sql instance is enabled bool false no
sql_instance_engine The database engine to use string "postgres" no
sql_instance_max_allocated_storage the upper limit to which Amazon RDS can automatically scale the storage of the DB instance number 200 no
sql_master_password Password for the master DB user string "" no
sql_master_username Username for the master DB user string "" no
sql_node_count The number of instances to be used for this DB cluster number 0 no
sql_parameter_group_name Name of the DB parameter group to associate string "" no
sql_performance_insights_enabled Specifies whether Performance Insights are enabled. Defaults to false bool false no
sql_rds_multi_az Specify if the RDS instance is enabled multi-AZ bool false no
sql_serverless_seconds_until_auto_pause The time, in seconds, before the DB cluster in serverless mode is paused number 300 no
sql_skip_final_snapshot Determines whether a final DB snapshot is created before the DB instance is deleted. bool false no
sql_storage_type The allocated storage type for DB Instance string "gp3" no
sql_subnet_group_include_public Include public subnets as part of the clusters subnet configuration. bool false no
tags Terraform map to create custom tags for the AWS resources map {} no
vpc_flow_logs_enabled Specify whether the vpc flow log is enabled bool false no
zones AZs for the subnets list
[
"us-west-2a",
"us-west-2b"
]
no

Resources

Name Type
aws_autoscaling_group.asg_nodes resource
aws_cloudwatch_log_group.vpc resource
aws_cloudwatch_metric_alarm.asg_nodes_cpu_threshold resource
aws_cloudwatch_metric_alarm.database_cpu_database resource
aws_cloudwatch_metric_alarm.database_cpu_database-rds resource
aws_cloudwatch_metric_alarm.database_disk_database resource
aws_cloudwatch_metric_alarm.database_free_disk_database resource
aws_cloudwatch_metric_alarm.database_free_disk_database2 resource
aws_cloudwatch_metric_alarm.database_free_disk_database3 resource
aws_cloudwatch_metric_alarm.database_free_disk_database4 resource
aws_cloudwatch_metric_alarm.database_free_disk_database5 resource
aws_cloudwatch_metric_alarm.database_io_mysql resource
aws_cloudwatch_metric_alarm.database_io_postgres resource
aws_cloudwatch_metric_alarm.database_io_rds resource
aws_cloudwatch_metric_alarm.node_group_cpu_threshold resource
aws_db_instance.default resource
aws_db_subnet_group.default resource
aws_egress_only_internet_gateway.egress resource
aws_eip.eips resource
aws_eks_access_entry.entries resource
aws_eks_access_policy_association.policies resource
aws_eks_addon.core resource
aws_eks_cluster.cluster resource
aws_eks_fargate_profile.fargate resource
aws_eks_node_group.node_group resource
aws_elasticache_cluster.default resource
aws_elasticache_subnet_group.default resource
aws_flow_log.vpc resource
aws_iam_instance_profile.node resource
aws_iam_openid_connect_provider.cluster resource
aws_iam_policy.alb resource
aws_iam_policy.ebs resource
aws_iam_policy.eks_pod_logs_to_cloudwatch resource
aws_iam_policy.s3_policy resource
aws_iam_policy.secrets_policy resource
aws_iam_role.cluster resource
aws_iam_role.fargate resource
aws_iam_role.node resource
aws_iam_role.secrets_manager_role resource
aws_iam_role.vpc resource
aws_iam_role_policy.vpc resource
aws_iam_role_policy_attachment.alb resource
aws_iam_role_policy_attachment.cluster-AmazonEKSClusterPolicy resource
aws_iam_role_policy_attachment.cluster-AmazonEKSServicePolicy resource
aws_iam_role_policy_attachment.csi resource
aws_iam_role_policy_attachment.ebs resource
aws_iam_role_policy_attachment.fargate-AmazonEKSFargatePodExecutionRolePolicy resource
aws_iam_role_policy_attachment.node-AmazonEC2ContainerRegistryReadOnly resource
aws_iam_role_policy_attachment.node-AmazonEKSWorkerNodePolicy resource
aws_iam_role_policy_attachment.node-AmazonEKS_CNI_Policy resource
aws_iam_role_policy_attachment.node_eks_pod_logs_to_cloudwatch resource
aws_iam_role_policy_attachment.node_role_policies resource
aws_iam_role_policy_attachment.secrets_manager_attachment resource
aws_internet_gateway.public resource
aws_kms_key.cloudwatch_log resource
aws_kms_key.cluster_secrets resource
aws_launch_configuration.asg_nodes resource
aws_launch_template.encrypted_launch_template resource
aws_nat_gateway.gw resource
aws_rds_cluster.default resource
aws_rds_cluster_instance.cluster_instances resource
aws_route.ig resource
aws_route.ipv6 resource
aws_route.nat resource
aws_route_table.private resource
aws_route_table.public resource
aws_route_table_association.private resource
aws_route_table_association.public resource
aws_security_group.cluster resource
aws_security_group.node resource
aws_security_group_rule.cluster-ingress-node-https resource
aws_security_group_rule.eks resource
aws_security_group_rule.node-ingress-cluster resource
aws_security_group_rule.node-ingress-self resource
aws_security_group_rule.private_subnet resource
aws_security_group_rule.public_subnet resource
aws_subnet.private resource
aws_subnet.public resource
aws_vpc.vpc resource
helm_release.aws_load_balancer resource
helm_release.calico resource
helm_release.csi_secrets_store resource
helm_release.karpenter resource
helm_release.karpenter_crd resource
helm_release.metrics-server resource
kubernetes_config_map.aws_auth resource
kubernetes_config_map.fluent_bit_cluster_info resource
kubernetes_namespace.amazon_cloudwatch resource
kubernetes_service_account.efs_csi_controller_sa resource
kubernetes_service_account.efs_csi_node_sa resource
kubernetes_service_account.main resource
null_resource.csi_secrets_store_aws_provider resource
null_resource.delete_aws_node resource
null_resource.karpenter_ec2_node_class_apply resource
aws_availability_zones.available data source
aws_caller_identity.current data source
aws_eks_cluster_auth.cluster data source
aws_iam_policy.ssm_managed_instance data source
aws_iam_policy_document.cloudwatch data source
aws_iam_policy_document.trust_relationship data source
aws_partition.current data source
aws_region.current data source
aws_ssm_parameter.bottlerocket_ami data source
aws_ssm_parameter.eks_al2_ami data source
aws_ssm_parameter.eks_ami data source
http_http.csi_secrets_store_aws_provider data source
tls_certificate.cluster data source

Outputs

Name Description
eks_cluster n/a
eks_cluster_oidc_provider_arn n/a
eks_cluster_token n/a
internet_gateway_id n/a
nat_gateway_ids n/a
node_role n/a
node_security_group_id n/a
private_route_table n/a
private_subnet_ids n/a
public_route_table n/a
public_subnet_ids n/a
vpc_id n/a

🚀 Built by opsZero!

Since 2016 opsZero has been providing Kubernetes expertise to companies of all sizes on any Cloud. With a focus on AI and Compliance we can say we seen it all whether SOC2, HIPAA, PCI-DSS, ITAR, FedRAMP, CMMC we have you and your customers covered.

We provide support to organizations in the following ways:

We do this with a high-touch support model where you:

  • Get access to us on Slack, Microsoft Teams or Email
  • Get 24/7 coverage of your infrastructure
  • Get an accelerated migration to Kubernetes

Please schedule a call if you need support.