diff --git a/docs/user/aws/install_existing_vpc_local-zones.md b/docs/user/aws/install_existing_vpc_local-zones.md new file mode 100644 index 00000000000..b1a61f88dbd --- /dev/null +++ b/docs/user/aws/install_existing_vpc_local-zones.md @@ -0,0 +1,489 @@ +# Cluster Installation in existing VPC with Local Zones subnet + +The steps below describe how to install a cluster in existing VPC with AWS Local Zones subnets using Edge Machine Pool, introduced in 4.12. + +The Edge Machine Pool was created to create a pool of workers running in the AWS Local Zones locations. This pool differs from the default compute pool on these items - Edge workers was not designed to run regular cluster workloads: +- The resources in AWS Local Zones are more expensive than the normal availability zones +- The latency between the application and end-users is lower in Local Zones and may vary for each location. So it will impact if some workloads like routers are mixed in the normal availability zones due to the unbalanced latency +- Network Load Balancers do not support subnets in the Local Zones +- The total time to connect to the applications running in Local Zones from the end-users close to the metropolitan region running the workload, is almost 10x faster than the parent region. + +Table of Contents: + +- [Prerequisites](#prerequisites) + - [Additional IAM permissions](#prerequisites-iam) +- [Create the Network stack](#create-network) + - [Create the VPC](#create-network-vpc) + - [Create the Local Zone subnet](#create-network-subnet) + - [Opt-in zone group](#create-network-subnet-optin) + - [Creating the Subnet using AWS CloudFormation](#create-network-subnet-cfn) +- [Install](#install-cluster) + - [Create the install-config.yaml](#create-config) + - [Setting up the Edge Machine Pool](#create-config-edge-pool") + - [Example edge pool created without customization](#create-config-edge-pool) + - [Example edge pool with custom Instance type](#reate-config-edge-pool-example-ec2) + - [Example edge pool with custom EBS type](#create-config-edge-pool-example-ebs) + - [Create the cluster](#create-cluster-run) +- [Uninstall](#uninstall) + - [Destroy the cluster](#uninstall-destroy-cluster) + - [Destroy the Local Zone subnet](#uninstall-destroy-subnet) + - [Destroy the VPC](#uninstall-destroy-vpc) +- [Use Cases](#use-cases) + - [Example of a sample application deployment](#uc-deployment) + - [User-workload ingress traffic](#uc-exposing-ingress) +___ + +To install a cluster in an existing VPC with Local Zone subnets, you should provision the network resources and then add the subnet IDs to the `install-config.yaml`. + +## Prerequisites + +- [AWS Command Line Interface](aws-cli) +- [openshift-install >= 4.12](openshift-install) +- environment variables exported: + +```bash +export CLUSTER_NAME="ipi-localzones" + +# AWS Region and extra Local Zone group Information +export AWS_REGION="us-west-2" +export ZONE_GROUP_NAME="us-west-2-lax-1" +export ZONE_NAME="us-west-2-lax-1a" + +# VPC Information +export VPC_CIDR="10.0.0.0/16" +export VPC_SUBNETS_BITS="10" +export VPC_SUBNETS_COUNT="3" + +# Local Zone Subnet information +export SUBNET_CIDR="10.0.192.0/22" +export SUBNET_NAME="${CLUSTER_NAME}-public-usw2-lax-1a" +``` + +### Additional IAM permissions + +The AWS Local Zone deployment described in this document, requires the additional permission for the user creating the cluster to modify the Local Zone group: `ec2:ModifyAvailabilityZoneGroup` + +Example of the permissive IAM Policy that can be attached to the User or Role: + +```json +{ + "Version": "2012-10-17", + "Statement": [ + { + "Sid": "Stmt1677614927608", + "Action": [ + "ec2:ModifyAvailabilityZoneGroup" + ], + "Effect": "Allow", + "Resource": "*" + } + ] +} +``` + +## Create the Network Stack + +### Create the VPC + +The steps to install a cluster in an existing VPC are [detailed in the official documentation](aws-install-vpc). You can alternatively use [the CloudFormation templates to create the Network resources](aws-install-cloudformation), which will be used in this document. + +- Create the Stack + +```bash +INSTALLER_URL="https://raw.githubusercontent.com/openshift/installer/master" +TPL_URL="${INSTALLER_URL}/upi/aws/cloudformation/01_vpc.yaml" + +aws cloudformation create-stack \ + --region ${AWS_REGION} \ + --stack-name ${CLUSTER_NAME}-vpc \ + --template-body ${TPL_URL} \ + --parameters \ + ParameterKey=VpcCidr,ParameterValue=${VPC_CIDR} \ + ParameterKey=SubnetBits,ParameterValue=${VPC_SUBNETS_BITS} \ + ParameterKey=AvailabilityZoneCount,ParameterValue=${VPC_SUBNETS_COUNT} +``` + +- Wait for the stack to be created: `StackStatus=CREATE_COMPLETE` + +```bash +aws cloudformation wait stack-create-complete \ + --region ${AWS_REGION} \ + --stack-name ${CLUSTER_NAME}-vpc +``` + +- Export the VPC ID: + +```bash +export VPC_ID=$(aws cloudformation describe-stacks \ + --region us-west-2 \ + --stack-name ${CLUSTER_NAME}-vpc \ + | jq -r '.Stacks[0].Outputs[] | select(.OutputKey=="VpcId").OutputValue' ) +``` + +- Extract the subnets IDs to the environment variable list `SUBNETS`: + +```bash +mapfile -t SUBNETS < <(aws cloudformation describe-stacks \ + --region us-west-2 \ + --stack-name ${CLUSTER_NAME}-vpc \ + | jq -r '.Stacks[0].Outputs[0].OutputValue' | tr ',' '\n') +mapfile -t -O "${#SUBNETS[@]}" SUBNETS < <(aws cloudformation describe-stacks \ + --region us-west-2 \ + --stack-name ${CLUSTER_NAME}-vpc \ + | jq -r '.Stacks[0].Outputs[1].OutputValue' | tr ',' '\n') +``` + +- Export the Public Route Table ID: + +```bash +export PUBLIC_RTB_ID=$(aws cloudformation describe-stacks \ + --region us-west-2 \ + --stack-name ${CLUSTER_NAME}-vpc \ + | jq -r '.Stacks[0].Outputs[] | select(.OutputKey=="PublicRouteTableId").OutputValue' ) +``` + +- Make sure all variables have been correctly set: + +```bash +echo "SUBNETS=${SUBNETS[*]} +VPC_ID=${VPC_ID} +PUBLIC_RTB_ID=${PUBLIC_RTB_ID}" +``` + +### Create the Local Zone subnet + +The following actions are required to create subnets in Local Zones: +- choose the zone group to be enabled +- opt-in the zone group + +#### Opt-in Zone groups + +Opt-in the zone group: + +```bash +aws ec2 modify-availability-zone-group \ + --region ${AWS_REGION} \ + --group-name ${ZONE_GROUP_NAME} \ + --opt-in-status opted-in +``` + +#### Creating the Subnet using AWS CloudFormation + +- Create the Stack for Local Zone subnet `us-west-2-lax-1a` + +```bash +INSTALLER_URL="https://raw.githubusercontent.com/openshift/installer/master" +TPL_URL="${INSTALLER_URL}/upi/aws/cloudformation/01.99_net_local-zone.yaml" + +aws cloudformation create-stack \ + --region ${AWS_REGION} \ + --stack-name ${SUBNET_NAME} \ + --template-body ${TPL_URL} \ + --parameters \ + ParameterKey=VpcId,ParameterValue=${VPC_ID} \ + ParameterKey=ZoneName,ParameterValue=${ZONE_NAME} \ + ParameterKey=SubnetName,ParameterValue=${SUBNET_NAME} \ + ParameterKey=PublicSubnetCidr,ParameterValue=${SUBNET_CIDR} \ + ParameterKey=PublicRouteTableId,ParameterValue=${PUBLIC_RTB_ID} +``` + +- Wait for the stack to be created `StackStatus=CREATE_COMPLETE` + +```bash +aws cloudformation wait stack-create-complete \ + --region ${AWS_REGION} \ + --stack-name ${SUBNET_NAME} +``` + +- Export the Local Zone subnet ID + +```bash +export SUBNET_ID=$(aws cloudformation describe-stacks \ + --region ${AWS_REGION} \ + --stack-name ${SUBNET_NAME} \ + | jq -r '.Stacks[0].Outputs[] | select(.OutputKey=="PublicSubnetIds").OutputValue' ) + +# Append the Local Zone Subnet ID to the Subnet List +SUBNETS+=(${SUBNET_ID}) +``` + +- Check the total of subnets. If you choose 3 AZs to be created on the VPC stack, you should have 7 subnets on this list: + +```bash +$ echo ${#SUBNETS[*]} +7 +``` + +## Install the cluster + +To install the cluster in existing VPC with subnets in Local Zones, you should: +- generate the `install-config.yaml`, or provide yours +- add the subnet IDs by setting the option `platform.aws.subnets` +- (optional) customize the `edge` compute pool + +### Create the install-config.yaml + +Create the `install-config.yaml` providing the subnet IDs recently created: + +- create the `install-config` + +```bash +$ ./openshift-install create install-config --dir ${CLUSTER_NAME} +? SSH Public Key /home/user/.ssh/id_rsa.pub +? Platform aws +? Region us-west-2 +? Base Domain devcluster.openshift.com +? Cluster Name ipi-localzone +? Pull Secret [? for help] ** +INFO Install-Config created in: ipi-localzone +``` + +- Append the subnets to the `platform.aws.subnets`: + +```bash +$ echo " subnets:"; for SB in ${SUBNETS[*]}; do echo " - $SB"; done + subnets: + - subnet-0fc845d8e30fdb431 + - subnet-0a2675b7cbac2e537 + - subnet-01c0ac400e1920b47 + - subnet-0fee60966b7a93da6 + - subnet-002b48c0a91c8c641 + - subnet-093f00deb44ce81f4 + - subnet-0f85ae65796e8d107 +``` + +### Setting up the Edge Machine Pool + +Version 4.12 or later introduces a new compute pool named `edge` designed for +the remote zones. The `edge` compute pool configuration is common between +AWS Local Zone locations, but due to the limitation of resources (Instance Types +and Sizes) of the Local Zone, the default instance type created may vary +from the traditional worker pool. + +The default EBS for Local Zone locations is `gp2`, different than the default worker pool. + +The preferred list of instance types follows the same order of worker pools, depending +on the availability of the location, one of those instances will be chosen*: +> Note: This list can be updated over the time +- `m6i.xlarge` +- `m5.xlarge` +- `c5d.2xlarge` + +The `edge` compute pool will also create new labels to help developers to +deploy their applications onto those locations. The new labels introduced are: + - `node-role.kubernetes.io/edge=''` + - `zone_type=local-zone` + - `zone_group=` + +Finally, the Machine Sets created by the `edge` compute pool have `NoSchedule` taint to avoid the +regular workloads spread out on those machines, and only user workloads will be allowed to run +when the tolerations are defined on the pod spec (you can see the example in the following sections). + +By default, the `edge` compute pool will be created only when AWS Local Zone subnet IDs are added +to the list of `platform.aws.subnets`. + +See below some examples of `install-config.yaml` with `edge` compute pool. + +#### Example edge pool created without customization + +```yaml +apiVersion: v1 +baseDomain: devcluster.openshift.com +metadata: + name: ipi-localzone +platform: + aws: + region: us-west-2 + subnets: + - subnet-0fc845d8e30fdb431 + - subnet-0a2675b7cbac2e537 + - subnet-01c0ac400e1920b47 + - subnet-0fee60966b7a93da6 + - subnet-002b48c0a91c8c641 + - subnet-093f00deb44ce81f4 + - subnet-0f85ae65796e8d107 +pullSecret: '{"auths": ...}' +sshKey: ssh-ed25519 AAAA... +``` + +#### Example edge pool with custom Instance type + +The Instance Type may differ between locations. You should check the AWS Documentation to check availability in the Local Zone that the cluster will run. + +`install-config.yaml` example customizing the Instance Type for the Edge Machine Pool: + +```yaml +apiVersion: v1 +baseDomain: devcluster.openshift.com +metadata: + name: ipi-localzone +compute: +- name: edge + platform: + aws: + type: m5.4xlarge +platform: + aws: + region: us-west-2 + subnets: + - subnet-0fc845d8e30fdb431 + - subnet-0a2675b7cbac2e537 + - subnet-01c0ac400e1920b47 + - subnet-0fee60966b7a93da6 + - subnet-002b48c0a91c8c641 + - subnet-093f00deb44ce81f4 + - subnet-0f85ae65796e8d107 +pullSecret: '{"auths": ...}' +sshKey: ssh-ed25519 AAAA... +``` + +#### Example edge pool with custom EBS type + +The EBS Type may differ between locations. You should check the AWS Documentation to check availability in the Local Zone that the cluster will run. + +`install-config.yaml` example customizing the EBS Type for the Edge Machine Pool: + +```yaml +apiVersion: v1 +baseDomain: devcluster.openshift.com +metadata: + name: ipi-localzone +compute: +- name: edge + platform: + aws: + rootVolume: + type: gp3 + size: 120 +platform: + aws: + region: us-west-2 + subnets: + - subnet-0fc845d8e30fdb431 + - subnet-0a2675b7cbac2e537 + - subnet-01c0ac400e1920b47 + - subnet-0fee60966b7a93da6 + - subnet-002b48c0a91c8c641 + - subnet-093f00deb44ce81f4 + - subnet-0f85ae65796e8d107 +pullSecret: '{"auths": ...}' +sshKey: ssh-ed25519 AAAA... +``` + +### Create the cluster + +```bash +./openshift-install create cluster --dir ${CLUSTER_NAME} +``` + +### Uninstall the cluster + +#### Destroy the cluster + +```bash +./openshift-install destroy cluster --dir ${CLUSTER_NAME} +``` + +#### Destroy the Local Zone subnets + +```bash +aws cloudformation delete-stack \ + --region ${AWS_REGION} \ + --stack-name ${SUBNET_NAME} +``` + +#### Destroy the VPC + +```bash +aws cloudformation delete-stack \ + --region ${AWS_REGION} \ + --stack-name ${CLUSTER_NAME}-vpc +``` + +## Use Cases + +### Example of a sample application deployment + +The example below creates one sample application on the node running in the Local zone, setting the tolerations needed to pin the pod on the correct node: + +```bash +cat << EOF | oc create -f - +apiVersion: v1 +kind: Namespace +metadata: + name: local-zone-demo +--- +apiVersion: apps/v1 +kind: Deployment +metadata: + name: local-zone-demo-app-nyc-1 + namespace: local-zone-demo +spec: + selector: + matchLabels: + app: local-zone-demo-app-nyc-1 + replicas: 1 + template: + metadata: + labels: + app: local-zone-demo-app-nyc-1 + machine.openshift.io/zone-group: ${ZONE_GROUP_NAME} + spec: + nodeSelector: + machine.openshift.io/zone-group: ${ZONE_GROUP_NAME} + tolerations: + - key: "node-role.kubernetes.io/edge" + operator: "Equal" + value: "" + effect: "NoSchedule" + containers: + - image: openshift/origin-node + command: + - "/bin/socat" + args: + - TCP4-LISTEN:8080,reuseaddr,fork + - EXEC:'/bin/bash -c \"printf \\\"HTTP/1.0 200 OK\r\n\r\n\\\"; sed -e \\\"/^\r/q\\\"\"' + imagePullPolicy: Always + name: echoserver + ports: + - containerPort: 8080 +--- +apiVersion: v1 +kind: Service +metadata: + name: local-zone-demo-app-nyc-1 + namespace: local-zone-demo +spec: + ports: + - port: 80 + targetPort: 8080 + protocol: TCP + type: NodePort + selector: + app: local-zone-demo-app-nyc-1 +EOF +``` + +### User-workload ingress traffic + +To expose the applications to the internet on AWS Local Zones, application developers +must expose the applications using an external Load Balancer, for example, AWS Application Load Balancers (ALB). The +[ALB Operator](https://docs.openshift.com/container-platform/4.11/networking/aws_load_balancer_operator/install-aws-load-balancer-operator.html) is available through OLM on 4.11+. + +To explore the best of deploying applications on the AWS Local Zone locations, at least one new +ALB `Ingress` must be provisioned by location to expose the services deployed on the +zones. + +If the cluster-admin decides to share the ALB `Ingress` subnets between different locations, +it will impact drastically the latency for the end-users when the traffic is routed to +backends (compute nodes) placed in different zones that the traffic entered by the Ingress/Load Balancer. + +The ALB deployment is not covered by this documentation. + + +[openshift-install]: https://docs.openshift.com/container-platform/4.11/installing/index.html +[aws-cli]: https://aws.amazon.com/cli/ +[aws-install-vpc]: https://docs.openshift.com/container-platform/4.11/installing/installing_aws/installing-aws-vpc.html +[aws-install-cloudformation]: https://docs.openshift.com/container-platform/4.11/installing/installing_aws/installing-aws-user-infra.html +[aws-local-zones]: https://aws.amazon.com/about-aws/global-infrastructure/localzones +[aws-local-zones-features]: https://aws.amazon.com/about-aws/global-infrastructure/localzones/features diff --git a/pkg/asset/installconfig/aws/availabilityzones.go b/pkg/asset/installconfig/aws/availabilityzones.go index 111234e1ce3..fcc30522bc2 100644 --- a/pkg/asset/installconfig/aws/availabilityzones.go +++ b/pkg/asset/installconfig/aws/availabilityzones.go @@ -7,10 +7,12 @@ import ( "github.com/aws/aws-sdk-go/aws/session" "github.com/aws/aws-sdk-go/service/ec2" "github.com/pkg/errors" + + typesaws "github.com/openshift/installer/pkg/types/aws" ) -// availabilityZones retrieves a list of availability zones for the given region. -func availabilityZones(ctx context.Context, session *session.Session, region string) ([]string, error) { +// describeAvailabilityZones retrieves a list of all zones for the given region. +func describeAvailabilityZones(ctx context.Context, session *session.Session, region string) ([]*ec2.AvailabilityZone, error) { client := ec2.New(session, aws.NewConfig().WithRegion(region)) resp, err := client.DescribeAvailabilityZonesWithContext(ctx, &ec2.DescribeAvailabilityZonesInput{ Filters: []*ec2.Filter{ @@ -25,12 +27,21 @@ func availabilityZones(ctx context.Context, session *session.Session, region str }, }) if err != nil { - return nil, errors.Wrap(err, "fetching availability zones") + return nil, errors.Wrap(err, "fetching zones") } + return resp.AvailabilityZones, nil +} + +// availabilityZones retrieves a list of zones type 'availability-zone' for the region. +func availabilityZones(ctx context.Context, session *session.Session, region string) ([]string, error) { + azs, err := describeAvailabilityZones(ctx, session, region) + if err != nil { + return nil, errors.Wrap(err, "fetching availability zones") + } zones := []string{} - for _, zone := range resp.AvailabilityZones { - if *zone.ZoneType == "availability-zone" { + for _, zone := range azs { + if *zone.ZoneType == typesaws.AvailabilityZoneType { zones = append(zones, *zone.ZoneName) } } diff --git a/pkg/asset/installconfig/aws/metadata.go b/pkg/asset/installconfig/aws/metadata.go index 827fae8ad5e..f0828b3008d 100644 --- a/pkg/asset/installconfig/aws/metadata.go +++ b/pkg/asset/installconfig/aws/metadata.go @@ -18,6 +18,7 @@ type Metadata struct { availabilityZones []string privateSubnets map[string]Subnet publicSubnets map[string]Subnet + edgeSubnets map[string]Subnet vpc string instanceTypes map[string]InstanceType @@ -25,7 +26,8 @@ type Metadata struct { Subnets []string `json:"subnets,omitempty"` Services []typesaws.ServiceEndpoint `json:"services,omitempty"` - mutex sync.Mutex + mutex sync.Mutex + mutexSubnets sync.Mutex } // NewMetadata initializes a new Metadata object. @@ -74,13 +76,22 @@ func (m *Metadata) AvailabilityZones(ctx context.Context) ([]string, error) { return m.availabilityZones, nil } +// EdgeSubnets retrieves subnet metadata indexed by subnet ID, for +// subnets that the cloud-provider logic considers to be edge +// (i.e. Local Zone). +func (m *Metadata) EdgeSubnets(ctx context.Context) (map[string]Subnet, error) { + err := m.populateSubnets(ctx) + if err != nil { + return nil, err + } + + return m.edgeSubnets, nil +} + // PrivateSubnets retrieves subnet metadata indexed by subnet ID, for // subnets that the cloud-provider logic considers to be private // (i.e. not public). func (m *Metadata) PrivateSubnets(ctx context.Context) (map[string]Subnet, error) { - m.mutex.Lock() - defer m.mutex.Unlock() - err := m.populateSubnets(ctx) if err != nil { return nil, err @@ -93,9 +104,6 @@ func (m *Metadata) PrivateSubnets(ctx context.Context) (map[string]Subnet, error // subnets that the cloud-provider logic considers to be public // (e.g. with suitable routing for hosting public load balancers). func (m *Metadata) PublicSubnets(ctx context.Context) (map[string]Subnet, error) { - m.mutex.Lock() - defer m.mutex.Unlock() - err := m.populateSubnets(ctx) if err != nil { return nil, err @@ -113,12 +121,19 @@ func (m *Metadata) populateSubnets(ctx context.Context) error { return errors.New("no subnets configured") } - session, err := m.unlockedSession(ctx) + m.mutexSubnets.Lock() + defer m.mutexSubnets.Unlock() + + session, err := m.Session(ctx) if err != nil { return err } - m.vpc, m.privateSubnets, m.publicSubnets, err = subnets(ctx, session, m.Region, m.Subnets) + sb, err := subnets(ctx, session, m.Region, m.Subnets) + m.vpc = sb.VPC + m.privateSubnets = sb.Private + m.publicSubnets = sb.Public + m.edgeSubnets = sb.Edge return err } diff --git a/pkg/asset/installconfig/aws/subnet.go b/pkg/asset/installconfig/aws/subnet.go index 4de700ec3a7..ac15be172b1 100644 --- a/pkg/asset/installconfig/aws/subnet.go +++ b/pkg/asset/installconfig/aws/subnet.go @@ -11,10 +11,15 @@ import ( "github.com/aws/aws-sdk-go/service/ec2" "github.com/pkg/errors" "github.com/sirupsen/logrus" + + typesaws "github.com/openshift/installer/pkg/types/aws" ) // Subnet holds metadata for a subnet. type Subnet struct { + // ID is the subnet's Identifier. + ID string + // ARN is the subnet's Amazon Resource Name. ARN string @@ -23,13 +28,47 @@ type Subnet struct { // CIDR is the subnet's CIDR block. CIDR string + + // ZoneType is the type of subnet's availability zone. + // The valid values are availability-zone and local-zone. + ZoneType string + + // ZoneGroupName is the AWS zone group name. + // For Availability Zones, this parameter has the same value as the Region name. + // + // For Local Zones, the name of the associated group, for example us-west-2-lax-1. + ZoneGroupName string + + // Public is the flag to define the subnet public. + Public bool + + // PreferredEdgeInstanceType is the preferred instance type on the subnet's zone. + // It's used for the edge pools which does not offer the same type across zone groups. + PreferredEdgeInstanceType string +} + +// Subnets is the map for the Subnet metadata. +type Subnets map[string]Subnet + +// SubnetGroups is the group of subnets used by installer. +type SubnetGroups struct { + Public Subnets + Private Subnets + Edge Subnets + VPC string } // subnets retrieves metadata for the given subnet(s). -func subnets(ctx context.Context, session *session.Session, region string, ids []string) (vpc string, private map[string]Subnet, public map[string]Subnet, err error) { +func subnets(ctx context.Context, session *session.Session, region string, ids []string) (subnetGroups SubnetGroups, err error) { metas := make(map[string]Subnet, len(ids)) - private = map[string]Subnet{} - public = map[string]Subnet{} + zoneNames := make([]*string, len(ids)) + availabilityZones := make(map[string]*ec2.AvailabilityZone, len(ids)) + subnetGroups = SubnetGroups{ + Public: make(map[string]Subnet, len(ids)), + Private: make(map[string]Subnet, len(ids)), + Edge: make(map[string]Subnet, len(ids)), + } + var vpcFromSubnet string client := ec2.New(session, aws.NewConfig().WithRegion(region)) @@ -60,19 +99,22 @@ func subnets(ctx context.Context, session *session.Session, region string, ids [ return false } - if vpc == "" { - vpc = *subnet.VpcId + if subnetGroups.VPC == "" { + subnetGroups.VPC = *subnet.VpcId vpcFromSubnet = *subnet.SubnetId - } else if *subnet.VpcId != vpc { - lastError = errors.Errorf("all subnets must belong to the same VPC: %s is from %s, but %s is from %s", *subnet.SubnetId, *subnet.VpcId, vpcFromSubnet, vpc) + } else if *subnet.VpcId != subnetGroups.VPC { + lastError = errors.Errorf("all subnets must belong to the same VPC: %s is from %s, but %s is from %s", *subnet.SubnetId, *subnet.VpcId, vpcFromSubnet, subnetGroups.VPC) return false } metas[*subnet.SubnetId] = Subnet{ - ARN: *subnet.SubnetArn, - Zone: *subnet.AvailabilityZone, - CIDR: *subnet.CidrBlock, + ID: *subnet.SubnetId, + ARN: *subnet.SubnetArn, + Zone: *subnet.AvailabilityZone, + CIDR: *subnet.CidrBlock, + Public: false, } + zoneNames = append(zoneNames, subnet.AvailabilityZone) } return !lastPage }, @@ -81,7 +123,7 @@ func subnets(ctx context.Context, session *session.Session, region string, ids [ err = lastError } if err != nil { - return vpc, nil, nil, errors.Wrap(err, "describing subnets") + return subnetGroups, errors.Wrap(err, "describing subnets") } var routeTables []*ec2.RouteTable @@ -90,7 +132,7 @@ func subnets(ctx context.Context, session *session.Session, region string, ids [ &ec2.DescribeRouteTablesInput{ Filters: []*ec2.Filter{{ Name: aws.String("vpc-id"), - Values: []*string{aws.String(vpc)}, + Values: []*string{aws.String(subnetGroups.VPC)}, }}, }, func(results *ec2.DescribeRouteTablesOutput, lastPage bool) bool { @@ -99,7 +141,15 @@ func subnets(ctx context.Context, session *session.Session, region string, ids [ }, ) if err != nil { - return vpc, nil, nil, errors.Wrap(err, "describing route tables") + return subnetGroups, errors.Wrap(err, "describing route tables") + } + + azs, err := client.DescribeAvailabilityZonesWithContext(ctx, &ec2.DescribeAvailabilityZonesInput{ZoneNames: zoneNames}) + if err != nil { + return subnetGroups, errors.Wrap(err, "describing availability zones") + } + for _, az := range azs.AvailabilityZones { + availabilityZones[*az.ZoneName] = az } publicOnlySubnets := os.Getenv("OPENSHIFT_INSTALL_AWS_PUBLIC_ONLY") != "" @@ -107,30 +157,43 @@ func subnets(ctx context.Context, session *session.Session, region string, ids [ for _, id := range ids { meta, ok := metas[id] if !ok { - return vpc, nil, nil, errors.Errorf("failed to find %s", id) + return subnetGroups, errors.Errorf("failed to find %s", id) } isPublic, err := isSubnetPublic(routeTables, id) if err != nil { - return vpc, nil, nil, err + return subnetGroups, err } - if isPublic { - public[id] = meta - } else { - private[id] = meta + meta.Public = isPublic + meta.ZoneType = *availabilityZones[meta.Zone].ZoneType + meta.ZoneGroupName = *availabilityZones[meta.Zone].GroupName + + // AWS Local Zones are grouped as Edge subnets + if meta.ZoneType == typesaws.LocalZoneType { + // Local Zones is supported only in Public subnets + if !meta.Public { + return subnetGroups, errors.Errorf("subnet tyoe local-zone must be associated with public route tables: subnet %s from availability zone %s[%s] is public[%v]", id, meta.Zone, meta.ZoneType, meta.Public) + } + subnetGroups.Edge[id] = meta + continue } - - // Let public subnets work as if they were private. This allows us to - // have clusters with public-only subnets without having to introduce a - // lot of changes in the installer. Such clusters can be used in a - // NAT-less GW scenario, therefore decreasing costs in cases where node - // security is not a concern (e.g, ephemeral clusters in CI) - if publicOnlySubnets && isPublic { - private[id] = meta + if meta.Public { + subnetGroups.Public[id] = meta + + // Let public subnets work as if they were private. This allows us to + // have clusters with public-only subnets without having to introduce a + // lot of changes in the installer. Such clusters can be used in a + // NAT-less GW scenario, therefore decreasing costs in cases where node + // security is not a concern (e.g, ephemeral clusters in CI) + if publicOnlySubnets { + subnetGroups.Private[id] = meta + } + continue } + // Subnet is grouped by default as private + subnetGroups.Private[id] = meta } - - return vpc, private, public, nil + return subnetGroups, nil } // https://github.com/kubernetes/kubernetes/blob/9f036cd43d35a9c41d7ac4ca82398a6d0bef957b/staging/src/k8s.io/legacy-cloud-providers/aws/aws.go#L3376-L3419 diff --git a/pkg/asset/installconfig/aws/validation.go b/pkg/asset/installconfig/aws/validation.go index a7ecf9ca487..43191c8bcd3 100644 --- a/pkg/asset/installconfig/aws/validation.go +++ b/pkg/asset/installconfig/aws/validation.go @@ -49,12 +49,30 @@ func Validate(ctx context.Context, meta *Metadata, config *types.InstallConfig) allErrs = append(allErrs, validatePlatform(ctx, meta, field.NewPath("platform", "aws"), config.Platform.AWS, config.Networking, config.Publish)...) if config.ControlPlane != nil && config.ControlPlane.Platform.AWS != nil { - allErrs = append(allErrs, validateMachinePool(ctx, meta, field.NewPath("controlPlane", "platform", "aws"), config.Platform.AWS, config.ControlPlane.Platform.AWS, controlPlaneReq)...) + allErrs = append(allErrs, validateMachinePool(ctx, meta, field.NewPath("controlPlane", "platform", "aws"), config.Platform.AWS, config.ControlPlane.Platform.AWS, controlPlaneReq, "")...) } + for idx, compute := range config.Compute { fldPath := field.NewPath("compute").Index(idx) + + // Pool's specific validation. + // Edge Compute Pool: AWS Local Zones is valid only when installing in existing VPC. + if compute.Name == types.MachinePoolEdgeRoleName { + if len(config.Platform.AWS.Subnets) == 0 { + return errors.New(field.Required(fldPath, "invalid install config. edge machine pool is valid when installing in existing VPC").Error()) + } + edgeSubnets, err := meta.EdgeSubnets(ctx) + if err != nil { + errMsg := fmt.Sprintf("%s pool. %v", compute.Name, err.Error()) + return errors.New(field.Invalid(field.NewPath("platform", "aws", "subnets"), config.Platform.AWS.Subnets, errMsg).Error()) + } + if len(edgeSubnets) == 0 { + return errors.New(field.Required(fldPath, "invalid install config. There is no valid subnets for edge machine pool").Error()) + } + } + if compute.Platform.AWS != nil { - allErrs = append(allErrs, validateMachinePool(ctx, meta, fldPath.Child("platform", "aws"), config.Platform.AWS, compute.Platform.AWS, computeReq)...) + allErrs = append(allErrs, validateMachinePool(ctx, meta, fldPath.Child("platform", "aws"), config.Platform.AWS, compute.Platform.AWS, computeReq, compute.Name)...) } } return allErrs.ToAggregate() @@ -74,7 +92,7 @@ func validatePlatform(ctx context.Context, meta *Metadata, fldPath *field.Path, allErrs = append(allErrs, validateSubnets(ctx, meta, fldPath.Child("subnets"), platform.Subnets, networking, publish)...) } if platform.DefaultMachinePlatform != nil { - allErrs = append(allErrs, validateMachinePool(ctx, meta, fldPath.Child("defaultMachinePlatform"), platform, platform.DefaultMachinePlatform, controlPlaneReq)...) + allErrs = append(allErrs, validateMachinePool(ctx, meta, fldPath.Child("defaultMachinePlatform"), platform, platform.DefaultMachinePlatform, controlPlaneReq, "")...) } return allErrs } @@ -153,10 +171,22 @@ func validateSubnets(ctx context.Context, meta *Metadata, fldPath *field.Path, s } } + edgeSubnets, err := meta.EdgeSubnets(ctx) + if err != nil { + return append(allErrs, field.Invalid(fldPath, subnets, err.Error())) + } + edgeSubnetsIdx := map[string]int{} + for idx, id := range subnets { + if _, ok := edgeSubnets[id]; ok { + edgeSubnetsIdx[id] = idx + } + } + allErrs = append(allErrs, validateSubnetCIDR(fldPath, privateSubnets, privateSubnetsIdx, networking.MachineNetwork)...) allErrs = append(allErrs, validateSubnetCIDR(fldPath, publicSubnets, publicSubnetsIdx, networking.MachineNetwork)...) allErrs = append(allErrs, validateDuplicateSubnetZones(fldPath, privateSubnets, privateSubnetsIdx, "private")...) allErrs = append(allErrs, validateDuplicateSubnetZones(fldPath, publicSubnets, publicSubnetsIdx, "public")...) + allErrs = append(allErrs, validateDuplicateSubnetZones(fldPath, edgeSubnets, edgeSubnetsIdx, "edge")...) privateZones := sets.NewString() publicZones := sets.NewString() @@ -174,16 +204,23 @@ func validateSubnets(ctx context.Context, meta *Metadata, fldPath *field.Path, s return allErrs } -func validateMachinePool(ctx context.Context, meta *Metadata, fldPath *field.Path, platform *awstypes.Platform, pool *awstypes.MachinePool, req resourceRequirements) field.ErrorList { +func validateMachinePool(ctx context.Context, meta *Metadata, fldPath *field.Path, platform *awstypes.Platform, pool *awstypes.MachinePool, req resourceRequirements, poolName string) field.ErrorList { allErrs := field.ErrorList{} if len(pool.Zones) > 0 { availableZones := sets.String{} if len(platform.Subnets) > 0 { - privateSubnets, err := meta.PrivateSubnets(ctx) + var err error + var subnets Subnets + switch poolName { + case types.MachinePoolEdgeRoleName: + subnets, err = meta.EdgeSubnets(ctx) + default: + subnets, err = meta.PrivateSubnets(ctx) + } if err != nil { return append(allErrs, field.InternalError(fldPath, err)) } - for _, subnet := range privateSubnets { + for _, subnet := range subnets { availableZones.Insert(subnet.Zone) } } else { diff --git a/pkg/asset/installconfig/aws/validation_test.go b/pkg/asset/installconfig/aws/validation_test.go index 20b5089a87c..808598b1d0d 100644 --- a/pkg/asset/installconfig/aws/validation_test.go +++ b/pkg/asset/installconfig/aws/validation_test.go @@ -4,6 +4,7 @@ import ( "context" "fmt" "os" + "sort" "testing" "github.com/aws/aws-sdk-go/service/route53" @@ -75,6 +76,7 @@ func validInstallConfig() *types.InstallConfig { }, }, Compute: []types.MachinePool{{ + Name: types.MachinePoolComputeRoleName, Architecture: types.ArchitectureAMD64, Replicas: pointer.Int64Ptr(3), Platform: types.MachinePoolPlatform{ @@ -89,10 +91,33 @@ func validInstallConfig() *types.InstallConfig { } } +func validInstallConfigEdge() *types.InstallConfig { + ic := validInstallConfig() + edgeSubnets := validEdgeSubnets() + for subnet := range edgeSubnets { + ic.Platform.AWS.Subnets = append(ic.Platform.AWS.Subnets, subnet) + } + ic.Compute = append(ic.Compute, types.MachinePool{ + Name: types.MachinePoolEdgeRoleName, + Platform: types.MachinePoolPlatform{ + AWS: &aws.MachinePool{}, + }, + }) + return ic +} + func validAvailZones() []string { return []string{"a", "b", "c"} } +func validAvailZonesWithEdge() []string { + return []string{"a", "b", "c", "edge-a", "edge-b", "edge-c"} +} + +func validAvailZonesOnlyEdge() []string { + return []string{"edge-a", "edge-b", "edge-c"} +} + func validPrivateSubnets() map[string]Subnet { return map[string]Subnet{ "valid-private-subnet-a": { @@ -127,6 +152,23 @@ func validPublicSubnets() map[string]Subnet { } } +func validEdgeSubnets() map[string]Subnet { + return map[string]Subnet{ + "valid-public-subnet-edge-a": { + Zone: "edge-a", + CIDR: "10.0.7.0/24", + }, + "valid-public-subnet-edge-b": { + Zone: "edge-b", + CIDR: "10.0.8.0/24", + }, + "valid-public-subnet-edge-c": { + Zone: "edge-c", + CIDR: "10.0.9.0/24", + }, + } +} + func validServiceEndpoints() []aws.ServiceEndpoint { return []aws.ServiceEndpoint{{ Name: "ec2", @@ -211,6 +253,7 @@ func TestValidate(t *testing.T) { availZones []string privateSubnets map[string]Subnet publicSubnets map[string]Subnet + edgeSubnets map[string]Subnet instanceTypes map[string]InstanceType proxy string expectErr string @@ -244,6 +287,13 @@ func TestValidate(t *testing.T) { availZones: validAvailZones(), privateSubnets: validPrivateSubnets(), publicSubnets: validPublicSubnets(), + }, { + name: "valid byo", + installConfig: validInstallConfigEdge(), + availZones: validAvailZones(), + privateSubnets: validPrivateSubnets(), + publicSubnets: validPublicSubnets(), + edgeSubnets: validEdgeSubnets(), }, { name: "valid byo", installConfig: func() *types.InstallConfig { @@ -431,6 +481,66 @@ func TestValidate(t *testing.T) { return s }(), expectErr: `^platform\.aws\.subnets\[6\]: Invalid value: \"valid-public-zone-c-2\": public subnet valid-public-subnet-c is also in zone c$`, + }, { + name: "invalid multiple public edge in same zone", + installConfig: func() *types.InstallConfig { + c := validInstallConfigEdge() + c.Platform.AWS.Subnets = append(c.Platform.AWS.Subnets, "valid-public-zone-edge-c-2") + return c + }(), + availZones: validAvailZonesWithEdge(), + privateSubnets: validPrivateSubnets(), + publicSubnets: validPublicSubnets(), + edgeSubnets: func() map[string]Subnet { + s := validEdgeSubnets() + s["valid-public-zone-edge-c-2"] = Subnet{ + Zone: "edge-c", + CIDR: "10.0.9.0/24", + ZoneType: aws.LocalZoneType, + } + return s + }(), + expectErr: `^platform\.aws\.subnets\[9\]: Invalid value: \"valid-public-zone-edge-c-2\": edge subnet valid-public-subnet-edge-c is also in zone edge-c$`, + }, { + name: "invalid edge pool missing subnets", + installConfig: func() *types.InstallConfig { + c := validInstallConfigEdge() + c.Platform.AWS.Subnets = []string{} + return c + }(), + availZones: validAvailZonesWithEdge(), + privateSubnets: validPrivateSubnets(), + publicSubnets: validPublicSubnets(), + edgeSubnets: validEdgeSubnets(), + expectErr: `^compute\[1\]: Required value: invalid install config\. edge machine pool is valid when installing in existing VPC$`, + }, { + name: "invalid edge pool missing edge subnets", + installConfig: func() *types.InstallConfig { + c := validInstallConfigEdge() + return c + }(), + availZones: validAvailZonesWithEdge(), + privateSubnets: validPrivateSubnets(), + publicSubnets: validPublicSubnets(), + edgeSubnets: map[string]Subnet{}, + expectErr: `^compute\[1\]: Required value: invalid install config\. There is no valid subnets for edge machine pool$`, + }, { + name: "invalid edge pool missing subnets on regular zones", + installConfig: func() *types.InstallConfig { + c := validInstallConfigEdge() + c.Platform.AWS.Subnets = []string{} + edgeSubnets := validEdgeSubnets() + for subnet := range edgeSubnets { + c.Platform.AWS.Subnets = append(c.Platform.AWS.Subnets, subnet) + } + sort.Strings(c.Platform.AWS.Subnets) + return c + }(), + availZones: validAvailZonesOnlyEdge(), + privateSubnets: map[string]Subnet{}, + publicSubnets: map[string]Subnet{}, + edgeSubnets: validEdgeSubnets(), + expectErr: `^platform\.aws\.subnets: Invalid value: \[\]string{\"valid-public-subnet-edge-a\", \"valid-public-subnet-edge-b\", \"valid-public-subnet-edge-c\"}: edge pool. no subnets configured$`, }, { name: "invalid no subnet for control plane zones", installConfig: func() *types.InstallConfig { @@ -637,6 +747,7 @@ func TestValidate(t *testing.T) { availabilityZones: test.availZones, privateSubnets: test.privateSubnets, publicSubnets: test.publicSubnets, + edgeSubnets: test.edgeSubnets, instanceTypes: test.instanceTypes, } if test.proxy != "" { diff --git a/pkg/asset/installconfig/installconfig.go b/pkg/asset/installconfig/installconfig.go index 9049b1cc3cf..3ffda12e6a3 100644 --- a/pkg/asset/installconfig/installconfig.go +++ b/pkg/asset/installconfig/installconfig.go @@ -2,6 +2,7 @@ package installconfig import ( "context" + "fmt" "github.com/pkg/errors" metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" @@ -124,9 +125,32 @@ func (a *InstallConfig) Load(f asset.FileFetcher) (found bool, err error) { return found, err } +// finishAWS set defaults for AWS Platform before the config validation. +func (a *InstallConfig) finishAWS() error { + // Set the Default Edge Compute pool when the subnets are defined. + // Edge Compute Pool/AWS Local Zones is supported only when installing in existing VPC. + if len(a.Config.Platform.AWS.Subnets) > 0 { + edgeSubnets, err := a.AWS.EdgeSubnets(context.TODO()) + if err != nil { + return errors.Wrap(err, fmt.Sprintf("unable to load edge subnets: %v", err)) + } + totalEdgeSubnets := int64(len(edgeSubnets)) + if totalEdgeSubnets == 0 { + return nil + } + if edgePool := defaults.CreateEdgeMachinePoolDefaults(a.Config.Compute, a.Config.Platform.Name(), totalEdgeSubnets); edgePool != nil { + a.Config.Compute = append(a.Config.Compute, *edgePool) + } + } + return nil +} + func (a *InstallConfig) finish(filename string) error { if a.Config.AWS != nil { a.AWS = aws.NewMetadata(a.Config.Platform.AWS.Region, a.Config.Platform.AWS.Subnets, a.Config.AWS.ServiceEndpoints) + if err := a.finishAWS(); err != nil { + return err + } } if a.Config.AlibabaCloud != nil { a.AlibabaCloud = alibabacloud.NewMetadata(a.Config.AlibabaCloud.Region, a.Config.AlibabaCloud.VSwitchIDs) diff --git a/pkg/asset/machines/aws/machines.go b/pkg/asset/machines/aws/machines.go index 72c06e8aa48..4156966020c 100644 --- a/pkg/asset/machines/aws/machines.go +++ b/pkg/asset/machines/aws/machines.go @@ -18,6 +18,21 @@ import ( "github.com/openshift/installer/pkg/types/aws" ) +type machineProviderInput struct { + clusterID string + region string + subnet string + instanceType string + osImage string + zone string + role string + userDataSecret string + root *aws.EC2RootVolume + imds aws.EC2Metadata + userTags map[string]string + publicSubnet bool +} + // Machines returns a list of machines for a machinepool. func Machines(clusterID string, region string, subnets map[string]string, pool *types.MachinePool, role, userDataSecret string, userTags map[string]string) ([]machineapi.Machine, *machinev1.ControlPlaneMachineSet, error) { if poolPlatform := pool.Platform.Name(); poolPlatform != aws.Name { @@ -37,19 +52,20 @@ func Machines(clusterID string, region string, subnets map[string]string, pool * if len(subnets) > 0 && !ok { return nil, nil, errors.Errorf("no subnet for zone %s", zone) } - provider, err := provider( - clusterID, - region, - subnet, - mpool.InstanceType, - &mpool.EC2RootVolume, - mpool.EC2Metadata, - mpool.AMIID, - zone, - role, - userDataSecret, - userTags, - ) + provider, err := provider(&machineProviderInput{ + clusterID: clusterID, + region: region, + subnet: subnet, + instanceType: mpool.InstanceType, + osImage: mpool.AMIID, + zone: zone, + role: role, + userDataSecret: userDataSecret, + root: &mpool.EC2RootVolume, + imds: mpool.EC2Metadata, + userTags: userTags, + publicSubnet: false, + }) if err != nil { return nil, nil, errors.Wrap(err, "failed to create provider") } @@ -98,7 +114,7 @@ func Machines(clusterID string, region string, subnets map[string]string, pool * }} } else { domain.Subnet.Type = machinev1.AWSIDReferenceType - domain.Subnet.ID = pointer.StringPtr(subnet) + domain.Subnet.ID = pointer.String(subnet) } failureDomains = append(failureDomains, domain) } @@ -153,8 +169,8 @@ func Machines(clusterID string, region string, subnets map[string]string, pool * return machines, controlPlaneMachineSet, nil } -func provider(clusterID string, region string, subnet string, instanceType string, root *aws.EC2RootVolume, imds aws.EC2Metadata, osImage string, zone, role, userDataSecret string, userTags map[string]string) (*machineapi.AWSMachineProviderConfig, error) { - tags, err := tagsFromUserTags(clusterID, userTags) +func provider(in *machineProviderInput) (*machineapi.AWSMachineProviderConfig, error) { + tags, err := tagsFromUserTags(in.clusterID, in.userTags) if err != nil { return nil, errors.Wrap(err, "failed to create machineapi.TagSpecifications from UserTags") } @@ -163,51 +179,59 @@ func provider(clusterID string, region string, subnet string, instanceType strin APIVersion: "machine.openshift.io/v1beta1", Kind: "AWSMachineProviderConfig", }, - InstanceType: instanceType, + InstanceType: in.instanceType, BlockDevices: []machineapi.BlockDeviceMappingSpec{ { EBS: &machineapi.EBSBlockDeviceSpec{ - VolumeType: pointer.StringPtr(root.Type), - VolumeSize: pointer.Int64Ptr(int64(root.Size)), - Iops: pointer.Int64Ptr(int64(root.IOPS)), - Encrypted: pointer.BoolPtr(true), - KMSKey: machineapi.AWSResourceReference{ARN: pointer.StringPtr(root.KMSKeyARN)}, + VolumeType: pointer.String(in.root.Type), + VolumeSize: pointer.Int64(int64(in.root.Size)), + Iops: pointer.Int64(int64(in.root.IOPS)), + Encrypted: pointer.Bool(true), + KMSKey: machineapi.AWSResourceReference{ARN: pointer.String(in.root.KMSKeyARN)}, }, }, }, - Tags: tags, - IAMInstanceProfile: &machineapi.AWSResourceReference{ID: pointer.StringPtr(fmt.Sprintf("%s-%s-profile", clusterID, role))}, - UserDataSecret: &corev1.LocalObjectReference{Name: userDataSecret}, - CredentialsSecret: &corev1.LocalObjectReference{Name: "aws-cloud-credentials"}, - Placement: machineapi.Placement{Region: region, AvailabilityZone: zone}, + Tags: tags, + IAMInstanceProfile: &machineapi.AWSResourceReference{ + ID: pointer.String(fmt.Sprintf("%s-%s-profile", in.clusterID, in.role)), + }, + UserDataSecret: &corev1.LocalObjectReference{Name: in.userDataSecret}, + CredentialsSecret: &corev1.LocalObjectReference{Name: "aws-cloud-credentials"}, + Placement: machineapi.Placement{Region: in.region, AvailabilityZone: in.zone}, SecurityGroups: []machineapi.AWSResourceReference{{ Filters: []machineapi.Filter{{ Name: "tag:Name", - Values: []string{fmt.Sprintf("%s-%s-sg", clusterID, role)}, + Values: []string{fmt.Sprintf("%s-%s-sg", in.clusterID, in.role)}, }}, }}, } - if subnet == "" { + subnetName := fmt.Sprintf("%s-private-%s", in.clusterID, in.zone) + if in.publicSubnet { + config.PublicIP = pointer.Bool(in.publicSubnet) + subnetName = fmt.Sprintf("%s-public-%s", in.clusterID, in.zone) + } + + if in.subnet == "" { config.Subnet.Filters = []machineapi.Filter{{ Name: "tag:Name", - Values: []string{fmt.Sprintf("%s-private-%s", clusterID, zone)}, + Values: []string{subnetName}, }} } else { - config.Subnet.ID = pointer.StringPtr(subnet) + config.Subnet.ID = pointer.String(in.subnet) } - if osImage == "" { + if in.osImage == "" { config.AMI.Filters = []machineapi.Filter{{ Name: "tag:Name", - Values: []string{fmt.Sprintf("%s-ami-%s", clusterID, region)}, + Values: []string{fmt.Sprintf("%s-ami-%s", in.clusterID, in.region)}, }} } else { - config.AMI.ID = pointer.StringPtr(osImage) + config.AMI.ID = pointer.String(in.osImage) } - if imds.Authentication != "" { - config.MetadataServiceOptions.Authentication = machineapi.MetadataServiceAuthentication(imds.Authentication) + if in.imds.Authentication != "" { + config.MetadataServiceOptions.Authentication = machineapi.MetadataServiceAuthentication(in.imds.Authentication) } return config, nil diff --git a/pkg/asset/machines/aws/machinesets.go b/pkg/asset/machines/aws/machinesets.go index 2d3f006fb93..ded52fd70d6 100644 --- a/pkg/asset/machines/aws/machinesets.go +++ b/pkg/asset/machines/aws/machinesets.go @@ -5,16 +5,18 @@ import ( "fmt" "github.com/pkg/errors" + corev1 "k8s.io/api/core/v1" metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" "k8s.io/apimachinery/pkg/runtime" machineapi "github.com/openshift/api/machine/v1beta1" + icaws "github.com/openshift/installer/pkg/asset/installconfig/aws" "github.com/openshift/installer/pkg/types" "github.com/openshift/installer/pkg/types/aws" ) // MachineSets returns a list of machinesets for a machinepool. -func MachineSets(clusterID string, region string, subnets map[string]string, pool *types.MachinePool, role, userDataSecret string, userTags map[string]string) ([]*machineapi.MachineSet, error) { +func MachineSets(clusterID string, region string, subnets icaws.Subnets, pool *types.MachinePool, role, userDataSecret string, userTags map[string]string) ([]*machineapi.MachineSet, error) { if poolPlatform := pool.Platform.Name(); poolPlatform != aws.Name { return nil, fmt.Errorf("non-AWS machine-pool: %q", poolPlatform) } @@ -32,28 +34,62 @@ func MachineSets(clusterID string, region string, subnets map[string]string, poo if int64(idx) < total%numOfAZs { replicas++ } - subnet, ok := subnets[az] if len(subnets) > 0 && !ok { return nil, errors.Errorf("no subnet for zone %s", az) } - provider, err := provider( - clusterID, - region, - subnet, - mpool.InstanceType, - &mpool.EC2RootVolume, - mpool.EC2Metadata, - mpool.AMIID, - az, - role, - userDataSecret, - userTags, - ) + + publicSubnet := subnet.Public + instanceType := mpool.InstanceType + nodeLabels := make(map[string]string, 3) + nodeTaints := []corev1.Taint{} + + if pool.Name == types.MachinePoolEdgeRoleName { + // edge pools typically do not receive the same workloads between + // different zoneGroups, thus the installer will discover preferred + // instance based on the installer's preferred instance lookup. + if subnet.PreferredEdgeInstanceType != "" { + instanceType = subnet.PreferredEdgeInstanceType + } + nodeLabels = map[string]string{ + "node-role.kubernetes.io/edge": "", + "machine.openshift.io/zone-type": subnet.ZoneType, + "machine.openshift.io/zone-group": subnet.ZoneGroupName, + } + nodeTaints = append(nodeTaints, corev1.Taint{ + Key: "node-role.kubernetes.io/edge", + Effect: "NoSchedule", + }) + } + + provider, err := provider(&machineProviderInput{ + clusterID: clusterID, + region: region, + subnet: subnet.ID, + instanceType: instanceType, + osImage: mpool.AMIID, + zone: az, + role: "worker", + userDataSecret: userDataSecret, + root: &mpool.EC2RootVolume, + imds: mpool.EC2Metadata, + userTags: userTags, + publicSubnet: publicSubnet, + }) if err != nil { return nil, errors.Wrap(err, "failed to create provider") } name := fmt.Sprintf("%s-%s-%s", clusterID, pool.Name, az) + spec := machineapi.MachineSpec{ + ProviderSpec: machineapi.ProviderSpec{ + Value: &runtime.RawExtension{Object: provider}, + }, + ObjectMeta: machineapi.ObjectMeta{ + Labels: nodeLabels, + }, + Taints: nodeTaints, + } + mset := &machineapi.MachineSet{ TypeMeta: metav1.TypeMeta{ APIVersion: "machine.openshift.io/v1beta1", @@ -83,12 +119,8 @@ func MachineSets(clusterID string, region string, subnets map[string]string, poo "machine.openshift.io/cluster-api-machine-type": role, }, }, - Spec: machineapi.MachineSpec{ - ProviderSpec: machineapi.ProviderSpec{ - Value: &runtime.RawExtension{Object: provider}, - }, - // we don't need to set Versions, because we control those via cluster operators. - }, + Spec: spec, + // we don't need to set Versions, because we control those via cluster operators. }, }, } diff --git a/pkg/asset/machines/master.go b/pkg/asset/machines/master.go index 93ee392272f..ba089dcc3ad 100644 --- a/pkg/asset/machines/master.go +++ b/pkg/asset/machines/master.go @@ -15,6 +15,7 @@ import ( "k8s.io/apimachinery/pkg/runtime" "k8s.io/apimachinery/pkg/runtime/serializer" + configv1 "github.com/openshift/api/config/v1" machinev1 "github.com/openshift/api/machine/v1" machinev1alpha1 "github.com/openshift/api/machine/v1alpha1" machinev1beta1 "github.com/openshift/api/machine/v1beta1" @@ -46,6 +47,7 @@ import ( "github.com/openshift/installer/pkg/types" alibabacloudtypes "github.com/openshift/installer/pkg/types/alibabacloud" awstypes "github.com/openshift/installer/pkg/types/aws" + awsdefaults "github.com/openshift/installer/pkg/types/aws/defaults" azuretypes "github.com/openshift/installer/pkg/types/azure" azuredefaults "github.com/openshift/installer/pkg/types/azure/defaults" baremetaltypes "github.com/openshift/installer/pkg/types/baremetal" @@ -112,12 +114,6 @@ const ( controlPlaneMachineSetFileName = "99_openshift-machine-api_master-control-plane-machine-set.yaml" ) -// AWS specific constants. -const ( - defaultAWSInstanceSize = "xlarge" - defaultAWSSingleNodeControlPlaneInstanceSize = "2xlarge" -) - var ( secretFileNamePattern = fmt.Sprintf(secretFileName, "*") networkConfigSecretFileNamePattern = fmt.Sprintf(networkConfigSecretFileName, "*") @@ -209,7 +205,7 @@ func (m *Master) Generate(dependencies asset.Parents) error { } } - mpool := defaultAWSMachinePoolPlatform() + mpool := defaultAWSMachinePoolPlatform("master") osImage := strings.SplitN(string(*rhcosImage), ",", 2) osImageID := osImage[0] @@ -236,21 +232,14 @@ func (m *Master) Generate(dependencies asset.Parents) error { } if mpool.InstanceType == "" { - // If the control plane is single node, we need to use a larger - // instance type for that node, as the minimum requirement for - // single-node control-plane nodes is 8 cores, and xlarge only has - // 4. Unfortunately 2xlarge has twice as much RAM as we need, but - // we default to it because AWS doesn't offer an 8-core 16GiB - // instance type - instanceSize := defaultAWSInstanceSize + topology := configv1.HighlyAvailableTopologyMode if pool.Replicas != nil && *pool.Replicas == 1 { - instanceSize = defaultAWSSingleNodeControlPlaneInstanceSize + topology = configv1.SingleReplicaTopologyMode } - - mpool.InstanceType, err = aws.PreferredInstanceType(ctx, installConfig.AWS, awsDefaultMachineTypes(installConfig.Config.Platform.AWS.Region, installConfig.Config.ControlPlane.Architecture, instanceSize), mpool.Zones) + mpool.InstanceType, err = aws.PreferredInstanceType(ctx, installConfig.AWS, awsdefaults.InstanceTypes(installConfig.Config.Platform.AWS.Region, installConfig.Config.ControlPlane.Architecture, topology), mpool.Zones) if err != nil { logrus.Warn(errors.Wrap(err, "failed to find default instance type")) - mpool.InstanceType = awsDefaultMachineTypes(installConfig.Config.Platform.AWS.Region, installConfig.Config.ControlPlane.Architecture, instanceSize)[0] + mpool.InstanceType = awsdefaults.InstanceTypes(installConfig.Config.Platform.AWS.Region, installConfig.Config.ControlPlane.Architecture, topology)[0] } } diff --git a/pkg/asset/machines/worker.go b/pkg/asset/machines/worker.go index e26578215df..a38c8ad4a2b 100644 --- a/pkg/asset/machines/worker.go +++ b/pkg/asset/machines/worker.go @@ -14,6 +14,7 @@ import ( "k8s.io/apimachinery/pkg/runtime/serializer" "k8s.io/apimachinery/pkg/util/intstr" + configv1 "github.com/openshift/api/config/v1" machinev1 "github.com/openshift/api/machine/v1" machinev1alpha1 "github.com/openshift/api/machine/v1alpha1" machinev1beta1 "github.com/openshift/api/machine/v1beta1" @@ -26,6 +27,7 @@ import ( "github.com/openshift/installer/pkg/asset" "github.com/openshift/installer/pkg/asset/ignition/machine" "github.com/openshift/installer/pkg/asset/installconfig" + icaws "github.com/openshift/installer/pkg/asset/installconfig/aws" icazure "github.com/openshift/installer/pkg/asset/installconfig/azure" "github.com/openshift/installer/pkg/asset/machines/alibabacloud" "github.com/openshift/installer/pkg/asset/machines/aws" @@ -87,10 +89,18 @@ var ( _ asset.WritableAsset = (*Worker)(nil) ) -func defaultAWSMachinePoolPlatform() awstypes.MachinePool { +func defaultAWSMachinePoolPlatform(poolName string) awstypes.MachinePool { + defaultEBSType := awstypes.VolumeTypeGp3 + + // gp3 is not offered in all local-zones locations used by Edge Pools. + // Once it is available, it can be used as default for all machine pools. + // https://aws.amazon.com/about-aws/global-infrastructure/localzones/features + if poolName == types.MachinePoolEdgeRoleName { + defaultEBSType = awstypes.VolumeTypeGp2 + } return awstypes.MachinePool{ EC2RootVolume: awstypes.EC2RootVolume{ - Type: "gp3", + Type: defaultEBSType, Size: decimalRootVolumeSize, }, } @@ -182,14 +192,25 @@ func defaultNutanixMachinePoolPlatform() nutanixtypes.MachinePool { } } -func awsDefaultMachineTypes(region string, arch types.Architecture, instanceSize string) []string { - classes := awsdefaults.InstanceClasses(region, arch) +// awsDiscoveryPreferredEdgeInstanceByZone discover supported instanceType for each subnet's +// zone using the preferred list of instances allowed for OCP. +func awsDiscoveryPreferredEdgeInstanceByZone(ctx context.Context, defaultTypes []string, meta *icaws.Metadata, subnets icaws.Subnets) (ok bool, err error) { + for zone := range subnets { + subnet, ok := subnets[zone] + if !ok { + return ok, errors.Wrap(err, fmt.Sprintf("failed to get subnet's zone[%v] to lookup preferred instance type.", zone)) + } - types := make([]string, len(classes)) - for i, c := range classes { - types[i] = fmt.Sprintf("%s.%s", c, instanceSize) + preferredType, err := aws.PreferredInstanceType(ctx, meta, defaultTypes, []string{zone}) + if err != nil { + logrus.Warn(errors.Wrap(err, fmt.Sprintf("unable to select instanceType on the zone[%v] from the preferred list: %v. You must update the MachineSet manifest", zone, defaultTypes))) + continue + } + + subnet.PreferredEdgeInstanceType = preferredType + subnets[zone] = subnet } - return types + return true, nil } // Worker generates the machinesets for `worker` machine pool. @@ -305,18 +326,31 @@ func (w *Worker) Generate(dependencies asset.Parents) error { machineSets = append(machineSets, set) } case awstypes.Name: - subnets := map[string]string{} + subnets := icaws.Subnets{} if len(ic.Platform.AWS.Subnets) > 0 { - subnetMeta, err := installConfig.AWS.PrivateSubnets(ctx) - if err != nil { - return err + var subnetsMeta icaws.Subnets + switch pool.Name { + case types.MachinePoolEdgeRoleName: + subnetsMeta, err = installConfig.AWS.EdgeSubnets(ctx) + if err != nil { + return err + } + if *pool.Replicas == 0 { + sbCount := int64(len(subnetsMeta)) + pool.Replicas = &sbCount + } + default: + subnetsMeta, err = installConfig.AWS.PrivateSubnets(ctx) + if err != nil { + return err + } } - for id, subnet := range subnetMeta { - subnets[subnet.Zone] = id + for _, subnet := range subnetsMeta { + subnets[subnet.Zone] = subnet } } - mpool := defaultAWSMachinePoolPlatform() + mpool := defaultAWSMachinePoolPlatform(pool.Name) osImage := strings.SplitN(string(*rhcosImage), ",", 2) osImageID := osImage[0] @@ -343,11 +377,24 @@ func (w *Worker) Generate(dependencies asset.Parents) error { } if mpool.InstanceType == "" { - instanceSize := defaultAWSInstanceSize - mpool.InstanceType, err = aws.PreferredInstanceType(ctx, installConfig.AWS, awsDefaultMachineTypes(installConfig.Config.Platform.AWS.Region, installConfig.Config.ControlPlane.Architecture, instanceSize), mpool.Zones) - if err != nil { - logrus.Warn(errors.Wrap(err, "failed to find default instance type")) - mpool.InstanceType = awsDefaultMachineTypes(installConfig.Config.Platform.AWS.Region, installConfig.Config.ControlPlane.Architecture, instanceSize)[0] + instanceTypes := awsdefaults.InstanceTypes(installConfig.Config.Platform.AWS.Region, installConfig.Config.ControlPlane.Architecture, configv1.HighlyAvailableTopologyMode) + + switch pool.Name { + case types.MachinePoolEdgeRoleName: + ok, err := awsDiscoveryPreferredEdgeInstanceByZone(ctx, instanceTypes, installConfig.AWS, subnets) + if err != nil { + return errors.Wrap(err, "failed to find default instance type for edge pool, you must define on the compute pool") + } + if !ok { + logrus.Warn(errors.Wrap(err, "failed to find preferred instance type for edge pool, using default")) + mpool.InstanceType = instanceTypes[0] + } + default: + mpool.InstanceType, err = aws.PreferredInstanceType(ctx, installConfig.AWS, instanceTypes, mpool.Zones) + if err != nil { + logrus.Warn(errors.Wrap(err, "failed to find default instance type")) + mpool.InstanceType = instanceTypes[0] + } } } // if the list of zones is the default we need to try to filter the list in case there are some zones where the instance might not be available @@ -364,7 +411,7 @@ func (w *Worker) Generate(dependencies asset.Parents) error { installConfig.Config.Platform.AWS.Region, subnets, &pool, - "worker", + pool.Name, workerUserDataSecretName, installConfig.Config.Platform.AWS.UserTags, ) diff --git a/pkg/asset/machines/worker_test.go b/pkg/asset/machines/worker_test.go index 4abb12eec7b..88ffeceeb92 100644 --- a/pkg/asset/machines/worker_test.go +++ b/pkg/asset/machines/worker_test.go @@ -233,3 +233,52 @@ func TestComputeIsNotModified(t *testing.T) { t.Fatalf("compute in the install config has been modified") } } + +func TestDefaultAWSMachinePoolPlatform(t *testing.T) { + type testCase struct { + name string + poolName string + expectedMachinePool awstypes.MachinePool + assert func(tc *testCase) + } + cases := []testCase{ + { + name: "default EBS type for compute pool", + poolName: types.MachinePoolComputeRoleName, + expectedMachinePool: awstypes.MachinePool{ + EC2RootVolume: awstypes.EC2RootVolume{ + Type: awstypes.VolumeTypeGp3, + Size: decimalRootVolumeSize, + }, + }, + assert: func(tc *testCase) { + mp := defaultAWSMachinePoolPlatform(tc.poolName) + want := tc.expectedMachinePool.EC2RootVolume.Type + got := mp.EC2RootVolume.Type + assert.Equal(t, want, got, "unexepcted EBS type") + }, + }, + { + name: "default EBS type for edge pool", + poolName: types.MachinePoolEdgeRoleName, + expectedMachinePool: awstypes.MachinePool{ + EC2RootVolume: awstypes.EC2RootVolume{ + Type: awstypes.VolumeTypeGp2, + Size: decimalRootVolumeSize, + }, + }, + assert: func(tc *testCase) { + mp := defaultAWSMachinePoolPlatform(tc.poolName) + want := tc.expectedMachinePool.EC2RootVolume.Type + got := mp.EC2RootVolume.Type + assert.Equal(t, want, got, "unexepcted EBS type") + }, + }, + } + for i := range cases { + tc := cases[i] + t.Run(tc.name, func(t *testing.T) { + tc.assert(&tc) + }) + } +} diff --git a/pkg/asset/manifests/network.go b/pkg/asset/manifests/network.go index 06279c4b966..f693e43b59d 100644 --- a/pkg/asset/manifests/network.go +++ b/pkg/asset/manifests/network.go @@ -9,16 +9,22 @@ import ( metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" configv1 "github.com/openshift/api/config/v1" + operatorv1 "github.com/openshift/api/operator/v1" "github.com/openshift/installer/pkg/asset" "github.com/openshift/installer/pkg/asset/installconfig" "github.com/openshift/installer/pkg/asset/templates/content/openshift" + "github.com/openshift/installer/pkg/types" + "github.com/openshift/installer/pkg/types/aws" "github.com/openshift/installer/pkg/types/powervs" ) var ( - noCrdFilename = filepath.Join(manifestDir, "cluster-network-01-crd.yml") - noCfgFilename = filepath.Join(manifestDir, "cluster-network-02-config.yml") - ovnKubeFilename = filepath.Join(manifestDir, "cluster-network-03-config.yml") + noCrdFilename = filepath.Join(manifestDir, "cluster-network-01-crd.yml") + noCfgFilename = filepath.Join(manifestDir, "cluster-network-02-config.yml") + cnoCfgFilename = filepath.Join(manifestDir, "cluster-network-03-config.yml") + // Cluster Network MTU for AWS Local Zone deployments on edge machine pools. + ovnKNetworkMtuEdge uint32 = 1200 + ocpSDNNetworkMtuEdge uint32 = 1250 ) // We need to manually create our CRDs first, so we can create the @@ -119,6 +125,18 @@ func (no *Networking) Generate(dependencies asset.Parents) error { } switch installConfig.Config.Platform.Name() { + case aws.Name: + cnoDefCfg, exists, err := no.generateDefaultNetworkConfigAWSEdge(installConfig) + if err != nil { + return err + } + if exists { + no.FileList = append(no.FileList, &asset.File{ + Filename: cnoCfgFilename, + Data: cnoDefCfg, + }) + } + case powervs.Name: if netConfig.NetworkType == "OVNKubernetes" { ovnConfig, err := OvnKubeConfig(clusterNet, serviceNet, true) @@ -126,7 +144,7 @@ func (no *Networking) Generate(dependencies asset.Parents) error { return errors.Wrapf(err, "cannot marshal Power VS OVNKube Config") } no.FileList = append(no.FileList, &asset.File{ - Filename: ovnKubeFilename, + Filename: cnoCfgFilename, Data: ovnConfig, }) } @@ -145,3 +163,75 @@ func (no *Networking) Files() []*asset.File { func (no *Networking) Load(f asset.FileFetcher) (bool, error) { return false, nil } + +// Generates the defaultNetwork for Cluster Network Operator configuration. +// The defaultNetwork is the "default" network that all pods will receive. +func (no *Networking) generateDefaultNetworkConfig(defaultNetwork *operatorv1.DefaultNetworkDefinition) ([]byte, error) { + dnConfig := operatorv1.Network{ + TypeMeta: metav1.TypeMeta{ + APIVersion: operatorv1.SchemeGroupVersion.String(), + Kind: "Network", + }, + ObjectMeta: metav1.ObjectMeta{ + Name: "cluster", + }, + Spec: operatorv1.NetworkSpec{ + OperatorSpec: operatorv1.OperatorSpec{ManagementState: operatorv1.Managed}, + DefaultNetwork: *defaultNetwork, + }, + } + + return yaml.Marshal(dnConfig) +} + +// Check if there is any edge machine pool created, and generate the +// CNO object to set DefaultNetwork for CNI with custom MTU. +// EC2 on AWS Local Zones requires MTU 1300 to communicate with regular zones. +// The const (?)NetworkMtuEdge decreases from network plugin overhead. +// https://docs.aws.amazon.com/local-zones/latest/ug/how-local-zones-work.html +func (no *Networking) generateDefaultNetworkConfigAWSEdge(ic *installconfig.InstallConfig) ([]byte, bool, error) { + var ( + hasEdgePool = false + defNetCfg *operatorv1.DefaultNetworkDefinition + err error + ) + + netConfig := ic.Config.Networking + + // Setup defaultNetwork only for Edge deployment on AWS + for _, mp := range ic.Config.Compute { + if mp.Name == types.MachinePoolEdgeRoleName { + hasEdgePool = true + } + } + if !hasEdgePool { + return nil, false, nil + } + + switch netConfig.NetworkType { + case string(operatorv1.NetworkTypeOVNKubernetes): + defNetCfg = &operatorv1.DefaultNetworkDefinition{ + Type: operatorv1.NetworkTypeOVNKubernetes, + OVNKubernetesConfig: &operatorv1.OVNKubernetesConfig{ + MTU: &ovnKNetworkMtuEdge, + }, + } + + case string(operatorv1.NetworkTypeOpenShiftSDN): + defNetCfg = &operatorv1.DefaultNetworkDefinition{ + Type: operatorv1.NetworkTypeOpenShiftSDN, + OpenShiftSDNConfig: &operatorv1.OpenShiftSDNConfig{ + MTU: &ocpSDNNetworkMtuEdge, + }, + } + default: + return nil, true, errors.Wrapf(err, "unable to set the DefaultNetworkConfig for %s", netConfig.NetworkType) + } + + cnoConfig, err := no.generateDefaultNetworkConfig(defNetCfg) + if err != nil { + return nil, true, errors.Wrapf(err, "cannot marshal DefaultNetworkConfig for %s", netConfig.NetworkType) + } + + return cnoConfig, true, nil +} diff --git a/pkg/types/aws/availabilityzones.go b/pkg/types/aws/availabilityzones.go new file mode 100644 index 00000000000..81d8daa145d --- /dev/null +++ b/pkg/types/aws/availabilityzones.go @@ -0,0 +1,8 @@ +package aws + +const ( + // AvailabilityZoneType is the type of regular zone placed on the region. + AvailabilityZoneType = "availability-zone" + // LocalZoneType is the type of Local zone placed on the metropolitan areas. + LocalZoneType = "local-zone" +) diff --git a/pkg/types/aws/defaults/platform.go b/pkg/types/aws/defaults/platform.go index aad6c23d541..49195725eaa 100644 --- a/pkg/types/aws/defaults/platform.go +++ b/pkg/types/aws/defaults/platform.go @@ -1,19 +1,27 @@ package defaults import ( + "fmt" + + configv1 "github.com/openshift/api/config/v1" "github.com/openshift/installer/pkg/types" "github.com/openshift/installer/pkg/types/aws" ) +const ( + defaultInstanceSizeHighAvailabilityTopology = "xlarge" + defaultInstanceSizeSingleReplicaTopology = "2xlarge" +) + var ( - defaultMachineClass = map[types.Architecture]map[string][]string{ + defaultMachineTypes = map[types.Architecture]map[string][]string{ types.ArchitectureAMD64: { // Example region default machine class override for AMD64: - // "ap-east-1": {"m5", "m4"}, + // "ap-east-1": {"m6i.xlarge", "m5.xlarge"}, }, types.ArchitectureARM64: { // Example region default machine class override for ARM64: - // "us-east-1": {"m6g", "m6gd"}, + // "us-east-1": {"m6g.xlarge", "m6gd.xlarge"}, }, } ) @@ -22,19 +30,44 @@ var ( func SetPlatformDefaults(p *aws.Platform) { } -// InstanceClasses returns a list of instance "class", in decreasing priority order, which we should use for a given -// region. Default is m6i then m5 unless a region override is defined in defaultMachineClass. -func InstanceClasses(region string, arch types.Architecture) []string { - if classesForArch, ok := defaultMachineClass[arch]; ok { +// InstanceTypes returns a list of instance types, in decreasing priority order, which we should use for a given +// region. Default is m6i.xlarge, m5.xlarge, lastly c5d.2xlarge unless a region override +// is defined in defaultMachineTypes. +// c5d.2xlarge is in the most locations of availability for Local Zone offerings. +// https://aws.amazon.com/about-aws/global-infrastructure/localzones/features +// https://aws.amazon.com/ec2/pricing/on-demand/ +func InstanceTypes(region string, arch types.Architecture, topology configv1.TopologyMode) []string { + if classesForArch, ok := defaultMachineTypes[arch]; ok { if classes, ok := classesForArch[region]; ok { return classes } } + instanceSize := defaultInstanceSizeHighAvailabilityTopology + // If the control plane is single node, we need to use a larger + // instance type for that node, as the minimum requirement for + // single-node control-plane nodes is 8 cores, and xlarge only has + // 4. Unfortunately 2xlarge has twice as much RAM as we need, but + // we default to it because AWS doesn't offer an 8-core 16GiB + // instance type + if topology == configv1.SingleReplicaTopologyMode { + instanceSize = defaultInstanceSizeSingleReplicaTopology + } + switch arch { case types.ArchitectureARM64: - return []string{"m6g"} + return []string{ + fmt.Sprintf("m6g.%s", instanceSize), + } default: - return []string{"m6i", "m5"} + return []string{ + fmt.Sprintf("m6i.%s", instanceSize), + fmt.Sprintf("m5.%s", instanceSize), + // For Local Zone compatibility + fmt.Sprintf("r5.%s", instanceSize), + "c5.2xlarge", + "m5.2xlarge", + "c5d.2xlarge", + } } } diff --git a/pkg/types/aws/defaults/platform_test.go b/pkg/types/aws/defaults/platform_test.go new file mode 100644 index 00000000000..7c2e659d094 --- /dev/null +++ b/pkg/types/aws/defaults/platform_test.go @@ -0,0 +1,67 @@ +package defaults + +import ( + "testing" + + "github.com/stretchr/testify/assert" + + configv1 "github.com/openshift/api/config/v1" + "github.com/openshift/installer/pkg/types" +) + +func TestInstanceTypes(t *testing.T) { + type testCase struct { + name string + region string + architecture types.Architecture + topology configv1.TopologyMode + expected []string + assert func(*testCase) + } + cases := []testCase{ + { + name: "default instance types for AMD64", + topology: configv1.HighlyAvailableTopologyMode, + expected: []string{"m6i.xlarge", "m5.xlarge", "r5.xlarge", "c5.2xlarge", "m5.2xlarge", "c5d.2xlarge"}, + assert: func(tc *testCase) { + instances := InstanceTypes(tc.region, tc.architecture, tc.topology) + assert.Equal(t, tc.expected, instances, "unexepcted instance type for AMD64") + }, + }, + { + name: "default instance types for AMD64", + topology: configv1.SingleReplicaTopologyMode, + expected: []string{"m6i.2xlarge", "m5.2xlarge", "r5.2xlarge", "c5.2xlarge", "m5.2xlarge", "c5d.2xlarge"}, + assert: func(tc *testCase) { + instances := InstanceTypes(tc.region, tc.architecture, tc.topology) + assert.Equal(t, tc.expected, instances, "unexepcted instance type for AMD64") + }, + }, + { + name: "default instance types for ARM64", + architecture: types.ArchitectureARM64, + topology: configv1.HighlyAvailableTopologyMode, + expected: []string{"m6g.xlarge"}, + assert: func(tc *testCase) { + instances := InstanceTypes(tc.region, tc.architecture, tc.topology) + assert.Equal(t, tc.expected, instances, "unexepcted instance type for ARM64") + }, + }, + { + name: "default instance types for ARM64", + architecture: types.ArchitectureARM64, + topology: configv1.SingleReplicaTopologyMode, + expected: []string{"m6g.2xlarge"}, + assert: func(tc *testCase) { + instances := InstanceTypes(tc.region, tc.architecture, tc.topology) + assert.Equal(t, tc.expected, instances, "unexepcted instance type for ARM64") + }, + }, + } + for i := range cases { + tc := cases[i] + t.Run(tc.name, func(t *testing.T) { + tc.assert(&tc) + }) + } +} diff --git a/pkg/types/aws/platform.go b/pkg/types/aws/platform.go index b2ef8c9e5e7..44f932cfcc6 100644 --- a/pkg/types/aws/platform.go +++ b/pkg/types/aws/platform.go @@ -6,6 +6,13 @@ import ( configv1 "github.com/openshift/api/config/v1" ) +const ( + // VolumeTypeGp2 is the type of EBS volume for General Purpose SSD gp2. + VolumeTypeGp2 = "gp2" + // VolumeTypeGp3 is the type of EBS volume for General Purpose SSD gp3. + VolumeTypeGp3 = "gp3" +) + // Platform stores all the global configuration that all machinesets // use. type Platform struct { diff --git a/pkg/types/defaults/installconfig.go b/pkg/types/defaults/installconfig.go index 64f455917f9..54237c7a15a 100644 --- a/pkg/types/defaults/installconfig.go +++ b/pkg/types/defaults/installconfig.go @@ -72,8 +72,16 @@ func SetInstallConfigDefaults(c *types.InstallConfig) { } c.ControlPlane.Name = "master" SetMachinePoolDefaults(c.ControlPlane, c.Platform.Name()) - if len(c.Compute) == 0 { - c.Compute = []types.MachinePool{{Name: "worker"}} + + defaultComputePoolUndefined := true + for _, compute := range c.Compute { + if compute.Name == types.MachinePoolComputeRoleName { + defaultComputePoolUndefined = false + break + } + } + if defaultComputePoolUndefined { + c.Compute = append(c.Compute, types.MachinePool{Name: types.MachinePoolComputeRoleName}) } for i := range c.Compute { SetMachinePoolDefaults(&c.Compute[i], c.Platform.Name()) diff --git a/pkg/types/defaults/installconfig_test.go b/pkg/types/defaults/installconfig_test.go index 65cf2731367..38976d972eb 100644 --- a/pkg/types/defaults/installconfig_test.go +++ b/pkg/types/defaults/installconfig_test.go @@ -44,6 +44,12 @@ func defaultInstallConfig() *types.InstallConfig { } } +func defaultInstallConfigWithEdge() *types.InstallConfig { + c := defaultInstallConfig() + c.Compute = append(c.Compute, *defaultMachinePool("edge")) + return c +} + func defaultAWSInstallConfig() *types.InstallConfig { c := defaultInstallConfig() c.Platform.AWS = &aws.Platform{} @@ -219,11 +225,25 @@ func TestSetInstallConfigDefaults(t *testing.T) { { name: "Compute present", config: &types.InstallConfig{ - Compute: []types.MachinePool{{Name: "test-compute"}}, + Compute: []types.MachinePool{{Name: "worker"}}, }, expected: func() *types.InstallConfig { c := defaultInstallConfig() - c.Compute = []types.MachinePool{*defaultMachinePool("test-compute")} + c.Compute = []types.MachinePool{*defaultMachinePool("worker")} + return c + }(), + }, + { + name: "Edge Compute present", + config: &types.InstallConfig{ + Compute: []types.MachinePool{{Name: "worker"}, {Name: "edge"}}, + }, + expected: func() *types.InstallConfig { + c := defaultInstallConfigWithEdge() + c.Compute = []types.MachinePool{ + *defaultMachinePool("worker"), + *defaultEdgeMachinePool("edge"), + } return c }(), }, diff --git a/pkg/types/defaults/machinepools.go b/pkg/types/defaults/machinepools.go index 7896ae451d3..0e4e6a2ba41 100644 --- a/pkg/types/defaults/machinepools.go +++ b/pkg/types/defaults/machinepools.go @@ -12,6 +12,9 @@ func SetMachinePoolDefaults(p *types.MachinePool, platform string) { if platform == libvirt.Name { defaultReplicaCount = 1 } + if p.Name == types.MachinePoolEdgeRoleName { + defaultReplicaCount = 0 + } if p.Replicas == nil { p.Replicas = &defaultReplicaCount } @@ -22,3 +25,22 @@ func SetMachinePoolDefaults(p *types.MachinePool, platform string) { p.Architecture = version.DefaultArch() } } + +// CreateEdgeMachinePoolDefaults create the edge compute pool when it is not already defined. +func CreateEdgeMachinePoolDefaults(pools []types.MachinePool, platform string, replicas int64) *types.MachinePool { + edgePoolDefined := false + for _, compute := range pools { + if compute.Name == types.MachinePoolEdgeRoleName { + edgePoolDefined = true + } + } + if edgePoolDefined { + return nil + } + pool := &types.MachinePool{ + Name: types.MachinePoolEdgeRoleName, + Replicas: &replicas, + } + SetMachinePoolDefaults(pool, platform) + return pool +} diff --git a/pkg/types/defaults/machinepools_test.go b/pkg/types/defaults/machinepools_test.go index 9eb1e43fd16..e33b64a97d4 100644 --- a/pkg/types/defaults/machinepools_test.go +++ b/pkg/types/defaults/machinepools_test.go @@ -4,21 +4,29 @@ import ( "testing" "github.com/stretchr/testify/assert" - "k8s.io/utils/pointer" "github.com/openshift/installer/pkg/types" ) func defaultMachinePool(name string) *types.MachinePool { + repCount := int64(3) return &types.MachinePool{ Name: name, - Replicas: pointer.Int64Ptr(3), + Replicas: &repCount, Hyperthreading: types.HyperthreadingEnabled, Architecture: types.ArchitectureAMD64, } } +func defaultEdgeMachinePool(name string) *types.MachinePool { + pool := defaultMachinePool(name) + defaultEdgeReplicaCount := int64(0) + pool.Replicas = &defaultEdgeReplicaCount + return pool +} + func TestSetMahcinePoolDefaults(t *testing.T) { + defaultEdgeReplicaCount := int64(0) cases := []struct { name string pool *types.MachinePool @@ -30,21 +38,48 @@ func TestSetMahcinePoolDefaults(t *testing.T) { pool: &types.MachinePool{}, expected: defaultMachinePool(""), }, + { + name: "empty", + pool: &types.MachinePool{Replicas: &defaultEdgeReplicaCount}, + expected: defaultEdgeMachinePool(""), + }, { name: "default", pool: defaultMachinePool("test-name"), expected: defaultMachinePool("test-name"), }, + { + name: "default", + pool: defaultEdgeMachinePool("test-name"), + expected: defaultEdgeMachinePool("test-name"), + }, { name: "non-default replicas", pool: func() *types.MachinePool { p := defaultMachinePool("test-name") - p.Replicas = pointer.Int64Ptr(5) + repCount := int64(5) + p.Replicas = &repCount return p }(), expected: func() *types.MachinePool { p := defaultMachinePool("test-name") - p.Replicas = pointer.Int64Ptr(5) + repCount := int64(5) + p.Replicas = &repCount + return p + }(), + }, + { + name: "non-default replicas", + pool: func() *types.MachinePool { + p := defaultEdgeMachinePool("test-name") + repCount := int64(5) + p.Replicas = &repCount + return p + }(), + expected: func() *types.MachinePool { + p := defaultEdgeMachinePool("test-name") + repCount := int64(5) + p.Replicas = &repCount return p }(), }, @@ -54,7 +89,8 @@ func TestSetMahcinePoolDefaults(t *testing.T) { platform: "libvirt", expected: func() *types.MachinePool { p := defaultMachinePool("") - p.Replicas = pointer.Int64Ptr(1) + repCount := int64(1) + p.Replicas = &repCount return p }(), }, @@ -71,6 +107,19 @@ func TestSetMahcinePoolDefaults(t *testing.T) { return p }(), }, + { + name: "non-default hyperthreading", + pool: func() *types.MachinePool { + p := defaultEdgeMachinePool("test-name") + p.Hyperthreading = types.HyperthreadingMode("test-hyperthreading") + return p + }(), + expected: func() *types.MachinePool { + p := defaultEdgeMachinePool("test-name") + p.Hyperthreading = types.HyperthreadingMode("test-hyperthreading") + return p + }(), + }, } for _, tc := range cases { t.Run(tc.name, func(t *testing.T) { diff --git a/pkg/types/installconfig.go b/pkg/types/installconfig.go index d9bd01dff11..48a0c7313aa 100644 --- a/pkg/types/installconfig.go +++ b/pkg/types/installconfig.go @@ -27,8 +27,7 @@ const ( // InstallConfigVersion is the version supported by this package. // If you bump this, you must also update the list of convertable values in // pkg/types/conversion/installconfig.go - InstallConfigVersion = "v1" - workerMachinePoolName = "worker" + InstallConfigVersion = "v1" ) var ( @@ -478,7 +477,8 @@ type Capabilities struct { // WorkerMachinePool retrieves the worker MachinePool from InstallConfig.Compute func (c *InstallConfig) WorkerMachinePool() *MachinePool { for _, machinePool := range c.Compute { - if machinePool.Name == workerMachinePoolName { + switch machinePool.Name { + case MachinePoolComputeRoleName, MachinePoolEdgeRoleName: return &machinePool } } diff --git a/pkg/types/machinepools.go b/pkg/types/machinepools.go index c481fd1feac..4b9b79757ef 100644 --- a/pkg/types/machinepools.go +++ b/pkg/types/machinepools.go @@ -16,9 +16,11 @@ import ( ) const ( - // MachinePoolComputeRoleName name associated with the compute machinepool + // MachinePoolComputeRoleName name associated with the compute machinepool. MachinePoolComputeRoleName = "worker" - // MachinePoolControlPlaneRoleName name associated with the control plane machinepool + // MachinePoolEdgeRoleName name associated with the compute edge machinepool. + MachinePoolEdgeRoleName = "edge" + // MachinePoolControlPlaneRoleName name associated with the control plane machinepool. MachinePoolControlPlaneRoleName = "master" ) diff --git a/pkg/types/validation/installconfig.go b/pkg/types/validation/installconfig.go index d88ad85496b..7e596631ecc 100644 --- a/pkg/types/validation/installconfig.go +++ b/pkg/types/validation/installconfig.go @@ -430,14 +430,28 @@ func validateControlPlane(platform *types.Platform, pool *types.MachinePool, fld return allErrs } +func validateComputeEdge(platform *types.Platform, pName string, fldPath *field.Path, pfld *field.Path) field.ErrorList { + allErrs := field.ErrorList{} + if platform.Name() != aws.Name { + allErrs = append(allErrs, field.NotSupported(pfld.Child("name"), pName, []string{types.MachinePoolComputeRoleName})) + } + + return allErrs +} + func validateCompute(platform *types.Platform, control *types.MachinePool, pools []types.MachinePool, fldPath *field.Path) field.ErrorList { allErrs := field.ErrorList{} poolNames := map[string]bool{} for i, p := range pools { poolFldPath := fldPath.Index(i) - if p.Name != types.MachinePoolComputeRoleName { - allErrs = append(allErrs, field.NotSupported(poolFldPath.Child("name"), p.Name, []string{"worker"})) + switch p.Name { + case types.MachinePoolComputeRoleName: + case types.MachinePoolEdgeRoleName: + allErrs = append(allErrs, validateComputeEdge(platform, p.Name, poolFldPath, poolFldPath)...) + default: + allErrs = append(allErrs, field.NotSupported(poolFldPath.Child("name"), p.Name, []string{types.MachinePoolComputeRoleName, types.MachinePoolEdgeRoleName})) } + if poolNames[p.Name] { allErrs = append(allErrs, field.Duplicate(poolFldPath.Child("name"), p.Name)) } diff --git a/upi/aws/cloudformation/01.99_net_local-zone.yaml b/upi/aws/cloudformation/01.99_net_local-zone.yaml new file mode 100644 index 00000000000..f1af3cf2331 --- /dev/null +++ b/upi/aws/cloudformation/01.99_net_local-zone.yaml @@ -0,0 +1,48 @@ +AWSTemplateFormatVersion: 2010-09-09 +Description: Template for create Public Local Zone subnets + +Parameters: + VpcId: + Description: VPC Id + Type: String + ZoneName: + Description: Local Zone Name (Example us-west-2-lax-1a) + Type: String + SubnetName: + Description: Local Zone Name (Example cluster-usw2-lax-1a) + Type: String + PublicRouteTableId: + Description: Public Route Table ID to associate the Local Zone subnet + Type: String + PublicSubnetCidr: + # yamllint disable-line rule:line-length + AllowedPattern: ^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])(\/(1[6-9]|2[0-4]))$ + ConstraintDescription: CIDR block parameter must be in the form x.x.x.x/16-24. + Default: 10.0.128.0/20 + Description: CIDR block for Public Subnet + Type: String + +Resources: + PublicSubnet: + Type: "AWS::EC2::Subnet" + Properties: + VpcId: !Ref VpcId + CidrBlock: !Ref PublicSubnetCidr + AvailabilityZone: !Ref ZoneName + Tags: + - Key: Name + Value: !Ref SubnetName + - Key: kubernetes.io/cluster/unmanaged + Value: "true" + + PublicSubnetRouteTableAssociation: + Type: "AWS::EC2::SubnetRouteTableAssociation" + Properties: + SubnetId: !Ref PublicSubnet + RouteTableId: !Ref PublicRouteTableId + +Outputs: + PublicSubnetIds: + Description: Subnet IDs of the public subnets. + Value: + !Join ["", [!Ref PublicSubnet]] diff --git a/upi/aws/cloudformation/01_vpc.yaml b/upi/aws/cloudformation/01_vpc.yaml index de55a49b2fc..c2f8cdc4065 100644 --- a/upi/aws/cloudformation/01_vpc.yaml +++ b/upi/aws/cloudformation/01_vpc.yaml @@ -286,3 +286,6 @@ Outputs: ",", [!Ref PrivateSubnet, !If [DoAz2, !Ref PrivateSubnet2, !Ref "AWS::NoValue"], !If [DoAz3, !Ref PrivateSubnet3, !Ref "AWS::NoValue"]] ] + PublicRouteTableId: + Description: Public Route table ID + Value: !Ref PublicRouteTable