|
| 1 | +--- |
| 2 | +title: Networking for ROSA HCP |
| 3 | +authors: |
| 4 | + - "@mzazrivec" |
| 5 | +reviewers: |
| 6 | + - |
| 7 | +creation-date: 2025-02-24 |
| 8 | +last-updated: 2025-07-11 |
| 9 | +status: provisional |
| 10 | +--- |
| 11 | + |
| 12 | +# Networking for ROSA HCP |
| 13 | + |
| 14 | +## Table of Contents |
| 15 | + |
| 16 | +<!-- START doctoc generated TOC please keep comment here to allow auto update --> |
| 17 | +<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE --> |
| 18 | + |
| 19 | +- [Glossary](#glossary) |
| 20 | +- [Summary](#summary) |
| 21 | +- [Motivation](#motivation) |
| 22 | + - [Goals](#goals) |
| 23 | + - [Non-Goals/Future Work](#non-goalsfuture-work) |
| 24 | +- [Proposal](#proposal) |
| 25 | + - [User Stories](#user-stories) |
| 26 | + - [Functional Requirements](#functional-requirements) |
| 27 | +- [Alternatives](#alternatives) |
| 28 | +- [Upgrade Strategy](#upgrade-strategy) |
| 29 | + |
| 30 | +<!-- END doctoc generated TOC please keep comment here to allow auto update --> |
| 31 | + |
| 32 | +## Glossary |
| 33 | + |
| 34 | +Refer to the [Cluster API Book Glossary](https://cluster-api.sigs.k8s.io/reference/glossary.html). |
| 35 | + |
| 36 | +## Summary |
| 37 | + |
| 38 | +This proposal defines implementation of networking infrastructure in CAPA for ROSA Hosted Control Plane. |
| 39 | + |
| 40 | +## Motivation |
| 41 | + |
| 42 | +To be able to provision a new ROSA HCP kubernetes cluster using CAPA, one has to create and setup the underlying network infrastructure first: VPC, public and private subnets, internet gateway, routing tables for both subnets, elastic IP address. |
| 43 | + |
| 44 | +All of the above can be currently provisioned and configured via AWS CLI, AWS Management Console or Terraform. Motivation for this work is to be able to provision and configure the network infrastructure for ROSA HCP using CAPI. |
| 45 | + |
| 46 | +### Goals |
| 47 | + |
| 48 | +1. Implement a namespaced new custom resource `ROSANetwork` representing the networking stack for ROSA HCP. |
| 49 | +2. It will be possible to reference the new `ROSANetwork` resource from ROSA control plane resource |
| 50 | +3. Implement creation and deletion for the new `ROSANetwork` resource. |
| 51 | +4. Support the same networking scenarios as [ROSA CLI](https://github.com/openshift/rosa) using the same embeded AWS CloudFormation template that ROSA CLI uses. |
| 52 | + |
| 53 | +### Non-Goals/Future Work |
| 54 | + |
| 55 | +- Modify current networking code in AWS / EKS clusters. |
| 56 | +- Support custom CloudFormation template. |
| 57 | + |
| 58 | +## Proposal |
| 59 | + |
| 60 | +The goal of this proposal is to be able to provision the networking infrastructure required for a ROSA HCP cluster. |
| 61 | + |
| 62 | +[ROSA CLI](https://github.com/openshift/rosa) supports creation of the networking infrastructure for ROSA HCP and uses [AWS CloudFormation](https://aws.amazon.com/cloudformation/) template under the hood. The [CloudFormation template used by ROSA CLI](https://github.com/openshift/rosa/blob/master/cmd/create/network/templates/rosa-quickstart-default-vpc/cloudformation.yaml) allows to specify five parameters: CloudFormation stack name, AZ count or list of availability zones, region and CIDR block for the VPC. The created CloudFormation stack then contains a VPC, public and private subnets (each pair created in separate AZ), internet gateway attached to VPC, elastic IPs, NAT gateways, public and private routing tables and a security group. |
| 63 | + |
| 64 | +Adopting the CloudFormation template used by rosa-cli would mean that CAPA and the `ROSANetwork` custom resource would be relying on a mechanism that is know to work well and any changes or fixes implemented in ROSA CLI would be picked up automatically in CAPA. |
| 65 | + |
| 66 | +In practical terms, implementation of the proposal would mean: |
| 67 | +1. A new namespaced custom resource definition `ROSANetwork` in CAPA with five attributes: name, AZ count, list of availability zones, region and CIDR block for VPC. `availabilityZoneCount`, `availabilityZones`, `region` and `cidrBlock` will become the `spec` part of the new `ROSANetwork` type, name of the cloudformation stack will be the same as `metadata.name`. |
| 68 | + |
| 69 | + `ROSANetwork` spec example: |
| 70 | + ``` |
| 71 | + kind: ROSANetwork |
| 72 | + metadata: |
| 73 | + name: rosa-network-01 |
| 74 | + namespace: default |
| 75 | + spec: |
| 76 | + availabilityZoneCount: 3 |
| 77 | + region: us-west-2 |
| 78 | + cidrBlock: 10.0.0.0/16 |
| 79 | + ``` |
| 80 | +
|
| 81 | + `ROSANetwork` spec example with specified availability zones: |
| 82 | + ``` |
| 83 | + kind: ROSANetwork |
| 84 | + metadata: |
| 85 | + name: rosa-network-01 |
| 86 | + namespace: default |
| 87 | + spec: |
| 88 | + availabilityZones: |
| 89 | + - us-west-2a |
| 90 | + - us-west-2d |
| 91 | + region: us-west-2 |
| 92 | + cidrBlock: 10.0.0.0/16 |
| 93 | + ``` |
| 94 | +
|
| 95 | +1. A new reconciler for the new custom resource, implementing creation and deletion. The reconciler will be using an existing [CloudFormation template from ROSA CLI](https://github.com/openshift/rosa/blob/master/cmd/create/network/templates/rosa-quickstart-default-vpc/cloudformation.yaml) and will use [AWS CloudFormation API](https://pkg.go.dev/github.com/aws/aws-sdk-go-v2/service/cloudformation) to [create](https://pkg.go.dev/github.com/aws/aws-sdk-go-v2/service/cloudformation#Client.CreateStack) and [delete](https://pkg.go.dev/github.com/aws/aws-sdk-go-v2/service/cloudformation#Client.DeleteStack) the AWS CloudFormation stack. |
| 96 | +
|
| 97 | + Outputs and resources created in the cloudformation stack will be tracked under `status` of the `ROSANetwork` type. In particular, the `status` will contain the list of public and private subnets and availability zones, grouped together by the availability zones. |
| 98 | +
|
| 99 | + Example: |
| 100 | + ``` |
| 101 | + kind: ROSANetwork |
| 102 | + metadata: |
| 103 | + name: rosa-network-01 |
| 104 | + namespace: default |
| 105 | + status: |
| 106 | + subnets: |
| 107 | + - availabilityZone: us-west-2a |
| 108 | + privateSubnet: subnet-1d9f28ba992a83514 |
| 109 | + publicSubnet: subnet-0d9f28ba991b93514 |
| 110 | + - availabilityZone: us-west-2b |
| 111 | + privateSubnet: subnet-2d7f58c09f1b43512 |
| 112 | + publicSubnet: subnet-2d7f18c09f1b43512 |
| 113 | + - availabilityZone: us-west-2c |
| 114 | + privateSubnet: subnet-7d7e19c0af1f4d57f |
| 115 | + publicSubnet: subnet-1d7e19c0af1c4c57f |
| 116 | + ``` |
| 117 | +
|
| 118 | + All resources created in the cloudformation stack will be tracked under `status.resources` array: |
| 119 | + ``` |
| 120 | + kind: ROSANetwork |
| 121 | + metadata: |
| 122 | + name: rosa-network-01 |
| 123 | + namespace: default |
| 124 | + status: |
| 125 | + resources: |
| 126 | + - logicalId: AttachGateway |
| 127 | + physicalId: IGW|vpc-0b3efe540b42d3561 |
| 128 | + reason: "" |
| 129 | + resource: AWS::EC2::VPCGatewayAttachment |
| 130 | + status: CREATE_COMPLETE |
| 131 | + - logicalId: EC2VPCEndpoint |
| 132 | + physicalId: vpce-0a361ac65e48031e5 |
| 133 | + reason: Resource creation Initiated |
| 134 | + resource: AWS::EC2::VPCEndpoint |
| 135 | + status: CREATE_IN_PROGRESS |
| 136 | + - logicalId: EcrApiVPCEndpoint |
| 137 | + physicalId: vpce-09f346abadcc09f61 |
| 138 | + reason: Resource creation Initiated |
| 139 | + resource: AWS::EC2::VPCEndpoint |
| 140 | + status: CREATE_IN_PROGRESS |
| 141 | + ``` |
| 142 | + and will be reflecting the the values coming from [AWS CloudFormation API](https://pkg.go.dev/github.com/aws/aws-sdk-go-v2/service/cloudformation#Client.DescribeStackEvents) (`resource`, `logicalId`, `physicalId`, `reason` and `status`). |
| 143 | +
|
| 144 | + `status.conditions` of the `ROSANetwork` resource will be consistent with the CAPA conventions, example of a successful network stack creation: |
| 145 | + ``` |
| 146 | + kind: ROSANetwork |
| 147 | + metadata: |
| 148 | + name: rosa-network-01 |
| 149 | + namespace: default |
| 150 | + status: |
| 151 | + conditions: |
| 152 | + - lastTransitionTime: "2025-07-11T13:51:40Z" |
| 153 | + reason: Created |
| 154 | + severity: Info |
| 155 | + status: "True" |
| 156 | + type: ROSANetworkReady |
| 157 | + ``` |
| 158 | + Example of creation in progress: |
| 159 | + ``` |
| 160 | + kind: ROSANetwork |
| 161 | + metadata: |
| 162 | + name: rosa-network-01 |
| 163 | + namespace: default |
| 164 | + status: |
| 165 | + conditions: |
| 166 | + - lastTransitionTime: "2025-07-11T13:51:40Z" |
| 167 | + reason: Creating |
| 168 | + severity: Info |
| 169 | + status: "False" |
| 170 | + type: ROSANetworkReady |
| 171 | + ``` |
| 172 | + Example of failed network stack creation: |
| 173 | + ``` |
| 174 | + kind: ROSANetwork |
| 175 | + metadata: |
| 176 | + name: rosa-network-01 |
| 177 | + namespace: default |
| 178 | + status: |
| 179 | + conditions: |
| 180 | + - lastTransitionTime: "2025-03-18T13:25:16Z" |
| 181 | + status: "False" |
| 182 | + type: ROSANetworkReady |
| 183 | + severity: Error |
| 184 | + reason: Failed |
| 185 | + ``` |
| 186 | + Failed deletion: |
| 187 | + ``` |
| 188 | + kind: ROSANetwork |
| 189 | + metadata: |
| 190 | + name: rosa-network-01 |
| 191 | + namespace: default |
| 192 | + status: |
| 193 | + conditions: |
| 194 | + - lastTransitionTime: "2025-03-18T13:25:16Z" |
| 195 | + status: "False" |
| 196 | + type: ROSANetworkReady |
| 197 | + severity: Error |
| 198 | + reason: DeletionFailed |
| 199 | + message: ... |
| 200 | + ``` |
| 201 | + |
| 202 | +1. Modifications in the ROSA control plane CRD & reconciler so that it would be possible to reference the `ROSANetwork` resource from control plane: |
| 203 | + ``` |
| 204 | + kind: ROSAControlPlane |
| 205 | + metadata: |
| 206 | + name: hcp01-control-plane |
| 207 | + namespace: default |
| 208 | + spec: |
| 209 | + rosaNetworkRef: |
| 210 | + name: hcp01-rosa-network |
| 211 | + ``` |
| 212 | + Should the ROSA control plane CR contain reference to ROSA network, the reconciler will read the AWS region, AZ and subnet ids parameters from the ROSA network CR. The ROSA control plane should also be validated through a webhook so that it does not contain both the reference to `ROSANetwork` and the subnet ids and / or availability zones. |
| 213 | +
|
| 214 | +1. New tests. |
| 215 | +
|
| 216 | +### User Stories |
| 217 | +
|
| 218 | +1. As a CAPA user, I want to be able to provision the network infrastructure for ROSA HCP. |
| 219 | +2. As a CAPA user, I want to be able to use the provisioned network infrastructure in ROSA HCP control plane. |
| 220 | +3. As a CAPA user, I want to be able to delete the network infrastructure previously provisioned by CAPA. |
| 221 | +
|
| 222 | +#### Functional Requirements |
| 223 | +
|
| 224 | +1. Ability to create a new namespaced custom resource `ROSANetwork` with four attributes: name, AZ count, region and CIDR block for VPC. |
| 225 | +2. Reconciler implementing creation and deletion of the `ROSANetwork` resource. |
| 226 | +3. Ability to reference the new custom resource from ROSA HCP control plane. |
| 227 | +
|
| 228 | +## Alternatives |
| 229 | +
|
| 230 | +1. Implement CRDs and reconcilers for each of the atoms of network infrastructure (VPCs, subnets, etc.). |
| 231 | +2. Implement the network infrasructure similar to EKS, the network parameters being attributes of the EKS control plane. |
| 232 | +3. Not implement anything and rely purely on AWS CLI or Terraform. |
| 233 | +
|
| 234 | +## Upgrade Strategy |
| 235 | +
|
| 236 | +The implementation will not affect CAPA upgrades. |
| 237 | +
|
| 238 | +<!-- Links --> |
| 239 | +[community meeting]: https://docs.google.com/document/d/1ushaVqAKYnZ2VN_aa3GyKlS4kEd6bSug13xaXOakAQI/edit#heading=h.pxsq37pzkbdq |
0 commit comments