-
Notifications
You must be signed in to change notification settings - Fork 1.9k
OSDOCS-986 GCP UPI shared VPC #22332
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The preview will be available shortly at: |
|
@jstuever, will you PTAL? |
|
@shellyyang1989, will you PTAL? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although the shared vpc alongside host project should be pre-existing resources, IMO, we still need to provide an end-to-end solution to customers so that they can deploy OCP from scratch. Therefore, I suggest to remove this line and add a section to describe how to configure host project and shared VPC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shellyyang1989 or @jstuever, do you have more details about the host project and VPC configuration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To learn more about how to enable shared VPC, please refer to https://cloud.google.com/vpc/docs/provisioning-shared-vpc#setting_up. Once shared VPC is enabled, all the vpc networks in host project can be accessed by service project.
The host project is dedicated to host vpc network and dns which do not have much difference with those in stand-alone project.
The host project should have these:
- Service account which has these roles enabled:
Compute Network User
Compute Security Admin
Deployment Manager Editor
DNS Administrator
Security Admin
Network Management Admin - Have user managed service account of service project got 'Compute Network User' role of host project
- Have google managed service account i.e. [email protected] of service project got 'Compute Network User' role of host project
- A public dns zone for external dns resolution
- VPC network
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In shared VPC scenario, there are 2 kinds of projects: host project which hosts the shared VPC network and service project which hosts the OCP cluster. I think you'd like to describe the service project in this section, and I suggest to update it to Configuring your GCP service project to make it more clear. If you'd like to learn more about shared VPC, please refer to https://cloud.google.com/vpc/docs/shared-vpc#shared_vpc_networks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, missing service before project.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned before, add a section "Configuring your GCP host project" here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In shared VPC scenario, host project hosts the DNS zone for the cluster which means DNS is in a different project from OCP cluster. Please update the content of this section accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A service account in host project is also required. And required roles are:
Compute Network User
Compute Security Admin
Deployment Manager Editor
DNS Administrator
Security Admin
Network Management Admin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
network, controlPlaneSubnet and computeSubnet are not required in install-config.yaml. If you specify them, you will get the below error as the network is in host project instead of service project.
FATAL failed to fetch Master Machines: failed to load asset "Install Config": platform.gcp.network: Invalid value: "aos-qe-network": failed to get network aos-qe-network: googleapi: Error 404: The resource 'projects/openshift-qe/global/networks/aos-qe-network' was not found, notFound
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding this: Required. The installation program prompts you for this value.
Please explicitly demonstrate it's the public DNS zone resided on host project
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The replica of work nodes should be 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @jstuever , please help confirm: in shared VPC scenario, only support to create worker nodes manually, correct?
In UPI installation of standalone project, cluster could provision the worker nodes, does it work for shared VPC? Actually I tried that but worker nodes were not created. I am not sure if it's expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is suffixed 'yaml' not yml. And absolute path would be better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The value of network-project-id is the id of host project
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For external cluster, we need to change load balance to external before ignition files creation:
Open <installation_directory>/manifests/cluster-ingress-default-ingresscontroller.yaml file
Replace the value of 'scope' parameter with External.
The content of this file is like this:
apiVersion: operator.openshift.io/v1
kind: IngressController
metadata:
creationTimestamp: null
name: default
namespace: openshift-ingress-operator
spec:
endpointPublishingStrategy:
loadBalancer:
scope: External
type: LoadBalancerService
status:
availableReplicas: 0
domain: ''
selector: ''There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think absolute path would be better.
|
For $ jq -r .infraID /<installation_directory>/metadata.json |
|
Regarding 01_VPC.py, it's recommended to use auto_only network. Please get the latest VPC deployment config here: https://github.com/openshift/installer/blob/master/upi/gcp/01_vpc.py. You haven't completed the VPC network creation section, right? As the VPC network is hosted by host project, we need to use --project --account to specify the host project and the service account of host project, e.g. |
|
In Creating networking and load balancing components in GCP section: gcloud deployment-manager deployments create ${INFRA_ID}-dns --config ${DIR}/02_dns.yaml --project ${HOST_PROJECT} --account ${HOST_PROJECT_ACCOUNT} gcloud deployment-manager deployments create ${INFRA_ID}-dns --config ${DIR}/02_dns.yaml --project ${HOST_PROJECT} --account ${HOST_PROJECT_ACCOUNT} For external cluster, both external lb and internal lb are required, to create lb: gcloud deployment-manager deployments create ${INFRA_ID}-lb --config 02_lb.yaml export CLUSTER_IP=$(gcloud compute addresses describe ${INFRA_ID}-cluster-ip --region=${REGION} --format json | jq -r .address) For internal cluster, only internal lb is required, to create internal lb only: gcloud deployment-manager deployments create ${INFRA_ID}-lb --config ${DIR}/02_lb.yaml || exit 2 export CLUSTER_IP=$(gcloud compute addresses describe ${INFRA_ID}-cluster-ip --region=${REGION} --format json | jq -r .address) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a redundant space between > and '.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for misleading you. As the variables INFRA_ID is not available here, we can change ${INFRA_ID}-vpc to <vpc_deployment_name> or something else. And we need to define HOST_PROJECT and HOST_PROJECT_ACCOUNT before this command line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please change control plane to compute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we define REGION in the Creating the VPC section, please add a note below to explain that the region should be consistent with REGION. I mean the region in which the cluster will install should be the same as the region that VPC network resides in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a redundant space in HOST_ PROJECT
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not required since master-subnet is not used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As HOST_PROJECT_CONTROL_SUBNET and HOST_PROJECT_COMPUTE_SUBNET are defined in Creating VPC section, please remove them from this section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only see where these variables are used in this file now, so if you're looking through the preview, it might not have refreshed right. I'm uploading a fresh preview now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
worker-subnet is not used so please remove this line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding also before copy might make it more clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding also before export might make it more clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding also before add might make it more clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section seems duplicate with Creating cluster-wide firewall rules for a shared VPC in GCP. Please check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shellyyang1989, it looks like the options in the commands are different, and I am not sure if they're trying to do exactly the same thing or not.
I don't work with the devs who work on ingress, so you know more about this than I do. What can I do today to make this work for GA, and will you file a bug for what you think we need to figure out to make this section right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kalexand-rh I don't find the command lines are different. If you're seeing any difference, please let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per upstream doc, there are 2 approaches to add firewall rules: 1 is Adding firewall rules based on cluster events and another is Adding a cluster-wide firewall rules. So I suggest to change the section name as below:
Adding the ingress firewall rules
Adding firewall rules based on cluster events
Adding cluster-wide firewall rules (alternative)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked with the author of the dev docs, and dev's preference is to use the event-based rules to better mimic the IPI functionality. I structured the docs to mimic that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. What I mean was that we need to instruct that Adding cluster-wide firewall rules is an alternative of event-based rules, if there is any issue on event, user can add cluster-wide firewall rules instead.
lamek
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I left a few quick comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing word? "... firewall rules in Google Cloud..."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Repeated word "that".
Slight reword suggestion:
"While making the service account an Owner of the project is the easiest way to gain the required permissions, this gives the service account complete control over the project. You must determine if the risk that comes from offering that power is acceptable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing article: "...have a project that hosts a shared VPC network..."
|
LGTM except for the copy buttons and the zone example for bootstrap, master, worker. |
|
Great! I am going to fix the copy buttons on a separate PR because it's going to involve a lot of code block changes. I fixed the zones to use us-central1. Per the GCP docs, us-central1 uses a,b,c,f instead of b,c,d, like the us-east1 examples, so I'm adjusting those references, too. |
|
@vikram-redhat approved merging this change with the final amendments. |
|
/cherrypick enterprise-4.5 |
|
@kalexand-rh: new pull request created: #23628 DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
See https://issues.redhat.com/browse/OSDOCS-986