diff --git a/docs/design/openstack/networking-infrastructure.md b/docs/design/openstack/networking-infrastructure.md new file mode 100644 index 00000000000..9242d2f4a40 --- /dev/null +++ b/docs/design/openstack/networking-infrastructure.md @@ -0,0 +1,68 @@ +# OpenStack IPI Networking Infrastructure + +The `OpenStack` platform installer uses an internal networking solution that +is based heavily on the [baremetal networking infrastructure](../baremetal/networking-infrastructure.md). +For an overview of the quotas required, and the entrypoints created when +you build an OpenStack IPI cluster, see the [user docs](../../user/openstack/README.md). + + +## Load-balanced control plane access + +Access to the Kubernetes API (port 6443) from clients both external +and internal to the cluster, and access to ignition configs (port 22623) from clients within the +cluster is load-balanced across control plane machines. +These services are initially hosted by the bootstrap node until the control +plane is up. Then, control is pivoted to the control plane machines. We will go into further detail on +that process in the [Virtual IPs section](#virtual-ips). + +## CoreDNS-mDNS + +https://github.com/openshift/CoreDNS-mdns/ + +The `mDNS` plugin for `CoreDNS` was developed to perform DNS lookups +based on discoverable information from mDNS. This plugin will resolve both the +`etcd-NNN` records, as well as the `_etcd-server-ssl._tcp.` SRV record. It is also +able to resolve the name of the nodes. + +The list of `etcd` hosts included in the SRV record is based on the list of +control plane nodes currently running. + +The IP addresses that the `etcd-NNN` host records resolve to comes from the +mDNS advertisement sent out by the `mDNS-publisher` on that control plane node. + +## Virtual IPs + +We use virtual IP addresses, VIPs, managed by Keepalived +to provide high availability access to essential APIs and services. For more info +on how this works, please read about what [Keepalived is](https://www.keepalived.org/) and +about the underlying [VRRP algorithm](https://en.wikipedia.org/wiki/Virtual_Router_Redundancy_Protocol) +that it runs. In our current implementation, we have 3 highly available VIPs that we manage. +Ingress VIP handles requests to services managed by OpenShift, the DNS VIP handles internal dns requests, and the + API VIP handles requests to the openshift API. Our VIP addresses are chosen and validated from the nodes subnet in the openshift +installer, however, the services we run to manage the internal networking infrastructure, such as Keepalived, +dns, and loadbalancers, are manged by the + [machine-config-operator(MCO)](https://github.com/openshift/machine-config-operator/tree/master/docs). +The MCO has been configured to run static pods on the Bootstrap, Master, and Worker Nodes that +run our internal networking infrastructure. Files run on the bootstrap node can be found +[here](https://github.com/openshift/machine-config-operator/tree/master/manifests/openstack). +Files run on both master and worker nodes can be found +[here](https://github.com/openshift/machine-config-operator/tree/master/templates/common/openstack/files). +Files run on only master nodes can be found +[here](https://github.com/openshift/machine-config-operator/tree/master/templates/master/00-master/openstack/files). +Lastly, files run only on worker nodes can be found +[here](https://github.com/openshift/machine-config-operator/tree/master/templates/worker/00-worker/openstack/files). + +## Infrastructure Walkthrough + +The bootstrap node is responsible for running temporary networking infrastructure while the Master +nodes are still coming up. The bootstrap node will run a CoreDNS instance, as well as +Keepalived. While the bootstrap node is up, it will have priority running the API and DNS +VIPs. + +The Master nodes run dhcp, HAProxy, CoreDNS, mDNS-publisher, and Keepalived. Haproxy loadbalances incoming requests +to the api across all running masters. It also runs a stats and healthcheck server. Keepalived manages all 3 VIPs on the master, where each +master has an equal chance of being assigned one of the VIPs. Initially, the bootstrap node has the highest priority for hosting the DNS +and API VIPs, so they will point to addresses there at startup. Meanwhile, the master nodes will try to get the control plane, and the OpenShift API up. Keepalived implements periodic health checks for each VIP that are used to determine the weight assigned to each server. The server with the highest weight is assigned the VIP. Keepalived has two seperate healthchecks that attempt to reach the OpenShift API and CoreDNS on the localhost of each master node. When the API or DNS on a master node is reachable, Keepalived substantially increases it's weight for that VIP, making its priority higher than that of the bootstrap node and any node that does not yet have the that service running. This ensures that nodes that are incapable of serving DNS records or the OpenShift API do not get assigned the respective VIP. The Ingress VIP is also managed by a healthcheck that queries for an OCP Router HAProxy healthcheck, not the HAProxy we stand up in static pods for the API. This makes sure that the Ingress VIP is pointing to a server that is running the necessary OpenShift Ingress Operator resources to enable external access to the node. + +The Worker Nodes run dhcp, CoreDNS, mDNS-publisher, and Keepalived. On workers, Keepalived is only responsible for managing +the Ingress VIP. It's algorithm is the same as the one run on the masters. \ No newline at end of file diff --git a/docs/user/openstack/README.md b/docs/user/openstack/README.md index 29f835ad7fb..5eec7ac83d2 100644 --- a/docs/user/openstack/README.md +++ b/docs/user/openstack/README.md @@ -5,14 +5,18 @@ Support for launching clusters on OpenStack is **experimental**. This document discusses the requirements, current expected behavior, and how to try out what exists so far. -## OpenStack Requirements +## Openstack Credentials + +There are two ways to pass your credentials to the installer, with a clouds.yaml file or with environment variables. You can also use a combination of the two, but be aware that clouds.yaml file has precident over the environment variables you set. -The installer assumes the following about the OpenStack cloud you run against: +The installer will look for a clouds.yaml file in the following locations in order: +1. OS_CLIENT_CONFIG_FILE +2. Current directory +3. unix-specific user config directory (~/.config/openstack/clouds.yaml) +4. unix-specific site config directory (/etc/openstack/clouds.yaml) -* You must create a `clouds.yaml` file with the auth URL and credentials - necessary to access the OpenStack cloud you want to use. Information on - this file can be found at - https://docs.openstack.org/openstacksdk/latest/user/config/configuration.html#config-files +In many openstack distributions, you can get a clouds.yaml file through Horizon. If you cant, then you can make a `clouds.yaml` file yourself. Information on + this file can be found at https://docs.openstack.org/openstacksdk/latest/user/config/configuration.html#config-files and it looks like: ``` clouds: @@ -33,7 +37,43 @@ clouds: auth_url: 'https://10.10.14.22:5001/v2.0' ``` -* Swift must be enabled. The user must have `swiftoperator` permissions and +If you choose to use environment variables in place of a clouds.yaml, or along side it, consult the following doccumentation: +https://www.terraform.io/docs/providers/openstack/#configuration-reference + + +## OpenStack Requirements + +### Recommended Minimums + +In order to run the latest version of the installer in OpenStack, at a bare minimum you need the following quota to run a *default* cluster. While it is possible to run the cluster with fewer resources than this, it is not recommended. Certian edge cases, such as deploying [without FIPs](#without-floating-ips), or deploying with an [external loadbalancer](#using-an-external-load-balancer) are documented below, and are not included in the scope of this recomendation. + + * OpenStack Quota + * Floating IPs: 3 + * Security Groups: 3 + * Security Group Rules: 60 + * Routers: 1 + * Subnets: 1 + * RAM: 112 Gb + * VCPU: 28 + * Volume Storage: 175 Gb + * Instances: 7 + +#### Master Nodes + +The default deployment stands up 3 master nodes, which is the minimum amount required for a cluster. For each master node you stand up, you will need 1 instance, and 1 port available in your quota. They should be assigned a flavor with at least 16 Gb RAM, 4 VCPu, and 25 Gb Disk. It is theoretically possible to run with a smaller flavor, but be aware that if it takes too long to stand up services, or certian essential services crash, the installer could time out, leading to a failed install. + +#### Worker Nodes + +The default deployment stands up 3 worker nodes. In our testing we determined that 2 was the minimum number of workers you could have to get a succesful install, but we don't recommend running with that few. Worker nodes host the apps you run on OpenShift, so it is in your best interest to have more of them. See [here](https://docs.openshift.com/enterprise/3.0/architecture/infrastructure_components/kubernetes_infrastructure.html#node) for more information. The flavor assigned to the worker nodes should have at least 2 VCPUs, 8 Gb RAM and 25 Gb Disk. However, if you are experiencing `Out Of Memory` issues, or your installs are timing out, you should increase the size of your flavor to match the masters: 4 VCPUs and 16 Gb RAM. + +#### Bootstrap Node + +The bootstrap node is a temporary node that is responsable for standing up the control plane on the masters. Only one bootstrap node will be stood up. To do so, you need 1 instance, and 1 port. We recommend a flavor with a minimum of 16 Gb RAM, 4 VCPUs, and 25 Gb Disk. + + +### Swift + +Swift must be enabled. The user must have `swiftoperator` permissions and `temp-url` support must be enabled. As an OpenStack admin: * `openstack role add --user --project swiftoperator` * `openstack object store account set --property Temp-URL-Key=superkey` @@ -42,9 +82,14 @@ clouds: enough to store the ignition config files, so they are served by swift instead. * You may need to increase the security group related quotas from their default - values. For example (as an OpenStack admin) `openstack quota set --secgroups 100 --secgroup-rules 1000 ` + values. For example (as an OpenStack admin) `openstack quota set --secgroups 8 --secgroup-rules 100 ` + +### RHCOS Image -* The installer requires a proper RHCOS image in the OpenStack cluster or project: +If you do not have a Red Hat Core OS image already, or are looking for the latest, + [click here](https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/pre-release/latest/). + +The installer requires a proper RHCOS image in the OpenStack cluster or project: `openstack image create --container-format=bare --disk-format=qcow2 --file rhcos-${RHCOSVERSION}-openstack.qcow2 rhcos` **NOTE:** Depending on your OpenStack environment you can upload the RHCOS image @@ -72,37 +117,27 @@ documentation in that repository for further details. * https://github.com/shiftstack-dev-tools/ocp-doit -## OpenShift API Access - -All the OpenShift nodes are created in an OpenStack tenant network and as such, can't be accessed directly. The installer does not create any floating IP addresses. - -However, the installer does need access to the OpenShift's API as it is being deployed. +## API Access -There are two ways you can handle this. +All the OpenShift nodes are created in an OpenStack tenant network and as such, can't be accessed directly in most openstack deployments. The installer does not create any floating IP addresses, but does need access to the OpenShift's API as it is being deployed. We will briefly explain how to set up access to the openshift api with and without floating IP addresses. -### Bring Your Own Cluster IP +### Using Floating IPs -We recommend you create a floating IP address ahead of time, add the -API record to your own DNS and let the installer use that address. +This method allows you to attach two floating IP addresses to endpoints in OpenShift. -First, create the floating IP: +First, create a floating IP address for the API: $ openstack floating ip create -Note the actual IP address. We will use `10.19.115.117` throughout this -document. - Next, add the `api..` and `*.apps..` name records pointing to that floating IP to your DNS: - api.ostest.shiftstack.com IN A 10.19.115.117 - *.apps.ostest.shiftstack.com IN A 10.19.115.117 + api.example.shiftstack.com IN A If you don't have a DNS server under your control, you finish the installation by adding the following to your `/etc/hosts`: - 10.19.115.117 api.ostest.shiftstack.com - 10.19.115.117 console-openshift-console.apps.ostest.shiftstack.com + api.example.shiftstack.com **NOTE:** *this will make the API accessible only to you. This is fine for your own testing (and it is enough for the installation to succeed), but it is not @@ -122,30 +157,48 @@ platform: cloud: standalone externalNetwork: public region: regionOne - computeFlavor: m1.medium - lbFloatingIP: "10.19.115.117" + computeFlavor: m1.large + lbFloatingIP: "" ``` -This will let you do a fully unattended end to end deployment. +At the time of writing, you will have to create a second floating ip and attach it to the ingress-port if you want to be able to reach *.apps externally. +This can be done after the install completes in three steps: + +Get the ID of the ingress port: +```sh +openstack port list | grep "ingress-port" +``` -### No Floating IP +Create and associate a floating IP to the ingress port: -If you don't want to pre-create a floating IP address, you will still want to create the API DNS record or the installer will fail waiting for the API. +```sh +openstack floating ip create --port +``` + +Add A record in your dns for *apps. in your DNS: + +``` +*.apps.example.shiftstack.com IN A +``` +OR add A record in `/etc/hosts`: +``` + console-openshift-console.apps.example.shiftstack.com +``` + +### Without Floating IPs + +If you don't want to pre-create a floating IP address, you will still want to create the API DNS record or the installer will fail waiting for the API. Without the floating IP, you won't know the right IP address of the server ahead of time, so you will have to wait for it to come up and create the DNS records then: $ watch openstack server list Wait for the `-api` server comes up and you can make your changes then. - **WARNING:** The installer will fail if it can't reach the bootstrap OpenShift API in 30 minutes. - Even if the installer times out, the OpenShift cluster should still come up. Once the bootstrapping process is in place, it should all run to completion. - So you should be able to deploy OpenShift without any floating IP addresses and DNS records and create everything yourself after the cluster is up. - ## Current Expected Behavior As mentioned, OpenStack support is still experimental. Currently: @@ -160,51 +213,6 @@ OpenShift API and as an internal DNS for the instances The installer should finish successfully, though it is still undergoing development and things might break from time to time. -### Workarounds - -#### External DNS - -While deploying the cluster, the installer will hang trying to reach the API as -the node running the installer cannot resolve the service VM (the cluster -should still come up successfully within the isolated network). - -You can add the service VM floating IP address at the top of your `/etc/resolv.conf`: - -``` -$ cat /etc/resolv.conf -# Generated by NetworkManager -search example.com -# OpenShift Service VM DNS: -nameserver 10.19.115.117 - -# Your previous DNS servers: -nameserver 83.240.0.215 -nameserver 83.240.0.136 -``` - -(the service VM floating IP is `10.19.115.117` in this example) - -If you don't want to update your DNS config, you can add a couple of entries in your `/etc/hosts` file instead: - -``` -$ cat /etc/hosts -127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 -::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 -10.19.115.117 -api. -10.19.115.117 console-openshift-console.apps.. -``` - -If you do expose the cluster, the installer should complete successfully. - -It will print the console URL, username and password and you should be able to go there and log in. - -``` -INFO Install complete! -INFO To access the cluster as the system:admin user when using 'oc', run 'export KUBECONFIG=/path/to/installer/auth/kubeconfig' -INFO Access the OpenShift web-console here: https://console-openshift-console.apps.ostest.shiftstack.com -INFO Login to the console with user: kubeadmin, password: 5char-5char-5char-5char -``` - ## Using an External Load Balancer This documents how to shift from the API VM load balancer, which is