diff --git a/docs/img/deploy-a-cluster/teleport-ha-architecture.png b/docs/img/deploy-a-cluster/teleport-ha-architecture.png index b52b9df25508f..0e8e78c00fe90 100644 Binary files a/docs/img/deploy-a-cluster/teleport-ha-architecture.png and b/docs/img/deploy-a-cluster/teleport-ha-architecture.png differ diff --git a/docs/pages/deploy-a-cluster/high-availability.mdx b/docs/pages/deploy-a-cluster/high-availability.mdx index cd25044e08bbe..e9d635c2e9ef7 100644 --- a/docs/pages/deploy-a-cluster/high-availability.mdx +++ b/docs/pages/deploy-a-cluster/high-availability.mdx @@ -1,6 +1,7 @@ --- title: "Deploying a High Availability Teleport Cluster" description: "Deploying a High Availability Teleport Cluster" +tocDepth: 3 --- When deploying Teleport in production, you should design your deployment to @@ -16,14 +17,17 @@ deployment. ## Overview -A high-availability Teleport cluster revolves around a group of redundant -`teleport` processes, each of which runs the Auth Service and Proxy Service, -plus the infrastructure required to support them. - -This includes: - -- **A Layer 4 load balancer** to direct traffic from users and services to an - available `teleport` process. +A high-availability Teleport cluster revolves around two pools of redundant +`teleport` processes, one running the Auth Service and one running the Proxy +Service, plus the infrastructure required to support each pool. + +Infrastructure components include: +- A **public Layer 4 load balancer** to direct traffic from users and services + to an available Proxy Service instance. +- A **private Layer 4 load balancer** to direct traffic from the Proxy Service + to the Auth Service's gRPC API, which is how Teleport manages the Auth + Service's backend state and provides credentials to users and services in your + cluster. - A **cluster state backend**. This is a key-value store for cluster state and audit events that all Auth Service instances can access. This requires permissions for Auth Service instances to manage records within the key-value @@ -44,21 +48,32 @@ This includes: ![Diagram of a high-availability Teleport architecture](../../img/deploy-a-cluster/teleport-ha-architecture.png) -## Layer 4 load balancer +## Layer 4 load balancers -The load balancer forwards traffic from users and services to an available -Teleport instance. This must not terminate TLS, and must transparently forward -the TCP traffic it receives. In other words, this must be a Layer 4 load -balancer, not a Layer 7 (e.g., HTTP) load balancer. +High-availability Teleport clusters require two load balancers: +- **Proxy Service load balancer:** A load balancer to receive traffic from + outside the network where your Teleport cluster is running and forward it to + an available Proxy Service instance. This load balancer handles TCP traffic + from users and services in a variety of application-layer protocols. +- **Auth Service load balancer:** A load balancer to forward traffic from a + Proxy Service instance to an available Auth Service instance. This handles TLS + traffic to the Auth Service's gRPC endpoint. -We recommend configuring your load balancer to route traffic across multiple +Both load balancers must transparently forward the TCP traffic they receive, +without terminating TLS. In other words, these must be Layer 4 load balancers, +not Layer 7 (e.g., HTTP). + +We recommend configuring your load balancers to route traffic across multiple zones (if using a cloud provider) or data centers (if using an on-premise solution) to ensure availability. -### TLS Routing +### Configuring the Proxy Service load balancer + +#### TLS Routing -Your load balancer configuration depends on whether you will enable [TLS -Routing](../management/operations/tls-routing.mdx) in your Teleport cluster. +The way you configure the Proxy Service load balancer depends on whether you +will enable [TLS Routing](../management/operations/tls-routing.mdx) in your +Teleport cluster. With TLS Routing, the Teleport Proxy Service uses application-layer protocol negotiation (ALPN) to handle all communication with users and services via the @@ -76,29 +91,30 @@ The approach we describe in this guide uses only a Layer 4 load balancer to minimize the infrastructure you will deploy, but users that require a separate load balancer for HTTPS traffic should disable TLS Routing. -### Configuring the load balancer +#### Open ports -Configure the load balancer to forward traffic from the following ports on the -load balancer to the corresponding port on an available Teleport instance. The -configuration depends on whether you will enable TLS Routing: +Configure the Proxy Service load balancer to forward traffic from the following +ports on the load balancer to the corresponding port on an available Proxy +Service instance. The configuration depends on whether you will enable TLS +Routing: -| Port | Description | -| - | - | -| `443` | ALPN port for TLS Routing. | +| Load Balancer Port | Proxy Service Port | Description | +| - | - | - | +| `443` | `3080` | ALPN port for TLS Routing. | These ports are required: -| Port | Description | -| - | - | -| `3023` | SSH port for clients connect to. | -| `3024` | SSH port used to create reverse SSH tunnels from behind-firewall environments. | -| `443` | HTTPS connections to authenticate `tsh` users into the cluster. The same connection is used to serve a Web UI. | +| Load Balancer Port | Proxy Service Port | Description | +| - | - | - | +| `3023` | `3023` | SSH port for clients connect to. | +| `3024` | `3024` | SSH port used to create reverse SSH tunnels from behind-firewall environments. | +| `443` | `3080` | HTTPS connections to authenticate `tsh` users into the cluster. The same connection is used to serve a Web UI. | You can leave these ports closed if you are not using their corresponding services: @@ -112,6 +128,13 @@ services: +### Configuring the Auth Service load balancer + +The Auth Service load balancer must forward traffic to the Auth Service's gRPC +port. In this guide, we are assuming that you have configured the Auth Service +load balancer to forward traffic from port `3025` to port `3025` on an available +Auth Service instance. + ## Cluster state backend The Teleport Auth Service stores cluster state (such as dynamic configuration @@ -256,23 +279,30 @@ records. ## Teleport instances -Run the Teleport Auth Service and Proxy Service as a scalable group of compute -resources, for example, a Kubernetes `Deployment` or AWS Auto Scaling group. -This requires running the `teleport` binary on each Kubernetes pod or virtual -machine or in your group. +Run the Teleport Auth Service and Proxy Service as two scalable groups of +compute resources, for example, using Kubernetes Deployments or AWS Auto +Scaling groups. This requires running the `teleport` binary on each Kubernetes +pod or virtual machine in your group. + + + +If you plan to run Teleport on Kubernetes, the `teleport-cluster` Helm chart +deploys the Auth Service and Proxy Service pools for you. To see how to use this +Helm chart, read our [Helm Deployments](helm-deployments.mdx) documentation. + + You should deploy your Teleport instances across multiple zones (if using a cloud provider) or data centers (if using an on-premise solution) to ensure availability. -In the [Configuration](#configuration) section, we will show you how to -configure each binary for high availability. +### Proxy Service pool -### Open ports +#### Open ports -Ensure that, on each Teleport instance, the following ports allow traffic from -the load balancer. The Proxy Service uses these ports to communicate with -Teleport users and services. +Ensure that, on each Proxy Service instance, the following ports allow traffic +from the Proxy Service load balancer. The Proxy Service uses these ports to +communicate with Teleport users and services. As with your load balancer configuration, the ports you should open on your Teleport instances depend on whether you will enable TLS Routing: @@ -282,7 +312,7 @@ Teleport instances depend on whether you will enable TLS Routing: | Port | Description | | - | - | -| `443` | ALPN port for TLS Routing. | +| `3080` | ALPN port for TLS Routing. | @@ -293,7 +323,7 @@ These ports are required: | - | - | | `3023` | SSH port for clients connect to. | | `3024` | SSH port used to create reverse SSH tunnels from behind-firewall environments. | -| `443` | HTTPS connections to authenticate `tsh` users into the cluster. The same connection is used to serve a Web UI. | +| `3080` | HTTPS connections to authenticate `tsh` users into the cluster. The same connection is used to serve a Web UI. | You can leave these ports closed if you are not using their corresponding services: @@ -309,51 +339,19 @@ services: *This is the same table of ports you used to configure the load balancer.* -### License file - -If you are deploying Teleport Enterprise, you need to download a license file -and make it available to your Teleport Auth Service instances. - -To obtain your license file, visit the [Teleport customer -dashboard](https://dashboard.gravitational.com/web/login) and log in. Click -"DOWNLOAD LICENSE KEY". You will see your current Teleport Enterprise account -permissions and the option to download your license file: - -![License File modal](../../img/enterprise/license.png) - -The license file must be available to each Teleport Auth Service instance at -`/var/lib/teleport/license.pem`. - -### Configuration - -Create a configuration file and provide it to each of your Teleport instances at -`/etc/teleport.yaml`. We will explain the required configuration fields for a -high-availability Teleport deployment below. These are the minimum requirements, -and when planning your high-availability deployment, you will want to follow a -more specific [deployment guide](introduction.mdx) for your environment. - -#### `storage` - -The first configuration section to write is the `storage` section, which -configures the cluster state backend and session recording backend for the -Teleport Auth Service: - -```yaml -version: v3 -teleport: - storage: - # ... -``` +#### Configuration -Consult our [Backends Reference](../reference/backends.mdx) for the configuration -fields you should set in the `storage` section. +Create a configuration file and provide it to each of your Proxy Service +instances at `/etc/teleport.yaml`. We will explain the required configuration +fields for a high-availability Teleport deployment below. These are the minimum +requirements, and when planning your high-availability deployment, you will want +to follow a more specific [deployment guide](introduction.mdx) for your +environment. -#### `auth_service` and `proxy_service` +#### `proxy_service` and `auth_service` -The `auth_service` and `proxy_service` sections configure the Auth Service and -Proxy Service, which we will run together on each Teleport instance. The -configuration will depend on whether you are enabling TLS Routing in your -cluster: +The `proxy_service` section configures the Proxy Service. The configuration will +depend on whether you are enabling TLS Routing in your cluster: @@ -363,14 +361,8 @@ Teleport configuration: ```yaml version: v3 -teleport: - storage: - # ... auth_service: - enabled: true - cluster_name: "mycluster.example.com" - # Remove this if not using Teleport Enterprise - license_file: "/var/lib/license/license.pem" + enabled: false proxy_service: enabled: true public_addr: "mycluster.example.com:443" @@ -390,15 +382,9 @@ Teleport configuration: ```yaml version: v3 -teleport: - storage: - # ... auth_service: proxy_listener_mode: separate - enabled: true - cluster_name: "mycluster.example.com" - # Remove this if not using Teleport Enterprise - license_file: "/var/lib/license/license.pem" + enabled: false proxy_service: enabled: true listen_addr: 0.0.0.0:3023 @@ -416,21 +402,17 @@ reverse tunnel port (`tunnel_listen_addr`) for the Proxy Service. -The `auth_service` and `proxy_service` configurations above have the following -required settings for a high-availability Teleport deployment: +In the `proxy_service` section, we have enabled the Teleport Proxy Service +(`enabled`) and instructed it to find its TLS credentials in the +`/etc/teleport-tls` directory (`https_keypairs`). -- In the `auth_service` section, we have enabled the Teleport Auth Service - (`enabled`) and instructed it to find an Enterprise license file at - `/var/lib/license/license.pem` (`license_file`). Remove the `license_file` - field if you are deploying the open source edition of Teleport. -- In the `proxy_service` section, we have enabled the Teleport Proxy Service - (`enabled`) and instructed it to find its TLS credentials in the - `/etc/teleport-tls` directory (`https_keypairs`). +We have set `auth_service.enabled` to `false` to disable the Auth Service, which +is enabled by default, on each Proxy Service instance. #### `ssh_service` -You can disable the SSH Service on each Teleport instance by adding the -following to each instance's configuration file: +The SSH Service is enabled by default. You can disable the SSH Service on each +Teleport instance by adding the following to each instance's configuration file: ```yaml version: v3 @@ -451,6 +433,96 @@ should not have direct access to the underlying node. If you are deploying Teleport on a cluster of virtual machines, remove this line to run the SSH Service and enable secure access to the host. +### Auth Service pool + +#### Open ports + +Ensure that, on each Auth Service instance, the following ports are open: + +| Port | Description | +| - | - | +| `3025` | gRPC port to open to Proxy Service instances.| + +#### License file + +If you are deploying Teleport Enterprise, you need to download a license file +and make it available to your Teleport Auth Service instances. + +(!docs/pages/includes//enterprise/obtainlicense.mdx!) + +The license file must be available to each Teleport Auth Service instance at +`/var/lib/teleport/license.pem`. + +#### Configuration + +Create a configuration file and provide it to each of your Auth Service +instances at `/etc/teleport.yaml`. We will explain the required configuration +fields for a high-availability Teleport deployment below. These are the minimum +requirements, and when planning your high-availability deployment, you will want +to follow a more specific [deployment guide](introduction.mdx) for your +environment. + +#### `storage` + +The first configuration section to write is the `storage` section, which +configures the cluster state backend and session recording backend for the Auth +Service: + +```yaml +version: v3 +teleport: + storage: + # ... +``` + +Consult our [Backends Reference](../reference/backends.mdx) for the configuration +fields you should set in the `storage` section. + +#### `auth_service` and `proxy_service` + +The `auth_service` section configures the Auth Service: + +```yaml +version: v3 +teleport: + storage: + # ... +auth_service: + enabled: true + cluster_name: "mycluster.example.com" + # Remove this if not using Teleport Enterprise + license_file: "/var/lib/teleport/license.pem" +proxy_service: + enabled: false +``` + +In the `auth_service` section, we have enabled the Teleport Auth Service +(`enabled`) and instructed it to find an Enterprise license file at +`/var/lib/teleport/license.pem` (`license_file`). Remove the `license_file` field +if you are deploying the open source edition of Teleport. + +Since we are running Proxy Service instances in a dedicated pool, we have +disabled the Proxy Service on our Auth Service instances by setting +`proxy_service.enabled` to `false`. + +#### `ssh_service` + +As with the Proxy Service pool, you can disable the SSH Service on each Teleport +instance by adding the following to each instance's configuration file: + +```yaml +version: v3 +teleport: + storage: + # ... +auth_service: +# ... +proxy_service: +# ... +ssh_service: + enabled: false +``` + ## Next steps ### Refine your plan