gravitational · pschisa · May 22, 2023 · Apr 5, 2023 · Apr 5, 2023 · Apr 5, 2023
diff --git a/docs/config.json b/docs/config.json
@@ -211,6 +211,14 @@
                 "enterprise"
               ]
             },
+            {
+              "title": "AWS Multi-Region Proxy Deployment",
+              "slug": "/deploy-a-cluster/deployments/aws-gslb-proxy-peering-ha-deployment/",
+              "forScopes": [
+                "oss",
+                "enterprise"
+              ]
+            },
             {
               "title": "GCP",
               "slug": "/deploy-a-cluster/deployments/gcp/",

diff --git a/docs/cspell.json b/docs/cspell.json
@@ -66,6 +66,8 @@
     "Gbps",
     "Goland",
     "Grafana's",
+    "gslb",
+    "GSLB",
     "Gtczk",
     "HSTS",
     "Hqlo",

diff --git a/docs/img/deploy-a-cluster/aws-gslb-proxy-peering-ha-deployment.png b/docs/img/deploy-a-cluster/aws-gslb-proxy-peering-ha-deployment.png
diff --git a/docs/pages/deploy-a-cluster/deployments.mdx b/docs/pages/deploy-a-cluster/deployments.mdx
@@ -7,6 +7,11 @@ layout: tocless-doc
 These guides show you how to set up a full self-hosted Teleport deployment on
 the platform of your choice.
 
-- [AWS Terraform](./deployments/aws-terraform.mdx): Deploy HA Teleport with Terraform Provider on AWS.
+- [AWS Terraform](./deployments/aws-terraform.mdx): Deploy HA Teleport with
+  Terraform Provider on AWS.
+- [AWS Multi-Region Proxy
+  Deployment](./deployments/aws-gslb-proxy-peering-ha-deployment.mdx): Deploy HA
+  Teleport with Proxy Service instances in multiple regions for low-latency
+  access.
 - [GCP](./deployments/gcp.mdx): Deploy HA Teleport on GCP.
 - [IBM Cloud](./deployments/ibm.mdx): Deploy HA Teleport on IBM cloud.
diff --git a/docs/pages/deploy-a-cluster/deployments/aws-gslb-proxy-peering-ha-deployment.mdx b/docs/pages/deploy-a-cluster/deployments/aws-gslb-proxy-peering-ha-deployment.mdx
@@ -0,0 +1,247 @@
+---
+title: "AWS Multi-Region High Availability Deployment Guide"
+description: "Deploying a high-availability Teleport cluster using Proxy Peering and Route 53 to create global server load balancing."
+---
+
+This deployment architecture features two important design decisions:
+
+1. AWS Route 53 latency-based routing is used for global server load balancing 
+   ([GSLB](https://www.cloudflare.com/learning/cdn/glossary/global-server-load-balancing-gslb/)).
+   This allows for efficient distribution of traffic across resources that are globally distributed.
+2. Teleport's [Proxy Peering](../../architecture/proxy-peering.mdx) is used to reduce the total number of tunnel connections in the Teleport cluster.
+
+This deployment architecture isn't recommended for use cases where your users or resources are 
+clustered in a single region, or for Managed Service Providers needing to provide separate clusters 
+to customers.
+
+This architecture is best suited for globally distributed resources and end-users that prefer a single point of
+entry while also ensuring minimal latency when accessing connected resources.
+
+## Key deployment components
+
+- Deployed exclusively in the AWS ecosystem
+- High-availability Auto Scaling group of Auth Service instances that must remain in a single region
+- High-availability Auto Scaling group of Proxy Service instances deployed across multiple regions
+- [AWS Route 53 latency-based routing](https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/routing-policy-latency.html) 
+- [GSLB](https://www.cloudflare.com/learning/cdn/glossary/global-server-load-balancing-gslb/)
+- [Teleport TLS Routing](https://goteleport.com/docs/architecture/tls-routing/) to reduce the number of ports needed to use Teleport
+- [Teleport Proxy Peering](https://goteleport.com/docs/architecture/proxy-peering/) for reducing the number of resource connections
+- [AWS Network Load Balancing](https://docs.aws.amazon.com/elasticloadbalancing/latest/network/introduction.html)
+- [AWS DynamoDB](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Introduction.html) for cluster state storage
+- [AWS S3](https://docs.aws.amazon.com/AmazonS3/latest/userguide/Welcome.html) for session recording storage
+
+## Advantages of this deployment architecture
+
+- Eliminates the complexity and cost of maintaining multiple Teleport clusters across multiple regions.
+- Uses the lowest-latency path to connect users to resources.
+- Provides a highly resilient, redundant HA architecture for Teleport that can quickly 
+  scale with an organization's needs.
+- All required Teleport components can be provisioned within the AWS ecosystem.
+- Using load balancers for the Proxy and Auth Services allows for increased availability 
+  during Teleport cluster upgrades.
+
+## Disadvantages of this deployment architecture
+
+- When Teleport Auth Service instances are limited to a single region, there is a higher likelihood 
+  of decreased availability during an AWS regional outage.
+- More technically complex to deploy than a single region Teleport cluster.
+
+![Diagram showing this Teleport
+architecture](../../../img/deploy-a-cluster/aws-gslb-proxy-peering-ha-deployment.png)
+
+
+## AWS Network Load Balancer (NLB)
+AWS NLBs are required for this highly available deployment architecture. 
+The NLB forwards traffic from users and services to an available Teleport Proxy Service instance. This must not 
+terminate TLS, and must transparently forward the TCP traffic it receives. 
+In other words, this must be a Layer 4 load balancer, not a Layer 7 
+(e.g., HTTP) load balancer. 
+
+<Admonition
+  type="warning"
+  title="Note"
+>
+Cross-zone load balancing is required for the Auth and Proxy service NLB configurations to route 
+traffic across multiple zones. Doing this improves resiliency against localized AWS zone outages.
+</Admonition>
+
+### Configure the Proxy Service NLBs
+
+Configure the load balancer to forward traffic from the following ports on the
+load balancer to the corresponding port on an available Teleport instance.
+
+<TabItem label="NLB Proxy ports">
+
+| Port | Description |
+| - | - |
+| `443` | ALPN port for TLS Routing, HTTPS connections to authenticate `tsh` users into the cluster, and to serve Teleport's Web UI |
+
+</TabItem>
+
+### Configure the Auth Service NLB
+
+Configure the load balancer to forward traffic from the following ports on the
+load balancer to the corresponding port on an available Teleport instance.
+
+<Admonition
+  type="warning"
+  title="Note"
+>
+Proxies must have network access to the Auth Service NLB. You can accomplish this
+in the Route53 GSLB architecture using [VPC Peering](https://docs.aws.amazon.com/vpc/latest/peering/what-is-vpc-peering.html) 
+or [Transit Gateways](https://docs.aws.amazon.com/vpc/latest/tgw/what-is-transit-gateway.html).
+</Admonition>
+
+Internal NLB Auth Service ports
+
+| Port | Description |
+| - | - |
+| `3025` | TLS port used by the Auth Service to serve its API to Proxies in a cluster |
+
+## TLS credential provisioning 
+
+High-availability Teleport deployments require a system to fetch TLS
+credentials from a certificate authority like Let's Encrypt, AWS Certificate
+Manager, Digicert, or a trusted internal authority. The system must then
+provision Teleport Proxy Service instances with these credentials and renew them
+periodically. 
+
+For high-availability deployments that use Let's Encrypt to supply TLS
+credentials to Teleport instances running behind a load balancer, you need
+to use the [ACME
+DNS-01](https://letsencrypt.org/docs/challenge-types/#dns-01-challenge)
+challenge to demonstrate domain name ownership to Let's Encrypt. In this
+challenge, your TLS credential provisioning system creates a DNS TXT record with
+a value expected by Let's Encrypt.
+
+## Global Server Load Balancing with Route 53
+
+[Latency-based routing](https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/routing-policy-latency.html) 
+in a public hosted zone must be used to ensure traffic from Teleport 
+resources are routed to the closest or lowest latency path Proxy NLB based on the region of 
+the VPC the resource is connecting from.
+
+To create GSLB routing, create a CNAME record for each region where you have VPCs containing Teleport connected resources.
+It is recommended to add a wildcard record for every region if you plan to
+register applications with Teleport.
+
+The following CNAME record values need to be set:
+- **Value:** The domain name of the NLB where `example-region-1` located Teleport resource traffic should be routed
+- **Routing policy:** Latency
+- **Region:** The AWS region from which traffic should be routed to the NLB listed in **Value**
+- **Health Check ID:** It is recommended that you set this so that traffic is always routed to a healthy NLB
+
+Example Hosted Zone using AWS Route53 Latency Routing to create GSLB:
+
+### Root GSLB record for Teleport
+
+|Record name|Type|Value|
+|---|---|---|
+|```*.teleport.example.com```|CNAME|AWS Route 53 nameservers|
+
+### Teleport Proxy DNS records for GSLB
+
+|Record name|Type|Routing Policy|Region|Value|
+|---|---|---|---|---|
+|```teleport.example.com```|CNAME|Latency|us-west-1| ```elb.us-west-1.amazonaws.com``` |
+|```*.teleport.example.com```|CNAME|Latency|us-west-1| ```elb.us-west-1.amazonaws.com``` |
+|```teleport.example.com```|CNAME|Latency|eu-central-1| ```elb.eu_central-1.amazonaws.com```|
+|```*.teleport.example.com```|CNAME|Latency|eu-central-1| ```elb.eu_central-1.amazonaws.com```|
+
+<Admonition title="Required permissions">
+
+If you are using Let's Encrypt to provide TLS credentials to your Teleport
+instances, the TLS credential system we mentioned earlier needs permissions to
+manage Route53 DNS records in order to satisfy Let's Encrypt's DNS-01 challenge. 
+
+</Admonition>
+
+### Teleport resource agent configuration for GSLB
+
+To facilitate latency-based routing, resource agents must be configured to point ```proxy_server:``` to 
+the GSLB domain configured in Route53, **not** the specific proxy NLB address.
+
+For example:
+
+```
+version: v3
+teleport:
+    nodename: ssh-node
+    ...
+    proxy_server: teleport.example.com:443
+    ...
+    ssh_service:
+        enabled: yes
+    ...
+```
+Review the [configuration reference](https://goteleport.com/docs/reference/config/) page for 
+additional settings.       
+
+## Configure Proxy Peering
+
+In this deployment architecture, [Proxy Peering](https://goteleport.com/docs/architecture/proxy-peering/) is used to restrict the number of connections made from 
+resources to proxies in the Teleport Cluster.
+
+This guide covers the necessary Proxy Peering settings for deploying an HA Teleport Cluster routing resource
+traffic with GSLB. 
+
+### Auth Service Proxy Peering configuration 
+
+The Teleport Auth Service must be configured to use the `proxy_peering` tunnel strategy as shown in the example below:
+
+```
+auth_service:
+ ...
+ tunnel_strategy:
+  type: proxy_peering
+  agent_connection_count: 2
+```
+Reference the [Auth Server configuration](https://goteleport.com/docs/reference/config/#auth-service) reference page 
+for additional settings.
+
+### Proxy Service Proxy Peering configuration 
+
+Proxies must advertise a peer address for proxy peers to establish connections to each other.
+The ports exposed on the Teleport Proxy Instances depends on whether you route Proxy Peering traffic over 
+the public internet:
+
+<Tabs>
+<TabItem label="Public Proxy Peering ports">
+
+| Port | Description |
+| - | - |
+| `443` | ALPN port for TLS Routing, HTTPS connections to authenticate `tsh` users into the cluster, and to serve Teleport's Web UI |
+| `3021`| Proxy Peering gRPC Stream  |
+
+</TabItem>
+<TabItem label="VPC peering Proxy Peering ports">
+
+| Port | Description |
+| - | - |
+| `443` | ALPN port for TLS Routing, HTTPS connections to authenticate `tsh` users into the cluster, and to serve Teleport's Web UI |
+
+</TabItem>
+</Tabs>
+
+Set `peer_public_addr` to the specific name of that proxy. This is the recommended 
+method for lowest latency and most reliable connection.
+
+```
+version: v3
+teleport:
+...
+proxy_service:
+  ...
+  peer_public_addr: teleport-proxy-eu-west-1-host1.example.com:3021
+  ...
+```
+<Admonition
+  type="warning"
+  title="Note"
+>
+`agent_connection_count` on the Auth service should be set to a value >=2 to decrease
+the likelihood of agents being unavailable.
+</Admonition>
+
+Reference the [Proxy Service configuration](https://goteleport.com/docs/reference/config/#proxy-service) reference page 
+for additional settings.
diff --git a/docs/pages/deploy-a-cluster/introduction.mdx b/docs/pages/deploy-a-cluster/introduction.mdx
@@ -28,5 +28,6 @@ In these guides, we will show you how to deploy a high-availability, VM-based
 Teleport cluster on your cloud:
 
 - [AWS with Terraform](./deployments/aws-terraform.mdx)
+- [AWS Multi-Region Proxy Deployment](./deployments/aws-gslb-proxy-peering-ha-deployment.mdx)
 - [Google Cloud](./deployments/gcp.mdx)
 - [IBM Cloud](./deployments/ibm.mdx)
-Original file line number
+Diff line change
@@ Expand Up / @@ -66,6 +66,8 @@ @@
         "Gbps",
         "Goland",
         "Grafana's",
+        "gslb",
+        "GSLB",
         "Gtczk",
         "HSTS",
         "Hqlo",
@@ Expand Down @@