diff --git a/rfd/0096-helm-chart-revamp.md b/rfd/0096-helm-chart-revamp.md
new file mode 100644
index 0000000000000..09f9c03481afc
--- /dev/null
+++ b/rfd/0096-helm-chart-revamp.md
@@ -0,0 +1,404 @@
+---
+authors: Hugo Hervieux (hugo.hervieux@goteleport.com)
+state: draft
+---
+
+# RFD 0096 - Helm chart revamping
+
+## Required approvers
+
+* Engineering: @r0mant && @tigrato && ( @gus || @programmerq )
+* Product: @klizhentas || @xinding33
+* Security: @reedloden
+
+## What
+
+This proposal describes structural changes made to the Teleport Helm charts to
+achieve the following goals:
+
+- deploy auth and proxies separately
+- reduce the time to deploy in most common setups (`aws` and `standalone`)
+- always support the latest Teleport features by default (reduce time-to-market)
+- reduce the cost of chart maintenance
+- ensure seamless updates between Teleport versions
+- ensure out of the box configuration supports large scale deployments
+
+## Why
+
+Most self-hosted Teleport setups rely either on Helm charts or Terraform to
+deploy and operate Teleport. We want those two methods to become reference
+ways of deploying Teleport, providing out of the box the most secure and
+available setup.
+
+Helm charts should allow users to easily benefit from the best Teleport
+deployment they can have.  This includes but does not limit to:
+- security
+- maintainability
+- availability
+- scalability
+
+In its current state, the Helm chart deploys a all-in-one set of pods assuming
+proxy, auth, and kubernetes-access roles. Splitting responsibilities across
+multiple sets of pods would increase availability, scalability, and reduce
+attack surface.
+
+Helm charts are also lagging behind upstream Teleport in terms of feature.  The
+`teleport-cluster` chart configuration is exposing a subset of the supported
+`teleport.yaml` values, but under different names. This causes unnecessary
+friction for the user and increases the cost of maintaining the chart
+configuration template.
+
+## Details
+
+This proposal starts by discussing the chart structure and deployed resources.
+The second part is dedicated to the chart values, configuration format, and
+backward compatibility. The third part addresses new update strategy
+constraints between major Teleport versions.
+
+### Chart structure and deployed resources
+
+The resources in the chart would be split in two subdirectories,
+`templates/auth/` and `templates/proxy/` to clearly identify which resource is
+used by which teleport node. Common resources should be put in `templates/`.
+
+The chart would deploy two Deployments: one for the proxies and one for the auths.
+
+- the `teleport-proxy` Deployment: Those pods are stateless by default and can
+  be upscaled even in standalone mode. Deploying those nodes using a Deployment
+  means we cannot mount peristent storage on them. As Teleport does not support
+  graceful shutdown with record shipping, users might lose active sessions
+  recordings during a rollout if using the `proxy` mode. Teleport nodes
+  are relying on `kube` ProvisionTokens to join the auth nodes on startup ([see
+  RFD-0094](https://github.com/gravitational/teleport/blob/rfd/0096-helm-chart-revamp/rfd/0096-helm-chart-revamp.md)).
+- the `teleport-auth` Deployment:  Those pods cannot be
+  replicated without remote backend for state and audit logs. When persistence is
+  enabled, a single volume will be mounted to those pods and the update strategy
+  will be "re-create". For setups in which auth pods are stateless, the Deployment
+  can be scaled up.
+
+The main LB service should send traffic to the proxies, two additional services
+for in-cluster communication should be created: one for the proxies and one for
+the auth.
+
+The trust between auth and proxy should be bootstrapped by
+[creating a provisionToken on start](https://github.com/gravitational/teleport/pull/19009).
+
+#### Labels and selectors
+
+Deploying different pod sets requires a way to discriminate them. The only
+label set currently is `app: {{.Release.Name }}`. We should follow [Helm
+label recommendations](https://helm.sh/docs/chart_best_practices/labels/):
+
+| Label                        | Value                                 | Purpose   |
+|------------------------------|---------------------------------------|-----------|
+| app.kubernetes.io/name       | `{{- default .Chart.Name .Values.nameOverride \| trunc 63 \| trimSuffix "-" }}` | Identify the application. |
+| helm.sh/chart                | `{{ .Chart.Name }}-{{ .Chart.Version \| replace "+" "_" }}` | This should be the chart name and version. |
+| app.kubernetes.io/managed-by | `{{ .Release.Service }}`              | It is for finding all things managed by Helm. |
+| app.kubernetes.io/instance   | `{{ .Release.Name }}`                 | It aids in differentiating between different instances of the same application. |
+| app.kubernetes.io/version    | `{{ .Chart.AppVersion }}`             | The version of the app. |
+| app.kubernetes.io/component  | Name of the main Teleport service: `auth`, `proxy`, `kube` | This describes which Teleport component is deployed. |
+
+Those labels should be applied to all deployed resources when applicable.
+This includes but does not limit to Pods, Deployments, ConfigMaps,
+Secrets and Services.
+
+Note: if multiple components are deployed in the same pod (e.g. auth and kube),
+only the main component should appear in the `app.kubernetes.io/component`.
+This avoids the label selectors to change when services are added or removed.
+
+The `app: {{.Release.Name}}` label should stay on the auth pods for
+compatibility reasons.
+
+#### Monitoring
+
+A single optional
+[`PodMonitor`](https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#podmonitor)
+should be deployed per Helm release, selecting all pods based
+on `app.kubernetes.io/name`.
+
+#### Custom Resources
+
+It was initially planned to allow deployment of custom resources through the chart.
+Unfortunately, Helm does not support deploying both a CRD and its CRs in the same
+release (it checks if the API is supported before deploying). This section has been
+removed from the RFD during implementation.
+
+### Values and Teleport configuration
+
+#### Generating `teleport.yaml`
+
+The Helm chart would still expose modes (`aws`, `gcp`, `standalone`, `custom`),
+but allow users to pass arbitrary additional configuration or perform specific
+overrides.  This way, users would not have to leave the happy path if they need
+to set one specific value.  Manually implementing all configuration knobs in
+Helm adds no value and brings confusion as some values are not supported or not
+named the same way than the `teleport.yaml` field they set.
+
+By leveraging Helm's templating functions `toYaml`, `fromYaml`, and sprig's
+`mustMergeOverwrite` the charts would merge their automatically-generated
+`teleport.yaml` with user-provided `teleport.yaml`.
+
+A user deploying the `auth_service` chart in `standalone` mode and wanting to
+set key exchange algorithms, remove extra log fields, and override kube
+cluster's name would use the values:
+
+```yaml
+clusterName: my-cluster
+chartMode: standalone
+auth:
+  teleportConfig:
+    teleport:
+      kex_algos:
+        - ecdh-sha2-nistp256
+        - ecdh-sha2-nistp384
+        - ecdh-sha2-nistp521
+      log:
+        format:
+          extra_fields: ~
+    kubernetes_service:
+      kube_cluster_name: my-override
+```
+
+The generated chart configuration for the `standalone` mode is
+```yaml
+teleport:
+  log:
+    severity: INFO
+    output: stderr
+    format:
+      output: text
+      extra_fields: ["timestamp","level","component","caller"]
+auth_service:
+  enabled: true
+  cluster_name: my-cluster
+  authentication:
+    type: "local"
+    local_auth: true
+    second_factor: "otp"
+kubernetes_service:
+  enabled: true
+  listen_addr: 0.0.0.0:3027
+  kube_cluster_name: my-cluster
+proxy_service:
+  enabled: false
+ssh_service:
+  enabled: false
+```
+
+Once merged with the custom user configuration, the resulting configuration is
+```yaml
+auth_service:
+  authentication:
+    local_auth: true
+    second_factor: otp
+    type: local
+  cluster_name: my-cluster
+  enabled: true
+kubernetes_service:
+  enabled: true
+  kube_cluster_name: my-override
+  listen_addr: 0.0.0.0:3027
+proxy_service:
+  enabled: false
+ssh_service:
+  enabled: false
+teleport:
+  kex_algos:
+    - ecdh-sha2-nistp256
+    - ecdh-sha2-nistp384
+    - ecdh-sha2-nistp521
+  log:
+    format:
+      extra_fields: null
+      output: text
+    output: stderr
+    severity: INFO
+```
+
+The proof of concept code [can be found
+here](https://github.com/hugoShaka/teleport-helm-config-poc).
+
+The main drawback of this approach is that comments and value ordering are lost
+during the round-trip.  This approach could be extended to support multiple
+configuration syntax, following a breaking change for example.
+
+`custom` should be removed in favor of a new `scratch` mode. Compared to
+the previous Helm chart, users would not provide an external ConfigMap
+but pass the custom configuration through the values. This is a breaking
+change for them, but by the nature of the auth/proxy split it is not possible
+to be backward compatible with `custom` mode.
+
+In order to mitigate the risk of building an invalid confifguration, the chart should
+run pre-install and pre-upgrade hooks validating the configuration.
+
+#### Backward compatibility
+
+Splitting between auth and proxies will imply breaking some logic, we will try
+to provide backward compatibility as much as possible. This includes being
+compatible with the previous installation guides and seamlessly upgrading setups
+created from those guides.
+
+The revamp of the `teleport-cluster` change should ensure the IP of the service
+stays the same, this requires the loadbalancing service to remain the same.
+
+#### Teleport-specific configuration values
+
+This proposal introduces two new values for users to edit the `teleport.yaml`
+config: `auth.teleportConfig` and `proxy.teleportConfig`. The content of those
+values should be merged with the generated configuration, as described [in the
+previous section](#generating-teleportyaml).
+
+For example:
+```yaml
+auth:
+  teleportConfig:
+    auth_service:
+      authentication:
+        connector_name: "my-connector"
+proxy:
+  teleportConfig:
+    proxy_service:
+      acme:
+        enabled: true
+        email: foo@example.com
+```
+
+The following values are core values: users must set them for the chart to work
+properly. They support the happy path. Those values should not be changed by
+the proposal as it would harm backward compatibility and user experience.
+
+- `clusterName`
+- `publicAddr`
+- `chartMode`
+- `aws`
+- `gcp`
+- `enterprise`
+- `operator`
+
+The following values are used to generate Teleport's configuration, we must
+continue to support them for backward compatibility, but using
+`*.teleportConfig` should be preferred.
+
+- `kubeClusterName`
+- `authentication`
+- `authenticationType`
+- `authenticationSecondFactor`
+- `proxyListenerMode`
+- `sessionRecording`
+- `separatePostgresListener`
+- `separateMongoListener`
+- `kubePublicAddr`
+- `mongoPublicAddr`
+- `mysqlPublicAddr`
+- `postgresPublicAddr`
+- `sshPublicAddr`
+- `tunnelPublicAddr`
+- `acme`
+- `acmeEmail`
+- `acmeURI`
+- `log`
+
+#### Chart-specific configuration values
+
+Some values are used to configure the Kubernetes resources deploying Teleport.
+When specified they should apply to both auth and proxy deployments.  Those
+values are:
+
+- `podSecurityPolicy`
+- `labels`
+- `highAvailability`
+- `tls`
+- `image`
+- `enterpriseImage`
+- `affinity`
+- `annotations`
+- `extraArgs`
+- `extraEnv`
+- `extraVolumes`
+- `extraVolumeMounts`
+- `imagePullPolicy`
+- `initContainers`
+- `postStart`
+- `securityContext`
+- `priorityClassName`
+- `tolerations`
+- `probeTimeoutSeconds`
+- `teleportVersionOverride`
+- `resources`
+
+A few values will have to be treated differently:
+
+- `persistence` will only apply to the `auth` deployment
+- `service` will only apply to the `proxy` service
+- `serviceAccount.name` will apply to the `auth`, the proxy service account
+  name should be the auth one suffixed with `-proxy`
+
+Some users will need to set different values for auth and proxy pods, the
+following values should be also available under `auth` and `proxy`. Those
+specific values should take precedence over the ones at the root.
+
+- `labels`
+- `highAvailability` (except the certManager section)
+- `affinity`
+- `annotations`
+- `extraArgs`
+- `extraEnv`
+- `extraVolumes`
+- `extraVolumeMounts`
+- `initContainers`
+- `postStart`
+- `tolerations`
+- `teleportVersionOverride`
+- `resources`
+
+### Configuration examples
+
+As this RFD brings numerous value changes and adds several ways of doing the same
+thing, users should be provided full working examples covering various common setups.
+
+Such examples would complete the documentation by demonstrating to users the best
+practices and capabilities of the chart.
+
+Those examples should also be used to lint the chart.
+
+### Update strategy between major versions
+
+Auth pods have to be updated before proxies.  Helm does not support applying
+resources in a specific order.
+
+Both auth and proxy rollouts will be triggered at the same time, but the proxy
+one should be held until all auth pods are rolled out. Not waiting for the full
+rollout will cause the load to spread unevenly across auth pods, which will be
+harmful at scale.
+
+Proxies will have an initContainer checking if all auth pods from the past
+version were removed. Version check via the Teleport gRPC api (`PingResponse`)
+requires valid credentials to connect to Teleport. To work around this issue we
+can rely on Kubernetes's service discovery through DNS to discover how many
+pods are running which version:
+
+- the chart labels auth pods with their major teleport version
+- the chart creates two headless services:
+  - one selecting pods with the current major version `teleport-auth-v11`
+  - one selecting pods with the previous major version `teleport-auth-v10`
+- proxy pods have an initContainer
+- the `v11` initContainer resolves `teleport-auth-v10` every 5 seconds until no
+  IP is returned
+- the initContainer exits, the proxy starts
+- this unlocks the proxy deployment rollout
+
+Headless services selecting auth pods with a specific version should contain 
+on-ready endpoints to ensure the rollout happens only when all pods are
+completely terminated. This means setting `spec.publishNotReadyAddresses: true`.
+
+This rollout approach might take some time on largest Teleport deployments.
+This is not an issue per-se but has to be documented, as users running with
+`--atomic` or `--wait` might have to increase their Helm timeouts.
+
+Note: Teleport does not officially support multiple auth nodes running under
+different major versions. The recommended update approach is to scale down to
+a single node, update, and scale back up. In reality, most Teleport versions
+are backward compatible with the previous major version, running multiple auth
+is rarely an issue. This potential issue seems more related to Teleport than to
+the deployment method, it will be considered out of scope of this RFD for the
+sake of simplicity.