Skip to content

RFD 0096: Helm chart revamp#18274

Merged
hugoShaka merged 10 commits intomasterfrom
rfd/0096-helm-chart-revamp
Jan 11, 2023
Merged

RFD 0096: Helm chart revamp#18274
hugoShaka merged 10 commits intomasterfrom
rfd/0096-helm-chart-revamp

Conversation

@hugoShaka
Copy link
Copy Markdown
Contributor

@hugoShaka hugoShaka commented Nov 8, 2022

Rendered version

This RFD is part of our Q4 efforts to improve the teleport Helm experience.

@hugoShaka hugoShaka added the rfd Request for Discussion label Nov 8, 2022
@hugoShaka hugoShaka force-pushed the rfd/0096-helm-chart-revamp branch 2 times, most recently from 4045fb0 to f56ff5d Compare November 9, 2022 22:01
@hugoShaka
Copy link
Copy Markdown
Contributor Author

hugoShaka commented Nov 9, 2022

We might also want folks from the scalability team to take a look at this once we're happy with the design.

Comment on lines +316 to +388
Proxies will have an initContainer checking if all auth pods from the past
version were removed. Version check via the Teleport gRPC api (`PingResponse`)
requires valid credentials to connect to Teleport. To workaround this issue we
can rely on Kubernetes's service discovery through DNS to discover how many
pods are running which version:

- the chart labels auth pods with their major teleport version
- the chart creates two headless services:
- one selecting pods with the current major version `teleport-auth-v11`
- one selecting pods with the previous major version `teleport-auth-v10`
- proxy pods have an initContainer
- the `v11` initContainer resolves `teleport-auth-v10` every 5 seconds until no
IP is returned
- the initContainer exits, the proxy starts
- this unlocks the proxy deployment rollout
Copy link
Copy Markdown
Contributor Author

@hugoShaka hugoShaka Nov 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unsafe with a StatefulSet because a pod killed during a rollout will be re-created with the lastest version. Deployment implementation is based on ReplicaSets, which ensures this does not happen and we cannot have a capacity loss because of a stuck rollout.

I wonder if it would be possible for the chart to refuse sessionRecording: proxy. In the last chart, running in proxy mode was almost identical to running in auth mode (almost because if users manually added proxies it would have had a different behaviour).

I believe that in order to improve the overall experience we have to provide a more opinionated deployment of Teleport, one that works great in 99% of the use-cases. This could be one of those changes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I discussed with @r0mant and it seems not supporting proxy session recording mode does not align with our strategy. However we could gracefully handle session recording queue emptying. This would mean the proxy session recording would only loose data on hardware failure, not on controlled disruptions like autoscaling events or version updates/configuration rollouts.

Copy link
Copy Markdown
Contributor Author

@hugoShaka hugoShaka Nov 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After discussing with multiple persons, it appears graceful shutdown is not currently supported by the chart, and does not even ship the recordings before exiting. Working around with a graceful restart also raises issues because the restart will hang until users session end, but proxy is not preemptively closing user sessions.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good to investigate if we can use the lifecycle.preStop hook.
The exec plugin could force teleport to close connections and ship the data volume to other proxy nodes to avoid the loss of session recording inventory.

Using a statefulset it's possible to know if your replica will be scaled down -> using the number of replicas and the pod name. For cases where scale down will happen, we can ship the payload to another proxy

Copy link
Copy Markdown
Contributor Author

@hugoShaka hugoShaka Nov 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good to investigate if we can use the lifecycle.preStop hook. The exec plugin could force teleport to close connections and ship the data volume to other proxy nodes to avoid the loss of session recording inventory.

Using a statefulset it's possible to know if your replica will be scaled down -> using the number of replicas and the pod name. For cases where scale down will happen, we can ship the payload to another proxy

As far as I understand the proxy storage is temporary, even in proxy mode. The recordings will eventually be shipped to an auth server for archiving purposes. In a permanent downscale case we would need to have a way to forcefully empty the queue. I initially assumed people using proxy mode would not downscale, but this was not based and this case is valid.

My biggest concern is what happens if we're in a stuck proxy rollout (auths are slowly rolling out, proxies are waiting) and we loose one or many proxies. With a StatefulSet they will come back up as v11 and not v10. They will not go back up and wait for the auth rollout to finish. This will cause a capacity loss, and maybe a full blackout if all proxies gets rescheduled (e.g. a GKE automated nodepool update that happens at the same time). With a deployment, a v10 pod getting deleted would be replaced by another v10 pod, which is what we want.

Copy link
Copy Markdown
Contributor Author

@hugoShaka hugoShaka Nov 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed back to a Deployment, this means users might loose active session recordings during a pod rollout when using the proxy recording mode.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC the way we do this in Cloud is to rewrite the default container's SIGTERM (15) signal to SIGQUIT (3) using dumb-init so that we do in fact ask the pods to shut down gracefully: /usr/bin/dumb-init --rewrite 15:3 -- teleport start -c /etc/teleport/teleport.yaml

(from https://goteleport.com/docs/reference/signals/):
QUIT: Graceful shutdown. The daemon will wait until connections are dropped.
TERM , INT or KILL: Immediate non-graceful shutdown. All existing connections will be dropped.

I've never proposed this change in the Teleport Docker container itself because I feel it would be too much of a breaking change, but doing manually inside the Helm chart with args/command might work and be a good compromise?

I know that someone on the Teleport side was investigating the ability for Teleport to properly flush its session recording queue as part of a graceful shutdown but I'm unsure what progress they made on that. Cloud doesn't support proxy recording mode so there isn't a ton of prior art unfortunately.

Copy link
Copy Markdown
Contributor Author

@hugoShaka hugoShaka Nov 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, @espadolini took a lot of time to explain me this (thank you ❤️ ). I don't think I want to/can jump into this rabbit hole right now: SIGQUIT waits for sessions to end, which can take several hours as we don't have a session eviction/migration mechanism. In cloud this kinda works because we can afford slow rollouts with 30min of grace period. I'd love to raise this issue again once we have done the main refactos.

Copy link
Copy Markdown
Collaborator

@r0mant r0mant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this RFD needs to include details about additional updates we wanted to make to the charts such as

  • Deployment resources always named as release name
  • Lack of standard static labels on Teleport deployed resources e.g. app.kubernetes.io/component
  • Including auth connectors
  • Service monitors for metrics

Comment thread rfd/0096-helm-chart-revamp.md
Comment on lines +205 to +263
auth:
teleportConfig:
auth_service:
authentication:
connector_name: "my-connector"
proxy:
teleportConfig:
proxy_service:
acme:
enabled: true
email: foo@example.com
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@webvictim Do you recall why our initial Helm chart design took the approach with exposing only certain fields via Helm chart values rather than mirroring the config structure as-is and merging like proposed here?

This approach definitely makes it easier to reconcile Helm values with config but I wonder if it makes it easy for users to shoot themselves in the foot. E.g. what happens if you add "auth_service" config section to the "proxy" value section? Just thinking out loud.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@r0mant Essentially it was because of all the added complexity that it would add. To be honest though I didn't know that mustMergeOverride even existed, or I'd probably have written it that way in the first place 😁

One notable upside of doing it the old way with specific named values is that we could use enums in the values.yaml.json schema to set acceptable values for the chart and make sure that it's pretty difficult for people to break their Teleport config. When we take the guard rails off it'll be much easier to make the config file fail to validate, but I think the benefits of not needing to use chartMode: custom for more advanced deployments will definitely outweigh the downsides.

Another reason was that toYaml can be a bit of a blunt object and will rewrite lists in a certain way which made config files harder to read. That isn't a huge problem though.

@hugoShaka I wonder whether there's a way for us to do some kind of config file validation via a pre-deploy hook or InitContainer where we template out the config file that the user has specified and run teleport configure --test /etc/teleport/teleport.yaml against it to at least check whether Teleport is able to parse the file correctly. It would offload the validation to the Teleport binary itself and could maybe be used to abort a rollout with a failed configuration before taking the pods offline?

Copy link
Copy Markdown
Contributor Author

@hugoShaka hugoShaka Nov 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for a pre-deploy hook validating the configuration.

This would also fit perfectly with a suggestion from @programmerq: validate backend reachability/s3 permissions on startup as this is the most complex part of our setup and would allow user to catch broken setups waaaaay more easily. I don't think configure --test has this behaviour yet but it sounds like a nice little RFD

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added configure --test hooks to the RFD: ec8823c

Comment thread rfd/0096-helm-chart-revamp.md Outdated
but it would have required a remote backend anyway (because our SQLite backend
does not support replication).

The main LB service should send traffic to the proxies, two additional services
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want a graceful shutdown, we have to have a readiness probe that marks the container as not ready as soon as the container receives the signal to start a graceful shutdown. This is necessary so that the LB does not continue to send traffic to a container that is shutting down.

Copy link
Copy Markdown
Contributor Author

@hugoShaka hugoShaka Nov 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is a requirement, see the note on kube docs. Kubernetes will set the pod's endpoints unready as soon as the pod deletion process starts, even before transitioning to the Terminated state. If we don't want to loose requests we have to keep the pod running for at least 30 seconds. After 30 seconds all kube-proxy will have reconciled and no traffic will be routed to the pod through native kube services. This is usually done with a preStop hook waiting 30 seconds.

Users wanting to ensure 100% seamless rollouts would also have to rely on their LB readiness-gates. Readiness gate injection can be controlled through an annotation on the EKS namespace, or on the GKE service.

rollout will cause the load to spread unevenly across auth pods, which will be
harmful at scale.

Proxies will have an initContainer checking if all auth pods from the past
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will the helm upgrade wait for the full operation to end? If yes, we need to check the timing. By default it waits just 5min but in operations where new nodes are spanned it could take longer

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By default no, if you run with --wait it will. Else it's fire and forget, we're only relying on Kubernetes' logic and hope we reach the desired state at some point.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would assume that most of them would run with --atomic
--atomic if set, upgrade process rolls back changes made in case of failed upgrade. The --wait flag will be set automatically if --atomic is used

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case yes, we must warn the users the rollout can take some time and they should adjust their timeouts. Releasing will be longer than before if we setup disruption mitigations like preStop hooks, graceful shutdowns and sequentially update auth and proxies.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a section about documenting the possible rollout time and timeout tuning in the RFD: ec8823c

Proxies will have an initContainer checking if all auth pods from the past
version were removed. Version check via the Teleport gRPC api (`PingResponse`)
requires valid credentials to connect to Teleport. To workaround this issue we
can rely on Kubernetes's service discovery through DNS to discover how many
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The DNS server will just reply with the ready pods. To show not-ready pods, you must configure the service with Service.spec.publishNotReadyAddresses: true.

In situations where you upgrade from 1 auth to 3 auths, teleport-auth-v10 will return no endpoints after the first upgrade and you still have 2 pods that are not yet ready/running. It will result in every proxy connecting to a single auth server.

the init container must be aware of the number of auth servers. In situations where auth pods are moved to other nodes - cluster upgrades or rebalancing - it should also wait until every auth server is ready

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The DNS server will just reply with the ready pods. To show not-ready pods, you must configure the service with Service.spec.publishNotReadyAddresses: true.

👍

In situations where you upgrade from 1 auth to 3 auths, teleport-auth-v10 will return no endpoints after the first upgrade and you still have 2 pods that are not yet ready/running. It will result in every proxy connecting to a single auth server.

Interesting edge case. I think this is an acceptable temporary state as all proxies were previously connected to a single auth instance. This might be mitigated by doing surge-only updates, which is generally not a bad idea to avoid loosing capacity during rollouts.

the init container must be aware of the number of auth servers. In situations where auth pods are moved to other nodes - cluster upgrades or rebalancing - it should also wait until every auth server is ready

I don't know yet what to think about this. It makes sense but I wonder if this is really necessary as it would likely involve adding some kind of env variable/cmdline argument. An auth upscale would cause a proxy rollout, plus in certain cases we have stateless auth (full aws mode) that users might want to upscale/downscale without Helm.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The more I think about those cases, the more I just want an operator to take care of the updates instead of hacking through Kubernetes 🙃

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the publishNotReadyAddresses in the RFD: 4c1ca75

Comment on lines +316 to +388
Proxies will have an initContainer checking if all auth pods from the past
version were removed. Version check via the Teleport gRPC api (`PingResponse`)
requires valid credentials to connect to Teleport. To workaround this issue we
can rely on Kubernetes's service discovery through DNS to discover how many
pods are running which version:

- the chart labels auth pods with their major teleport version
- the chart creates two headless services:
- one selecting pods with the current major version `teleport-auth-v11`
- one selecting pods with the previous major version `teleport-auth-v10`
- proxy pods have an initContainer
- the `v11` initContainer resolves `teleport-auth-v10` every 5 seconds until no
IP is returned
- the initContainer exits, the proxy starts
- this unlocks the proxy deployment rollout
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good to investigate if we can use the lifecycle.preStop hook.
The exec plugin could force teleport to close connections and ship the data volume to other proxy nodes to avoid the loss of session recording inventory.

Using a statefulset it's possible to know if your replica will be scaled down -> using the number of replicas and the pod name. For cases where scale down will happen, we can ship the payload to another proxy

@hugoShaka hugoShaka requested a review from r0mant November 15, 2022 17:24
@hugoShaka
Copy link
Copy Markdown
Contributor Author

hugoShaka commented Nov 15, 2022

I think this RFD needs to include details about additional updates we wanted to make to the charts such as

  • Deployment resources always named as release name
  • Lack of standard static labels on Teleport deployed resources e.g. app.kubernetes.io/component
  • Including auth connectors
  • Service monitors for metrics

@r0mant I added a commit covering labels, deploying CRs and serviceMonitor. Resources are already named based on the release name.

@klizhentas
Copy link
Copy Markdown
Contributor

I will review, but would be more helpful if @webvictim and @bcg62 would take a look

@bcg62
Copy link
Copy Markdown
Contributor

bcg62 commented Nov 16, 2022

I will review, but would be more helpful if @webvictim and @bcg62 would take a look

Myself and @gclawes will provide feedback

@hugoShaka hugoShaka marked this pull request as ready for review November 21, 2022 13:09
@github-actions github-actions Bot requested review from jakule and lxea November 21, 2022 13:10
Copy link
Copy Markdown
Contributor

@webvictim webvictim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the proposal overall and think that both decoupling the auth/proxy pods and offloading full config validation to Teleport itself rather than implementing a small subset in Helm will make customers' lives a lot easier. I have zero doubt that people have already built their own changes around this and the migration will likely be a breaking change for them, but unfortunately a little friction is necessary to make the journey smoother for everyone else.

Comment thread rfd/0096-helm-chart-revamp.md Outdated
Comment on lines +205 to +263
auth:
teleportConfig:
auth_service:
authentication:
connector_name: "my-connector"
proxy:
teleportConfig:
proxy_service:
acme:
enabled: true
email: foo@example.com
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@r0mant Essentially it was because of all the added complexity that it would add. To be honest though I didn't know that mustMergeOverride even existed, or I'd probably have written it that way in the first place 😁

One notable upside of doing it the old way with specific named values is that we could use enums in the values.yaml.json schema to set acceptable values for the chart and make sure that it's pretty difficult for people to break their Teleport config. When we take the guard rails off it'll be much easier to make the config file fail to validate, but I think the benefits of not needing to use chartMode: custom for more advanced deployments will definitely outweigh the downsides.

Another reason was that toYaml can be a bit of a blunt object and will rewrite lists in a certain way which made config files harder to read. That isn't a huge problem though.

@hugoShaka I wonder whether there's a way for us to do some kind of config file validation via a pre-deploy hook or InitContainer where we template out the config file that the user has specified and run teleport configure --test /etc/teleport/teleport.yaml against it to at least check whether Teleport is able to parse the file correctly. It would offload the validation to the Teleport binary itself and could maybe be used to abort a rollout with a failed configuration before taking the pods offline?

Comment thread rfd/0096-helm-chart-revamp.md
Comment thread rfd/0096-helm-chart-revamp.md Outdated
Comment on lines +316 to +388
Proxies will have an initContainer checking if all auth pods from the past
version were removed. Version check via the Teleport gRPC api (`PingResponse`)
requires valid credentials to connect to Teleport. To workaround this issue we
can rely on Kubernetes's service discovery through DNS to discover how many
pods are running which version:

- the chart labels auth pods with their major teleport version
- the chart creates two headless services:
- one selecting pods with the current major version `teleport-auth-v11`
- one selecting pods with the previous major version `teleport-auth-v10`
- proxy pods have an initContainer
- the `v11` initContainer resolves `teleport-auth-v10` every 5 seconds until no
IP is returned
- the initContainer exits, the proxy starts
- this unlocks the proxy deployment rollout
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC the way we do this in Cloud is to rewrite the default container's SIGTERM (15) signal to SIGQUIT (3) using dumb-init so that we do in fact ask the pods to shut down gracefully: /usr/bin/dumb-init --rewrite 15:3 -- teleport start -c /etc/teleport/teleport.yaml

(from https://goteleport.com/docs/reference/signals/):
QUIT: Graceful shutdown. The daemon will wait until connections are dropped.
TERM , INT or KILL: Immediate non-graceful shutdown. All existing connections will be dropped.

I've never proposed this change in the Teleport Docker container itself because I feel it would be too much of a breaking change, but doing manually inside the Helm chart with args/command might work and be a good compromise?

I know that someone on the Teleport side was investigating the ability for Teleport to properly flush its session recording queue as part of a graceful shutdown but I'm unsure what progress they made on that. Cloud doesn't support proxy recording mode so there isn't a ton of prior art unfortunately.

Comment thread rfd/0096-helm-chart-revamp.md
hugoShaka added a commit that referenced this pull request Jan 5, 2023
Part of [RFD-0096](#18274)

This PR adds helm hooks deploying a test configuration job and running `teleport configure --test` to validate the `teleport.yaml` configuration is sane.
Comment thread rfd/0096-helm-chart-revamp.md Outdated
Comment thread rfd/0096-helm-chart-revamp.md Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we use PodMonitor?

Comment thread rfd/0096-helm-chart-revamp.md Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like app.kubernetes.io/component should have a value that indicates which teleport components are active. auth+kube or proxy+kube, etc...

If not in that label, is there another label that would better indicate the teleport component(s) that are active?

It would make the helm chart feel more in tune with the product. It'd be great if this could be generated based on the rendered teleport.yaml in the configmap for the deployment, if possible.

Copy link
Copy Markdown
Contributor Author

@hugoShaka hugoShaka Jan 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair, currently it would be "proxy" and "auth+kube". However I'd like to move kube service out of auth pods soon, I kept this off the main PR in a failed attempt to limit the number of changes. Eventually, this should be proxy, auth, and kube. I'll update the RFD to reflect this.

However I think this will be easier to just write auth instead of auth+kube today as we rely on those labels as selectors and Kubernetes does not support label edition well, this will require another downtime when we'll move the kube service out of the auth because of the label change.

Comment thread rfd/0096-helm-chart-revamp.md Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since helm can't deploy the CRD and its CRs in the same release, would it be possible to support bootstrap.yaml in the helm chart? That would hopefully allow folks to do a one-shot initial deploy that doesn't require them to separately create custom resources or exec into the pod.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically you can do bootstrap.yaml with the Helm chart currently using extraVolumes, extraVolumeMounts and extraArgs. Maybe if we just document how to do this better as it's intended to be a one-off thing?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extraVolume/extraVolumeMounts/extraArgs also require that the end-user do a two-step operation to get that initial data in place. Allowing the chart to natively support as much of this initialization process as possible all from the chart values will reduce the need for external steps/scripts/processes.

Copy link
Copy Markdown
Contributor Author

@hugoShaka hugoShaka Jan 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I read, --bootstrap does not support all resources (only User, CA, TrustedCluster, Role, AuthConnectors) and allows you to import severely broken states (it does some validation about the CA material but does not check if a user has existing roles for example). Therefore, the flag should be used only for restoring an existing state and must not be used as a substitution for the CR/CRD mechanism.

We do need to make populating resources easier, but I think --boostrap is not the adequate way.

Comment thread rfd/0096-helm-chart-revamp.md Outdated
Comment on lines 221 to 229
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we intending to add a migration guide from a custom config to the new format in the docs? I know a lot of people are already using custom and so it'd be nice to guide them through the process to ease the pain of a breaking change.

Copy link
Copy Markdown
Contributor Author

@hugoShaka hugoShaka Jan 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like this guide? https://github.com/gravitational/teleport/pull/19881/files#diff-c7dd7c2ab1f43e582d7662823d997dc4317c42d334fb179a4b3a5cd4dbc911dd

Feel free to jump in the PR if you have ideas on how to make the migration less painful

hugoShaka added a commit that referenced this pull request Jan 10, 2023
Part of #18274

This commit introduces a new hidden `wait` CLI subcommand:

- `teleport wait no-resolve <domain-name>` resolves a domain name and exits only when no IPs are resolved. This CLI command should be used in the Helm chart, as an init-container, to block proxies from rolling out until all auth pods have been successfully rolled-out.

- `teleport wait duration 30s` has the same behaviour as `sleep 30`. Due to image hardening we won't have `sleep` available, but waiting 30 seconds in a preStop hook is required to ensure a 100% seamless pod rollout on kube-proxy-based clusters.
@hugoShaka hugoShaka force-pushed the rfd/0096-helm-chart-revamp branch from 9a07a29 to 03f3eb4 Compare January 11, 2023 14:48
Comment thread rfd/0096-helm-chart-revamp.md Outdated
hugoShaka and others added 2 commits January 11, 2023 10:40
@hugoShaka hugoShaka enabled auto-merge (squash) January 11, 2023 16:03
@hugoShaka hugoShaka merged commit c5dfffc into master Jan 11, 2023
@hugoShaka hugoShaka deleted the rfd/0096-helm-chart-revamp branch January 11, 2023 16:09
hugoShaka added a commit that referenced this pull request Jan 11, 2023
This commit refactors the `teleport-cluster` Helm chart to deploy separately proxy and auth pods.
It allows users to pass raw teleport configuration to the deployed Teleport nodes.
Finally, it removes the `custom` chart mode as the mode was broken by the split. A new `scratch` mode has been introduced.

See [the corresponding RFD](#18274) describing the design.
hugoShaka added a commit that referenced this pull request Jan 13, 2023
Part of [RFD-096](#18274): managing the major upgrades safely

This commit's main purpose is to block proxies running a new Teleport major version from connecting to auth pods running an old Teleport version.

This commit does 3 things:
- adding initContainers and preStop hooks to the `teleport-cluster` Helm chart (initContainers were designed in RFD 096, preStop was a nice additoin coming from [the wait PR](#19277))
- fixing a bug in the `wait` command (the DNS error was not properly unwrapped and not recognized as a DNS error)
- fixing missing override support on some auth Deployment values. As a rule of thumb for future review, we should not use .Values directly and prefer using $auth and $proxy
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

rfd Request for Discussion

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants