Conversation
| 2. Notify Admins via Cluster Alerts when Unaccompanied Agents are detected. The | ||
| Cluster Alert should also present a docs link informing Admins how to | ||
| properly add an auto upgrader service. |
There was a problem hiding this comment.
By design, we can't really know if this is the case. The updater is teleport-idependant so that the agent can fail and still be updated. The agent only knows if it should export its maintenance schedule, not if there's an updater reading it.
We can detect unaccompanied agents in two ways:
- the agent is not configured to export its maintenance schedule. As described earlier this is not 100% equivalent to being unaccompanied: you can be accompanied without maintenance schedule (update asap) or export the schedule but have a broken/suspended updater or no updater at all.
- the agent is not running the right version. This second approach might closer to what we want to achieve: don't have to deal with version skews.
There was a problem hiding this comment.
@xinding33 Can we just update our existing "your agents are outdated" alerts with a link to docs on how to enable auto-upgrades for Cloud customers?
There was a problem hiding this comment.
@hugoShaka You're right, we likely don't need to technically actually identify Unaccompanied Agents. Admins will be happy as long as we solve the following:
- Notify Admins when Agents are not running the latest version.
- Help Admins identify exactly which Agent(s) is/are problematic.
- Help Admins "fix" said Agent(s) once and for all by providing instructions on 1) how to upgrade the Agent and 2) how to add an Agent Auto Upgrader Service to the Agent.
There was a problem hiding this comment.
@r0mant IMO, that's not sufficient because it doesn't satisfy 2 and 3 of the comment above.
There was a problem hiding this comment.
@xinding33 we've started discussing implementation of this RFD and need some clarification on a few points:
Help Admins identify exactly which Agent(s) is/are problematic.
Do we actually want users to be able to list all agents that are older than a given version and/or aren't configured to use an upgrader, or do we just want the alert message to include some examples of offending agents to help users catch agents that fell through the cracks?
Actually being able to list offending agents is obviously nicer, but is also a much more complex ask since it requires us to either implement the logic for tracking and filtering on upgrader status for each service type (brittle and annoying to maintain), or to enable the per-agent "instance" heartbeat which we switched to disabled by default when we opted to use the more weakly coordinated upgrade system (known to cause scalability issues for some etcd-based deployments). Both options are doable, but neither is really ideal.
Including a few hostnames in an alert is comparatively easy, since each auth server can directly observe information about the agents connected to it, and write them to a resource. But that has the downside of only really being useful in the case where agents are generally OK, but one or two fell through the cracks.
Help Admins "fix" said Agent(s) once and for all by providing instructions on 1) how to upgrade the Agent and 2) how to add an Agent Auto Upgrader Service to the Agent.
Are you thinking of something more than just pointing a user at the docs? If so what?
There was a problem hiding this comment.
Do we actually want users to be able to list all agents that are older than a given version and/or aren't configured to use an upgrader, or do we just want the alert message to include some examples of offending agents to help users catch agents that fell through the cracks?
Yes, we want users to be able to list all agents that are out-of-date. That's where the real value is because then users can take that list and fix everything. I don't have opinions on how we should go about the implementation.
Are you thinking of something more than just pointing a user at the docs? If so what?
We definitely want to link to docs but if we can provide a one-liner that helps users fix the problem 90% of the time, that'd be an excellent addition.
| * (Teleport) Agent: The `teleport` process which can run one more more Teleport | ||
| services. |
There was a problem hiding this comment.
nit: this definition covers teleport processes running auth and proxy, even if cloud users are not supposed to run them.
| reduce the significant workload associated with manually managing and upgrading | ||
| a fleet of Teleport Agents. | ||
|
|
||
| ## How |
There was a problem hiding this comment.
This section doesn't differentiate between tenants enrolled in automatic updates and tenants who are not. Even if in the end we want to enroll everyone, we might want to explain that this only applies to the tenants enrolled in this new mode.
| * An easy (ideally one command) way to add an auto upgrader service to an | ||
| Unaccompanied Teleport Agent. No newline at end of file |
There was a problem hiding this comment.
I think this is already doable. apt install teleport-ent-updater and helm upgrade --reuse-values my-agent-release teleport/teleport-kube-agent --set updater.enabled=true.
My biggest question is how do we expose this command to the user? Through teleport discover? Docs-only? Can we have something Teleport Assist-style that runs the command on the behalf of the users?
There was a problem hiding this comment.
@fspmarshall @fheinecke Will installing an updater package if you already have teleport installed be sufficient for enabling auto-upgrades?
@hugoShaka I think we should include this in the guides you're working on, for starters.
There was a problem hiding this comment.
Yes, installing the updater package on an existing teleport agent is all you need to do. The updater package's install process automatically configures teleport to start exporting schedules and restarts it.
| * Any commands exposed in the Teleport Web UI, including those in Teleport | ||
| Discover. |
There was a problem hiding this comment.
I think most of the work was already done in: #22731, especially for Helm.
The only remaining thing would be to change the package name for apt/yum installs.
| services. | ||
| * (Teleport) Cluster Alerts: Messages in the Teleport Web UI and appropriate | ||
| Teleport CLIs that alert Practitioners of relevant concerns. | ||
| * Unaccompanied Agent: A Teleport Agent deployed without an accompanying auto upgrader |
There was a problem hiding this comment.
It feels like we've defined this as the inverse of an accompanied agent, but we haven't defined what an accompanied agent or what an auto upgrader service is yet.
| * Unaccompanied Agent: A Teleport Agent deployed without an accompanying auto upgrader | ||
| service. | ||
|
|
||
| ## Why |
There was a problem hiding this comment.
I would mention the driver for making it easy to keep agents up to date - so that customers agents stay compatible with their control plane that we host.
There was a problem hiding this comment.
Are we happy with a customer not relying on our automatic updates but with strong automation quickly aligning their agents version with our target version?
There was a problem hiding this comment.
@hugoShaka Theoretically, we're ok with that. But in practice, very few customers actually do that. We want to make it easy for customers to adopt best practice with the least amount of friction, so we want them to do it our way unless they know exactly what they're doing.
|
|
||
| ## How | ||
|
|
||
| There are two key tactical UX change that will increase the adoption of Agent |
There was a problem hiding this comment.
| There are two key tactical UX change that will increase the adoption of Agent | |
| There are two key tactical UX changes that will increase the adoption of Agent |
| 2. Notify Admins via Cluster Alerts when Unaccompanied Agents are detected. The | ||
| Cluster Alert should also present a docs link informing Admins how to | ||
| properly add an auto upgrader service. |
There was a problem hiding this comment.
@xinding33 Can we just update our existing "your agents are outdated" alerts with a link to docs on how to enable auto-upgrades for Cloud customers?
|
|
||
| The first tactical change requires modifications to: | ||
|
|
||
| * https://goteleport.com/download/. |
There was a problem hiding this comment.
Have we decided how we want to determine whether it's a cloud or self-hosted user that's downloading from the Downloads page? IIRC that was the main question. Do we want to add "scopes" to the Downloads page similar to how we have scopes in the documentation? Or something else?
There was a problem hiding this comment.
@r0mant IMO, we shouldn't add scope to the Downloads page. Rather, we should add required options/flags in the package manager commands that differentiate between cloud and self-hosted.
| * An easy (ideally one command) way to add an auto upgrader service to an | ||
| Unaccompanied Teleport Agent. No newline at end of file |
There was a problem hiding this comment.
@fspmarshall @fheinecke Will installing an updater package if you already have teleport installed be sufficient for enabling auto-upgrades?
@hugoShaka I think we should include this in the guides you're working on, for starters.
|
|
||
| 1. Where possible, push Admins to deploy Teleport Agents via supported package | ||
| managers (i.e. `helm`, `apt`, and `yum`) as this will deploy the | ||
| `teleport-ent-cloud-updater` package alongside the `teleport` package. It's |
There was a problem hiding this comment.
| `teleport-ent-cloud-updater` package alongside the `teleport` package. It's | |
| `teleport-ent-cloud-updater` package alongside the `teleport-ent` package. It's |
| 1. Where possible, push Admins to deploy Teleport Agents via supported package | ||
| managers (i.e. `helm`, `apt`, and `yum`) as this will deploy the | ||
| `teleport-ent-cloud-updater` package alongside the `teleport` package. It's |
There was a problem hiding this comment.
as this will deploy the
teleport-ent-cloud-updaterpackage alongside theteleportpackage
To be clear, installing teleport still only installs teleport, but installing the updater causes teleport to also be installed if it wasn't already.
|
I am going to close this one out since it's been superseded by Managed Updates. |
Teleport will add support for Agent Auto Upgrades in v13.0.0. This PR adds an RFD that details tactical UX changes to expose the Agent Auto Upgrade capabilities to Teleport Cloud Admins.