feat: Bot instance upgrade status dashboard#60019
feat: Bot instance upgrade status dashboard#60019nicholasmarais1158 merged 25 commits intomasterfrom
Conversation
aacdce8 to
c8bf581
Compare
c8bf581 to
4e0c831
Compare
|
Rebased on top of master after #59896 was merged. |
| {error ? ( | ||
| <Alert kind="danger" m={3}> | ||
| {error.message} | ||
| </Alert> | ||
| ) : undefined} |
There was a problem hiding this comment.
| {error ? ( | |
| <Alert kind="danger" m={3}> | |
| {error.message} | |
| </Alert> | |
| ) : undefined} | |
| {error && ( | |
| <Alert kind="danger" m={3}> | |
| {error.message} | |
| </Alert> | |
| )} |
There was a problem hiding this comment.
This is a style I've used throughout the frontend for bots. It's not the first challenge I've had, so I'll replace them in a separate PR dedicated to just that.
| return ( | ||
| <UpgradeStatusContainer> | ||
| <ChartTitleText>Upgrade Status</ChartTitleText> | ||
| <BarsContainer> |
There was a problem hiding this comment.
There may be a (perhaps fututre) opportunity to unify and reuse some code between this one and UsageSummary from our enterprise code.
There was a problem hiding this comment.
Where do I find the Usage Dashboard in the product?
There was a problem hiding this comment.
I think it appears in the dashboard mode, when you launch it as a cloud server. There were some additional prerequisites, but I frankly don't remember. It's probably easiest to just set up a cloud tenant.
|
Haven't looked at the code, but a couple thoughts from the demo video:
|
…pgrade # Conflicts: # web/packages/teleport/src/BotInstances/List/BotInstancesList.tsx
|
@nicholasmarais1158 See the table below for backport results.
|
* Add plumbing for new metrics endpoint * Align version compatibility logic * Fix mocked responses in stories * Add new dashboard component * Wire-in dashboard component * Fix lint * Explain dynamic `refetchInterval` * docs: `onFilterSelected` * Use typography components from design package * Fix `onFilterSelected` naming inconsistencies * A better nbsp * Remove "control plane" terminology * Refactor `GetBotInstanceMetricsResponse` type * Handle out-of-date proxy * Make instance list messaging filter aware * Update chart title to "version compatibility" * Keep "Last updated x minutes ago" label current * Oops, forgot to update the test * Remove unused `TitleText` * Change dashboard title to "insights" * Version compatibility design changes * Fix tests after copy change, oops
* Add plumbing for new metrics endpoint * Align version compatibility logic * Fix mocked responses in stories * Add new dashboard component * Wire-in dashboard component * Fix lint * Explain dynamic `refetchInterval` * docs: `onFilterSelected` * Use typography components from design package * Fix `onFilterSelected` naming inconsistencies * A better nbsp * Remove "control plane" terminology * Refactor `GetBotInstanceMetricsResponse` type * Handle out-of-date proxy * Make instance list messaging filter aware * Update chart title to "version compatibility" * Keep "Last updated x minutes ago" label current * Oops, forgot to update the test * Remove unused `TitleText` * Change dashboard title to "insights" * Version compatibility design changes * Fix tests after copy change, oops
* Add plumbing for new metrics endpoint * Align version compatibility logic * Fix mocked responses in stories * Add new dashboard component * Wire-in dashboard component * Fix lint * Explain dynamic `refetchInterval` * docs: `onFilterSelected` * Use typography components from design package * Fix `onFilterSelected` naming inconsistencies * A better nbsp * Remove "control plane" terminology * Refactor `GetBotInstanceMetricsResponse` type * Handle out-of-date proxy * Make instance list messaging filter aware * Update chart title to "version compatibility" * Keep "Last updated x minutes ago" label current * Oops, forgot to update the test * Remove unused `TitleText` * Change dashboard title to "insights" * Version compatibility design changes * Fix tests after copy change, oops
* Add plumbing for new metrics endpoint * Align version compatibility logic * Fix mocked responses in stories * Add new dashboard component * Wire-in dashboard component * Fix lint * Explain dynamic `refetchInterval` * docs: `onFilterSelected` * Use typography components from design package * Fix `onFilterSelected` naming inconsistencies * A better nbsp * Remove "control plane" terminology * Refactor `GetBotInstanceMetricsResponse` type * Handle out-of-date proxy * Make instance list messaging filter aware * Update chart title to "version compatibility" * Keep "Last updated x minutes ago" label current * Oops, forgot to update the test * Remove unused `TitleText` * Change dashboard title to "insights" * Version compatibility design changes * Fix tests after copy change, oops
* docs(rfd): Bot Instances at Scale (RFD0222) (#57888) * docs(rfd): Bot Instance at Scale (RFD0222) * Add Notion link * First draft for review * fix: Spell check * Fix protobuf code block * Make the document structure slightly clearer * Add a section on data aggregation * Add missing end of code block * Add a bit more context on bots and instances * Add plan for `tctl` * Add an explanation for each UX example * Rename semver functions * Remove activity visualisation and required complexity * cspell * Pre-populate isn't the correct term * Minor tweak * Test plan additions * Remove reference to activity visualization * Clarify use of pagination in `tctl bots instances ls` and remove filter summary * Refine predicate language functions * Restructure protos * Refine plans for resource storage and quantities * Add webapi support for config, health and notices * Expand backwards compatibility * cspell * State `tbot` config size limit * Reduce config max size to 32Kb * Add failsafe env var * Reduce max notices to 10 * Explain updating related record expiry in-line with the instance * Revert extracting service health records * Move notices to earlier in the delivery plan * Remove notices & config and expand the why and what * Expand aggregate data and metrics, and rearrange sections * Fix `version.between` snippet * Updated approach to calculating bot instance counts * Document `newer_than` predicate language function --------- Co-authored-by: Dan Upton <daniel.upton@goteleport.com> * feat: List bot instances version and hostname sort (#59263) * Add version and hostname indexes to cache * Add `ListBotInstancesV2` rpc and use request options * Add v2 bot instance list endpoint * Use v2 endpoint in web UI * Pass signal through to support aborting requests * Fix comment typo * Rename util func * Deprecate `ListBotInstances` rpc * Encode hostname in cache key * Address pre-release sorting in version numbers * Rename bot instance cache utils * Fix lint deprecation warnings * Extract filter fields to message * Replace `fmt.Sprintf("%06d", ...)` * Update invalid sort field error * Fallback to v1 endpoint if possible * Use `strcase` for case-insensitive compare * Backend results are filtered by bot name so no need to re-filter in `MatchBotInstance` * Revert "Replace `fmt.Sprintf("%06d", ...)`" This reverts commit 2fbd797. * feat: Bot instances advanced filter (#59374) * Add version and hostname indexes to cache * Add `ListBotInstancesV2` rpc and use request options * Add v2 bot instance list endpoint * Use v2 endpoint in web UI * Pass signal through to support aborting requests * Fix comment typo * Rename util func * Add expression parser * Contribute `to_string` function to default parser * Add API support for `query` filter * Fix `SearchPanel` submit with advanced toggle * Add advanced search to web UI * Deprecate `ListBotInstances` rpc * Encode hostname in cache key * Address pre-release sorting in version numbers * Rename bot instance cache utils * Fix lint deprecation warnings * Extract filter fields to message * Replace `fmt.Sprintf("%06d", ...)` * Update invalid sort field error * Fallback to v1 endpoint if possible * Use `strcase` for case-insensitive compare * Backend results are filtered by bot name so no need to re-filter in `MatchBotInstance` * Use `t.Context()` * Remove expression methods * Remove unnecessary fallback comments * Return early if only bot name filter is required (backend only) * Fix lint * replace `to_string` with `equals` (version type only) * Fix comment * Remove unnecessary `to_string` tests * Switch to a true equals function # Conflicts: # lib/cache/bot_instance.go # web/packages/teleport/src/BotInstances/List/BotInstancesList.tsx # Conflicts: # web/packages/teleport/src/BotInstances/List/BotInstancesList.tsx * fix: Rename bot instance version expression functions (#59819) * feat(webui): New bot instances experience (#59655) * Make `disableSearch` props optional for SearchPanel component * Add a shared mock for TextEditor * Make instance items selectable and include bot name * Remove old bot instance details page * Add new bot instances UI * Add stories * Add and amend tests * Switch to arrow function * Remove `null` from item selected callback * Fix info guide wording # Conflicts: # web/packages/teleport/src/BotInstances/Details/BotInstanceDetails.test.tsx # web/packages/teleport/src/BotInstances/Details/BotInstanceDetails.tsx # web/packages/teleport/src/BotInstances/List/BotInstancesList.tsx * feat: Bot instances design review (#59897) * Alternate sort menu icons * Titles and close button * Yaml background colour * Spacing * Keyboard selectable list items * Fix selected list item padding * Default scroll bars for list * Clarify delete bot messaging * Simplify `onClick` * Use `FlexProps` type * Revert "Alternate sort menu icons" This reverts commit 4212dcd. * feat: Add filtering and sort to `tctl bots instances ls` (#60273) * Fix missing `--format` flag * Use v2 rpc * Add `--search` flag * Add `--query` flag * Add `--sort-index` and `--sort-order` flags * Remove `generation` and add `version` fields to output * Allow enabling the auth cache for the test process * Add list bot instances tests * Sync join method access logic between tctl and web * Access `authentication.JoinMethod` safely * Unhide `--format` flag * Simplify version header label * Fallback to v1 ListBotInstances * Refactor to remove use of `authclient.ClientI` * A way better fallback implementation 🙌 * typo 🙄 * Refactor to single interface # Conflicts: # tool/tctl/common/bots_command.go * docs: Add filter, sort and format flags to `tctl bots instances ls` reference (#60508) * docs: Add filter, sort and format fields to `tctl bots instances ls` reference * Using consistent capitalization Co-authored-by: Paul Gottschling <paul.gottschling@goteleport.com> --------- Co-authored-by: Paul Gottschling <paul.gottschling@goteleport.com> * feat: Add service health to `tctl bots instances ls|show` (#60316) * Add `service_health` to bot instance protos * Add aggregated service health to `show` * Add services section to `show` * Add health status column to `ls` * Extra tabs are not welcome * Handle zero services when aggregating health status # Conflicts: # tool/tctl/common/bots_command.go * MWI: Add `/webapi/.../machine-id/bot-instance/metrics` endpoint (#59896) * Add `autoupdate_bot_instance_report` to the editor role preset * Add `/webapi/.../machine-id/bot-instance/metrics` endpoint * Add missing error check in test * Better error message when metrics aren't ready * Allow users with `bot_instance:list` to read the `autoupdate_bot_instance_report` * Move update timestamp onto upgrade statuses object * Fix predicate language function names * Remove erroneous comment * Fix tests * Add `refresh_after_seconds` to metrics response * Return an empty `upgrade_statuses` if there is no report * Replace `exact_version` helper with simple `==` operator * Use `trace.Aggregate` to return both auth errors * feat: Bot instance upgrade status dashboard (#60019) * Add plumbing for new metrics endpoint * Align version compatibility logic * Fix mocked responses in stories * Add new dashboard component * Wire-in dashboard component * Fix lint * Explain dynamic `refetchInterval` * docs: `onFilterSelected` * Use typography components from design package * Fix `onFilterSelected` naming inconsistencies * A better nbsp * Remove "control plane" terminology * Refactor `GetBotInstanceMetricsResponse` type * Handle out-of-date proxy * Make instance list messaging filter aware * Update chart title to "version compatibility" * Keep "Last updated x minutes ago" label current * Oops, forgot to update the test * Remove unused `TitleText` * Change dashboard title to "insights" * Version compatibility design changes * Fix tests after copy change, oops * feat: Bot instance service health (#60133) * Add tabs to instance details * Add `kind` to bot instance heartbeat proto * Extend `GetBotInstanceResponse` type * Add `InfoTab` component for Overview tab * Add `HealthTab` component for Services tab * Wire-up tabs content * Use `join_attrs.meta` for join token fields * Fix links style * Fix handling of unspecified health status * Fix tab spacing * Remove tab tooltips * Replace service item background * Add zero services story * Support tctl instance kind * Fix styled links * Fix test * Fix bot instances story * feat: Allow instances to be selectable from bot details (#60717) * Make instance items selectable from bot details * Add test * Fix mocked calls in stories * fix: Bot instance health status dot inconsistency (#60786) --------- Co-authored-by: Dan Upton <daniel.upton@goteleport.com> Co-authored-by: Paul Gottschling <paul.gottschling@goteleport.com>
* Add plumbing for new metrics endpoint * Align version compatibility logic * Fix mocked responses in stories * Add new dashboard component * Wire-in dashboard component * Fix lint * Explain dynamic `refetchInterval` * docs: `onFilterSelected` * Use typography components from design package * Fix `onFilterSelected` naming inconsistencies * A better nbsp * Remove "control plane" terminology * Refactor `GetBotInstanceMetricsResponse` type * Handle out-of-date proxy * Make instance list messaging filter aware * Update chart title to "version compatibility" * Keep "Last updated x minutes ago" label current * Oops, forgot to update the test * Remove unused `TitleText` * Change dashboard title to "insights" * Version compatibility design changes * Fix tests after copy change, oops
* Add plumbing for new metrics endpoint * Align version compatibility logic * Fix mocked responses in stories * Add new dashboard component * Wire-in dashboard component * Fix lint * Explain dynamic `refetchInterval` * docs: `onFilterSelected` * Use typography components from design package * Fix `onFilterSelected` naming inconsistencies * A better nbsp * Remove "control plane" terminology * Refactor `GetBotInstanceMetricsResponse` type * Handle out-of-date proxy * Make instance list messaging filter aware * Update chart title to "version compatibility" * Keep "Last updated x minutes ago" label current * Oops, forgot to update the test * Remove unused `TitleText` * Change dashboard title to "insights" * Version compatibility design changes * Fix tests after copy change, oops
* Add plumbing for new metrics endpoint * Align version compatibility logic * Fix mocked responses in stories * Add new dashboard component * Wire-in dashboard component * Fix lint * Explain dynamic `refetchInterval` * docs: `onFilterSelected` * Use typography components from design package * Fix `onFilterSelected` naming inconsistencies * A better nbsp * Remove "control plane" terminology * Refactor `GetBotInstanceMetricsResponse` type * Handle out-of-date proxy * Make instance list messaging filter aware * Update chart title to "version compatibility" * Keep "Last updated x minutes ago" label current * Oops, forgot to update the test * Remove unused `TitleText` * Change dashboard title to "insights" * Version compatibility design changes * Fix tests after copy change, oops
* Add plumbing for new metrics endpoint * Align version compatibility logic * Fix mocked responses in stories * Add new dashboard component * Wire-in dashboard component * Fix lint * Explain dynamic `refetchInterval` * docs: `onFilterSelected` * Use typography components from design package * Fix `onFilterSelected` naming inconsistencies * A better nbsp * Remove "control plane" terminology * Refactor `GetBotInstanceMetricsResponse` type * Handle out-of-date proxy * Make instance list messaging filter aware * Update chart title to "version compatibility" * Keep "Last updated x minutes ago" label current * Oops, forgot to update the test * Remove unused `TitleText` * Change dashboard title to "insights" * Version compatibility design changes * Fix tests after copy change, oops
Summary
This change adds a dashboard to the bot instances list. It displays when there is no selected instance item. A breakdown of instance counts by upgrade status gives a high-level overview, and allows easy filtering to show instances for each status by clicking on an item on the dashboard. The metrics data is periodically auto-refetched based on a value (
refresh_after_seconds) from the API (or every 1 min, as a fallback).Updates: #55926
Depends on: #59896
Changelog: Added a dashboard to visualize bot instances by their upgrade status
Changes
Demo
Screen.Recording.2025-10-07.at.17.12.10.mov
Reviewer notes
Here's a script to insert a bunch of bot instances to make testing easier; bot_instances.sql
Be sure to restart your cluster afterwards - the cache will not be notified of the changes.