Skip to content

feat: List bot instances version and hostname sort#59263

Merged
nicholasmarais1158 merged 21 commits intomasterfrom
nicholasmarais1158/feat/mwi-bot-instances-v2
Sep 25, 2025
Merged

feat: List bot instances version and hostname sort#59263
nicholasmarais1158 merged 21 commits intomasterfrom
nicholasmarais1158/feat/mwi-bot-instances-v2

Conversation

@nicholasmarais1158
Copy link
Copy Markdown
Contributor

@nicholasmarais1158 nicholasmarais1158 commented Sep 17, 2025

Summary

This change adds extra sorting fields to bot instances; version and hostname. As is already the case, sorting by anything other than name (actually bot name and ID for bot instances) is powered by the in-memory cache and its indexes.

In addition, I've taken this opportunity to refactor how sorting and filter params are passed. For backwards compatibility, a /v2 incarnation replaces the existing list bot instances webapi endpoint. Similarly, the ListBotInstancesV2 RPC takes the same parameters, but does so in an options struct - this makes extending the capabilities with new filters easier.

The new endpoint is used in the web UI and the additional sorting has been made available. This is a small step towards rebuilding the bot instances page completely as part of RFD0222.

Updates: MWI: Instances at Scale > [Phase 1] New instance UI, and version filters

Changes

  • Add sorting by version and hostname to web UI
  • Refactor RPC sort and filter params
  • Pass the abort signal from Tanstack Query to fetch - used to cancel requests when sort/filter changes or when leaving the page

Demo

Screen.Recording.2025-09-17.at.14.31.23.mov

Reviewer notes

Creating bot instances (as well as heartbeat records) requires running tbot to enrol an instance. To make testing this feature easier, I've used a SQL script to insert instance resources into the backend directly. Here's the script;
bot_instances.sql

$ sqlite3 <your cluster's data dir>/backend/sqlite.db
sqlite> .read bot_instances.sql

@nicholasmarais1158 nicholasmarais1158 force-pushed the nicholasmarais1158/feat/mwi-bot-instances-v2 branch from 5be9170 to ce44caf Compare September 18, 2025 08:08
@nicholasmarais1158 nicholasmarais1158 changed the title feat: List bot instances extra sorting and refactor feat: List bot instances refactor sort and filter Sep 18, 2025
@nicholasmarais1158 nicholasmarais1158 changed the title feat: List bot instances refactor sort and filter feat: List bot instances version and hostname sort Sep 18, 2025
@nicholasmarais1158 nicholasmarais1158 added the no-changelog Indicates that a PR does not require a changelog entry label Sep 18, 2025
@nicholasmarais1158 nicholasmarais1158 marked this pull request as ready for review September 18, 2025 09:01
Comment on lines 85 to +113
// ListBotInstances returns a page of BotInstance resources.
rpc ListBotInstances(ListBotInstancesRequest) returns (ListBotInstancesResponse);
// ListBotInstancesV2 returns a page of BotInstance resources.
rpc ListBotInstancesV2(ListBotInstancesV2Request) returns (ListBotInstancesResponse);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to mark the old one as deprecated?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thanks. I'll update the uses in tctl as part of a later task.

Comment thread lib/cache/bot_instance.go
Comment on lines 187 to 189
if heartbeat != nil {
values = append(values, heartbeat.Hostname, heartbeat.JoinMethod, heartbeat.Version, "v"+heartbeat.Version)
hostname = heartbeat.GetHostname()
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we base32 this, or can we rely on hostnames not ever containing a slash?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think hostname can have a slash, but I'll encode it to be safe.

Comment thread lib/cache/bot_instance.go
Comment on lines +174 to 180
version = fmt.Sprintf("%06d.%06d.%06d", sv.Major, sv.Minor, sv.Patch)
if sv.PreRelease != "" {
version = version + "-" + string(sv.PreRelease)
}
if sv.Metadata != "" {
version = version + "+" + sv.Metadata
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how much we care, but the ordering of semver is more complicated than just zero-padding the triplet of version numbers (1.2.3 is greater than 1.2.3-a, 1.1.1-beta.4 is smaller than 1.1.1-beta.32).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't tend to use the pre-release part in tbot versioning other than during major releases or local/custom builds. So, we'll keep the naive implementation. I've supported 1.2.3 being greater than 1.2.3-a, but not 1.1.1-beta.4 being smaller than 1.1.1-beta.32.

Thanks for raising this - I was not aware of the ordering logic applicable to the pre-release part.

Comment thread lib/cache/bot_instance.go Outdated
Comment on lines +179 to +187
zeroPad := func(num int) string {
length := 6
s := strconv.Itoa(num)
var b strings.Builder
b.Grow(length)
b.WriteString(strings.Repeat("0", length-len(s)))
b.WriteString(s)
return b.String()
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is almost surely worse than just using fmt.Sprintf.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted.

@nicholasmarais1158 nicholasmarais1158 added this pull request to the merge queue Sep 25, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Sep 25, 2025
@nicholasmarais1158 nicholasmarais1158 added this pull request to the merge queue Sep 25, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Sep 25, 2025
@nicholasmarais1158 nicholasmarais1158 added this pull request to the merge queue Sep 25, 2025
Merged via the queue into master with commit a45e9b2 Sep 25, 2025
47 of 48 checks passed
@nicholasmarais1158 nicholasmarais1158 deleted the nicholasmarais1158/feat/mwi-bot-instances-v2 branch September 25, 2025 12:24
@backport-bot-workflows
Copy link
Copy Markdown
Contributor

@nicholasmarais1158 See the table below for backport results.

Branch Result
branch/v17 Failed
branch/v18 Failed

nicholasmarais1158 added a commit that referenced this pull request Oct 23, 2025
* Add version and hostname indexes to cache

* Add `ListBotInstancesV2` rpc and use request options

* Add v2 bot instance list endpoint

* Use v2 endpoint in web UI

* Pass signal through to support aborting requests

* Fix comment typo

* Rename util func

* Deprecate `ListBotInstances` rpc

* Encode hostname in cache key

* Address pre-release sorting in version numbers

* Rename bot instance cache utils

* Fix lint deprecation warnings

* Extract filter fields to message

* Replace `fmt.Sprintf("%06d", ...)`

* Update invalid sort field error

* Fallback to v1 endpoint if possible

* Use `strcase` for case-insensitive compare

* Backend results are filtered by bot name so no need to re-filter in `MatchBotInstance`

* Revert "Replace `fmt.Sprintf("%06d", ...)`"

This reverts commit 2fbd797.
nicholasmarais1158 added a commit that referenced this pull request Oct 23, 2025
* Add version and hostname indexes to cache

* Add `ListBotInstancesV2` rpc and use request options

* Add v2 bot instance list endpoint

* Use v2 endpoint in web UI

* Pass signal through to support aborting requests

* Fix comment typo

* Rename util func

* Deprecate `ListBotInstances` rpc

* Encode hostname in cache key

* Address pre-release sorting in version numbers

* Rename bot instance cache utils

* Fix lint deprecation warnings

* Extract filter fields to message

* Replace `fmt.Sprintf("%06d", ...)`

* Update invalid sort field error

* Fallback to v1 endpoint if possible

* Use `strcase` for case-insensitive compare

* Backend results are filtered by bot name so no need to re-filter in `MatchBotInstance`

* Revert "Replace `fmt.Sprintf("%06d", ...)`"

This reverts commit 2fbd797.
nicholasmarais1158 added a commit that referenced this pull request Oct 23, 2025
* Add version and hostname indexes to cache

* Add `ListBotInstancesV2` rpc and use request options

* Add v2 bot instance list endpoint

* Use v2 endpoint in web UI

* Pass signal through to support aborting requests

* Fix comment typo

* Rename util func

* Deprecate `ListBotInstances` rpc

* Encode hostname in cache key

* Address pre-release sorting in version numbers

* Rename bot instance cache utils

* Fix lint deprecation warnings

* Extract filter fields to message

* Replace `fmt.Sprintf("%06d", ...)`

* Update invalid sort field error

* Fallback to v1 endpoint if possible

* Use `strcase` for case-insensitive compare

* Backend results are filtered by bot name so no need to re-filter in `MatchBotInstance`

* Revert "Replace `fmt.Sprintf("%06d", ...)`"

This reverts commit 2fbd797.
# Conflicts:
#	web/packages/teleport/src/BotInstances/BotInstances.tsx
nicholasmarais1158 added a commit that referenced this pull request Oct 29, 2025
* Add version and hostname indexes to cache

* Add `ListBotInstancesV2` rpc and use request options

* Add v2 bot instance list endpoint

* Use v2 endpoint in web UI

* Pass signal through to support aborting requests

* Fix comment typo

* Rename util func

* Deprecate `ListBotInstances` rpc

* Encode hostname in cache key

* Address pre-release sorting in version numbers

* Rename bot instance cache utils

* Fix lint deprecation warnings

* Extract filter fields to message

* Replace `fmt.Sprintf("%06d", ...)`

* Update invalid sort field error

* Fallback to v1 endpoint if possible

* Use `strcase` for case-insensitive compare

* Backend results are filtered by bot name so no need to re-filter in `MatchBotInstance`

* Revert "Replace `fmt.Sprintf("%06d", ...)`"

This reverts commit 2fbd797.
nicholasmarais1158 added a commit that referenced this pull request Oct 30, 2025
* Add version and hostname indexes to cache

* Add `ListBotInstancesV2` rpc and use request options

* Add v2 bot instance list endpoint

* Use v2 endpoint in web UI

* Pass signal through to support aborting requests

* Fix comment typo

* Rename util func

* Deprecate `ListBotInstances` rpc

* Encode hostname in cache key

* Address pre-release sorting in version numbers

* Rename bot instance cache utils

* Fix lint deprecation warnings

* Extract filter fields to message

* Replace `fmt.Sprintf("%06d", ...)`

* Update invalid sort field error

* Fallback to v1 endpoint if possible

* Use `strcase` for case-insensitive compare

* Backend results are filtered by bot name so no need to re-filter in `MatchBotInstance`

* Revert "Replace `fmt.Sprintf("%06d", ...)`"

This reverts commit 2fbd797.
rhammonds-teleport pushed a commit that referenced this pull request Nov 6, 2025
* Add version and hostname indexes to cache

* Add `ListBotInstancesV2` rpc and use request options

* Add v2 bot instance list endpoint

* Use v2 endpoint in web UI

* Pass signal through to support aborting requests

* Fix comment typo

* Rename util func

* Deprecate `ListBotInstances` rpc

* Encode hostname in cache key

* Address pre-release sorting in version numbers

* Rename bot instance cache utils

* Fix lint deprecation warnings

* Extract filter fields to message

* Replace `fmt.Sprintf("%06d", ...)`

* Update invalid sort field error

* Fallback to v1 endpoint if possible

* Use `strcase` for case-insensitive compare

* Backend results are filtered by bot name so no need to re-filter in `MatchBotInstance`

* Revert "Replace `fmt.Sprintf("%06d", ...)`"

This reverts commit 2fbd797.
github-merge-queue bot pushed a commit that referenced this pull request Nov 13, 2025
* docs(rfd): Bot Instances at Scale (RFD0222) (#57888)

* docs(rfd): Bot Instance at Scale (RFD0222)

* Add Notion link

* First draft for review

* fix: Spell check

* Fix protobuf code block

* Make the document structure slightly clearer

* Add a section on data aggregation

* Add missing end of code block

* Add a bit more context on bots and instances

* Add plan for `tctl`

* Add an explanation for each UX example

* Rename semver functions

* Remove activity visualisation and required complexity

* cspell

* Pre-populate isn't the correct term

* Minor tweak

* Test plan additions

* Remove reference to activity visualization

* Clarify use of pagination in `tctl bots instances ls` and remove filter summary

* Refine predicate language functions

* Restructure protos

* Refine plans for resource storage and quantities

* Add webapi support for config, health and notices

* Expand backwards compatibility

* cspell

* State `tbot` config size limit

* Reduce config max size to 32Kb

* Add failsafe env var

* Reduce max notices to 10

* Explain updating related record expiry in-line with the instance

* Revert extracting service health records

* Move notices to earlier in the delivery plan

* Remove notices & config and expand the why and what

* Expand aggregate data and metrics, and rearrange sections

* Fix `version.between` snippet

* Updated approach to calculating bot instance counts

* Document `newer_than` predicate language function

---------

Co-authored-by: Dan Upton <daniel.upton@goteleport.com>

* feat: List bot instances version and hostname sort (#59263)

* Add version and hostname indexes to cache

* Add `ListBotInstancesV2` rpc and use request options

* Add v2 bot instance list endpoint

* Use v2 endpoint in web UI

* Pass signal through to support aborting requests

* Fix comment typo

* Rename util func

* Deprecate `ListBotInstances` rpc

* Encode hostname in cache key

* Address pre-release sorting in version numbers

* Rename bot instance cache utils

* Fix lint deprecation warnings

* Extract filter fields to message

* Replace `fmt.Sprintf("%06d", ...)`

* Update invalid sort field error

* Fallback to v1 endpoint if possible

* Use `strcase` for case-insensitive compare

* Backend results are filtered by bot name so no need to re-filter in `MatchBotInstance`

* Revert "Replace `fmt.Sprintf("%06d", ...)`"

This reverts commit 2fbd797.

* feat: Bot instances advanced filter (#59374)

* Add version and hostname indexes to cache

* Add `ListBotInstancesV2` rpc and use request options

* Add v2 bot instance list endpoint

* Use v2 endpoint in web UI

* Pass signal through to support aborting requests

* Fix comment typo

* Rename util func

* Add expression parser

* Contribute `to_string` function to default parser

* Add API support for `query` filter

* Fix `SearchPanel` submit with advanced toggle

* Add advanced search to web UI

* Deprecate `ListBotInstances` rpc

* Encode hostname in cache key

* Address pre-release sorting in version numbers

* Rename bot instance cache utils

* Fix lint deprecation warnings

* Extract filter fields to message

* Replace `fmt.Sprintf("%06d", ...)`

* Update invalid sort field error

* Fallback to v1 endpoint if possible

* Use `strcase` for case-insensitive compare

* Backend results are filtered by bot name so no need to re-filter in `MatchBotInstance`

* Use `t.Context()`

* Remove expression methods

* Remove unnecessary fallback comments

* Return early if only bot name filter is required (backend only)

* Fix lint

* replace `to_string` with `equals` (version type only)

* Fix comment

* Remove unnecessary `to_string` tests

* Switch to a true equals function
# Conflicts:
#	lib/cache/bot_instance.go
#	web/packages/teleport/src/BotInstances/List/BotInstancesList.tsx

# Conflicts:
#	web/packages/teleport/src/BotInstances/List/BotInstancesList.tsx

* fix: Rename bot instance version expression functions (#59819)

* feat(webui): New bot instances experience (#59655)

* Make `disableSearch` props optional for SearchPanel component

* Add a shared mock for TextEditor

* Make instance items selectable and include bot name

* Remove old bot instance details page

* Add new bot instances UI

* Add stories

* Add and amend tests

* Switch to arrow function

* Remove `null` from item selected callback

* Fix info guide wording
# Conflicts:
#	web/packages/teleport/src/BotInstances/Details/BotInstanceDetails.test.tsx
#	web/packages/teleport/src/BotInstances/Details/BotInstanceDetails.tsx
#	web/packages/teleport/src/BotInstances/List/BotInstancesList.tsx

* feat: Bot instances design review (#59897)

* Alternate sort menu icons

* Titles and close button

* Yaml background colour

* Spacing

* Keyboard selectable list items

* Fix selected list item padding

* Default scroll bars for list

* Clarify delete bot messaging

* Simplify `onClick`

* Use `FlexProps` type

* Revert "Alternate sort menu icons"

This reverts commit 4212dcd.

* feat: Add filtering and sort to `tctl bots instances ls` (#60273)

* Fix missing `--format` flag

* Use v2 rpc

* Add `--search` flag

* Add `--query` flag

* Add `--sort-index` and `--sort-order` flags

* Remove `generation` and add `version` fields to output

* Allow enabling the auth cache for the test process

* Add list bot instances tests

* Sync join method access logic between tctl and web

* Access `authentication.JoinMethod` safely

* Unhide `--format` flag

* Simplify version header label

* Fallback to v1 ListBotInstances

* Refactor to remove use of `authclient.ClientI`

* A way better fallback implementation 🙌

* typo 🙄

* Refactor to single interface
# Conflicts:
#	tool/tctl/common/bots_command.go

* docs: Add filter, sort and format flags to `tctl bots instances ls` reference (#60508)

* docs: Add filter, sort and format fields to `tctl bots instances ls` reference

* Using consistent capitalization

Co-authored-by: Paul Gottschling <paul.gottschling@goteleport.com>

---------

Co-authored-by: Paul Gottschling <paul.gottschling@goteleport.com>

* feat: Add service health to `tctl bots instances ls|show` (#60316)

* Add `service_health` to bot instance protos

* Add aggregated service health to `show`

* Add services section to `show`

* Add health status column to `ls`

* Extra tabs are not welcome

* Handle zero services when aggregating health status
# Conflicts:
#	tool/tctl/common/bots_command.go

* MWI: Add `/webapi/.../machine-id/bot-instance/metrics` endpoint (#59896)

* Add `autoupdate_bot_instance_report` to the editor role preset

* Add `/webapi/.../machine-id/bot-instance/metrics` endpoint

* Add missing error check in test

* Better error message when metrics aren't ready

* Allow users with `bot_instance:list` to read the `autoupdate_bot_instance_report`

* Move update timestamp onto upgrade statuses object

* Fix predicate language function names

* Remove erroneous comment

* Fix tests

* Add `refresh_after_seconds` to metrics response

* Return an empty `upgrade_statuses` if there is no report

* Replace `exact_version` helper with simple `==` operator

* Use `trace.Aggregate` to return both auth errors

* feat: Bot instance upgrade status dashboard (#60019)

* Add plumbing for new metrics endpoint

* Align version compatibility logic

* Fix mocked responses in stories

* Add new dashboard component

* Wire-in dashboard component

* Fix lint

* Explain dynamic `refetchInterval`

* docs: `onFilterSelected`

* Use typography components from design package

* Fix `onFilterSelected` naming inconsistencies

* A better nbsp

* Remove "control plane" terminology

* Refactor `GetBotInstanceMetricsResponse` type

* Handle out-of-date proxy

* Make instance list messaging filter aware

* Update chart title to "version compatibility"

* Keep "Last updated x minutes ago" label current

* Oops, forgot to update the test

* Remove unused `TitleText`

* Change dashboard title to "insights"

* Version compatibility design changes

* Fix tests after copy change, oops

* feat: Bot instance service health (#60133)

* Add tabs to instance details

* Add `kind` to bot instance heartbeat proto

* Extend `GetBotInstanceResponse` type

* Add `InfoTab` component for Overview tab

* Add `HealthTab` component for Services tab

* Wire-up tabs content

* Use `join_attrs.meta` for join token fields

* Fix links style

* Fix handling of unspecified health status

* Fix tab spacing

* Remove tab tooltips

* Replace service item background

* Add zero services story

* Support tctl instance kind

* Fix styled links

* Fix test

* Fix bot instances story

* feat: Allow instances to be selectable from bot details (#60717)

* Make instance items selectable from bot details

* Add test

* Fix mocked calls in stories

* fix: Bot instance health status dot inconsistency (#60786)

---------

Co-authored-by: Dan Upton <daniel.upton@goteleport.com>
Co-authored-by: Paul Gottschling <paul.gottschling@goteleport.com>
nicholasmarais1158 added a commit that referenced this pull request Nov 18, 2025
* Add version and hostname indexes to cache

* Add `ListBotInstancesV2` rpc and use request options

* Add v2 bot instance list endpoint

* Use v2 endpoint in web UI

* Pass signal through to support aborting requests

* Fix comment typo

* Rename util func

* Deprecate `ListBotInstances` rpc

* Encode hostname in cache key

* Address pre-release sorting in version numbers

* Rename bot instance cache utils

* Fix lint deprecation warnings

* Extract filter fields to message

* Replace `fmt.Sprintf("%06d", ...)`

* Update invalid sort field error

* Fallback to v1 endpoint if possible

* Use `strcase` for case-insensitive compare

* Backend results are filtered by bot name so no need to re-filter in `MatchBotInstance`

* Revert "Replace `fmt.Sprintf("%06d", ...)`"

This reverts commit 2fbd797.
# Conflicts:
#	api/gen/proto/go/teleport/machineid/v1/bot_instance_service.pb.go
#	web/packages/teleport/src/BotInstances/BotInstances.tsx
nicholasmarais1158 added a commit that referenced this pull request Nov 18, 2025
* Add version and hostname indexes to cache

* Add `ListBotInstancesV2` rpc and use request options

* Add v2 bot instance list endpoint

* Use v2 endpoint in web UI

* Pass signal through to support aborting requests

* Fix comment typo

* Rename util func

* Deprecate `ListBotInstances` rpc

* Encode hostname in cache key

* Address pre-release sorting in version numbers

* Rename bot instance cache utils

* Fix lint deprecation warnings

* Extract filter fields to message

* Replace `fmt.Sprintf("%06d", ...)`

* Update invalid sort field error

* Fallback to v1 endpoint if possible

* Use `strcase` for case-insensitive compare

* Backend results are filtered by bot name so no need to re-filter in `MatchBotInstance`

* Revert "Replace `fmt.Sprintf("%06d", ...)`"

This reverts commit 2fbd797.
# Conflicts:
#	api/gen/proto/go/teleport/machineid/v1/bot_instance_service.pb.go
#	web/packages/teleport/src/BotInstances/BotInstances.tsx
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants