Combine ClustersService logout functions#59539
Conversation
ClustersService logout functions
| */ | ||
| /** Logs out of the cluster. */ | ||
| async logout(clusterUri: uri.RootClusterUri) { | ||
| // TODO(gzdunek): logout and removeCluster should be combined into a single acton in tshd |
There was a problem hiding this comment.
To preserve the current behavior, I'd need to move both
logoutandremoveClustermethods. That separation feels like a poor API design and made me revisit the original PR #24978 that introduced it.
As far as I remember that PR didn't introduce logout & removeCluster. There are two separate RPCs for that because in the alpha version of Connect it was possible to log out of a cluster without removing it from the list of clusters in the app. Similar to how you can disconnect a gateway and only then remove it from the connections.
Though I think I see what you mean in the context of #24978 splitting the logout sequence into first changing connected to false and actually removing the cluster from the state at the very end.
Even today, if you remove a profile from disk and then call
ClustersService.syncRootClustersAndCatchErrors, the app will likely crash because the cluster is suddenly missing.
I understand this becomes a larger concern when ~/.tsh sharing gets implemented, right? Because at the moment I don't think there are many opportunities to trigger ClustersService.syncRootClustersAndCatchErrors beyond the app start, but looking at its callsites what you described is technically possible.
Most usages already perform a null check, so it was more consistent and easier to apply the same pattern to the remaining cases.
That was surprising to me because if I had to bet I wouldn't have said that this is the case. 😅 I think in my head I've always assumed that we've had this sweet little invariant where the existence of a workspace at least implies that a root cluster is available.
I don't know, I'm not entirely opposed to this change, it just feels like a big departure from something I've always assumed was invariant, so I do feel a bit uneasy about it.
Perhaps we should document functions returning Cluster from ClustersService to note that they might return no cluster?
There was a problem hiding this comment.
As far as I remember that PR didn't introduce logout & removeCluster. There are two separate RPCs for that because in the alpha version of Connect it was possible to log out of a cluster without removing it from the list of clusters in the app
Yeah, I meant the methods in ClustersService, not the tshd RPCs. Before that PR, the logout sequence started from the logout in tshd and removing the ClustersService state. In the PR, we switched to removing the state at the very end.
Tbh, even the fix that we added could be done easier. In the comment for ClustersService.logout we said:
A code that operates on that intermediate state is in
useClusterLogout.tsx.
After invokinglogout(), it looks for the next workspace to switch to. If we hadn't marked the cluster as disconnected, the method might have returned us the same cluster we wanted to log out of.
We could as well explicitly filter out that cluster when looking for the next connected workspace in useCluserLogout :)
I understand this becomes a larger concern when ~/.tsh sharing gets implemented, right? Because at the moment I don't think there are many opportunities to trigger
ClustersService.syncRootClustersAndCatchErrorsbeyond the app start, but looking at its callsites what you described is technically possible.
This change isn't strictly necessary for sharing ~/.tsh, but having a single method helps make the logic a bit cleaner.
I assume that in the ideal world, it would work like that:
- The profile watcher detects a logout.
- It calls logout on the cluster service (in the main process) to update the internal state.
- It sends a request to the renderer to clean up its local state (or to multiple renderers in theory).
It still could be four steps (as we have it today), where step 2 only calls tshd and sets .connected = false, and a separate step 4 actually removes the cluster, but that's a tighter coupling between the renderer and main process than seems necessary.
One alternative is switching steps 2 and 3, so the cluster is removed at the very end. That way, it would be removed after the workspace.
However, this still doesn’t address the other issue: ClustersService.syncRootClustersAndCatchError can be triggered beyond just the app start, which could cause a mismatch between workspaces and clusters.
So maybe it’s cleaner to have the null checks, unless we can guarantee that these stores are always in sync (or alternatively, prevent this function from being called beyond app initialization).
But hmm, now that I think of it, maybe it actually has more sense to switch the steps? So the renderer first needs to remove the workspace and other dependencies and then we attempt to logout in tsh and remove the cluster (and we forget about ClustersService.syncRootClustersAndCatchError).
That was surprising to me because if I had to bet I wouldn't have said that this is the case. 😅
Maybe it wasn't the majority, but we did have 17 places with null checks and something around that I added in this PR.
There was a problem hiding this comment.
After discussing it through DMs with Rafał, we decided to switch the steps and perform the logout at the end of the logout sequence.
It makes more sense this way, since a cluster can exist without a workspace, but not the other way around.
If the logout in tshd fails, the app will remain usable. The cluster will still appear in the profile selector, allowing the user to retry the logout or open a new workspace for it.
To address the types issues, we should pass a cluster through the workspace context, so that we won't need all these null checks.
When it comes to ClustersService.syncRootClustersAndCatchError, it should be called only once, before creating the workspaces. I left a TODO item to fix the one incorrect usage.
… being accessed" This reverts commit 8343c3c.
ravicious
left a comment
There was a problem hiding this comment.
Tested a couple of scenarios with it and it seems to work fine.
| } | ||
|
|
||
| /** Disposes cluster-related resources and then logs out. */ | ||
| export async function logoutWithCleanup( |
There was a problem hiding this comment.
Would it make sense to move this to web/packages/teleterm/src/ui/ClusterLogout/logoutWithCleanup.ts? I know it'll be used by more than just ClusterLogout, but appContext doesn't feel like the best place for it and it really is just related to logging out so it'd be hard to find a better place for it than ClusterLogout.
There was a problem hiding this comment.
I think we can move it there.
* Remove clusters immediately after a logout, move `useClusterLogout` to `AppContext` * Review callsites to ensure cluster is properly checked before being accessed * Revert "Review callsites to ensure cluster is properly checked before being accessed" This reverts commit 8343c3c. * Switch to removing the cluster at the end of logout sequence * Lint * Move `logoutWithCleanup` to `ui/ClusterLogout`
* Remove clusters immediately after a logout, move `useClusterLogout` to `AppContext` * Review callsites to ensure cluster is properly checked before being accessed * Revert "Review callsites to ensure cluster is properly checked before being accessed" This reverts commit 8343c3c. * Switch to removing the cluster at the end of logout sequence * Lint * Move `logoutWithCleanup` to `ui/ClusterLogout`
* Remove clusters immediately after a logout, move `useClusterLogout` to `AppContext` * Review callsites to ensure cluster is properly checked before being accessed * Revert "Review callsites to ensure cluster is properly checked before being accessed" This reverts commit 8343c3c. * Switch to removing the cluster at the end of logout sequence * Lint * Move `logoutWithCleanup` to `ui/ClusterLogout`
* Remove clusters immediately after a logout, move `useClusterLogout` to `AppContext` * Review callsites to ensure cluster is properly checked before being accessed * Revert "Review callsites to ensure cluster is properly checked before being accessed" This reverts commit 8343c3c. * Switch to removing the cluster at the end of logout sequence * Lint * Move `logoutWithCleanup` to `ui/ClusterLogout`
* Remove clusters immediately after a logout, move `useClusterLogout` to `AppContext` * Review callsites to ensure cluster is properly checked before being accessed * Revert "Review callsites to ensure cluster is properly checked before being accessed" This reverts commit 8343c3c. * Switch to removing the cluster at the end of logout sequence * Lint * Move `logoutWithCleanup` to `ui/ClusterLogout` (cherry picked from commit de6b4ed)
* Combine `ClustersService` logout functions (#59539) * Remove clusters immediately after a logout, move `useClusterLogout` to `AppContext` * Review callsites to ensure cluster is properly checked before being accessed * Revert "Review callsites to ensure cluster is properly checked before being accessed" This reverts commit 8343c3c. * Switch to removing the cluster at the end of logout sequence * Lint * Move `logoutWithCleanup` to `ui/ClusterLogout` (cherry picked from commit de6b4ed) * Enable sending messages from main to renderer with acknowledgments (#59642) * Create awaitable sender * Review comments * Fix test and lint (cherry picked from commit 5dc76fe) * Move cluster state to main process (#59643) * Create `ClusterStore` that manages cluster state * Fix tests that mocked tshd directly * Remove IPC to notify the main process about cluster list changes * Load immer plugins in `MainProcess` * Improve comments * Refactor `useSender` * Get rid of unnecessary Map and try/catch around send * Get rid of `MainProcess.create` * Do not return early `c.proxyHost` is falsy * Add more context to test * Add missing logout handler in main process * Fix applying patches * Adjust `subscribeToClusterStore` to updated `startAwaitableSenderListener` * Crash window when sending state update fails * Extract WebContents navigation handlers and add tests for opening links * Improve error message * Initialize `ClusterStore` synchronously * Convert `lazyTshdClient` field to `getTshdClient` function, add docs * Remove unused eslint directive (cherry picked from commit a41d021) * Connect: make logout function idempotent (#60553) * Remove `ClusterRemove` RPC, make logging out idempotent * Move calling `removeKubeConfig` and `maybeRemoveAppUpdatesManagingCluster` to main process The main process should not depend on the renderer to clean up its own resources. * Remove cleaning up kube dir * Lint (cherry picked from commit 2d1bc7b) * Connect: add profile watcher (#60622) * Add profile watcher * Move `makeClusterWithOnlyProfileProperties` to `profileWatcher.ts`, improve test * Handle watched directory removal * Improve comments * Make tests faster, pass abort signal everywhere * Improve docs * Make `removing tsh directory does not break watcher` easier to understand * Make test dir per test * Improve timing in tests * Add a limit of how many events can be emitted by `fs.watch` (to break the endless stream of events on Windows when watched dir is removed), go into the polling mode only when it's expected that the watched dir was removed * Use `expect().rejects.toThrow` correctly * Deflake 'max file system events count is restricted' * Replace `makeClusterWithOnlyProfileProperties` with `mergeClusterProfileWithDetails`, move it back to `cluster.ts` * Attempt to fix tests * Clarify comment (cherry picked from commit d4e6f19) * Initialize tshdClients in MainProcess constructor (#61044) (cherry picked from commit c7a4233) * Connect: react to tsh actions by watching tsh dir (#60884) * Add `ClusterLifecycleManager` * Register handlers for adding, removing and logging out from cluster * Provide `rootCluster` in `useWorkspaceContext` The handlers in the profile watcher will proceed with updating the cluster store, even if the renderer handlers returned errors. This check protects us from a runtime error if the renderer fails to remove the workspace. * Improve docs * Move processing queue to listener * Make `will-` operations always interrupt main process actions * Improve error messages * Do not remove managing cluster when **only** logging out The app updater displays all clusters, not just those the user is logged into. * Revert "Provide `rootCluster` in `useWorkspaceContext`" This reverts commit cf76d2b. * Rename `logoutWithCleanup` to `cleanUpBeforeLogout` * Do not pass `AbortSignal` to `this.mainProcessClient.syncRootClusters` * Lint * Fix types issues * Do not stack watcher notifications (cherry picked from commit 5fa8249) * Connect: close cluster clients when profile changes (#61090) * Include expiration time in `LoggedInUser` This will allow the profile watcher to detect when the user relogged. * Display expiration time in UI * Add `ClearStaleClusterClients` RPC * Implement `ClearStaleClusterClients` * Clear stale clients when profile changes * Improve session expiration component * Move refresh button back to top * `ClearCachedStaleClientsForRoot` -> `ClearStaleCachedClientsForRoot` * `unchanged` -> `stale` * Make "closing stale clients" a subtest * Add `clientcache` test * Remove `getProfile` error wrapping * Improve comment * Convert story to controls (cherry picked from commit 6615e42) * Gracefully handle missing `current-profile` and respect `TELEPORT_PROXY` in `tsh status` (#61295) * Respect `TELEPORT_PROXY` env var in `tsh status` * Enable listing profiles if there is no active profile * Add test * Define `err` within the block where it's actually used * Handle missing current profile in `tsh logout` * Make check more explicit * Revert mistakenly commited change (cherry picked from commit 95bec3a) * Connect: switch tsh home directory to ~/.tsh (#61352) * Switch tsh home directory to ~/.tsh * Migrate old tsh home to new location, disallow updating fields outside the `state` key in app_state.json from the renderer process * Show banner about migrated tsh home * `promoteMigratedTshHome` -> `showTshHomeMigrationBanner` * `MigratedTshHomeBanner` -> `TshHomeMigrationBanner` * 'Profiles are' -> 'Profiles are now', remove unnecessary space * Fix assigning colors for new workspaces * Improve logs (cherry picked from commit 54b5f6c) * Connect: refresh resources when access changes and add tests for `ClusterLifecycleManager` (#61479) * Detect when user's access changes * Refresh resources in UI when `did-change-access` is received * Add tests for `ClusterLifecycleManager` * Add better docs for ClusterLifecycleEvent * Test assuming requests too * Improve test names (cherry picked from commit 4b00520) * Set up deep links as soon as possible (#61668) (cherry picked from commit 0b5ab6b) * Serialize IPC errors (#61665) * Serialize all enumerable error fields * Add wrappers around `ipcMain.handle` and `ipcRenderer.invoke` * Fix `Method Error.prototype.toString called on incompatible receiver undefined` * Improve docs * Lint (cherry picked from commit a1f2ae0) * Fix unrecoverable ssh cert errors in tsh/Connect (#61322) * Initialize default Username/HostLogin only in tsh * Move `Username()` from `api.go` to `tsh.go` * Remove wrong `Profile.SiteName` default * Remove resetting `SiteName` Not sure why it was needed. Perhaps to clear the default that we just removed? But even if add the default back and remove this fix, everything works. * Gracefully handle missing SSH/TLS certs * Remove unused `TeleportClient.LoadKeyForClusterWithReissue` * Revert "Move `Username()` from `api.go` to `tsh.go`" This reverts commit f7ff0ff. * Revert "Initialize default Username/HostLogin only in tsh" This reverts commit ed38bab. * When any of SSH/TLS cert is missing, return partial profile * Only log non-nil errors * Revert "Remove wrong `Profile.SiteName` default" * Revert "Remove resetting `SiteName`" This reverts commit f54ab3f. * Set `SiteName` when adding cluster * Improve comments * Add test * Fix test * Add myself to TODO * Add test for logging out with missing SSH cert * Lint (cherry picked from commit cd3c8f8) * Connect: update docs for sharing ~/.tsh directory (#61467) * Update docs for sharing ~/.tsh directory * Review comments * Lint (cherry picked from commit 19533bf) --------- Co-authored-by: ravicious <rafal.cieslak@goteleport.com>
While working on moving the cluster state to the main process, I got stuck on logging out.
To preserve the current behavior, I'd need to move both
logoutandremoveClustermethods. That separation feels like a poor API design and made me revisit the original PR #24978 that introduced it.I now believe it was more of a workaround than a proper fix. The real issue was that several parts of the code assumed that every workspace always has an associated cluster.
That assumption is incorrect; clusters and workspaces are managed separately, and there's no guarantee they'll always stay in sync.
Even today, if you remove a profile from disk and then call
ClustersService.syncRootClustersAndCatchErrors, the app will likely crash because the cluster is suddenly missing.If we had
strictNullChecksenabled, each call toClustersService.findCluster()would returnCluster | undefined, requiring the caller to explicitly handle the undefined case.There are two ways we could address this:
useWorkspaceContext).I chose the second approach. Most usages already perform a null check, so it was more consistent and easier to apply the same pattern to the remaining cases.
I also moved the logout code from
useClusterLogoutto the app context to enable the upcoming profile watcher to call it.