diff --git a/apps/desktop/plans/done/20260507-host-service-bundled-version-pin.md b/apps/desktop/plans/done/20260507-host-service-bundled-version-pin.md new file mode 100644 index 00000000000..9b3ec85d9cb --- /dev/null +++ b/apps/desktop/plans/done/20260507-host-service-bundled-version-pin.md @@ -0,0 +1,71 @@ +# Pin host-service adoption to bundled version + +## Problem + +Today `host-service-coordinator.tryAdopt()` adopts any running host-service whose version satisfies `>= MIN_HOST_SERVICE_VERSION` — a manually-bumped floor in `packages/shared/src/host-version.ts`. After an Electron auto-update, the new desktop happily adopts a stale host-service as long as it clears the floor, so non-breaking host-service changes (patches, additive features) never reach users until someone remembers to bump the floor. + +We want auto-updated desktops to always end up running the host-service binary that shipped with them. + +## Why respawn is safe + +Pty-daemon — the only process holding durable session state — has its own manifest + supervisor (`packages/host-service/src/daemon/DaemonSupervisor.ts`) and is adopted independently. Killing host-service drops only: + +- tunnel-client WebSockets (auto-reconnect) +- the in-memory terminal sessions Map (rebuilt on re-attach via daemon-client) +- cached env + +Real PTY state stays in the daemon across the swap. + +## Change + +Replace the floor check with an equality check against the host-service bundled into this Electron build. Reuse the existing kill+respawn path. + +1. `packages/host-service/package.json#version` is the single source of truth, mirroring the `EXPECTED_DAEMON_VERSION` pattern at `packages/host-service/src/daemon/expected-version.ts`. Both `host.info` (runtime, returned to the desktop on adoption probe) and the desktop coordinator import the package.json directly with `with { type: "json" }` and read `.version`. Bumping `packages/host-service/package.json` automatically bumps both — no shared constant to keep in sync. + + To enable the cross-package import on the desktop side, `packages/host-service/package.json` adds `"./package.json": "./package.json"` to its `exports` map (same as `packages/pty-daemon/package.json`). The package.json version is also bumped from `0.1.0` to `0.8.0` to reconcile a pre-existing drift: `host.info` had been hard-coding `"0.8.0"` for several breaking-change ratchets while `package.json` was never bumped past `0.1.0`. After this PR, `package.json` becomes load-bearing — bumping it kills every running host-service on the next launch. +2. In `apps/desktop/src/main/lib/host-service-coordinator.ts:289-308`, replace + ```ts + !semver.satisfies(version, `>=${MIN_HOST_SERVICE_VERSION}`) + ``` + with strict equality: + ```ts + version !== BUNDLED_HOST_SERVICE_VERSION + ``` + The Electron build is the source of truth for which host-service runs against it. Any drift — older or newer — gets killed and respawned from the bundled binary. This keeps the cloud/desktop deploy contract tight (only one host-service version is ever live alongside a given desktop) and avoids the "hand-rolled newer host-service sticks around indefinitely" failure mode. +3. Keep `MIN_HOST_SERVICE_VERSION` only for the renderer-side **remote** host gate at `apps/desktop/src/renderer/routes/_authenticated/_dashboard/v2-workspace/hooks/useRemoteHostStatus/useRemoteHostStatus.ts:91`. We can't kill remote hosts — a floor is still the right shape there. + +Everything else is unchanged: `detached: true` spawn, manifest re-adoption on next start, crash circuit breaker, daemon lifecycle. + +## Behavior after change + +- **Auto-update ships new host-service version** → next Electron launch: adoption check fails, manifest PID gets SIGTERM, fresh host-service spawns from the bundled binary. One brief tunnel reconnect. +- **Auto-update ships same host-service version** → adoption succeeds, no respawn. +- **Dev: locally newer host-service** → killed and replaced with the bundled version. If you need to test against a newer daemon in dev, point the desktop at a build that bundles it; don't hand-roll the running process. +- **Pty-daemon** → never killed by this path; survives across host-service swaps as designed. + +## Open questions + +- **Drain before SIGTERM?** Today line 304 is a hard kill. If respawn cadence becomes user-visible (tunnel reconnect storms), add a short drain — stop accepting new connections, wait N seconds — before SIGTERM. Not needed in v1. + +## Out of scope + +- Pty-daemon version pinning (already handled by `DaemonSupervisor` + `EXPECTED_DAEMON_VERSION`). +- Changing `MIN_HOST_SERVICE_VERSION` semantics for the remote-host renderer gate. +- Any auto-update lifecycle changes — host-service and pty-daemon still must not be torn down by the auto-updater itself; the respawn happens on the *next* launch via the existing adoption path. + +## Outcomes & Retrospective + +The plan above describes a host-service version pin (steps 2–3). That approach was implemented and then **simplified out** — see the retrospective below for what actually shipped. + +**Shipped:** +- **App-version pin in `tryAdopt`.** The host-service manifest gains `spawnedByAppVersion`; the child writes it from `SUPERSET_APP_VERSION` (passed by the coordinator from `app.getVersion()`). On adoption the coordinator requires `manifest.spawnedByAppVersion === app.getVersion()` — any mismatch kills the manifest PID and falls through to `spawn()` from the bundled binary. Pre-existing manifests without the field are coerced to empty string by `readManifest` so the old PID is still found and killed (no orphan). +- **Why app-version, not host-service version.** Host-service ships with the desktop, so `app.getVersion()` strictly subsumes the host-service version pin for the auto-update use case. Bumping the host-service version on a host-service code change is no longer load-bearing for adoption. +- **Host-service version still flows through `host.info`** (`packages/host-service/src/trpc/router/host/host.ts`), now derived from `packages/host-service/package.json` via a `with { type: "json" }` import — kept for telemetry/debugging and because the renderer-side **remote**-host gate (`useRemoteHostStatus`) still compares it against `MIN_HOST_SERVICE_VERSION`. +- `packages/host-service/package.json` was bumped `0.1.0` → `0.8.0` (reconciles the historical drift where `host.info` hard-coded `"0.8.0"` while `package.json` lagged) and then `0.8.0` → `0.8.1`. Added `"./package.json": "./package.json"` to its `exports` so the import works cross-package. +- `semver` import dropped from the coordinator — no longer needed there. + +**Tried and removed:** +- A `BUNDLED_HOST_SERVICE_VERSION` pin in `tryAdopt` (compile-time read of `@superset/host-service/package.json#version`, strict-equality check against `host.info.version`). Implemented, but redundant once the app-version pin landed: in our build flow host-service ships with the desktop, so the app-version pin catches every case the host-service version pin caught — plus the case where someone forgets to bump the host-service version on a host-service code change. Removed to avoid carrying a second concept and a per-adoption HTTP probe that no longer mattered. + +**Deferred:** +- Drain-before-SIGTERM. The hard kill is unchanged; revisit if respawn cadence becomes user-visible. diff --git a/apps/desktop/src/main/host-service/env.ts b/apps/desktop/src/main/host-service/env.ts index 7641208ca13..51c45864b93 100644 --- a/apps/desktop/src/main/host-service/env.ts +++ b/apps/desktop/src/main/host-service/env.ts @@ -12,6 +12,7 @@ export const env = createEnv({ ORGANIZATION_ID: z.string().min(1), DESKTOP_VITE_PORT: z.coerce.number().int().positive(), RELAY_URL: z.string().url().optional(), + SUPERSET_APP_VERSION: z.string().min(1), }, runtimeEnv: process.env, emptyStringAsUndefined: true, diff --git a/apps/desktop/src/main/host-service/index.ts b/apps/desktop/src/main/host-service/index.ts index e67d14bbec3..d7443cfcbbd 100644 --- a/apps/desktop/src/main/host-service/index.ts +++ b/apps/desktop/src/main/host-service/index.ts @@ -74,6 +74,7 @@ async function main(): Promise { authToken: env.HOST_SERVICE_SECRET, startedAt, organizationId: env.ORGANIZATION_ID, + spawnedByAppVersion: env.SUPERSET_APP_VERSION, }); } catch (error) { console.error("[host-service] Failed to write manifest:", error); diff --git a/apps/desktop/src/main/lib/host-service-coordinator.ts b/apps/desktop/src/main/lib/host-service-coordinator.ts index 454a0761386..e32c75759f5 100644 --- a/apps/desktop/src/main/lib/host-service-coordinator.ts +++ b/apps/desktop/src/main/lib/host-service-coordinator.ts @@ -5,10 +5,8 @@ import * as fs from "node:fs"; import path from "node:path"; import { settings } from "@superset/local-db"; import { getHostId, getHostName } from "@superset/shared/host-info"; -import { MIN_HOST_SERVICE_VERSION } from "@superset/shared/host-version"; import { app } from "electron"; import { env } from "main/env.main"; -import semver from "semver"; import { env as sharedEnv } from "shared/env.shared"; import { getProcessEnvWithShellPath } from "../../lib/trpc/routers/workspaces/utils/shell-env"; import { SUPERSET_HOME_DIR } from "./app-environment"; @@ -286,17 +284,11 @@ export class HostServiceCoordinator extends EventEmitter { const url = new URL(manifest.endpoint); const port = Number(url.port); - const version = await this.fetchHostVersion( - manifest.endpoint, - manifest.authToken, - ); - if ( - !version || - !semver.satisfies(version, `>=${MIN_HOST_SERVICE_VERSION}`) - ) { - const reason = version - ? `version ${version} < ${MIN_HOST_SERVICE_VERSION}` - : "version unknown"; + const currentAppVersion = app.getVersion(); + if (manifest.spawnedByAppVersion !== currentAppVersion) { + const reason = manifest.spawnedByAppVersion + ? `spawned by app ${manifest.spawnedByAppVersion} != current ${currentAppVersion}` + : "no recorded app version (pre-upgrade manifest)"; console.log( `[host-service:${organizationId}] Adopted service ${reason}, killing`, ); @@ -322,27 +314,6 @@ export class HostServiceCoordinator extends EventEmitter { return { port, secret: manifest.authToken, machineId: this.machineId }; } - private async fetchHostVersion( - endpoint: string, - secret: string, - ): Promise { - try { - const controller = new AbortController(); - const timeout = setTimeout(() => controller.abort(), 3_000); - const response = await fetch(`${endpoint}/trpc/host.info`, { - signal: controller.signal, - headers: { Authorization: `Bearer ${secret}` }, - }); - clearTimeout(timeout); - if (!response.ok) return null; - const data = await response.json(); - const result = data?.result?.data; - return result?.json?.version ?? result?.version ?? null; - } catch { - return null; - } - } - private readAndValidateManifest( organizationId: string, ): HostServiceManifest | null { @@ -492,6 +463,7 @@ export class HostServiceCoordinator extends EventEmitter { SUPERSET_HOME_DIR: SUPERSET_HOME_DIR, SUPERSET_AGENT_HOOK_PORT: String(sharedEnv.DESKTOP_NOTIFICATIONS_PORT), SUPERSET_AGENT_HOOK_VERSION: HOOK_PROTOCOL_VERSION, + SUPERSET_APP_VERSION: app.getVersion(), AUTH_TOKEN: config.authToken, SUPERSET_API_URL: config.cloudApiUrl, }); diff --git a/apps/desktop/src/main/lib/host-service-manifest.ts b/apps/desktop/src/main/lib/host-service-manifest.ts index 57b55fda736..8e1d61901fc 100644 --- a/apps/desktop/src/main/lib/host-service-manifest.ts +++ b/apps/desktop/src/main/lib/host-service-manifest.ts @@ -15,6 +15,15 @@ export interface HostServiceManifest { authToken: string; startedAt: number; organizationId: string; + /** + * Desktop app version that spawned this host-service. Compared against + * the current `app.getVersion()` on adoption — any mismatch triggers a + * kill + respawn so every Electron auto-update lands on a freshly + * spawned host-service, even when the host-service version pin alone + * would have allowed adoption (e.g. host-service code changed but its + * `package.json#version` was not bumped). + */ + spawnedByAppVersion: string; } export function manifestDir(organizationId: string): string { @@ -60,6 +69,14 @@ export function readManifest( return null; } + // `spawnedByAppVersion` is required going forward, but pre-existing + // manifests on upgraded users won't have it. Coerce to empty string so + // `tryAdopt` still finds the old PID, then trip the app-version pin + // (current version !== "") so the stale daemon gets killed and respawned. + if (typeof data.spawnedByAppVersion !== "string") { + data.spawnedByAppVersion = ""; + } + return data as HostServiceManifest; } catch { return null; diff --git a/bun.lock b/bun.lock index c17af3d7927..a6941f5fcba 100644 --- a/bun.lock +++ b/bun.lock @@ -775,7 +775,7 @@ }, "packages/host-service": { "name": "@superset/host-service", - "version": "0.1.0", + "version": "0.8.1", "dependencies": { "@hono/node-server": "^1.14.1", "@hono/node-ws": "^1.3.0", diff --git a/packages/host-service/package.json b/packages/host-service/package.json index d48953e4b65..fd032f2ff1d 100644 --- a/packages/host-service/package.json +++ b/packages/host-service/package.json @@ -1,6 +1,6 @@ { "name": "@superset/host-service", - "version": "0.1.0", + "version": "0.8.1", "private": true, "type": "module", "exports": { @@ -39,7 +39,8 @@ "./attachments": { "types": "./src/trpc/router/attachments/index.ts", "default": "./src/trpc/router/attachments/index.ts" - } + }, + "./package.json": "./package.json" }, "scripts": { "clean": "git clean -xdf .cache .turbo dist node_modules", diff --git a/packages/host-service/src/trpc/router/host/host.ts b/packages/host-service/src/trpc/router/host/host.ts index f49b99423f4..a73ad330854 100644 --- a/packages/host-service/src/trpc/router/host/host.ts +++ b/packages/host-service/src/trpc/router/host/host.ts @@ -1,30 +1,17 @@ import os from "node:os"; +import hostServicePackageJson from "@superset/host-service/package.json" with { + type: "json", +}; import { getHostId, getHostName } from "@superset/shared/host-info"; import { TRPCError } from "@trpc/server"; import type { ApiClient } from "../../../types"; import { protectedProcedure, router } from "../../index"; -// 0.4.0: terminal launch moved from `terminal.ensureSession` to -// `terminal.launchSession` plus WebSocket attach params. -// 0.3.0: cloud `device.*` router renamed to `host.*`; `device.ensureV2Host` -// is now `host.ensure`, host registrations are keyed on (orgId, machineId) -// composite, and `targetHostId`/`v2_workspaces.host_id` are machineId text -// not uuid. Older host-service binaries call the now-removed `device.*` -// procedures and fail at registration. -// 0.2.0: `workspaceCreation.adopt` accepts optional `worktreePath`. -// 0.5.0: pty-daemon supervision moved into host-service. New -// `terminal.daemon` tRPC namespace; existing 0.4.x host-services -// don't expose it, so the desktop coordinator must refuse to adopt -// them on upgrade and respawn with the new bundle. Adopting in -// place would leave the new desktop talking to old code with no -// `terminal.daemon.*` routes, breaking Settings → Manage daemon. -// 0.7.0: canonical `workspaces.create` flow + `settings.hostAgentConfigs` -// router (PR1, #3893). 0.6.x host-services don't expose either, so -// adopting one in place would break new-project creation and the -// agent-config settings UI. -// 0.8.0: terminal creation moved to `terminal.createSession`; WebSocket -// `/terminal/:terminalId` is attach-only. -const HOST_SERVICE_VERSION = "0.8.0"; +// Auto-derived from this package's package.json so a host-service version +// bump automatically flows through to `host.info` and the desktop's +// strict-equality adoption check (see host-service-coordinator.tryAdopt). +const HOST_SERVICE_VERSION: string = hostServicePackageJson.version; + const ORGANIZATION_CACHE_TTL_MS = 60 * 60 * 1000; let cachedOrganization: { diff --git a/packages/shared/src/host-version.ts b/packages/shared/src/host-version.ts index b028017228f..0e5f5b0c2c2 100644 --- a/packages/shared/src/host-version.ts +++ b/packages/shared/src/host-version.ts @@ -1,8 +1,9 @@ /** - * Minimum host-service version this app can work with. Bumping this forces - * the desktop coordinator to kill + respawn any adopted local service older - * than this, and gates v2 workspace UIs from mounting against a remote host - * whose CLI is still on an older version. + * Minimum host-service version a v2 workspace UI can work with against a + * **remote** host whose binary we don't control (gates renderer mounting + * via `useRemoteHostStatus`). For the local host-service we bundle, the + * desktop coordinator pins to the bundled version exactly (read from + * `@superset/host-service/package.json`) — this floor does not apply. * * 0.4.0: terminal launch moved from `terminal.ensureSession` to * `terminal.launchSession` plus WebSocket attach params. @@ -13,17 +14,12 @@ * * 0.5.0 — pty-daemon supervision migrated into host-service. New * `terminal.daemon` tRPC namespace; older 0.4.x host-services don't - * expose it. Adopting one in place would leave the new desktop - * talking to old code: Settings → Manage daemon would silently - * fail, and the v2 PTY survival promise is broken. + * expose it. * * 0.7.0 — canonical `workspaces.create` flow + `settings.hostAgentConfigs` - * router (PR1, #3893). Older 0.6.x host-services don't expose either, - * so adopting one in place would break new-project creation and the - * agent-config settings UI. + * router (PR1, #3893). Older 0.6.x host-services don't expose either. * * 0.8.0 — v2 terminal creation moved to `terminal.createSession`; the - * WebSocket route is attach-only by `terminalId`. Older host-services would - * reject the renderer's creation call and still expect socket-side startup. + * WebSocket route is attach-only by `terminalId`. */ export const MIN_HOST_SERVICE_VERSION = "0.8.0";