Skip to content

feat: pty-daemon fd-handoff — sessions survive daemon binary upgrades#3971

Merged
Kitenite merged 31 commits intomainfrom
cake-saguaro
May 2, 2026
Merged

feat: pty-daemon fd-handoff — sessions survive daemon binary upgrades#3971
Kitenite merged 31 commits intomainfrom
cake-saguaro

Conversation

@Kitenite
Copy link
Copy Markdown
Collaborator

@Kitenite Kitenite commented May 2, 2026

Summary

Phase 2 of the pty-daemon migration: PTY sessions now survive daemon-binary upgrades. Pre-Phase-2, "Restart and update" SIGTERMs the daemon and all shells die in the gap. After this PR, the user clicks Update (or the upgrade happens automatically) and the new daemon binary takes over via fd inheritance — original shell PIDs survive untouched.

Phase 1 (PR #3896) made daemon survive host-service restarts. Phase 2 makes it survive its own binary swap.

Manually verified in dev today: opened 3 terminal panes, clicked Update three times in succession (chain 62492 → 51479 → 9048 → 62648), all 3 sessions preserved at every step with the same shell PIDs.

How it works

┌─ supervisor (host-service) ───────────────────────────────────────┐
│  update(orgId)                                                     │
│   │ ① wire-message "prepare-upgrade" to running daemon            │
│   ▼                                                                │
│   ┌─ daemon A (running) ──────────────────────────────────────┐   │
│   │ ② write snapshot.json (sessions, fdIndex, ring buffers)   │   │
│   │ ③ spawn daemon B with stdio:                              │   │
│   │    fd 3: 'ipc'  (Node IPC for upgrade-ack)                │   │
│   │    fd 4..N: PTY master fds, one per live session          │   │
│   │   argv: --handoff --snapshot=... --socket=...             │   │
│   └────────────────────────┬──────────────────────────────────┘   │
│   ┌─ daemon B (successor) ▼──────────────────────────────────┐    │
│   │ ④ read snapshot, adoptFromFd for each PTY                │    │
│   │ ⑤ process.send({ type: "upgrade-ack", successorPid })    │    │
│   │ ⑥ wait for IPC `disconnect` (= predecessor exited)       │    │
│   │ ⑦ listen() on the now-free socket path                   │    │
│   └────────────────────────┬─────────────────────────────────┘    │
│   ┌─ daemon A ◀────────────▼─────────────────────────────────┐    │
│   │ ⑧ ack received → reply to supervisor → close + exit(0)   │    │
│   └────────────────────────┬─────────────────────────────────┘    │
│  ⑨ supervisor's adopted-liveness sees A dead → adopts B           │
└────────────────────────────────────────────────────────────────────┘

The kernel's fd refcount holds the master fds alive across the predecessor's exit (stdio inheritance is a dup, not a transfer). Shells never see SIGHUP.

Auto-update on host-service upgrade

When host-service boots and adopts a daemon whose runningVersion < EXPECTED_DAEMON_VERSION (i.e. host-service was just upgraded but the detached daemon is still on the old binary), the supervisor automatically calls update() in the background. Sessions survive on success. On failure, the old daemon stays running and the user retries via the Update button — we never silently force-destroy sessions.

Concurrency-coalesced via per-org updateInFlight map: a manual Update click and the auto-update can race; the second caller observes the same outcome via the cached promise.

Decisions (full /decide walkthrough in apps/desktop/plans/20260501-pty-daemon-phase2-implementation.md)

# Decision Choice
1 What new daemon inherits PTY master fds only (listener inheritance dropped after spike)
2 Old daemon exit timing Ack-then-exit (await upgrade-ack over IPC, then exit)
3 Session metadata transport Snapshot file + manifest pointer
4 Restart UX "Update" is default; "Force restart" stays as opt-in
5 Handoff failure mode Surface to user with dialog (Force update or Cancel)
6 Handoff protocol Separate IPC; client wire stays at v1

Commits (17)

ee5dde53c feat(pty-daemon): expose master fd from Pty adapter
9388ff820 feat(pty-daemon): handoff control-fd protocol
f5028f1f2 feat(pty-daemon): handoff snapshot writer/reader
caf53c119 feat(host-service): manifest fields for handoff state
5a643ebc5 feat(pty-daemon): adopt PTY sessions from inherited master fd
6b53b9776 feat(pty-daemon): cross-process daemon-binary handoff
476245daf feat(host-service): DaemonSupervisor.update() with fd-handoff
6d305cd94 feat(host-service): terminal.daemon.update tRPC procedure
489a53863 feat(desktop): wire Update button to fd-handoff flow
766d95166 test(host-service): supervisor.update() real-spawn handoff coverage
e2dfa34d1 feat(desktop): always show Update daemon button when daemon is up
70810fd55 fix(pty-daemon): surface cwd / shell in spawn errors
0ecb2b550 fix(pty-daemon): switch handoff signal to argv — bundlers DCE env checks
a2d43c236 test(desktop): smoke-check pty-daemon bundle for handoff markers
82c337058 feat(host-service): kill stale daemon on dev-mode startup
9274281f6 feat(pty-daemon): TIOCSWINSZ on adopted PTY fds via stty
a1eeb2c9e feat(host-service): auto-update daemon on adopt with version drift

Notable bugs caught + fixed during dev verification

Bundler DCE eliminated runHandoffReceiver (commit 0ecb2b550). Both Bun's bundler AND electron-vite's esbuild statically inline process.env.X references at build time, constant-fold the unset-at-build-time comparison, and dead-code-eliminate the entire receiver branch. The test suite ran source-direct or against the Bun-built package bundle; neither covered the desktop shim's electron-vite output where the failure mode lived. Fix: handoff signal moves from env to --handoff argv flag (fully dynamic, can't be statically analyzed). The desktop shim was a separate copy of pre-Phase-2 main() that needed to mirror the package's main; it's now in sync.

Regression-armored (commit a2d43c236): apps/desktop/scripts/check-pty-daemon-bundle.ts runs after compile:app and greps the desktop bundle for runHandoffReceiver, --handoff (≥2), upgrade-ack, adoptSnapshot, adoptFromFd. Build fails fast if any marker is missing.

Test plan

  • bun test passes in packages/pty-daemon (34 tests, 6 files) and packages/host-service (450+ tests, 50 files)
  • node --experimental-strip-types --test test/handoff.test.ts in packages/pty-daemon (real cross-process daemon spawn, asserts shell PID survives)
  • node --experimental-strip-types --test src/daemon/DaemonSupervisor.node-test.ts in packages/host-service (9 real-spawn tests including auto-update)
  • bun run check:pty-daemon-bundle in apps/desktop after compile:app (5/5 markers present)
  • Manual QA: bun dev, open several v2 terminal panes, Settings → Manage daemon → Update daemon. Sessions stay alive; toast confirms.
  • Manual QA: Force restart still works (closes sessions, spawns fresh — the destructive escape hatch).

Out of scope

  • Cross-platform. macOS arm64 validated. macOS x86_64 + Linux not yet exercised in this repo (Phase 0 harness apps/desktop/plans/pty-handoff-experiment/ exists for future stress runs).
  • EXPECTED_DAEMON_VERSION bump. This PR adds handoff plumbing but doesn't change daemon wire behavior, so no version bump. Bump when the next change actually changes the daemon's protocol surface.
  • Phase 0 Linux + high-N + SIGKILL stress. Test harness already in apps/desktop/plans/pty-handoff-experiment/; ship validation lives there.

Documentation trail

  • apps/desktop/plans/20260501-pty-daemon-phase2-audit.md — current-state audit before design (on the design-doc branch)
  • apps/desktop/plans/20260501-pty-daemon-phase2-implementation.md — decision log + step sequencing (on the design-doc branch)
  • apps/desktop/plans/pty-handoff-experiment/listener-handoff/ — Phase 2 spike that forced D1 revision (on the design-doc branch)
  • packages/host-service/DAEMON_SUPERVISION.md — Phase 1 reference, unchanged here (will need a follow-up edit to mention Phase 2)

Summary by cubic

Preserve PTY sessions across daemon binary upgrades using fd-handoff. Also fixes the post-upgrade version-reporting bug that kept “Update available” after successful updates.

  • Refactors

    • Versioning: single source of truth — EXPECTED_DAEMON_VERSION and DAEMON_PACKAGE_VERSION now derive from @superset/pty-daemon/package.json; package exports add ./package.json; bumped @superset/pty-daemon to 0.2.0.
    • Snapshots: switch to wire framing (encodeFrame/FrameDecoder), .snap files; snapshot helpers (clearSnapshot, readSnapshot, writeSnapshot) exported for the desktop shim.
    • Test coverage: adds byte-fidelity, cross-process handoff, replay-on-reconnect, and guards that auto-update failures do not disrupt sessions or overwrite the predecessor instance.
    • Desktop build: post-build smoke check verifies fd-handoff markers and uses fileURLToPath for Windows path compatibility.
  • Bug Fixes

    • Fix “Update daemon” stuck as UPDATE AVAILABLE: successor ignores inherited SUPERSET_PTY_DAEMON_VERSION in handoff mode and reads its own bundle version; supervisor retries version probe after predecessor exit to cover the bind gap.
    • Robust handoff: wrap readSnapshot() in successor try/catch and reply upgrade-nak on failure; unlink snapshot on failed handoff; add an error listener to AdoptedPty writer; clearer spawn error by moving the cwd-not-directory check outside the catch.
    • Supervisor + env: host-service strips SUPERSET_PTY_DAEMON_VERSION from the successor’s env during handoff, waits for the predecessor PID to fully exit before probing the new socket version, and now bails with ok:false on timeout instead of probing anyway.
    • Renderer + host-service: prevent duplicate terminal output on daemon swap by skipping ring-buffer replay on WS reconnect, keyed off the first received byte (not just socket open) and plumbed as replayOnAdoption.

Written for commit 8416696. Summary will update on new commits.

Summary by CodeRabbit

  • New Features

    • Live daemon update (handoff) to swap daemon binary without closing active terminal sessions
    • UI: "Update daemon" button with failure dialog and "Force update" fallback
    • Option to avoid replaying terminal scrollback on reconnect; transport honors reconnect flag
    • Supervisor can auto-apply bundled daemon updates opportunistically
    • End-to-end snapshot/handoff to transfer live session state and adopt PTY fds
  • Chores

    • Build scripts split to add an explicit post-build bundle validation step
    • Daemon package version bumped
  • Tests

    • Expanded integration and unit tests for handoff, snapshots, FD adoption, replay behavior, and concurrency guards

Kitenite added 17 commits May 1, 2026 14:45
Phase 2 fd-handoff requires passing the kernel master fd of each PTY to
the successor daemon process via stdio inheritance. node-pty does not
publicly expose this fd; reach into its private `_fd` property and
assert at spawn time so a future node-pty bump can't silently break
handoff months from now.

node-pty stays pinned to 1.1.0 exactly — Phase 0 harness validated this
version's `_fd` shape; the assert in NodePtyAdapter.getMasterFd surfaces
a clear error if the contract ever shifts.
Tiny protocol for the daemon-to-successor control channel during
fd-handoff. The successor sends `upgrade-ack { successorPid }` once it
has adopted the inherited PTY master fds, or `upgrade-nak { reason }`
on a soft failure. Old daemon awaits one or times out.

Travels over a dedicated stdio fd in the successor's inherited stdio
array — NOT exposed to clients. Reuses the existing length-prefixed
JSON framing so it shares encodeFrame / FrameDecoder.

Client wire protocol (messages.ts, version.ts) is untouched.

Also extends mock Pty implementations in handlers.test.ts and
SessionStore.test.ts to satisfy the new getMasterFd() interface
member added in the previous commit.
The handoff snapshot is the on-disk bookkeeping the successor reads to
rebuild SessionStore state. Kernel-side state (PTY master fds) flows
through the spawned successor's stdio array; this snapshot carries the
session id ↔ fdIndex mapping plus per-session metadata and replay buffer.

Atomic write via tmp+rename so the successor only ever sees a complete
file. Exited sessions are filtered out — they have no live PTY fd to
inherit, and their exit event has already been delivered to renderers
(see Server.onExit's delete-on-exit behavior).

`version: 1` is forward-compat scaffolding only — snapshots are written
and consumed by the same daemon-pair on the order of milliseconds, so we
don't need cross-version compatibility yet.
Adds three optional fields to PtyDaemonManifest to track in-flight
fd-handoff state:

- handoffInProgress: true between snapshot-write and successor's bind
- handoffSnapshotPath: where the successor reads its bookkeeping
- handoffSuccessorPid: the spawned successor's pid

All optional + lenient parsing — older host-service builds that don't
know these fields ignore them, and a malformed handoff field doesn't
make the whole manifest unreadable. The handoff state is advisory; if
it gets corrupted, the supervisor can always fall back to the existing
adopt-or-spawn path.
AdoptedPty wraps a PTY master fd inherited from a predecessor daemon
via stdio. The successor doesn't have a node-pty IPty for these
sessions (forkpty was never run; the fd already existed), so we build
a thin adapter directly on the fd:

- read via fs.createReadStream (autoClose: false — supervisor owns)
- write via fs.createWriteStream
- kill via process.kill(pid)
- onExit: read-stream end/error OR PID-liveness poll, whichever first

Resize on adopted sessions is a known gap — TIOCSWINSZ requires native
ioctl access. Today resize() updates meta cols/rows but leaves kernel
window size untouched. Visible only if a user resizes a session that
was carried across an upgrade. Follow-up: koffi or N-API helper.

Test coverage: API-surface checks. End-to-end I/O on an adopted fd
will be validated in the cross-process handoff integration test
(Step 10) — testing in-process is unreliable because node-pty's native
worker reads from the master fd in parallel with adoptFromFd's stream.
Phase 2 sender + receiver: a daemon process spawns a successor that
inherits all live PTY master fds via stdio, adopts them via the
already-built `adoptFromFd`, then takes over the socket. The original
shell PIDs survive the daemon binary swap.

**Wire (client-facing):** new `prepare-upgrade` request and
`upgrade-prepared { result }` reply. The supervisor calls
`DaemonClient.prepareUpgrade()` to drive the flow; result carries the
successor's pid on success or a reason string on failure. Client
protocol stays at v1 — the new pair is additive.

**Daemon-to-successor:** Node IPC channel (stdio[3]='ipc'). The
predecessor sends nothing on the IPC; the successor sends one of
`upgrade-ack { successorPid }` or `upgrade-nak { reason }`. Predecessor
times out after 5s.

**Sequencing landmines this commit handles:**

1. server.close() unlinks the bound socket path. The successor must
   bind only AFTER the predecessor has unlinked, otherwise the
   predecessor's unlink yanks the path entry from under the successor's
   chmod, raising ENOENT. We wait for the IPC channel to disconnect
   (= predecessor process.exit) before binding.
2. The dispatch handler that called prepareUpgrade is responsible for
   sending upgrade-prepared back over the supervisor's wire. Closing
   the server inside prepareUpgrade beat the reply and timed out the
   supervisor. We schedule close+exit via `setImmediate` after the
   dispatcher has had its turn.
3. process.execArgv (--experimental-strip-types in dev/tests) must
   forward to the successor or it can't load .ts files.

**Tests:** test/handoff.test.ts spawns a real daemon binary, opens a
session that sleeps 30s, sends prepare-upgrade, verifies the successor
binary lists the same session id with the same shell pid still alive.
This is the core Phase 2 success criterion.
Wraps the daemon-side prepare-upgrade wire call with the supervisor-side
bookkeeping needed to swap in the successor cleanly:

- Suppress crash-respawn for the predecessor's imminent exit
  (`stopping` set + stop adopted-liveness)
- Mark `handoffInProgress: true` in the manifest so a host-service
  crash mid-handoff is debuggable
- On `upgrade-prepared { ok: true }`: probe successor version, swap
  manifest + instances entry to the new pid, restart adopted-liveness
  on the successor (we don't have a child handle for it)
- On failure: restore manifest, clear `stopping`, re-arm liveness on
  the still-alive predecessor

Distinct from `restart()` — restart kills sessions, update preserves them.
The renderer wires "Update" to this path and falls back to restart() only
on failure.
Wires the supervisor's `update()` to the renderer surface as a
mutation that mirrors `restart()` shape but preserves sessions on
success. The renderer keeps `restart` as the explicit force-restart
fallback; `update` is the new default UX.
Renderer changes for Phase 2's "preserve sessions across upgrades" UX:

- New primary "Update daemon" button when update is pending. Calls
  terminal.daemon.update; on success, sessions stay alive and a toast
  reports the version bump.
- Existing "Restart daemon" stays as the secondary, force-close path
  ("Force restart" when an update is pending). The existing confirmation
  dialog now nudges users toward Update for the non-destructive path.
- New "Update couldn't preserve sessions" failure dialog. Surfaces the
  reason from supervisor.update (snapshot write failed, successor never
  acked, transport error). Two paths from there: Force update (calls
  restart) or Cancel — keeps sessions alive on the predecessor.

This is decision D5 from the Phase 2 /decide walkthrough: surface
handoff failure to the user, let them choose force-update or cancel,
never silently destroy sessions on a soft failure.
End-to-end supervisor coverage at the integration level:

1. update() preserves live sessions across daemon binary swap
   - Spawn fresh daemon via supervisor
   - Open a session over the wire, capture the shell pid
   - Call sup.update(orgId)
   - Verify successor pid differs from predecessor, predecessor is
     dead, successor is alive
   - Reconnect to the (same) socket and assert the session is still
     listed with its original shell pid

2. update() returns ok:false with "no daemon running" when no instance
   exists — guards against the renderer firing update() before
   bootstrap completes.

Also fixes an AdoptedPty teardown leak: with `autoClose: false` on the
read/write streams, the inherited fd would keep the event loop alive
after the shell exits. AdoptedPty's onExit now destroys both streams
and closes the fd explicitly. (Without this, normal daemon shutdown
of an adopted-session daemon stalls waiting for stale handles to drain.)
Update is non-destructive — there's no reason to gate it behind an
`updatePending` flag. Even with no version difference, clicking Update
spawns a fresh successor daemon process that adopts existing sessions
via fd-handoff. Useful in dev for verifying handoff plumbing without
faking a version mismatch, and arguably better UX in prod too — Update
always works, Force restart is the explicit destructive opt-in.

The version-pending badge in the status row stays gated on
updatePending — that signal is meaningful UI; the button gate isn't.
node-pty's "posix_spawnp failed" message swallows errno entirely,
leaving callers with zero context for diagnosis. Pre-flight the
most-common failure mode (missing cwd) and annotate any other native
failure with the cwd + shell that triggered it.

Common cause: a workspace's worktreePath gets deleted (manual rm,
worktree pruned) but the workspace record stays in the local DB.
Opening a terminal in it sends the missing cwd to the daemon. With
this change, the renderer's terminal-error toast says "spawn: cwd
does not exist: /path/to/dead-worktree" instead of the generic native
error.
Phase 2 handoff worked end-to-end against source-direct integration
tests and Bun-built bundles, but failed in the deployed Electron build.
Root cause: both Bun's bundler AND electron-vite's esbuild statically
inline `process.env.<KEY>` references at build time, constant-fold the
unset-at-build-time comparison `undefined === "1"`, then dead-code-
eliminate the entire receiver branch — including the runHandoffReceiver
function. Bracket notation `process.env["KEY"]` doesn't help; both
bundlers see through it.

Manually verified end-to-end in a real dev build today: spawned 3
sessions, clicked Update three times in succession, all sessions
preserved (same shell PIDs, no flicker beyond the WS reconnect).

Changes:
- Mode signal moves from `SUPERSET_PTY_DAEMON_HANDOFF=1` env to
  `--handoff` argv flag. Snapshot path + socket path also move to argv
  (`--snapshot=...` `--socket=...`). argv is fully dynamic and survives
  every bundler we run.
- The desktop daemon entry shim at apps/desktop/src/main/pty-daemon/
  index.ts was an ENTIRELY SEPARATE copy of the package's old main.ts
  that electron-vite bundles independently — so it had none of the
  Phase 2 receiver code. Rewritten to mirror the package's main.ts with
  full handoff support.
- Snapshot helpers (clearSnapshot, readSnapshot, writeSnapshot) and
  related types are now exported from @superset/pty-daemon for the
  desktop shim to consume.
- Diagnostic stderr writes in both predecessor's prepareUpgrade and
  successor's runHandoffReceiver. Kept in production code: cheap, and
  this whole class of bug was invisible without them.

Test gap noted: integration tests ran the .ts source via
`node --experimental-strip-types` (no bundling) or the Bun-built
package output (different DCE behavior than esbuild). Neither covered
apps/desktop/dist/main/pty-daemon.js, where the actual production
failure mode lives. Follow-up: add a CI-runnable smoke that builds the
desktop bundle and greps for `runHandoffReceiver` + `--handoff`
markers, so future bundler regressions fail fast.
Phase 2's fd-handoff broke in the deployed Electron build because
both Bun's bundler and electron-vite's esbuild did dead-code-elimination
on `process.env.X === "Y"` checks at build time, stripping the handoff
receiver branch from `apps/desktop/dist/main/pty-daemon.js`. The bug
was invisible to existing tests (they ran source-direct or the Bun
bundle, never the desktop-shim bundle).

Adds a post-`compile:app` smoke that greps the bundle for runtime
markers Phase 2 depends on:
- `runHandoffReceiver` — the successor entry function
- `--handoff` (≥2 times) — argv flag at the spawn + receiver sites
- `upgrade-ack` — IPC handshake message type
- `adoptSnapshot` — Server's snapshot replay path
- `adoptFromFd` — Pty wrapping for an inherited fd

If any marker is missing, the build fails fast with a diagnostic
pointing at the runtime conditions that need to survive bundling.
This makes future bundler regressions visible at build time, not
manual-QA time.

Wired into apps/desktop's `compile:app` so it runs in production
builds and CI without an extra step. Standalone via
`bun run check:pty-daemon-bundle` for ad-hoc verification.
Adoption is the right behavior in production — host-service can crash
or restart and pick up its still-running daemon, preserving every PTY
session. In dev (bun dev), it's a footgun: a leftover detached daemon
from the previous bun dev session has the OLD bundle code, so devs
edit pty-daemon source, restart bun dev, and silently still talk to
the stale daemon. Code changes that ride only on the daemon binary
(handoff plumbing, snapshot writes, etc.) appear to do nothing.

NODE_ENV != "production" → kill any existing daemon for the org
(SIGTERM, escalate to SIGKILL after 1s if still alive) before
adopt-or-spawn. Manifest gets removed too. Next bootstrap always
spawns fresh from the latest bundle. Idempotent when no daemon is
running.

Production path is unchanged.
AdoptedPty.resize previously only updated meta cols/rows but left the
kernel-side window size unchanged — visible to users as stale dims
after dragging the terminal pane border on a session that was carried
across a daemon upgrade. node-pty issues TIOCSWINSZ via its native
binding, but adopted sessions don't go through that path.

Workaround without adding a native FFI dependency: spawn `stty` with
the master fd as its stdin. stty(1) issues TIOCSWINSZ on its own stdin
by default, so the master inherits the resize. One spawn per resize is
fine — resize is rare (xterm.js throttles, users only resize via
drag).

Best-effort: if stty isn't available or the syscall fails, meta stays
correct for next reattach; the kernel-side just stays stale until then.
When host-service boots and adopts a daemon whose runningVersion <
EXPECTED_DAEMON_VERSION (i.e. host-service was just upgraded but the
detached daemon is still on the old binary), kick off `supervisor.update()`
in the background. Sessions survive on success — the new daemon adopts
all PTY masters via fd-handoff. On failure, leave the old daemon
running and let the user retry via the Update button. No fallback to
force-restart; we never auto-destroy sessions.

Concurrency:
- New `updateInFlight` map coalesces concurrent `update()` calls per
  org. Both auto-update and a manual Update button click hit the same
  entry point — without this, two simultaneous calls would each try
  to handoff a daemon that's already mid-handoff. Second caller gets
  the cached promise and observes the same outcome.

Telemetry events: `pty_daemon_auto_update_attempt`, `pty_daemon_auto_update_ok`,
`pty_daemon_auto_update_failed { reason }`.

Constructor option `autoUpdate?: boolean` (default true) for tests
that intentionally adopt a stale daemon and don't want a real handoff
racing the assertion.

Tests:
- 3 unit tests for the in-flight cache (coalesce, recycle after settle,
  per-org separation)
- 1 real-spawn integration test: pin a daemon to v0.0.1, fresh
  supervisor adopts, asserts pid swap to a successor of the bundled
  version within 8s, predecessor dead within another 2s.
- Existing real-spawn tests now force NODE_ENV=production in the test
  fixture so the dev-mode-kill behavior added earlier doesn't preempt
  the adoption paths these tests assert.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 2, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Implements a Phase‑2 PTY‑daemon binary upgrade: snapshot + fd handoff, successor/adopter startup modes, protocol messages, snapshot I/O, PTY fd adoption, supervisor update API/auto‑update, renderer replay suppression, UI update flow, bundle smoke‑check, and end‑to‑end tests. (50 words)

Changes

Daemon upgrade + handoff (single DAG)

Layer / File(s) Summary
Protocol & Data Shapes
packages/pty-daemon/src/protocol/messages.ts, packages/pty-daemon/src/protocol/handoff.ts, packages/pty-daemon/src/SessionStore/snapshot.ts, packages/host-service/src/daemon/manifest.ts
Adds prepare-upgrade / upgrade-prepared, upgrade-ack / upgrade-nak, snapshot format (SNAPSHOT_VERSION, SerializedSession, HandoffSnapshot), snapshot I/O (serialize/read/write/clear), and manifest handoff fields (handoffInProgress, handoffSnapshotPath, handoffSuccessorPid).
PTY API & Adoption
packages/pty-daemon/src/Pty/Pty.ts, packages/pty-daemon/src/Pty/index.ts
Adds Pty.getMasterFd(), AdoptedPty implementation, adoptFromFd() and AdoptOptions; NodePtyAdapter exposes private fd, spawn validates cwd and fd availability, and adopted PTYs wrap inherited master fds.
Server: handoff orchestration
packages/pty-daemon/src/Server/Server.ts, packages/pty-daemon/src/main.ts, apps/desktop/src/main/pty-daemon/index.ts
Adds prepareUpgrade(), finalizeHandoff(), adoptSnapshot(), listenWithRetry(), waitForHandoffAck(), changes close() to close({killSessions?:boolean}), and splits entrypoint into fresh vs --handoff receiver flows that read/clear snapshots and coordinate IPC ack/nak.
Client & Supervisor coordination
packages/host-service/src/terminal/DaemonClient/DaemonClient.ts, packages/host-service/src/daemon/DaemonSupervisor.ts
Adds DaemonClient.prepareUpgrade() and DaemonSupervisor.update(orgId) with per-org in‑flight coalescing, manifest handoff marking, predecessor‑exit gating (waitForPidExit), instance swap and manifest update, autoUpdate option/auto kickoff, and background liveness polling.
Host API surface
packages/host-service/src/trpc/router/terminal/terminal.ts, packages/host-service/src/trpc/router/terminal/terminal.daemon.test.ts
Adds daemon.update protected mutation delegating to supervisor.update and tests verifying success/failure propagation.
Renderer replay control
packages/host-service/src/terminal/terminal.ts, apps/desktop/src/renderer/lib/terminal/terminal-ws-transport.ts, packages/host-service/src/terminal/terminal.adoption.node-test.ts
Adds replayOnAdoption option to terminal session creation (default true), reads ?replay=0 from WS to disable replay, and makes renderer transport append replay=0 after first open to suppress subsequent replay on reconnect; adds E2E test verifying suppression.
UI: Desktop settings & bundle check
apps/desktop/src/renderer/.../V2SessionsSection.tsx, apps/desktop/package.json, apps/desktop/scripts/check-pty-daemon-bundle.ts
Adds “Update daemon” flow invoking terminal.daemon.update, captures soft failure reason to show AlertDialog with “Force update” fallback; adds post-build smoke-check script check-pty-daemon-bundle.ts and new npm script check:pty-daemon-bundle (and updated compile:app to run the check).
Public exports
packages/pty-daemon/src/protocol/index.ts, packages/pty-daemon/src/SessionStore/index.ts, packages/pty-daemon/src/index.ts, packages/pty-daemon/src/Pty/index.ts
Re-exports handoff/protocol and snapshot types/functions and adoptFromFd/AdoptOptions, expanding the module public API.
Tests & Integration
packages/pty-daemon/test/*, packages/pty-daemon/src/*/test.ts, packages/host-service/src/*/test.ts, apps/desktop/*
Adds/updates unit and integration tests covering snapshot serialization, framing, cross‑process handoff integration, PTY fd validation/adoption, supervisor update concurrency and node tests, tRPC delegation, WS/replay behavior, and desktop bundle check invocation.
Version bump / expectations
packages/pty-daemon/package.json, packages/host-service/src/daemon/expected-version.ts
Bumps pty‑daemon package to 0.2.0 and updates EXPECTED_DAEMON_VERSION to "0.2.0".

Sequence Diagrams

sequenceDiagram
    participant UI as Desktop UI
    participant Supervisor as DaemonSupervisor
    participant Client as DaemonClient
    participant Pred as Predecessor Daemon
    participant Succ as Successor Daemon

    UI->>Supervisor: update(orgId)
    activate Supervisor
    Supervisor->>Supervisor: coalesce concurrent calls, mark handoffInProgress
    Supervisor->>Client: connect(socket) / prepareUpgrade()
    activate Client
    Client->>Pred: "prepare-upgrade"
    activate Pred
    Pred->>Pred: serializeSessions & writeSnapshot
    Pred->>Succ: spawn with --handoff + inherited master fds
    Pred->>Succ: pass snapshotPath + socket via IPC/stdio
    Pred->>Pred: wait for upgrade-ack/nak or timeout
    activate Succ
    Succ->>Succ: readSnapshot, adoptSnapshot()
    alt adoption ok
        Succ->>Pred: upgrade-ack(successorPid)
    else adoption fail
        Succ->>Pred: upgrade-nak(reason)
        Succ->>Succ: exit
    end
    deactivate Succ
    Pred->>Pred: finalizeHandoff & exit (on ack)
    Client-->>Supervisor: { ok:true, successorPid } or { ok:false, reason }
    deactivate Client
    Supervisor->>Supervisor: swap instance tracking, start liveness polling
    Supervisor-->>UI: result
    deactivate Supervisor
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • AviPeltz

Poem

🐰 I hopped through bytes and file,

Passing fds with nimble style,
Snapshots tucked, successors wake,
Sessions hold — no break, no quake,
Handoff done, the rabbits smile.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 11.36% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main objective of the PR: implementing fd-handoff so PTY sessions survive daemon binary upgrades. It is specific, clear, and reflects the primary change.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The pull request includes a comprehensive description covering summary, how it works, decisions, commits, test plan, and out-of-scope items.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch cake-saguaro

Tip

💬 Introducing [Slack Agent](https://www.coderabbit.ai/agent): Turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get your free trial and get 200 agent minutes per Slack user (a $50 value).


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented May 2, 2026

Greptile Summary

This PR implements Phase 2 of the pty-daemon migration: PTY sessions survive daemon binary upgrades via fd-handoff — the predecessor writes a snapshot, spawns the successor with inherited PTY master fds over stdio, awaits an IPC ack, then exits cleanly. Two P1 issues need attention before merging:

  • Successor reports the wrong version after handoff. prepareUpgrade() spawns the successor with env: process.env, which carries the predecessor's pinned SUPERSET_PTY_DAEMON_VERSION. The successor reads that env var and advertises the old version in its hello-ack. The version probe in DaemonSupervisor.runUpdate() therefore records the predecessor's version on the successor instance, setting updatePending: true. On every subsequent host-service restart, tryAdopt re-probes, sees the stale version, and kickoffAutoUpdate fires again — creating an infinite handoff loop. Fix: delete SUPERSET_PTY_DAEMON_VERSION from the env passed to the successor so it falls back to readPackageVersion().

  • Version probe races against predecessor teardown. probeDaemonVersion is called immediately after prepareUpgrade() resolves, but the predecessor's finalizeHandoff() is only scheduled via setImmediate — the predecessor is still listening when the probe connects. The probe may return the old version or null rather than the successor's. Adding a short waitForSocket poll before probing would guarantee the connection lands on the successor.

Confidence Score: 3/5

Not safe to merge as-is — the env-var inheritance bug causes infinite auto-update loops after every host-service restart following a handoff.

Two P1 findings: the SUPERSET_PTY_DAEMON_VERSION inheritance creates an infinite auto-update loop, and the version probe timing races against the predecessor. Both stem from the same handoff sequence in Server.ts and DaemonSupervisor.ts. The fd-handoff architecture itself is sound and the broader session-survival goal works correctly.

packages/pty-daemon/src/Server/Server.ts (env forwarding in prepareUpgrade), packages/host-service/src/daemon/DaemonSupervisor.ts (version probe timing in runUpdate)

Important Files Changed

Filename Overview
packages/pty-daemon/src/Server/Server.ts Core handoff sender: spawns successor with env: process.env, propagating the predecessor's pinned SUPERSET_PTY_DAEMON_VERSION to the successor and causing it to advertise the wrong version post-handoff; also leaves snapshot files on disk after failure.
packages/host-service/src/daemon/DaemonSupervisor.ts New update()/runUpdate() orchestration with coalescing, manifest tagging, and liveness handover — well-structured, but version probe races against predecessor socket and will record stale version data on the successor instance.
packages/pty-daemon/src/main.ts Handoff receiver entrypoint: correctly uses argv flag to avoid bundler DCE, handles disconnect-wait with 1s fallback, and clears snapshot on success. Version inherited from env (the P1 source).
packages/pty-daemon/src/Pty/Pty.ts Adds getMasterFd() to NodePtyAdapter (private _fd accessor with version-guard assertion) and new AdoptedPty class with stty-based resize workaround; implementation is sound with appropriate best-effort fallbacks.
packages/pty-daemon/src/SessionStore/snapshot.ts Atomic snapshot write (tmp+rename), typed reader with version guard and shape validator, clearSnapshot helper — clean and correct.
packages/host-service/src/daemon/manifest.ts Adds Phase 2 optional fields to manifest schema with graceful parse-and-drop for unknown values; handoffSnapshotPath and handoffSuccessorPid are declared and parsed but never written during the actual handoff flow.
packages/host-service/src/terminal/DaemonClient/DaemonClient.ts Adds prepareUpgrade() with generous 15s timeout and proper error routing; clean extension of existing request/response pattern.
apps/desktop/scripts/check-pty-daemon-bundle.ts Post-build bundle integrity check that greps for 5 handoff markers; well-targeted regression guard against bundler DCE of the handoff path.
apps/desktop/src/main/pty-daemon/index.ts Desktop shim mirrors package's main.ts logic for electron-vite build; uses hardcoded "0.1.0" fallback version instead of readPackageVersion(), minor divergence from the package entrypoint.
packages/host-service/src/trpc/router/terminal/terminal.ts Adds update tRPC mutation that delegates to supervisor.update(); correctly awaits daemon-ready before acting.

Sequence Diagram

sequenceDiagram
    participant SUP as DaemonSupervisor
    participant A as Daemon A (predecessor)
    participant B as Daemon B (successor)
    participant FS as Filesystem

    SUP->>A: connect + send prepare-upgrade
    A->>FS: writeSnapshot(sessions, fdIndexes)
    A->>B: spawn(--handoff, --snapshot, --socket, stdio=[ipc, fd4..fdN])
    B->>FS: readSnapshot()
    B->>B: adoptFromFd() per session
    B-->>A: IPC upgrade-ack {successorPid}
    A-->>SUP: upgrade-prepared {ok:true, successorPid}
    Note over A: setImmediate → finalizeHandoff()
    SUP->>A: probeDaemonVersion(socketPath) ⚠️ may still hit A
    A->>A: server.close() + unlink socket
    A->>A: process.exit(0)
    Note over B: IPC disconnect received
    B->>B: server.listenWithRetry()
    B->>FS: clearSnapshot()
    SUP->>B: startAdoptedLivenessCheck(successorPid)
    Note over B: Sessions live, same shell PIDs
Loading
Prompt To Fix All With AI
Fix the following 4 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 4
packages/pty-daemon/src/Server/Server.ts:189-203
**Successor inherits predecessor's `SUPERSET_PTY_DAEMON_VERSION`, causing infinite auto-update loop**

`prepareUpgrade()` spawns the successor with `env: process.env`, which includes the predecessor's pinned `SUPERSET_PTY_DAEMON_VERSION`. The successor reads this env var in `runHandoffReceiver()` and reports the old version in its `hello-ack`. As a result, the version probe in `DaemonSupervisor.runUpdate()` returns the predecessor's version for the successor, setting `successorInstance.updatePending = true`. On the next host-service restart, `tryAdopt` probes the socket, sees the still-stale version, and `kickoffAutoUpdate` fires again — this repeats indefinitely across restarts.

The fix is to strip or override the version pin when spawning the successor so it falls back to `readPackageVersion()` on the new binary:

```
const successorEnv = { ...process.env };
delete successorEnv["SUPERSET_PTY_DAEMON_VERSION"];

const child = childProcess.spawn(
    process.execPath,
    [...],
    {
        stdio,
        env: successorEnv,
        detached: false,
    },
);
```

### Issue 2 of 4
packages/host-service/src/daemon/DaemonSupervisor.ts:274-278
**Version probe may connect to predecessor, not successor, recording stale version**

`probeDaemonVersion(instance.socketPath, ...)` is called immediately after `prepareUpgrade()` returns. At that instant the predecessor is still alive — its `finalizeHandoff()` is only scheduled via `setImmediate`, meaning the predecessor's server is still listening and the successor hasn't bound yet. If the probe connects to the (still-alive) predecessor, it returns the old version, setting `successorInstance.updatePending = true` erroneously. Combined with the env-var inheritance bug in `Server.ts`, this makes the stale-version state sticky across every probe. A short `waitForSocket` poll before probing would guarantee the connection lands on the successor.

### Issue 3 of 4
packages/host-service/src/daemon/manifest.ts:26-31
**`handoffSnapshotPath` and `handoffSuccessorPid` are declared but never written**

`PtyDaemonManifest` defines `handoffSnapshotPath` and `handoffSuccessorPid`, and `readPtyDaemonManifest` parses them, but `DaemonSupervisor.runUpdate()` only ever writes `handoffInProgress: true` — neither field is populated during the actual handoff. A host-service crash mid-handoff therefore gets no extra recovery hints from the manifest beyond the boolean flag. These fields are either dead code from the design phase or an incomplete implementation.

### Issue 4 of 4
packages/pty-daemon/src/Server/Server.ts:212-222
**Snapshot file not deleted after a failed handoff**

When `waitForHandoffAck` returns `{ok: false}`, the code kills the child but explicitly leaves `snapshotPath` in place ("for diagnostics"). There is no cleanup on any subsequent upgrade attempt, and no TTL. Repeated handoff failures will accumulate `pty-daemon-handoff-*.json` files in `/tmp`. A best-effort `fs.rm` on the old snapshot at the top of `prepareUpgrade()` (or at the start of `runUpdate()`) would cap the footprint.

Reviews (1): Last reviewed commit: "feat(host-service): auto-update daemon o..." | Re-trigger Greptile

Comment on lines +189 to +203
const child = childProcess.spawn(
process.execPath,
[
...process.execArgv,
scriptPath,
"--handoff",
`--snapshot=${snapshotPath}`,
`--socket=${this.opts.socketPath}`,
],
{
stdio,
env: process.env,
detached: false,
},
);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Successor inherits predecessor's SUPERSET_PTY_DAEMON_VERSION, causing infinite auto-update loop

prepareUpgrade() spawns the successor with env: process.env, which includes the predecessor's pinned SUPERSET_PTY_DAEMON_VERSION. The successor reads this env var in runHandoffReceiver() and reports the old version in its hello-ack. As a result, the version probe in DaemonSupervisor.runUpdate() returns the predecessor's version for the successor, setting successorInstance.updatePending = true. On the next host-service restart, tryAdopt probes the socket, sees the still-stale version, and kickoffAutoUpdate fires again — this repeats indefinitely across restarts.

The fix is to strip or override the version pin when spawning the successor so it falls back to readPackageVersion() on the new binary:

const successorEnv = { ...process.env };
delete successorEnv["SUPERSET_PTY_DAEMON_VERSION"];

const child = childProcess.spawn(
    process.execPath,
    [...],
    {
        stdio,
        env: successorEnv,
        detached: false,
    },
);
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/pty-daemon/src/Server/Server.ts
Line: 189-203

Comment:
**Successor inherits predecessor's `SUPERSET_PTY_DAEMON_VERSION`, causing infinite auto-update loop**

`prepareUpgrade()` spawns the successor with `env: process.env`, which includes the predecessor's pinned `SUPERSET_PTY_DAEMON_VERSION`. The successor reads this env var in `runHandoffReceiver()` and reports the old version in its `hello-ack`. As a result, the version probe in `DaemonSupervisor.runUpdate()` returns the predecessor's version for the successor, setting `successorInstance.updatePending = true`. On the next host-service restart, `tryAdopt` probes the socket, sees the still-stale version, and `kickoffAutoUpdate` fires again — this repeats indefinitely across restarts.

The fix is to strip or override the version pin when spawning the successor so it falls back to `readPackageVersion()` on the new binary:

```
const successorEnv = { ...process.env };
delete successorEnv["SUPERSET_PTY_DAEMON_VERSION"];

const child = childProcess.spawn(
    process.execPath,
    [...],
    {
        stdio,
        env: successorEnv,
        detached: false,
    },
);
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +274 to +278
const probedVersion = await probeDaemonVersion(
instance.socketPath,
VERSION_PROBE_TIMEOUT_MS,
);
const runningVersion = probedVersion ?? "unknown";
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Version probe may connect to predecessor, not successor, recording stale version

probeDaemonVersion(instance.socketPath, ...) is called immediately after prepareUpgrade() returns. At that instant the predecessor is still alive — its finalizeHandoff() is only scheduled via setImmediate, meaning the predecessor's server is still listening and the successor hasn't bound yet. If the probe connects to the (still-alive) predecessor, it returns the old version, setting successorInstance.updatePending = true erroneously. Combined with the env-var inheritance bug in Server.ts, this makes the stale-version state sticky across every probe. A short waitForSocket poll before probing would guarantee the connection lands on the successor.

Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/host-service/src/daemon/DaemonSupervisor.ts
Line: 274-278

Comment:
**Version probe may connect to predecessor, not successor, recording stale version**

`probeDaemonVersion(instance.socketPath, ...)` is called immediately after `prepareUpgrade()` returns. At that instant the predecessor is still alive — its `finalizeHandoff()` is only scheduled via `setImmediate`, meaning the predecessor's server is still listening and the successor hasn't bound yet. If the probe connects to the (still-alive) predecessor, it returns the old version, setting `successorInstance.updatePending = true` erroneously. Combined with the env-var inheritance bug in `Server.ts`, this makes the stale-version state sticky across every probe. A short `waitForSocket` poll before probing would guarantee the connection lands on the successor.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +26 to 31
handoffInProgress?: boolean;
/** Path of the on-disk handoff snapshot the successor will read. */
handoffSnapshotPath?: string;
/** PID of the spawned successor; pre-bind, supervisor uses this to track. */
handoffSuccessorPid?: number;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 handoffSnapshotPath and handoffSuccessorPid are declared but never written

PtyDaemonManifest defines handoffSnapshotPath and handoffSuccessorPid, and readPtyDaemonManifest parses them, but DaemonSupervisor.runUpdate() only ever writes handoffInProgress: true — neither field is populated during the actual handoff. A host-service crash mid-handoff therefore gets no extra recovery hints from the manifest beyond the boolean flag. These fields are either dead code from the design phase or an incomplete implementation.

Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/host-service/src/daemon/manifest.ts
Line: 26-31

Comment:
**`handoffSnapshotPath` and `handoffSuccessorPid` are declared but never written**

`PtyDaemonManifest` defines `handoffSnapshotPath` and `handoffSuccessorPid`, and `readPtyDaemonManifest` parses them, but `DaemonSupervisor.runUpdate()` only ever writes `handoffInProgress: true` — neither field is populated during the actual handoff. A host-service crash mid-handoff therefore gets no extra recovery hints from the manifest beyond the boolean flag. These fields are either dead code from the design phase or an incomplete implementation.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +212 to +222
const result = await waitForHandoffAck(child);
if (!result.ok) {
// Successor never acked. Leave snapshot for diagnostics; the
// supervisor will clean stale handoff state on next ensure().
try {
child.kill("SIGKILL");
} catch {
// already dead, ignore
// already gone
}
return result;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Snapshot file not deleted after a failed handoff

When waitForHandoffAck returns {ok: false}, the code kills the child but explicitly leaves snapshotPath in place ("for diagnostics"). There is no cleanup on any subsequent upgrade attempt, and no TTL. Repeated handoff failures will accumulate pty-daemon-handoff-*.json files in /tmp. A best-effort fs.rm on the old snapshot at the top of prepareUpgrade() (or at the start of runUpdate()) would cap the footprint.

Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/pty-daemon/src/Server/Server.ts
Line: 212-222

Comment:
**Snapshot file not deleted after a failed handoff**

When `waitForHandoffAck` returns `{ok: false}`, the code kills the child but explicitly leaves `snapshotPath` in place ("for diagnostics"). There is no cleanup on any subsequent upgrade attempt, and no TTL. Repeated handoff failures will accumulate `pty-daemon-handoff-*.json` files in `/tmp`. A best-effort `fs.rm` on the old snapshot at the top of `prepareUpgrade()` (or at the start of `runUpdate()`) would cap the footprint.

How can I resolve this? If you propose a fix, please make it concise.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 2, 2026

🧹 Preview Cleanup Complete

The following preview resources have been cleaned up:

  • ✅ Neon database branch

Thank you for your contribution! 🎉

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

🧹 Nitpick comments (1)
packages/host-service/src/trpc/router/terminal/terminal.daemon.test.ts (1)

119-140: ⚡ Quick win

Block ensure() in this test so it proves the bootstrap ordering.

ensureMock resolves immediately here, so this only checks org-id wiring. If terminal.daemon.update() stopped awaiting bootstrap, the test would still pass. Holding ensureMock unresolved until after asserting updateMock has not run would cover the intended contract.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/host-service/src/trpc/router/terminal/terminal.daemon.test.ts`
around lines 119 - 140, Replace the immediate-resolving ensureMock with a
deferred (pending) promise so the test proves bootstrap ordering: create a
deferred promise and set ensureMock to return that promise, start the call to
terminal.daemon.update() without awaiting it (or call and keep its promise),
assert updateMock has not been called yet, then resolve the deferred promise so
ensure() completes, await the update call result, and finally assert updateMock
was called with "00000000-0000-4000-8000-000000000000" and the returned value
matches; reference ensureMock, updateMock, getSupervisor, makeCaller, and
terminal.daemon.update in the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/desktop/src/main/pty-daemon/index.ts`:
- Around line 50-58: The current parsing for --buffer-bytes (the branch where
arg.startsWith("--buffer-bytes=")) uses Number.parseInt(raw) which accepts
prefixes like "64kb" or "1e6"; update the validation to require the entire token
to be an integer string (e.g., validate raw against /^\d+$/) before converting,
then parse (Number.parseInt or Number) and keep the existing checks
(Number.isFinite(parsed) && parsed > 0) and assignment to args.bufferBytes to
reject partially numeric values.
- Around line 153-157: The call to clearSnapshot(snapshotPath) after await
server.listenWithRetry() can throw non-ENOENT filesystem errors and must not
cause the daemon to exit; wrap the clearSnapshot(...) invocation in a try/catch
inside main (the block following server.listenWithRetry()), catch any error that
is not an ENOENT, log it as a warning (use the same logger used for log(...) or
processLogger), and swallow the error so startup continues; retain existing
behavior for ENOENT but ensure main().catch() no longer receives cleanup errors
from clearSnapshot.

In `@packages/host-service/src/daemon/DaemonSupervisor.ts`:
- Around line 129-141: The current Map updateInFlight only deduplicates update()
calls; restart() and stop() can race with runUpdate() handoff and must be
serialized per orgId. Fix by using the same per-org lock for all lifecycle
mutations: either (A) extend updateInFlight to represent any in-flight lifecycle
operation and have restart(orgId)/stop(orgId) await the existing Promise before
proceeding, or (B) introduce a shared per-org lock (e.g., lifecycleInFlight Map
keyed by orgId) and make update(), restart(), and stop() acquire that lock
(await existing entry or set a new Promise) so only one mutation runs at a time;
ensure runUpdate() clears the lock on completion or error and that
restart()/stop() either await or cancel the in-flight update before sending
signals (refer to updateInFlight, update(), restart(), stop(), and runUpdate()).

In `@packages/pty-daemon/src/main.ts`:
- Around line 119-137: The snapshot read is happening outside the handoff
try/catch so read failures never send an "upgrade-nak"; wrap or move the call to
readSnapshot(snapshotPath) into the same error-handling flow that calls
server.adoptSnapshot(snapshot) (or add a surrounding try/catch) so any error
from readSnapshot or adoptSnapshot is caught, logged via log(...), and results
in sending the HandoffMessage { type: "upgrade-nak", reason: `adopt failed:
${err.message}` } followed by the existing delayed process.exit; reference
readSnapshot, Server, adoptSnapshot and the existing NAK creation/processing
block to implement this change.

In `@packages/pty-daemon/src/Pty/Pty.ts`:
- Around line 46-59: getMasterFd() currently throws unconditionally if the
private _fd is missing which will break every spawn() call when node-pty
changes; change the implementation to provide a safe feature-detection/fallback:
add a non-throwing probe (e.g. hasMasterFd() or have getMasterFd() return number
| undefined) that checks (this.term as any)._fd for a valid integer and returns
undefined instead of throwing, then update spawn() to call that probe and only
attempt fd-handoff when a valid fd is present (or fall back to the existing
non-handoff code path), and ensure all callers of getMasterFd()/hasMasterFd()
handle the undefined case gracefully rather than assuming _fd exists.
- Around line 177-178: The adopted shell can exit before onExit is registered
and writes can target destroyed streams; add an exitFired boolean and an
exitResult (e.g., { code, signal }) on the Pty instance, set them when the
underlying reader/writer emit 'close'/'exit' (where reader/writer are created in
the constructor), update onExit to immediately invoke the subscriber with
exitResult if exitFired is already true (replaying the notification), and guard
the write(...) method to no-op (or fail fast) when exitFired is true so no
writes go to destroyed streams. Ensure the state is set exactly once when the
streams close and that all places that register exit listeners (onExit) check
exitFired and use the cached exitResult.

In `@packages/pty-daemon/src/Server/Server.ts`:
- Around line 83-98: listenWithRetry is retrying by calling listen(), but
listen() always unlinks this.opts.socketPath which can remove a live
predecessor's socket during a handoff; change listenWithRetry to call a new
bind-only path that skips stale-socket cleanup on retries. Add a new method
(e.g., bindOnly or listenBind) alongside listen that performs the bind without
unlinking this.opts.socketPath or calling unlinkSync, and have listenWithRetry
call that bind-only method inside the retry loop while preserving the original
listen() behavior for the non-retry path; ensure errors other than EADDRINUSE
are rethrown and lastErr is propagated as before.

In `@packages/pty-daemon/src/SessionStore/snapshot.ts`:
- Around line 106-126: The isHandoffSnapshot type guard is too permissive:
tighten it to fully validate each session's SessionMeta shape and ensure numeric
fields are sane before accepting a snapshot. Update isHandoffSnapshot to check
session.id is a non-empty string, session.pid and session.fdIndex are safe
integers within expected ranges (e.g., Number.isSafeInteger and pid > 0, fdIndex
>= 0 and within a reasonable max), and explicitly validate required keys and
types on session.meta to match the SessionMeta interface (check all required
properties and their types). This prevents readSnapshot() from returning
corrupted data that later causes adoptSnapshot() to fail; reference the
isHandoffSnapshot function, HandoffSnapshot type, SessionMeta shape, and the
downstream readSnapshot/adoptSnapshot flow when making the changes.

In `@packages/pty-daemon/test/handoff.test.ts`:
- Around line 132-137: After sending SIGTERM to the successor
(process.kill(successorPid, "SIGTERM")) in the after() teardown, wait for the
PID to actually disappear before finishing the test: implement an async wait
loop that repeatedly checks process existence (e.g., using
process.kill(successorPid, 0) inside try/catch) with a short delay, a total
timeout, and a fallback (send SIGKILL if the process still exists after the
timeout); make the after() hook async and await this wait routine so inherited
fds/sockets are not leaked into later tests.

---

Nitpick comments:
In `@packages/host-service/src/trpc/router/terminal/terminal.daemon.test.ts`:
- Around line 119-140: Replace the immediate-resolving ensureMock with a
deferred (pending) promise so the test proves bootstrap ordering: create a
deferred promise and set ensureMock to return that promise, start the call to
terminal.daemon.update() without awaiting it (or call and keep its promise),
assert updateMock has not been called yet, then resolve the deferred promise so
ensure() completes, await the update call result, and finally assert updateMock
was called with "00000000-0000-4000-8000-000000000000" and the returned value
matches; reference ensureMock, updateMock, getSupervisor, makeCaller, and
terminal.daemon.update in the change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d6670384-f3ec-408c-aebe-bfab9db8596e

📥 Commits

Reviewing files that changed from the base of the PR and between 58a3f7e and a1eeb2c.

📒 Files selected for processing (28)
  • apps/desktop/package.json
  • apps/desktop/scripts/check-pty-daemon-bundle.ts
  • apps/desktop/src/main/pty-daemon/index.ts
  • apps/desktop/src/renderer/routes/_authenticated/settings/terminal/components/TerminalSettings/components/V2SessionsSection/V2SessionsSection.tsx
  • packages/host-service/src/daemon/DaemonSupervisor.node-test.ts
  • packages/host-service/src/daemon/DaemonSupervisor.test.ts
  • packages/host-service/src/daemon/DaemonSupervisor.ts
  • packages/host-service/src/daemon/manifest.test.ts
  • packages/host-service/src/daemon/manifest.ts
  • packages/host-service/src/terminal/DaemonClient/DaemonClient.ts
  • packages/host-service/src/trpc/router/terminal/terminal.daemon.test.ts
  • packages/host-service/src/trpc/router/terminal/terminal.ts
  • packages/pty-daemon/src/Pty/Pty.ts
  • packages/pty-daemon/src/Pty/index.ts
  • packages/pty-daemon/src/Server/Server.ts
  • packages/pty-daemon/src/SessionStore/SessionStore.test.ts
  • packages/pty-daemon/src/SessionStore/index.ts
  • packages/pty-daemon/src/SessionStore/snapshot.test.ts
  • packages/pty-daemon/src/SessionStore/snapshot.ts
  • packages/pty-daemon/src/handlers/handlers.test.ts
  • packages/pty-daemon/src/index.ts
  • packages/pty-daemon/src/main.ts
  • packages/pty-daemon/src/protocol/handoff.test.ts
  • packages/pty-daemon/src/protocol/handoff.ts
  • packages/pty-daemon/src/protocol/index.ts
  • packages/pty-daemon/src/protocol/messages.ts
  • packages/pty-daemon/test/handoff.test.ts
  • packages/pty-daemon/test/integration.test.ts

Comment on lines +50 to +58
} else if (arg.startsWith("--buffer-bytes=")) {
const raw = arg.slice("--buffer-bytes=".length);
const parsed = Number.parseInt(raw, 10);
if (!Number.isFinite(parsed) || parsed <= 0) {
throw new Error(
`--buffer-bytes must be a positive integer, got: ${raw}`,
);
}
args.bufferBytes = parsed;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Reject partially numeric --buffer-bytes values.

Number.parseInt() will accept prefixes here, so inputs like --buffer-bytes=64kb or --buffer-bytes=1e6 are treated as valid and silently configure the wrong buffer size. Validate the whole token before converting it.

Suggested fix
 		} else if (arg.startsWith("--buffer-bytes=")) {
 			const raw = arg.slice("--buffer-bytes=".length);
-			const parsed = Number.parseInt(raw, 10);
-			if (!Number.isFinite(parsed) || parsed <= 0) {
+			if (!/^[1-9]\d*$/.test(raw)) {
 				throw new Error(
 					`--buffer-bytes must be a positive integer, got: ${raw}`,
 				);
 			}
+			const parsed = Number(raw);
 			args.bufferBytes = parsed;
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/desktop/src/main/pty-daemon/index.ts` around lines 50 - 58, The current
parsing for --buffer-bytes (the branch where arg.startsWith("--buffer-bytes="))
uses Number.parseInt(raw) which accepts prefixes like "64kb" or "1e6"; update
the validation to require the entire token to be an integer string (e.g.,
validate raw against /^\d+$/) before converting, then parse (Number.parseInt or
Number) and keep the existing checks (Number.isFinite(parsed) && parsed > 0) and
assignment to args.bufferBytes to reject partially numeric values.

Comment on lines +153 to +157
await server.listenWithRetry();
log(`bound and listening`);

clearSnapshot(snapshotPath);
wireShutdown(server);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Inspect clearSnapshot/readSnapshot implementations and callers.
fd 'snapshot\.(ts|js)$' packages apps | xargs -r rg -n -C3 '\b(clearSnapshot|readSnapshot)\b'

Repository: superset-sh/superset

Length of output: 1130


🏁 Script executed:

# Get full clearSnapshot implementation and surrounding context
cat -n packages/pty-daemon/src/SessionStore/snapshot.ts | sed -n '98,120p'

Repository: superset-sh/superset

Length of output: 1005


🏁 Script executed:

# Check if the file in the review exists and view lines 153-157
if [ -f apps/desktop/src/main/pty-daemon/index.ts ]; then
  cat -n apps/desktop/src/main/pty-daemon/index.ts | sed -n '150,160p'
else
  echo "File not found at apps/desktop/src/main/pty-daemon/index.ts"
  # Search for similar path patterns
  fd 'index\.ts$' apps/desktop | head -20
fi

Repository: superset-sh/superset

Length of output: 362


🏁 Script executed:

# Also search for all usages of clearSnapshot to understand call patterns
rg -n 'clearSnapshot' apps desktop --type ts --type tsx

Repository: superset-sh/superset

Length of output: 92


🏁 Script executed:

# View the main function to understand error handling context
cat -n apps/desktop/src/main/pty-daemon/index.ts | sed -n '1,50p'

Repository: superset-sh/superset

Length of output: 2393


🏁 Script executed:

# Look for main function definition and catch handler
rg -n 'main|catch' apps/desktop/src/main/pty-daemon/index.ts | head -20

Repository: superset-sh/superset

Length of output: 413


🏁 Script executed:

# View the broader context of the main function to see try/catch structure
cat -n apps/desktop/src/main/pty-daemon/index.ts | sed -n '67,170p'

Repository: superset-sh/superset

Length of output: 3995


🏁 Script executed:

# View the top-level catch handler
cat -n apps/desktop/src/main/pty-daemon/index.ts | sed -n '175,185p'

Repository: superset-sh/superset

Length of output: 410


Wrap post-bind snapshot cleanup to prevent fatal exit on cleanup errors.

The clearSnapshot() call at line 156 is not wrapped in error handling. While clearSnapshot() has internal try/catch logic for ENOENT, it will still throw on other filesystem errors (permissions, I/O failures). Since the socket is already successfully bound and listening (line 153) and sessions have been adopted (line 122), a cleanup failure here causes the daemon to exit fatally via main().catch() despite a successful handoff. Wrap this call in try/catch, log any non-ENOENT errors as warnings, and continue startup.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/desktop/src/main/pty-daemon/index.ts` around lines 153 - 157, The call
to clearSnapshot(snapshotPath) after await server.listenWithRetry() can throw
non-ENOENT filesystem errors and must not cause the daemon to exit; wrap the
clearSnapshot(...) invocation in a try/catch inside main (the block following
server.listenWithRetry()), catch any error that is not an ENOENT, log it as a
warning (use the same logger used for log(...) or processLogger), and swallow
the error so startup continues; retain existing behavior for ENOENT but ensure
main().catch() no longer receives cleanup errors from clearSnapshot.

Comment on lines +129 to +141
/**
* In-flight `update()` promises per orgId. Both auto-update (on adopt
* with version drift) and manual update via the renderer hit the same
* supervisor.update() entry point — without this guard, two concurrent
* calls would both try to handoff a daemon that's already mid-handoff.
* The second caller returns the cached promise and gets the same result.
*/
private readonly updateInFlight = new Map<
string,
Promise<
{ ok: true; successorPid: number } | { ok: false; reason: string }
>
>();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Serialize restart() / stop() with handoff too.

updateInFlight only dedupes update() callers. It does not stop a concurrent restart() or stop() from killing the predecessor while runUpdate() is mid-handoff, which can turn the “preserve sessions” path into a destructive race or leave supervisor state half-swapped. Please put all per-org lifecycle mutations behind the same lock, or have restart() / stop() await or cancel the in-flight update before sending signals.

Also applies to: 199-215

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/host-service/src/daemon/DaemonSupervisor.ts` around lines 129 - 141,
The current Map updateInFlight only deduplicates update() calls; restart() and
stop() can race with runUpdate() handoff and must be serialized per orgId. Fix
by using the same per-org lock for all lifecycle mutations: either (A) extend
updateInFlight to represent any in-flight lifecycle operation and have
restart(orgId)/stop(orgId) await the existing Promise before proceeding, or (B)
introduce a shared per-org lock (e.g., lifecycleInFlight Map keyed by orgId) and
make update(), restart(), and stop() acquire that lock (await existing entry or
set a new Promise) so only one mutation runs at a time; ensure runUpdate()
clears the lock on completion or error and that restart()/stop() either await or
cancel the in-flight update before sending signals (refer to updateInFlight,
update(), restart(), stop(), and runUpdate()).

Comment thread packages/pty-daemon/src/main.ts Outdated
Comment on lines +46 to +59
getMasterFd(): number {
// node-pty 1.1.x exposes the master fd as the private property `_fd`.
// Pinned to "1.1.0" in package.json so a future bump can't break this
// silently — assert here so a missing/changed field surfaces at the
// first spawn, not when the user clicks "Update" months later.
const fd = (this.term as unknown as { _fd?: unknown })._fd;
if (typeof fd !== "number" || !Number.isInteger(fd) || fd < 0) {
throw new Error(
`node-pty master fd unavailable (got ${typeof fd}: ${fd}). ` +
`Phase 2 fd-handoff depends on node-pty's private _fd property — ` +
`pin node-pty to 1.1.x or update Pty.ts to match the new shape.`,
);
}
return fd;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

version="$(
python - <<'PY'
import json, os, subprocess, sys

paths = ["package.json"]
try:
    out = subprocess.check_output(["fd", "-a", "^package\\.json$"], text=True)
    paths.extend([p for p in out.splitlines() if p])
except Exception:
    pass

seen = set()
for path in paths:
    if path in seen or not os.path.exists(path):
        continue
    seen.add(path)
    try:
        with open(path) as f:
            pkg = json.load(f)
    except Exception:
        continue
    deps = {}
    for key in ("dependencies", "devDependencies", "peerDependencies", "optionalDependencies"):
        deps.update(pkg.get(key, {}))
    if "node-pty" in deps:
        print(deps["node-pty"].lstrip("^~"))
        sys.exit(0)

raise SystemExit("node-pty dependency not found")
PY
)"

tmp="$(mktemp -d)"
trap 'rm -rf "$tmp"' EXIT

curl -Ls "https://registry.npmjs.org/node-pty/-/node-pty-${version}.tgz" | tar -xz -C "$tmp"

echo "node-pty version: $version"
echo
echo "Occurrences of _fd in the published package:"
rg -n "_fd" "$tmp/package" || true

echo
echo "Platform-specific files mentioning Windows/ConPTY/winpty:"
rg -n "Windows|ConPTY|conpty|winpty" "$tmp/package" -g '!**/*.map' || true

Repository: superset-sh/superset

Length of output: 50377


🏁 Script executed:

# Check for getMasterFd usage and spawn context
rg -n "getMasterFd|fd-handoff|Phase 2" packages/pty-daemon/src --type ts

# Check spawn() implementation to see if platform guards exist
rg -n "spawn\(" packages/pty-daemon/src/Pty/Pty.ts -A 3 -B 1 --type ts

# Look for any platform checks in the codebase
rg -n "process\.platform|darwin|win32|linux" packages/pty-daemon/src --type ts

Repository: superset-sh/superset

Length of output: 2619


🏁 Script executed:

# Check the PR context - look for supportedPlatforms or similar
fd -e ts -e tsx -e json packages/pty-daemon | xargs rg -l "platform|darwin|arm64" | head -10

Repository: superset-sh/superset

Length of output: 937


🏁 Script executed:

# Get context around getMasterFd() call at line 141 in spawn()
sed -n '95,155p' packages/pty-daemon/src/Pty/Pty.ts

Repository: superset-sh/superset

Length of output: 2003


🏁 Script executed:

# Check for any feature flags or conditions around fd-handoff
rg -n "handoff|upgrade|PHASE|feature" packages/pty-daemon/src/Pty/Pty.ts packages/pty-daemon/src/Server/Server.ts -B 2 -A 2

Repository: superset-sh/superset

Length of output: 9543


🏁 Script executed:

# Check platform detection and supported platforms
rg -n "process\.platform|darwin|win32|linux" packages/pty-daemon/src --type ts -B 2 -A 2 | head -50

Repository: superset-sh/superset

Length of output: 46


The unconditional getMasterFd() assert can fail if node-pty removes or changes the _fd property.

The private _fd field is currently available across all node-pty 1.1.0 implementations (Unix, Windows ConPTY, and winpty), so the platform concern in the original comment was incorrect. However, spawn() calls getMasterFd() unconditionally at line 141 with no fallback. Since the PR pins node-pty to "1.1.0" specifically to work around this private-API dependency, a future minor version bump that changes _fd's shape or removes it would break every normal session open, not just handoff scenarios. Consider adding an optional fallback path or explicit feature detection.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/pty-daemon/src/Pty/Pty.ts` around lines 46 - 59, getMasterFd()
currently throws unconditionally if the private _fd is missing which will break
every spawn() call when node-pty changes; change the implementation to provide a
safe feature-detection/fallback: add a non-throwing probe (e.g. hasMasterFd() or
have getMasterFd() return number | undefined) that checks (this.term as any)._fd
for a valid integer and returns undefined instead of throwing, then update
spawn() to call that probe and only attempt fd-handoff when a valid fd is
present (or fall back to the existing non-handoff code path), and ensure all
callers of getMasterFd()/hasMasterFd() handle the undefined case gracefully
rather than assuming _fd exists.

Comment on lines +177 to +178
this.reader = fs.createReadStream("", { fd, autoClose: false });
this.writer = fs.createWriteStream("", { fd, autoClose: false });
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

find . -type f -name "Pty.ts" | head -20

Repository: superset-sh/superset

Length of output: 101


🏁 Script executed:

wc -l ./packages/pty-daemon/src/Pty/Pty.ts

Repository: superset-sh/superset

Length of output: 105


🏁 Script executed:

cat -n ./packages/pty-daemon/src/Pty/Pty.ts

Repository: superset-sh/superset

Length of output: 10950


Cache adopted-session exit state to replay notifications to late subscribers and prevent post-exit writes.

If the adopted shell exits before the server registers onExit, the notification is dropped permanently. In that same window, write() can still target destroyed streams. Store the exit result and replay it to late-arriving subscribers; guard writes with an exitFired check to make them harmless after termination.

Suggested fix
 class AdoptedPty implements Pty {
 	readonly pid: number;
 	meta: SessionMeta;
 	private readonly fd: number;
 	private readonly reader: fs.ReadStream;
 	private readonly writer: fs.WriteStream;
 	private exitFired = false;
+	private exitInfo: { code: number | null; signal: number | null } | null =
+		null;
 	private livenessTimer: NodeJS.Timeout | null = null;
 	private exitCallbacks: PtyOnExit[] = [];

 	constructor(fd: number, pid: number, meta: SessionMeta) {
@@
 		const onExit = (info: { code: number | null; signal: number | null }) => {
 			if (this.exitFired) return;
 			this.exitFired = true;
+			this.exitInfo = info;
 			if (this.livenessTimer) clearInterval(this.livenessTimer);
@@
 		};
 		this.reader.on("end", () => onExit({ code: null, signal: null }));
 		this.reader.on("error", () => onExit({ code: null, signal: null }));
+		this.writer.on("error", () => onExit({ code: null, signal: null }));
 		this.livenessTimer = setInterval(() => {
 			if (!isPidAlive(this.pid)) onExit({ code: null, signal: null });
 		}, 1000);
@@
 	write(data: Buffer): void {
+		if (this.exitFired) return;
 		this.writer.write(data);
 	}
@@
 	onExit(cb: PtyOnExit): void {
+		if (this.exitInfo) {
+			cb(this.exitInfo);
+			return;
+		}
 		this.exitCallbacks.push(cb);
 	}
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/pty-daemon/src/Pty/Pty.ts` around lines 177 - 178, The adopted shell
can exit before onExit is registered and writes can target destroyed streams;
add an exitFired boolean and an exitResult (e.g., { code, signal }) on the Pty
instance, set them when the underlying reader/writer emit 'close'/'exit' (where
reader/writer are created in the constructor), update onExit to immediately
invoke the subscriber with exitResult if exitFired is already true (replaying
the notification), and guard the write(...) method to no-op (or fail fast) when
exitFired is true so no writes go to destroyed streams. Ensure the state is set
exactly once when the streams close and that all places that register exit
listeners (onExit) check exitFired and use the cached exitResult.

Comment on lines +83 to +98
async listenWithRetry(timeoutMs = 5_000): Promise<void> {
const start = Date.now();
let lastErr: unknown = null;
while (Date.now() - start < timeoutMs) {
try {
session.pty.kill("SIGKILL");
await this.listen();
return;
} catch (err) {
lastErr = err;
const code = (err as NodeJS.ErrnoException).code;
if (code !== "EADDRINUSE") throw err;
await new Promise((r) => setTimeout(r, 50));
}
}
throw lastErr ?? new Error("listenWithRetry timed out");
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't retry handoff binds through listen().

listen() always unlinks this.opts.socketPath first. In the handoff path that means a retry can remove the predecessor's still-live socket entry, let the successor bind early, and then have the predecessor's later unlinkSync() delete the successor's new socket path out from under it. The retry loop needs a bind-only path that skips stale-socket cleanup.

One way to split the behaviors
 async listen(): Promise<void> {
 	const dir = path.dirname(this.opts.socketPath);
 	fs.mkdirSync(dir, { recursive: true });
-	// Stale-socket cleanup: remove any prior socket file at this path.
-	try {
-		fs.unlinkSync(this.opts.socketPath);
-	} catch (err) {
-		if ((err as NodeJS.ErrnoException).code !== "ENOENT") throw err;
-	}
-	await new Promise<void>((resolve, reject) => {
-		this.server.once("error", reject);
-		this.server.listen(this.opts.socketPath, () => {
-			this.server.off("error", reject);
-			resolve();
-		});
-	});
+	try {
+		fs.unlinkSync(this.opts.socketPath);
+	} catch (err) {
+		if ((err as NodeJS.ErrnoException).code !== "ENOENT") throw err;
+	}
+	await this.bindSocket();
 	fs.chmodSync(this.opts.socketPath, 0o600);
 }
 
+private async bindSocket(): Promise<void> {
+	await new Promise<void>((resolve, reject) => {
+		this.server.once("error", reject);
+		this.server.listen(this.opts.socketPath, () => {
+			this.server.off("error", reject);
+			resolve();
+		});
+	});
+}
+
 async listenWithRetry(timeoutMs = 5_000): Promise<void> {
 	const start = Date.now();
 	let lastErr: unknown = null;
 	while (Date.now() - start < timeoutMs) {
 		try {
-			await this.listen();
+			await this.bindSocket();
+			fs.chmodSync(this.opts.socketPath, 0o600);
 			return;
 		} catch (err) {
 			lastErr = err;
 			const code = (err as NodeJS.ErrnoException).code;
 			if (code !== "EADDRINUSE") throw err;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/pty-daemon/src/Server/Server.ts` around lines 83 - 98,
listenWithRetry is retrying by calling listen(), but listen() always unlinks
this.opts.socketPath which can remove a live predecessor's socket during a
handoff; change listenWithRetry to call a new bind-only path that skips
stale-socket cleanup on retries. Add a new method (e.g., bindOnly or listenBind)
alongside listen that performs the bind without unlinking this.opts.socketPath
or calling unlinkSync, and have listenWithRetry call that bind-only method
inside the retry loop while preserving the original listen() behavior for the
non-retry path; ensure errors other than EADDRINUSE are rethrown and lastErr is
propagated as before.

Comment thread packages/pty-daemon/src/SessionStore/snapshot.ts Outdated
Comment on lines +132 to +137
// Reap the successor for the after() hook.
try {
process.kill(successorPid, "SIGTERM");
} catch {
// already gone
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Wait for the successor to actually exit during teardown.

This only sends SIGTERM and returns. If the successor exits asynchronously, it can keep the socket or inherited fds around long enough to bleed into later tests in the same worker. Please wait until the PID disappears before leaving the test.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/pty-daemon/test/handoff.test.ts` around lines 132 - 137, After
sending SIGTERM to the successor (process.kill(successorPid, "SIGTERM")) in the
after() teardown, wait for the PID to actually disappear before finishing the
test: implement an async wait loop that repeatedly checks process existence
(e.g., using process.kill(successorPid, 0) inside try/catch) with a short delay,
a total timeout, and a fallback (send SIGKILL if the process still exists after
the timeout); make the after() hook async and await this wait routine so
inherited fds/sockets are not leaked into later tests.

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

7 issues found across 28 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="apps/desktop/scripts/check-pty-daemon-bundle.ts">

<violation number="1" location="apps/desktop/scripts/check-pty-daemon-bundle.ts:64">
P2: Use a proper file-path conversion for `import.meta.url` instead of `.pathname`; the current code can resolve an invalid path on Windows.

(Based on your team's feedback about using cross-platform path utilities instead of string/path parsing shortcuts.) [FEEDBACK_USED]</violation>
</file>

<file name="packages/pty-daemon/src/main.ts">

<violation number="1" location="packages/pty-daemon/src/main.ts:8">
P2: Header comment documents env-var usage (`SUPERSET_PTY_DAEMON_HANDOFF`, `SUPERSET_PTY_DAEMON_SNAPSHOT`, `SUPERSET_PTY_DAEMON_SOCKET`) but the implementation uses argv flags (`--handoff`, `--snapshot=`, `--socket=`). The code itself explains env vars don't survive bundler DCE. Update the comment to match the actual invocation.</violation>
</file>

<file name="packages/pty-daemon/src/Pty/Pty.ts">

<violation number="1" location="packages/pty-daemon/src/Pty/Pty.ts:107">
P2: The `throw` for "not a directory" is inside the try block so it's immediately caught by the sibling catch, producing a confusing wrapped message like `"spawn: cwd not accessible: /path (spawn: cwd is not a directory: /path)"`. Move the directory check after the try/catch or re-throw before entering the catch path.</violation>

<violation number="2" location="packages/pty-daemon/src/Pty/Pty.ts:178">
P1: Missing `'error'` listener on `this.writer`. If a write fails (e.g., shell just exited), the stream emits `'error'` with no handler, crashing the daemon with an uncaught exception. Add an error listener that triggers the `onExit` path (similar to the reader).</violation>
</file>

<file name="packages/pty-daemon/src/Server/Server.ts">

<violation number="1" location="packages/pty-daemon/src/Server/Server.ts:200">
P1: The successor inherits `SUPERSET_PTY_DAEMON_VERSION` from the predecessor's environment, causing it to report the old version. This makes the supervisor think the upgrade didn't take effect (`updatePending` stays true), and on the next host-service restart the auto-update fires again — repeating indefinitely.

Strip or delete this env var before spawning the successor so it falls back to `readPackageVersion()` on the new binary.</violation>

<violation number="2" location="packages/pty-daemon/src/Server/Server.ts:501">
P2: Missing `disconnect` listener in `waitForHandoffAck`: if the child's IPC channel closes without the process exiting (e.g., child calls `process.disconnect()` before sending ack), the function waits the full 5s timeout instead of settling immediately. Consider adding a `disconnect` handler to settle with a failure.</violation>
</file>

<file name="apps/desktop/src/main/pty-daemon/index.ts">

<violation number="1" location="apps/desktop/src/main/pty-daemon/index.ts:116">
P1: `readSnapshot()` is outside the `try/catch` that sends `upgrade-nak`. If the snapshot is missing or corrupt, the error propagates to `main().catch()` without notifying the predecessor, which then burns the full 5s ack timeout instead of getting an immediate failure reason. Move this call inside the try block that sends the nak.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Comment thread packages/pty-daemon/src/Pty/Pty.ts
Comment thread packages/pty-daemon/src/Server/Server.ts Outdated
Comment thread apps/desktop/src/main/pty-daemon/index.ts Outdated
Comment thread apps/desktop/scripts/check-pty-daemon-bundle.ts Outdated
Comment thread packages/pty-daemon/src/main.ts
Comment thread packages/pty-daemon/src/Pty/Pty.ts Outdated
reason: `successor exited before ack (code=${code} signal=${signal})`,
});
};
child.on("message", onMessage);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Missing disconnect listener in waitForHandoffAck: if the child's IPC channel closes without the process exiting (e.g., child calls process.disconnect() before sending ack), the function waits the full 5s timeout instead of settling immediately. Consider adding a disconnect handler to settle with a failure.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At packages/pty-daemon/src/Server/Server.ts, line 501:

<comment>Missing `disconnect` listener in `waitForHandoffAck`: if the child's IPC channel closes without the process exiting (e.g., child calls `process.disconnect()` before sending ack), the function waits the full 5s timeout instead of settling immediately. Consider adding a `disconnect` handler to settle with a failure.</comment>

<file context>
@@ -253,6 +459,56 @@ export class Server {
+				reason: `successor exited before ack (code=${code} signal=${signal})`,
+			});
+		};
+		child.on("message", onMessage);
+		child.on("exit", onExit);
+		const timer = setTimeout(() => {
</file context>
Suggested change
child.on("message", onMessage);
child.on("message", onMessage);
child.on("disconnect", () => {
settle({ ok: false, reason: "successor IPC channel disconnected before ack" });
});

Net: 212 deletions, 87 insertions across 10 files. No behavior changes;
all 453 unit + 12 daemon-integration + 7 adoption-e2e tests still pass.

What was trimmed:
- Call-site comments that re-explained their helper's docstring
  (waitForPidExit, probeDaemonVersionWithRetry, replayOnAdoption JSDoc).
- Wishful comments that didn't match the code (Server.ts: "log child
  stderr separately" before code that just logs exit code).
- Multi-paragraph history-of-the-test in terminal.adoption.node-test.ts
  — the new assertion stands on its own; the obsolete-approach
  postmortem was clutter.
- "auto-update fires on every adopt with version drift / heavy path"
  said three different ways across describe block, test header, and
  inline.
- The stale "scriptPath points to a missing file" comment in the
  auto-update-fail integration test (the test uses runUpdate override,
  not scriptPath injection — comment lied).

What was extracted:
- seedPredecessor + getInstancePid helpers in DaemonSupervisor.test.ts:
  removed two ~20-line `(sup as unknown as ...).instances.set(...)`
  blocks per test; both auto-update-failure tests now read as 8-10 lines
  of the actual assertion they make.

What was kept:
- Real why-comments: env-strip rationale (1-line), readPackageVersion
  in handoff mode (1-line), writer-error listener (1-line), atomic-
  rename writeSnapshot, snapshot file-format header.
- Critical-contract assertions in the auto-update-failure test.
Three places hand-edited together for every daemon version bump:
  - packages/pty-daemon/package.json#version
  - packages/pty-daemon/src/index.ts#DAEMON_PACKAGE_VERSION
    (inlined into the desktop bundle by electron-vite — package.json
    can't be readFileSync'd at runtime in a bundled binary)
  - packages/host-service/src/daemon/expected-version.ts#EXPECTED_DAEMON_VERSION
    (what host-service compares running daemon against; drift here =
    false updatePending or false "running == expected")

The 0.1.0 → 0.2.0 bump in this PR was nearly the exact mistake this
test catches: hand-edit one, ship without the others, the bundle
silently reports the wrong version. Verified the test fails when any
of the three sources drifts.

Also tightened expected-version.ts's lockstep comment now that the
test replaces the "rely on PR review" hand-wave.
Three hand-edited copies of "what version is the daemon" collapse to
one: pty-daemon/package.json. The other two now derive at compile time
via JSON imports.

  packages/pty-daemon/src/index.ts
    DAEMON_PACKAGE_VERSION = packageJson.version  (was: "0.2.0" literal)

  packages/host-service/src/daemon/expected-version.ts
    EXPECTED_DAEMON_VERSION = ptyDaemonPackageJson.version  (was: literal)

  packages/pty-daemon/src/main.ts
    DAEMON_VERSION = packageJson.version  (replaces readPackageVersion()
    helper — 11 lines + fileURLToPath/fs/path imports gone)

Bundlers (Bun for the standalone, electron-vite for desktop) inline
the JSON at build time, so the runtime cost is identical to the
literals. Verified: all 453 host-service unit + 12 daemon integration
+ 7 adoption-e2e tests still pass; daemon bundle still reports 0.2.0
end-to-end.

Net effect for future bumps: edit pty-daemon/package.json, rebuild,
done. The lockstep test (added in the prior commit) is dropped because
drift is now structurally impossible — the constants don't exist as
independent values to drift from.

Tradeoff acknowledged: every daemon-version bump now forces an
auto-update for users on the older version (no more "ship a quiet
daemon patch"). Discussed and accepted — daemon bumps are expected to
be rare and significant in this codebase, so the simpler model wins.

Required pty-daemon/package.json#exports to add "./package.json":
"./package.json" so host-service can import it (Node's exports field
is deny-by-default for subpaths).
@Kitenite Kitenite merged commit 4ba7750 into main May 2, 2026
15 checks passed
@Kitenite Kitenite deleted the cake-saguaro branch May 2, 2026 21:37
Kitenite added a commit that referenced this pull request May 2, 2026
Phase 1 (PR #3896) and Phase 2 (PR #3971) are both shipped — moved the
planning docs from apps/desktop/plans/ into apps/desktop/plans/done/
per repo convention. Three plans that were untracked in the working
worktree (Phase 1 host-service-migration, implementation-report,
version-detection proposal) are committed at the same time so future
readers can trace the design journey.

Refreshed two long-lived reference docs to match shipped reality:

- packages/host-service/DAEMON_SUPERVISION.md
  * EXPECTED_DAEMON_VERSION is no longer hand-edited — derives from
    pty-daemon/package.json#version at compile time.
  * "Phase 2 deferred — daemon upgrades currently kill sessions"
    section replaced with the Phase-2-shipped walkthrough (handoff
    flow, restoreOnFailure contract, the argv-not-env mode signal,
    bundle-marker check script).
  * "Bumping the daemon version" instructions now describe the
    single-source-of-truth path.

- packages/pty-daemon/README.md
  * Header references both Phase 1 and Phase 2 plans.
  * Layout diagram and test-suite descriptions match what's actually
    in test/ today (byte-fidelity, handoff, signal-recovery, no-encoding-hops).
  * "Out of scope" lists the real outstanding gap (since-byte-N
    cursor) instead of Phase 2.

Excluded **/plans from biome's lint/format scope. plans/ holds docs +
occasional scratch code (the pty-handoff-experiment harness uses
patterns biome flags); production-quality lint expectations don't
apply there.
Kitenite added a commit that referenced this pull request May 2, 2026
Phase 1 (PR #3896) and Phase 2 (PR #3971) are both shipped — moved the
planning docs from apps/desktop/plans/ into apps/desktop/plans/done/
per repo convention. Three plans that were untracked in the working
worktree (Phase 1 host-service-migration, implementation-report,
version-detection proposal) are committed at the same time so future
readers can trace the design journey.

Refreshed two long-lived reference docs to match shipped reality:

- packages/host-service/DAEMON_SUPERVISION.md
  * EXPECTED_DAEMON_VERSION is no longer hand-edited — derives from
    pty-daemon/package.json#version at compile time.
  * "Phase 2 deferred — daemon upgrades currently kill sessions"
    section replaced with the Phase-2-shipped walkthrough (handoff
    flow, restoreOnFailure contract, the argv-not-env mode signal,
    bundle-marker check script).
  * "Bumping the daemon version" instructions now describe the
    single-source-of-truth path.

- packages/pty-daemon/README.md
  * Header references both Phase 1 and Phase 2 plans.
  * Layout diagram and test-suite descriptions match what's actually
    in test/ today (byte-fidelity, handoff, signal-recovery, no-encoding-hops).
  * "Out of scope" lists the real outstanding gap (since-byte-N
    cursor) instead of Phase 2.

Excluded **/plans from biome's lint/format scope. plans/ holds docs +
occasional scratch code (the pty-handoff-experiment harness uses
patterns biome flags); production-quality lint expectations don't
apply there.
Kitenite added a commit that referenced this pull request May 2, 2026
…pty-daemon README (#3851)

* docs(desktop): survey of PTY-survival architectures across host-service upgrades

Exhaustive survey of options for keeping node-pty shells alive when the
host-service binary upgrades. Covers serialize+replay (VS Code style),
SCM_RIGHTS fd-passing (with node-pty and direct forkpty), long-lived
pty-daemon, tmux-as-backend, exec() in-place, OS supervisor, CRIU, and
mosh-style sync. Includes findings from a working PoC of the SCM_RIGHTS
path at ~/workplace/pty-handoff-poc/, comparison table, and a phased
recommendation (visual-replay floor now, daemon + fd-passing later).

No code changes; doc only.

* docs(desktop): pty-daemon implementation plan + Phase 0 reliability harness

Adds a concrete implementation plan for Architecture E (pty-daemon +
fd-handoff for daemon upgrades) selected by the prior survey, plus the
working Phase 0 reliability harness whose results justify the choice
of runtime and primitive.

- 20260429-pty-daemon-implementation.md: scoped Phase 1 (~2 weeks) +
  Phase 2 (~1 week) plan. Locked decisions (Node + node-pty runtime,
  Unix-socket protocol with versioned handshake, ring buffer on
  daemon, fd inheritance for handoff). Package layout, manifest
  design, host-service integration points, telemetry, open decisions.
- 20260428-pty-handoff-survival-architectures.md: revised survey with
  empirical Phase 0 results (Go + creack/pty: 20k handoffs, 100%
  survival, 0 byte loss on macOS arm64) and a Phase 0 follow-up that
  disproves the original C1 conclusion (node-pty + macOS works for
  handoff; the real failure mode is master fd refcount, not the
  spawn-helper). Recommendation now defaults to a Node daemon.
- pty-handoff-experiment/: Go harness (creack/pty + UnixRights) and
  Node test scripts (node-pty + stdio fd inheritance) used to validate
  the primitive on macOS arm64.

* docs(desktop): pty-daemon plan — explicit v1-bloat avoidance

Adds a "Lessons from v1 (do not repeat)" section calling out the
specific responsibilities that bloated v1's daemon to ~1330 LOC and
mapping each to its v2 home (host-service) so they don't creep back in.

Concrete additions:
- Locked-decisions table reinforced: daemon is "stateless from the
  client's perspective"; ~500–700 LOC budget; in-memory ring buffer
  only (no SQLite-backed history).
- Itemized v1-vs-v2 anti-pattern table: HistoryWriter, cold restore,
  killed-session tombstones, workspace auth in daemon, port scanner,
  analytics, EventEmitter subscriber model, request dedup, priority
  semaphore, ANSI parsing, sticky cross-call state, deferred cleanup.
- "Where the v1 daemon's bloat lands in v2" mapping each removed
  concern to the upgrade-frequent layer (host-service) where state
  belongs.
- Patterns from v1 worth keeping called out (per-session snapshot on
  attach, resize bounds, signal abstraction, AbortSignal cancellation,
  simple FIFO concurrency cap).
- References to v1 daemon files (daemon-manager.ts:1009 LOC,
  history-manager.ts:245 LOC, priority-semaphore.ts:76 LOC) so
  reviewers can see what we're leaving behind.

* docs(desktop): listener-fd handoff spike for Phase 2

Validates that a parent process can dup an AF_UNIX listening fd into a
child via stdio array, the child can `accept()` on it via
`server.listen({ fd })`, and after the parent exits the kernel keeps the
in-kernel socket alive via the child's fd refcount.

PASSes on macOS arm64 / Node 24. Forced two architectural revisions to
the Phase 2 plan:

1. Old daemon spawns the successor — supervisor never holds the listener
   fd. Avoids the daemon→supervisor fd transfer problem entirely.
2. Don't call `server.close()` on the handoff path. Node's
   net.Server.close() unlinks the path's directory entry; even though
   the successor holds a dup'd fd, clients see ENOENT on connect.
   `process.exit(0)` skips Node's close handlers, preserves the path.

Spike led to dropping listener inheritance from D1 entirely (PTY fds
are the only thing that genuinely cannot blink); see updated plan.

* docs(desktop): Phase 2 audit + implementation plan

Audit (20260501-...-audit.md): surveys post-Phase-1 elastic-lens for
fd-handoff readiness. Confirms no critical blockers — Phase 0 already
proved master fd inheritance, and the supervisor's `restart()` is a
clean integration point.

Plan (20260501-...-implementation.md): bakes in 6 decisions from a
/decide walkthrough, sketches the daemon-spawns-successor architecture
that the listener-handoff spike forced, and sequences 10 steps across
9 commits.

D1 ("what gets inherited") was revised after the spike + user pushback
to PTY master fds only. host-service is the only client of the daemon
socket and already reconnects after host-service-restart adoptions —
adding listener inheritance just to avoid a ~100ms reconnect blip
wasn't worth the foot-guns.

* docs(daemon): archive shipped plans + refresh post-Phase-2 references

Phase 1 (PR #3896) and Phase 2 (PR #3971) are both shipped — moved the
planning docs from apps/desktop/plans/ into apps/desktop/plans/done/
per repo convention. Three plans that were untracked in the working
worktree (Phase 1 host-service-migration, implementation-report,
version-detection proposal) are committed at the same time so future
readers can trace the design journey.

Refreshed two long-lived reference docs to match shipped reality:

- packages/host-service/DAEMON_SUPERVISION.md
  * EXPECTED_DAEMON_VERSION is no longer hand-edited — derives from
    pty-daemon/package.json#version at compile time.
  * "Phase 2 deferred — daemon upgrades currently kill sessions"
    section replaced with the Phase-2-shipped walkthrough (handoff
    flow, restoreOnFailure contract, the argv-not-env mode signal,
    bundle-marker check script).
  * "Bumping the daemon version" instructions now describe the
    single-source-of-truth path.

- packages/pty-daemon/README.md
  * Header references both Phase 1 and Phase 2 plans.
  * Layout diagram and test-suite descriptions match what's actually
    in test/ today (byte-fidelity, handoff, signal-recovery, no-encoding-hops).
  * "Out of scope" lists the real outstanding gap (since-byte-N
    cursor) instead of Phase 2.

Excluded **/plans from biome's lint/format scope. plans/ holds docs +
occasional scratch code (the pty-handoff-experiment harness uses
patterns biome flags); production-quality lint expectations don't
apply there.
MocA-Love pushed a commit to MocA-Love/superset that referenced this pull request May 8, 2026
Recorded as integrated via -s ours after batch PRs #455-#464.

Taken via individual PRs:
- PR  1 (#455): v2 polish 前半 safe set (9 commits)
- PR  2 (#456): v2/host-service polish 中盤 (12 commits)
- PR  3 (#457): sidebar polish + jwt API (5 commits)
- PR  4 (#458): host-service tRPC retry/cache/timeout (3 commits)
- PR  5 (#459): v2 diff pane / file pane polish (2 commits)
- PR  7 (#462): host-service v2 canonical workspace.create + attachment store (PR1 superset-sh#3893 + PR2 superset-sh#3916)
- PR 11 (#463): agents API + onboarding (7 commits + 1 cleanup)
- PR 12 (#464): v1→v2 import flow rewrite (11 commits + 2 follow-ups)
- PR 13 (#460): host-service shell env probe + typo (2 commits)
- PR 16 (#461): marketplace 19 themes (1 commit)

Skipped / deferred (recorded as integrated for behind=0):
- PR  6: CLI v1 launch (superset-sh#3898 + 30+ CLI/SDK followups) — defer to dedicated migration
- PR  9: v2 PR3 (superset-sh#3940) + revert (superset-sh#4017) — net-zero pair
- PR 10: pty-daemon (superset-sh#3896, superset-sh#3971, superset-sh#4054) — fork keeps its terminal-host
- PR 14: Slack MCP-v2 (superset-sh#4197, superset-sh#4208) — depends on mcp-v2/sdk divergence
- PR 15: onboarding remaining (superset-sh#4115, superset-sh#4125, superset-sh#4214, superset-sh#4213, superset-sh#4222, superset-sh#4225) — depends on fork's deleted setup pages

Behind: 0 after this merge.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant