[codex] Harden PTY daemon auto-update#4460
Conversation
|
Capy auto-review is paused for this organization because the monthly auto-review limit has been reached. Increase the limit or turn it off in billing settings to resume automatic reviews. |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
🚧 Files skipped from review as they are similar to previous changes (3)
📝 WalkthroughWalkthroughDaemonSupervisor refactors auto-update to defer when live PTY sessions are active, adds session-aware guards before destructive restarts, improves adoption probing with retries and cleanup, introduces dev adoption control, switches PTY reader to ChangesAuto-update session awareness and deferral
Adoption lifecycle improvements
PTY daemon reader type migration
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Greptile SummaryThis PR hardens the PTY daemon lifecycle around two root causes: adopted master fds were wrapped with
Confidence Score: 4/5Safe to merge; the session-guard and probe-rejection logic is well-tested and the changes do not touch any data path outside the daemon lifecycle. The
|
| Filename | Overview |
|---|---|
| packages/host-service/src/daemon/DaemonSupervisor.ts | Core logic change: auto-update now checks for live sessions before updating/force-restarting; adoption rejects and terminates daemons that cannot answer the version probe. One minor redundant guard (!!probed) remains after the early-return. |
| packages/pty-daemon/src/Pty/Pty.ts | Fixes TTY buffering bug by replacing fs.createReadStream with tty.ReadStream for adopted PTY master fds. A double-close pattern is introduced (stream's libuv handle + explicit fs.closeSync), which is guarded but worth clarifying. |
| packages/host-service/src/daemon/DaemonSupervisor.test.ts | Adds unit tests for live-session deferral, force-restart skipping on late-arriving sessions, and adoption rejection on failed version probe. |
| packages/host-service/src/daemon/DaemonSupervisor.node-test.ts | Real-spawn integration tests updated: the force-restart test is replaced with live-session deferral, and dropSupervisorInstance helper is extracted. |
| packages/pty-daemon/package.json | Version bump from 0.2.2 to 0.2.3, consistent with bun.lock. |
| bun.lock | Lock file updated to reflect pty-daemon version bump to 0.2.3. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[ensure orgId] --> B{Instance cached?}
B -- Yes --> Z[Return cached instance]
B -- No --> C[tryAdopt]
C --> D{Manifest exists & PID alive?}
D -- No --> SPAWN[spawn]
D -- Yes --> E{Socket reachable?}
E -- No --> KILL1[Terminate + remove manifest] --> SPAWN
E -- Yes --> F{probeDaemonVersionWithRetry 3s with retries}
F -- null/timeout --> REJECT[Log adopt_rejected, Terminate + remove manifest] --> SPAWN
F -- version string --> ADOPTED[Return adopted instance]
ADOPTED --> H[kickoffAutoUpdate fire-and-forget]
H --> I{listSessions up to 1.5s}
I -- null timeout --> DEFER[Defer: session_list_unavailable]
I -- alive sessions > 0 --> DEFER2[Defer: live_sessions_present]
I -- no live sessions --> J[startUpdate / runUpdate]
J -- ok:true --> OK[Log auto_update_ok]
J -- ok:false --> K{canAutoUpdateForceRestart}
K --> L{listSessions up to 1.5s, any alive?}
L -- alive sessions > 0 --> SKIP[Skip force-restart: live_sessions_present]
L -- none --> FRESTART[forceRestart]
Comments Outside Diff (1)
-
packages/pty-daemon/src/Pty/Pty.ts, line 237-247 (link)Potential fd double-close with
tty.ReadStreamownershiptty.ReadStream(vianet.Socket) registers the fd with a libuv TTY handle viauv_tty_init. Whendestroy()is called, libuv queues an asyncuv_closethat will callclose(fd)one event-loop tick later. The synchronousfs.closeSync(this.fd)runs first and closes the fd correctly, but then libuv's deferred close fires on the now-closed fd, producing anEBADFthat libuv handles internally. This is safe in practice, but if any file-open call races between the two close events and reuses the same fd number, libuv would silently close the wrong descriptor. Thetry/catchcomment says "already closed" but the actual close order iscloseSync → libuv EBADF, which is the reverse of what the comment implies. Worth either removing thefs.closeSyncand relying ontty.ReadStream's own cleanup, or settingautoClose: falseexplicitly as the old code did.Prompt To Fix With AI
This is a comment left during a code review. Path: packages/pty-daemon/src/Pty/Pty.ts Line: 237-247 Comment: **Potential fd double-close with `tty.ReadStream` ownership** `tty.ReadStream` (via `net.Socket`) registers the fd with a libuv TTY handle via `uv_tty_init`. When `destroy()` is called, libuv queues an async `uv_close` that will call `close(fd)` one event-loop tick later. The synchronous `fs.closeSync(this.fd)` runs first and closes the fd correctly, but then libuv's deferred close fires on the now-closed fd, producing an `EBADF` that libuv handles internally. This is safe in practice, but if any file-open call races between the two close events and reuses the same fd number, libuv would silently close the wrong descriptor. The `try/catch` comment says "already closed" but the actual close order is `closeSync → libuv EBADF`, which is the reverse of what the comment implies. Worth either removing the `fs.closeSync` and relying on `tty.ReadStream`'s own cleanup, or setting `autoClose: false` explicitly as the old code did. How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 2
packages/host-service/src/daemon/DaemonSupervisor.ts:876-878
After the `if (!probed)` early-return guard on line 865, `probed` is guaranteed to be a non-null, non-empty string — `!!probed` is unconditionally `true`. The guard is harmless but misleads the reader into thinking a falsy `probed` could reach this line.
```suggestion
const runningVersion = probed;
const updatePending =
!semver.satisfies(probed, `>=${EXPECTED_DAEMON_VERSION}`);
```
### Issue 2 of 2
packages/pty-daemon/src/Pty/Pty.ts:237-247
**Potential fd double-close with `tty.ReadStream` ownership**
`tty.ReadStream` (via `net.Socket`) registers the fd with a libuv TTY handle via `uv_tty_init`. When `destroy()` is called, libuv queues an async `uv_close` that will call `close(fd)` one event-loop tick later. The synchronous `fs.closeSync(this.fd)` runs first and closes the fd correctly, but then libuv's deferred close fires on the now-closed fd, producing an `EBADF` that libuv handles internally. This is safe in practice, but if any file-open call races between the two close events and reuses the same fd number, libuv would silently close the wrong descriptor. The `try/catch` comment says "already closed" but the actual close order is `closeSync → libuv EBADF`, which is the reverse of what the comment implies. Worth either removing the `fs.closeSync` and relying on `tty.ReadStream`'s own cleanup, or setting `autoClose: false` explicitly as the old code did.
Reviews (1): Last reviewed commit: "harden pty daemon auto-update" | Re-trigger Greptile
| const runningVersion = probed; | ||
| const updatePending = | ||
| !!probed && !semver.satisfies(probed, `>=${EXPECTED_DAEMON_VERSION}`); |
There was a problem hiding this comment.
After the
if (!probed) early-return guard on line 865, probed is guaranteed to be a non-null, non-empty string — !!probed is unconditionally true. The guard is harmless but misleads the reader into thinking a falsy probed could reach this line.
| const runningVersion = probed; | |
| const updatePending = | |
| !!probed && !semver.satisfies(probed, `>=${EXPECTED_DAEMON_VERSION}`); | |
| const runningVersion = probed; | |
| const updatePending = | |
| !semver.satisfies(probed, `>=${EXPECTED_DAEMON_VERSION}`); |
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/host-service/src/daemon/DaemonSupervisor.ts
Line: 876-878
Comment:
After the `if (!probed)` early-return guard on line 865, `probed` is guaranteed to be a non-null, non-empty string — `!!probed` is unconditionally `true`. The guard is harmless but misleads the reader into thinking a falsy `probed` could reach this line.
```suggestion
const runningVersion = probed;
const updatePending =
!semver.satisfies(probed, `>=${EXPECTED_DAEMON_VERSION}`);
```
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
🧹 Nitpick comments (1)
packages/host-service/src/daemon/DaemonSupervisor.node-test.ts (1)
713-724: ⚡ Quick winSimplify the mock implementation for clarity.
The current mock uses an IIFE within the return object literal, which is unnecessarily complex and makes the code harder to understand. The IIFE executes when the async function body runs (when constructing the return value), which is correct but non-idiomatic.
♻️ Clearer implementation
let runUpdateCalled = false; ( sup as unknown as { runUpdate: () => Promise<{ ok: false; reason: string }>; } - ).runUpdate = async () => ({ - ok: false as const, - reason: (() => { - runUpdateCalled = true; - return "auto-update should have deferred before runUpdate"; - })(), - }); + ).runUpdate = async () => { + runUpdateCalled = true; + return { + ok: false as const, + reason: "auto-update should have deferred before runUpdate" + }; + };🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/host-service/src/daemon/DaemonSupervisor.node-test.ts` around lines 713 - 724, The mock of sup.runUpdate is overcomplicated by using an IIFE to set runUpdateCalled inside the returned object; simplify it by making runUpdate an async function that sets runUpdateCalled = true and then returns { ok: false, reason: "auto-update should have deferred before runUpdate" } directly. Locate the assignment to (sup as unknown as { runUpdate: ... }).runUpdate and replace the IIFE-based return with a straightforward async function body that flips runUpdateCalled and returns the literal reason.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@packages/host-service/src/daemon/DaemonSupervisor.node-test.ts`:
- Around line 713-724: The mock of sup.runUpdate is overcomplicated by using an
IIFE to set runUpdateCalled inside the returned object; simplify it by making
runUpdate an async function that sets runUpdateCalled = true and then returns {
ok: false, reason: "auto-update should have deferred before runUpdate" }
directly. Locate the assignment to (sup as unknown as { runUpdate: ...
}).runUpdate and replace the IIFE-based return with a straightforward async
function body that flips runUpdateCalled and returns the literal reason.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 706bab64-e9a5-4728-9f06-dd34c595eac3
⛔ Files ignored due to path filters (1)
bun.lockis excluded by!**/*.lock
📒 Files selected for processing (5)
packages/host-service/src/daemon/DaemonSupervisor.node-test.tspackages/host-service/src/daemon/DaemonSupervisor.test.tspackages/host-service/src/daemon/DaemonSupervisor.tspackages/pty-daemon/package.jsonpackages/pty-daemon/src/Pty/Pty.ts
🧹 Preview Cleanup CompleteThe following preview resources have been cleaned up:
Thank you for your contribution! 🎉 |
* harden pty daemon auto-update * address pty daemon review comments
Summary
tty.ReadStreaminstead of a generic fs read stream.@superset/pty-daemonto0.2.3and add unit/real-spawn coverage for the new paths.Root Cause
The typing/buffering issue came from the fd-handoff path introduced in #3971. Adopted PTY master fds were being read through
fs.createReadStream, which is not the right wrapper for TTY fds after daemon handoff. The recent auto-update path made production exercise that latent issue more often.Behavior
When the daemon is version-skewed after an app update, host-service now probes the adopted daemon. If it is compatible and idle, it updates in the background. If live sessions exist, it keeps them running and leaves the update available for explicit user action. If the daemon is reachable but cannot speak the current probe protocol, adoption is rejected and a fresh daemon is spawned.
Validation
bun test packages/host-service/src/daemon/DaemonSupervisor.test.tsnode --experimental-strip-types --test packages/host-service/src/daemon/DaemonSupervisor.node-test.tsnode --experimental-strip-types --test packages/pty-daemon/test/handoff.test.tsbun test src/protocol src/SessionStore src/handlers src/Pty/Pty.test.ts test/no-encoding-hops.test.tsinpackages/pty-daemonbun run typecheckinpackages/host-servicebun run typecheckinpackages/pty-daemonbun run lintgit diff --checkAfter the real-spawn integration run, no test daemon processes remained from this worktree.
Summary by cubic
Hardens PTY daemon auto‑update and adoption to protect live sessions, replace incompatible daemons, and fix adopted PTY read semantics with
tty.ReadStream. Bumps@superset/pty-daemonto0.2.3with new tests.New Features
SUPERSET_PTY_DAEMON_ADOPT_IN_DEV=1.Bug Fixes
tty.ReadStreamfor inherited master FDs to restore TTY semantics and fix input buffering.Written for commit 784a7ae. Summary will update on new commits.
Summary by CodeRabbit
Bug Fixes
Refactor
Chores
Tests