Bun.spawn: don't close user-provided extra stdio fds#29621
Conversation
When a raw file descriptor number is passed in the stdio array at an index > 2, that fd is owned by the caller, not by Bun.spawn. Previously the fd was appended to the subprocess's extra_pipes list and closed in finalizeStreams() when the Subprocess was garbage-collected, leading to EBADF on subsequent use of the fd (or silent corruption if the slot was reused). Store invalid_fd in the slot instead so the positional entry is preserved for subprocess.stdio but finalizeStreams() skips the close. This matches the existing behaviour for raw fds at stdio[0..2] and the Windows path for extra fds.
|
a fix for this is already merged on main |
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (2)
Disabled knowledge base sources:
WalkthroughModified POSIX spawn handling to prevent Bun from taking ownership of extra file descriptors when stdio is Changes
Possibly related PRs
Suggested reviewers
Comment |
| try actions.dup2(fd, fileno); | ||
|
|
||
| try extra_fds.append(fd); | ||
| // This file descriptor was passed in by the user (e.g. a | ||
| // number in the stdio array). It is not owned by Bun.spawn, | ||
| // so we must not close it in finalizeStreams. Store | ||
| // invalid_fd to keep the positional slot for | ||
| // subprocess.stdio without taking ownership. | ||
| try extra_fds.append(bun.invalid_fd); |
There was a problem hiding this comment.
🟡 This change makes proc.stdio[n] return null instead of the fd number for user-provided extra fds on POSIX (since getStdio at subprocess.zig:488-492 now sees invalid_fd). This is arguably the correct behavior — it matches Node.js and the existing Windows path — but it's an observable change worth calling out, and the TypeScript type readonly stdio: [null, null, null, ...number[]] at packages/bun-types/bun.d.ts:7165 should be updated to allow null in the rest tuple.
Extended reasoning...
What changed
Subprocess.getStdio (src/bun.js/api/bun/subprocess.zig:480-493) iterates stdio_pipes and, on POSIX, pushes JSValue.jsNumber(item.cast()) when item.isValid() and null otherwise. Before this PR, when a raw fd number was passed at stdio[3+], spawnProcessPosix stored that fd directly in extra_fds → stdio_pipes, so proc.stdio[3] returned the fd number. After this PR, bun.invalid_fd is stored instead, so item.isValid() is false and proc.stdio[3] returns null.
Step-by-step proof
- User calls
Bun.spawn({ stdio: ['ignore', 'ignore', 'ignore', 7] }). - The fd
7is mapped toPosixSpawnOptions.Stdio{ .pipe = 7 }inoptions.extra_fds[0]. spawnProcessPosix(process.zig:1509-1517) hits the.pipearm and now appendsbun.invalid_fdtoextra_fds(previously appended7).extra_fdsbecomesspawned.extra_pipes, which becomesSubprocess.stdio_pipes.- JS reads
proc.stdio→getStdioruns → for index 3,item == bun.invalid_fd,item.isValid()is false → pushes.null. - Before:
proc.stdio[3] === 7. After:proc.stdio[3] === null.
Why existing code doesn't prevent it
Nothing in getStdio special-cases "user-provided fd that we don't own"; it only checks isValid(). The PR intentionally uses invalid_fd as the sentinel for "slot exists but Bun doesn't own it", and getStdio maps that to null.
Addressing the refutation ("this is intentional, not a bug")
The refuter is right that this is the intended direction of the fix and aligns with both Node.js (which returns null for non-'pipe' stdio slots) and Bun's own Windows path (which already reported .unavailable → null). The previous return value was also dangerous: it echoed back an fd that Bun was about to close on GC, so any code relying on it was already racy/broken. And no information is lost — the user already has the fd they passed in. I agree this should not block the PR.
However, two things make it worth a comment rather than silence:
- The TypeScript type is now wrong.
packages/bun-types/bun.d.ts:7165declaresreadonly stdio: [null, null, null, ...number[]], and the type-test fixture attest/integration/bun-types/fixture/spawn.ts:74expectsnumber | undefinedfor indices ≥ 3. With this change, those slots can benullon every platform, so the type should become something like[null, null, null, ...(number | null)[]]. - It's a real observable change on POSIX. The PR description says the positional entry is "preserved" but doesn't mention the value changes from the fd number to
null. Worth a one-line note for anyone scanning the changelog.
Impact
Low. Existing POSIX code that did const fd = proc.stdio[3] to read back the fd it just passed in will now get null. Since the user already has that fd, the workaround is trivial (use the value you passed in). The type mismatch is the more concrete actionable item.
Suggested fix
Update packages/bun-types/bun.d.ts:7165 to readonly stdio: [null, null, null, ...(number | null)[]] (and the corresponding type-test fixture), and add a sentence to the PR description noting that proc.stdio[n] now returns null for raw-fd slots on POSIX, matching Windows and Node.js.
| expect(stdout.trim()).toBe("ok"); | ||
| expect(exitCode).toBe(0); | ||
| }, | ||
| 20000, |
There was a problem hiding this comment.
🟡 Nit: per test/CLAUDE.md ("CRITICAL: Do not set a timeout on tests. Bun already has timeouts."), the explicit 20000 timeout argument should be removed — the harness already manages per-test timeouts.
Extended reasoning...
What
The new test at test/js/bun/spawn/spawn-extra-fd-ownership.test.ts:67 passes 20000 as the third argument to test(), setting an explicit 20-second timeout.
Why this violates repo conventions
test/CLAUDE.md:120 states:
CRITICAL: Do not set a timeout on tests. Bun already has timeouts.
The test harness in this repo manages timeouts centrally, and explicit per-test timeouts have historically caused friction (drift between local and CI, masking hangs vs. genuine slowness, etc.). None of the other tests in test/js/bun/spawn/ set explicit timeouts, so this also breaks local consistency.
Step-by-step
test(name, fn, 20000)is called at line 67.bun:testrecords 20000 ms as the per-test timeout, overriding the harness default.- The repo guidance explicitly forbids step 2.
Impact
This is purely a convention/style issue — the test itself is functionally correct and the 20s value is generous enough that it won't cause flakes. Filing as a nit.
Fix
Drop the third argument:
},
);instead of
},
20000,
);
What
When a raw file descriptor number is passed in the
stdioarray at an index > 2 (e.g.stdio: ['ignore', 'ignore', 'ignore', fd]), that fd is owned by the caller, not byBun.spawn. Previously the fd was appended to the subprocess'sextra_pipeslist and closed infinalizeStreams()when theSubprocesswas garbage-collected.Repro
Cause
spawnProcessPosixstored the user's fd directly inextra_fds→stdio_pipes, andSubprocess.finalizeStreams()closes every valid fd in that list. For stdio[0..2] a raw fd is handled viaReadable/Writablewhich explicitly do not close the.fdvariant; the extra-fd path was inconsistent with that.Fix
Store
bun.invalid_fdin the slot instead, so the positional entry is preserved forsubprocess.stdiobutfinalizeStreams()skips the close. This matches:.unavailableVerification
test/js/bun/spawn/spawn-extra-fd-ownership.test.tsfails on main withEBADF: bad file descriptor, posix_spawnand passes with this change.bun bd test test/js/bun/spawn/spawn.ipc.test.tspasses (IPC uses the sameextra_fdsarray via socketpair — those fds are still owned and closed correctly).bun run zig:check-allpasses on all targets.