Don't forward SIGPWR in spawnSync signal handling on Linux#31104
Don't forward SIGPWR in spawnSync signal handling on Linux#31104robobun wants to merge 1 commit into
Conversation
On Linux, JSC uses SIGPWR for GC thread suspend/resume (see WTF ThreadingPOSIX.cpp). The spawnSync signal-forwarding list was copied from npm and included SIGPWR, so Bun__registerSignalsForForwarding would replace JSC's handler with a forwarding handler marked SA_RESETHAND. Bun.openInEditor() calls sync::spawn from a detached background thread, which meant multiple editor threads could race on the global previous_actions[] array and leave SIGPWR at SIG_DFL after the last unregister. The next GC-driven SIGPWR then terminated the process with signal 30.
|
Updated 4:15 PM PT - May 19th, 2026
❌ @robobun, your commit 15ed2e2 has 9 failures in
🧪 To try this PR locally: bunx bun-pr 31104That installs a local version of the PR into your bun-31104 --bun |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (2)
WalkthroughThis PR addresses signal handler conflicts on Linux by removing ChangesSignal Handler Stability Fix
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
|
This PR may be a duplicate of:
🤖 Generated with Claude Code |
|
Duplicate of #30956, which has the same SIGPWR removal plus a mutex/depth guard around |
| stdout: "ignore", | ||
| stderr: "ignore", | ||
| }); | ||
| const exitCode = await proc.exited; | ||
| expect(proc.signalCode).toBeNull(); | ||
| expect(exitCode).toBe(0); |
There was a problem hiding this comment.
🟡 nit: Consider piping stderr (and surfacing it before the exitCode assertion) instead of "ignore". Per the repo convention in CLAUDE.md, spawned-process tests should print/assert output before asserting exitCode so that an unrelated failure (e.g. a debug assertion or future openInEditor regression) doesn't reduce to an opaque "expected 0, received 1" in CI. The signalCode assertion already covers the SIGPWR case nicely, but stderr-on-failure would help debug everything else.
Extended reasoning...
What the issue is
The new test spawns a subprocess with stdout: "ignore", stderr: "ignore" and then asserts only on proc.signalCode and exitCode:
await using proc = Bun.spawn({
cmd: [bunExe(), "-e", script],
env: { ...bunEnv, PATH: "/nonexistent" },
stdout: "ignore",
stderr: "ignore",
});
const exitCode = await proc.exited;
expect(proc.signalCode).toBeNull();
expect(exitCode).toBe(0);The repo's testing convention (root CLAUDE.md, line ~128) explicitly says: "When spawning processes, tests should expect(stdout).toBe(...) BEFORE expect(exitCode).toBe(0). This gives you a more useful error message on test failure." By discarding both streams, this test loses the diagnostic context that convention is designed to preserve.
How it would manifest
Walk through a concrete failure: suppose a future change introduces a debug-build assertion inside Bun.openInEditor, or the -e script throws an uncaught error before reaching the GC loop. The subprocess writes the assertion/stack trace to stderr and exits with code 1 — not via a signal. In CI you would see:
expect(proc.signalCode).toBeNull()→ passes (it exited normally, not via signal).expect(exitCode).toBe(0)→ fails withexpected 0, received 1.
That's the entire output. Whoever triages it has to reproduce locally to find out what actually happened, because stderr was sent to /dev/null.
Why existing assertions don't cover it
The expect(proc.signalCode).toBeNull() line is nicely diagnostic for the specific regression this PR fixes — if SIGPWR comes back, you'll see expected null, received "SIGPWR". But it does nothing for non-signal failures, and the exitCode assertion alone carries no payload.
Why "ignore" was probably chosen
It's understandable: the script fires 30 openInEditor calls with PATH=/nonexistent, so stderr is likely full of non-deterministic "editor not found"/ENOENT noise from racing background spawns. Asserting expect(stderr).toBe("") would be flaky.
Suggested fix
Pipe stderr and include it in the failure message without asserting exact equality, e.g.:
stdout: "ignore",
stderr: "pipe",
...
const stderr = await proc.stderr.text();
expect(proc.signalCode).toBeNull();
expect(exitCode, stderr).toBe(0);(or just stderr: "inherit" so CI logs capture it). This keeps the test tolerant of the expected editor-spawn noise while still giving a future maintainer something to read when exitCode !== 0 for an unrelated reason.
Severity
This is purely a test-quality / convention nit — it does not affect correctness of the fix or the test's ability to catch the SIGPWR regression. Not blocking.
| // SIGPWR is intentionally excluded: JSC uses it for GC thread suspend/resume | ||
| // on Linux (see WTF ThreadingPOSIX.cpp), so replacing its handler here would | ||
| // break the collector and can terminate the process with signal 30. |
There was a problem hiding this comment.
🟣 Worth noting as a follow-up: the PR description correctly identifies that Bun.openInEditor() runs sync::spawn on detached threads, which means the "We only ever use bun.spawnSync on the main thread" comment at line 912 is now known to be false, and the underlying race on previous_actions[] + the unconditional memset still applies to every other signal in FOR_EACH_SIGNAL (e.g. the SIGINT/SIGTERM → onExitSignal handler that restores TTY state on Ctrl-C, or any user process.on('SIGHUP'/...) handler, can still be left at SIG_DFL). Removing SIGPWR is the right call regardless and fixes the crash, so this is a pre-existing issue and not a blocker — but it'd be good to at least update the stale comment, and ideally track a follow-up to add locking/refcounting around previous_actions[] or stop openInEditor from calling sync::spawn off the main thread.
Extended reasoning...
What this is
The PR's own root-cause analysis (in the description) establishes two things that the code change doesn't fully address:
- The comment at
c-bindings.cpp:912—// Note: We only ever use bun.spawnSync on the main thread.— is the stated justification forprevious_actions[]andBun__currentSyncPIDbeing unsynchronized process-globals, and it is now known to be false. - The race on
previous_actions[]that the description spells out applies to every signal inFOR_EACH_SIGNAL, not justSIGPWR. RemovingSIGPWReliminates the GC-driven crash but leaves the race intact for the remaining ~15 signals.
Code path
Bun.openInEditor→Editor::open(src/runtime/cli/open.rs:378) doesstd::thread::Builder::spawn(auto_close)— a detached thread.auto_close()(open.rs:518) callssync::spawn(...).sync::spawn→spawn_posix(src/spawn/process.rs:3133) constructsSignalForwarding::register(), which callsBun__registerSignalsForForwarding(); itsDropimpl callsBun__unregisterSignalsForForwarding()thencrash_handler::reset_on_posix().reset_on_posix()only re-installs the crash-class handlers (SIGSEGV/SIGILL/SIGBUS/SIGFPE), none of which are inFOR_EACH_SIGNAL, so it does not repair damage toSIGINT/SIGTERM/SIGHUP/etc.
So previous_actions[NSIG] and the memset(previous_actions, 0, ...) in Bun__unregisterSignalsForForwarding() are reached from multiple detached threads concurrently with no locking — exactly what the line-912 comment says cannot happen.
Why existing code doesn't prevent it
There is no mutex, refcount, or thread check around Bun__registerSignalsForForwarding / Bun__unregisterSignalsForForwarding. The only "protection" is the comment asserting main-thread-only use, which the PR description itself disproves. The reset_on_posix() call in SignalForwarding::Drop covers a disjoint set of signals from FOR_EACH_SIGNAL, so it doesn't help here.
Step-by-step proof (remaining race)
Take SIGINT as the example. bun_initialize_process() installs onExitSignal (which calls bun_restore_stdio()) for SIGINT/SIGTERM when any stdio is a TTY.
- Two concurrent
Bun.openInEditor()calls spawn detached threads A and B. - A:
register—sigaction(SIGINT, forwarder, &previous_actions[SIGINT])savesonExitSignalintoprevious_actions[SIGINT]. - B:
register— saves the forwarder (now the current handler) intoprevious_actions[SIGINT], overwritingonExitSignal. - A:
unregister—sigaction(SIGINT, &previous_actions[SIGINT], NULL)installs the forwarder (from step 3), thenmemset(previous_actions, 0, sizeof(previous_actions))zeroes the whole array. - B:
unregister—sigaction(SIGINT, &previous_actions[SIGINT], NULL)installs a zeroedstruct sigaction, i.e.sa_handler = 0 = SIG_DFL.
After this settles, SIGINT is at SIG_DFL. The next Ctrl-C terminates the process without running bun_restore_stdio(), leaving the terminal in whatever state Bun (or a child) put it in. The same applies to user-installed process.on('SIGHUP'/'SIGUSR2'/...) handlers.
Impact
Far less severe than the SIGPWR crash this PR fixes — GC fires SIGPWR during normal operation, whereas the remaining signals only matter on external delivery, and "multiple concurrent Bun.openInEditor() calls" is a fuzzer-pathological pattern. The likely worst case is a missed TTY restore on Ctrl-C or a silently dropped user signal handler. So: real, pre-existing, strictly improved by this PR, and reasonable to defer.
Suggested fix / follow-up
- In this PR (cheap): delete or correct the comment at line 912 — the PR's own analysis directly contradicts it, and leaving it tells future readers no synchronization is needed.
- Follow-up: either guard
previous_actions[]/ thememsetwith a mutex + nesting refcount, or stopEditor::openfrom callingsync::spawnon a detached thread (use a non-signal-forwarding spawn for the editor process).
| #define FOR_EACH_LINUX_ONLY_SIGNAL(M) \ | ||
| M(SIGPOLL); \ | ||
| M(SIGPWR); \ | ||
| M(SIGSTKFLT); |
There was a problem hiding this comment.
🟣 Pre-existing, but since this PR is auditing this exact list: SIGPOLL is an alias for SIGIO (29) on Linux, and SIGIOT is an alias for SIGABRT (6). Because REGISTER_SIGNAL runs twice for the same numeric signal, the second sigaction() saves the just-installed forwarder into previous_actions[N], so unregister "restores" the forwarder instead of the real prior handler — same handler-clobbering class as the SIGPWR bug. Dropping M(SIGPOLL) here (and M(SIGIOT) from FOR_EACH_POSIX_SIGNAL) would fix it.
Extended reasoning...
What the bug is
On Linux, several signal names are aliases for the same numeric signal. From <asm-generic/signal.h>:
SIGIOTis#defined to6, identical toSIGABRTSIGPOLLis#defined toSIGIO, signal29
FOR_EACH_POSIX_SIGNAL lists both M(SIGABRT) and M(SIGIOT), and on Linux FOR_EACH_LINUX_ONLY_SIGNAL adds M(SIGPOLL) on top of the M(SIGIO) already in the POSIX list. So FOR_EACH_SIGNAL expands the same numeric signal twice for both 6 and 29.
Code path
REGISTER_SIGNAL expands to:
sigaction(SIG, &sa, &previous_actions[SIG]);When this runs twice for the same numeric SIG, the first call correctly saves the real prior disposition into previous_actions[SIG] and installs the forwarder. The second call then installs the forwarder again — and writes the currently installed handler (the forwarder it just put there) back into previous_actions[SIG], overwriting the real saved handler.
UNREGISTER_SIGNAL then does sigaction(SIG, &previous_actions[SIG], NULL) twice, both times "restoring" the forwarder. The original disposition is never put back.
Why nothing prevents it
This is the same npm-derived signal list the PR is fixing. npm's JS-level loop is order-insensitive because Node deduplicates by signal number internally, but here the C++ macro expansion calls sigaction() directly, once per name, with no dedup. Unlike the multi-thread race described in the PR body, this self-overwrite is deterministic — it happens on a single thread, on every spawnSync, no concurrency required. crash_handler::reset_on_posix() (called after unregister) only reinstalls SIGSEGV/SIGILL/SIGBUS/SIGFPE, so it does not repair SIGABRT or SIGIO.
Step-by-step proof (SIGIO/SIGPOLL, signal 29)
- User does
process.on('SIGIO', handler)→ libuv installs a handler for signal 29. Bun.spawnSync(...)→Bun__registerSignalsForForwarding():M(SIGIO):sigaction(29, &forwarder, &previous_actions[29])→ saves libuv handler, installs forwarder.M(SIGPOLL):sigaction(29, &forwarder, &previous_actions[29])→ saves forwarder (overwriting libuv handler), installs forwarder.
- Child exits →
Bun__unregisterSignalsForForwarding():M(SIGIO): restoresprevious_actions[29]= forwarder.M(SIGPOLL): restoresprevious_actions[29]= forwarder again.memset(previous_actions, 0, ...).
- Signal 29 now has the forwarder installed (with
SA_RESETHAND, andBun__currentSyncPID == 0). Next SIGIO delivery hits the forwarder, which just stashes it inBun__pendingSignalToSendand returns — the user's handler never fires — andSA_RESETHANDthen drops the disposition toSIG_DFL.
The identical sequence applies to signal 6 via SIGABRT/SIGIOT.
Impact
Any user-installed SIGABRT or SIGIO handler is permanently lost after a single spawnSync. Far less severe than SIGPWR — no internal Bun/JSC component depends on these the way GC depends on SIGPWR — so this won't crash the process on its own. But it is exactly the same handler-clobbering defect class this PR is eliminating, and M(SIGPOLL) sits one line away from the line being deleted.
Fix
Drop the redundant aliases:
- Remove
M(SIGIOT);fromFOR_EACH_POSIX_SIGNAL(it'sSIGABRTeverywhere). - Remove
M(SIGPOLL);fromFOR_EACH_LINUX_ONLY_SIGNAL(it'sSIGIOon Linux).
Alternatively, guard REGISTER_SIGNAL to skip a signal whose previous_actions[SIG].sa_handler is already the forwarder, but de-duplicating the list is simpler and matches the spirit of this PR.
Fuzzilli found a flaky crash (fingerprint
5b0730bc208b3b6c) where the process is terminated by signal 30 (SIGPWR).Root cause
On Linux, JavaScriptCore uses SIGPWR for GC thread suspend/resume (see
WTF/wtf/posix/ThreadingPOSIX.cppunderUSE(BUN_JSC_ADDITIONS)— upstream uses SIGUSR1, Bun switched to SIGPWR so npm packages can intercept SIGUSR1).The
bun.spawnSyncsignal-forwarding list inc-bindings.cppwas copied from npm's list and includes SIGPWR.Bun__registerSignalsForForwarding()therefore replaces JSC's SIGPWR handler with a forwarding handler markedSA_RESETHAND, andBun__unregisterSignalsForForwarding()restores the previous handler from a process-globalprevious_actions[]array.Bun.openInEditor()spawns a detached background thread which callssync::spawn→Bun__registerSignalsForForwarding/Bun__unregisterSignalsForForwarding. Multiple concurrent editor threads race onprevious_actions[]:memsetsprevious_actionsto zeroSIG_DFLAfter the race settles, SIGPWR is left at
SIG_DFL. The next time GC sends SIGPWR to suspend a thread, the process dies with signal 30.Even without the overlap race, temporarily replacing JSC's SIGPWR handler while JS is running concurrently is unsafe: a concurrent GC that fires during the window hits the wrong handler and never posts the suspend semaphore.
Fix
Remove SIGPWR from
FOR_EACH_LINUX_ONLY_SIGNAL. It's an obscure "power failure" signal that supervisors don't send tobun run, and JSC has explicitly reserved it for GC on Linux.Repro
Before the fix, this dies with SIGPWR 100/100 on a debug build:
After: 0/100.