Skip to content

Don't forward SIGPWR in spawnSync signal handling on Linux#31104

Closed
robobun wants to merge 1 commit into
mainfrom
farm/bb96feb1/sigpwr-signal-forwarding
Closed

Don't forward SIGPWR in spawnSync signal handling on Linux#31104
robobun wants to merge 1 commit into
mainfrom
farm/bb96feb1/sigpwr-signal-forwarding

Conversation

@robobun

@robobun robobun commented May 19, 2026

Copy link
Copy Markdown
Collaborator

Fuzzilli found a flaky crash (fingerprint 5b0730bc208b3b6c) where the process is terminated by signal 30 (SIGPWR).

Root cause

On Linux, JavaScriptCore uses SIGPWR for GC thread suspend/resume (see WTF/wtf/posix/ThreadingPOSIX.cpp under USE(BUN_JSC_ADDITIONS) — upstream uses SIGUSR1, Bun switched to SIGPWR so npm packages can intercept SIGUSR1).

The bun.spawnSync signal-forwarding list in c-bindings.cpp was copied from npm's list and includes SIGPWR. Bun__registerSignalsForForwarding() therefore replaces JSC's SIGPWR handler with a forwarding handler marked SA_RESETHAND, and Bun__unregisterSignalsForForwarding() restores the previous handler from a process-global previous_actions[] array.

Bun.openInEditor() spawns a detached background thread which calls sync::spawnBun__registerSignalsForForwarding / Bun__unregisterSignalsForForwarding. Multiple concurrent editor threads race on previous_actions[]:

  1. Thread A: register → saves JSC handler, installs forwarder
  2. Thread B: register → saves forwarder as "previous", installs forwarder
  3. Thread A: unregister → restores forwarder, then memsets previous_actions to zero
  4. Thread B: unregister → restores zeroed struct = SIG_DFL

After the race settles, SIGPWR is left at SIG_DFL. The next time GC sends SIGPWR to suspend a thread, the process dies with signal 30.

Even without the overlap race, temporarily replacing JSC's SIGPWR handler while JS is running concurrently is unsafe: a concurrent GC that fires during the window hits the wrong handler and never posts the suspend semaphore.

Fix

Remove SIGPWR from FOR_EACH_LINUX_ONLY_SIGNAL. It's an obscure "power failure" signal that supervisors don't send to bun run, and JSC has explicitly reserved it for GC on Linux.

Repro

Before the fix, this dies with SIGPWR 100/100 on a debug build:

for (let i = 0; i < 30; i++) {
  try { Bun.openInEditor("/tmp/x" + i); } catch {}
}
await Bun.sleep(50);
for (let i = 0; i < 200; i++) {
  new Uint8Array(10000);
  Bun.gc(true);
}

After: 0/100.

On Linux, JSC uses SIGPWR for GC thread suspend/resume (see
WTF ThreadingPOSIX.cpp). The spawnSync signal-forwarding list was
copied from npm and included SIGPWR, so Bun__registerSignalsForForwarding
would replace JSC's handler with a forwarding handler marked
SA_RESETHAND.

Bun.openInEditor() calls sync::spawn from a detached background thread,
which meant multiple editor threads could race on the global
previous_actions[] array and leave SIGPWR at SIG_DFL after the last
unregister. The next GC-driven SIGPWR then terminated the process with
signal 30.
@robobun

robobun commented May 19, 2026

Copy link
Copy Markdown
Collaborator Author
Updated 4:15 PM PT - May 19th, 2026

@robobun, your commit 15ed2e2 has 9 failures in Build #56267 (All Failures):


🧪   To try this PR locally:

bunx bun-pr 31104

That installs a local version of the PR into your bun-31104 executable, so you can run:

bun-31104 --bun

@coderabbitai

coderabbitai Bot commented May 19, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 59e58942-365d-42f6-b565-c99325acca01

📥 Commits

Reviewing files that changed from the base of the PR and between b96cc3f and 15ed2e2.

📒 Files selected for processing (2)
  • src/jsc/bindings/c-bindings.cpp
  • test/js/bun/util/openInEditor-gc.test.ts

Walkthrough

This PR addresses signal handler conflicts on Linux by removing SIGPWR from the signal forwarding list in bun.spawnSync, since JSC uses SIGPWR internally for GC thread suspend/resume. A new regression test validates the fix by spawning a process that repeatedly calls Bun.openInEditor() under GC stress without signal termination.

Changes

Signal Handler Stability Fix

Layer / File(s) Summary
Signal forwarding exclusion
src/jsc/bindings/c-bindings.cpp
Linux-specific signal macro is modified to exclude SIGPWR from parent→child forwarding in bun.spawnSync, retaining only SIGPOLL, with a comment explaining JSC's internal use of SIGPWR.
Regression test for GC signal handler
test/js/bun/util/openInEditor-gc.test.ts
Linux-only test spawns a bun process that repeatedly calls Bun.openInEditor() and performs GC stress, asserting the process exits cleanly without signal termination.
🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: removing SIGPWR from signal forwarding in spawnSync on Linux, which directly matches the code modification.
Description check ✅ Passed The description comprehensively addresses both required template sections with detailed root cause analysis, fix explanation, and a concrete repro case demonstrating the issue and verification.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

Copy link
Copy Markdown
Contributor

This PR may be a duplicate of:

  1. spawnSync: make signal-forwarding register/unregister safe for concurrent callers #30956 - Also removes SIGPWR from FOR_EACH_LINUX_ONLY_SIGNAL in c-bindings.cpp to fix the same race condition where concurrent spawnSync signal forwarding leaves SIGPWR at SIG_DFL

🤖 Generated with Claude Code

@robobun

robobun commented May 19, 2026

Copy link
Copy Markdown
Collaborator Author

Duplicate of #30956, which has the same SIGPWR removal plus a mutex/depth guard around previous_actions[] and the SIGIOT/SIGPOLL alias cleanup. Closing in favor of that one.

@robobun robobun closed this May 19, 2026
@robobun robobun deleted the farm/bb96feb1/sigpwr-signal-forwarding branch May 19, 2026 23:14
Comment on lines +26 to +31
stdout: "ignore",
stderr: "ignore",
});
const exitCode = await proc.exited;
expect(proc.signalCode).toBeNull();
expect(exitCode).toBe(0);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 nit: Consider piping stderr (and surfacing it before the exitCode assertion) instead of "ignore". Per the repo convention in CLAUDE.md, spawned-process tests should print/assert output before asserting exitCode so that an unrelated failure (e.g. a debug assertion or future openInEditor regression) doesn't reduce to an opaque "expected 0, received 1" in CI. The signalCode assertion already covers the SIGPWR case nicely, but stderr-on-failure would help debug everything else.

Extended reasoning...

What the issue is

The new test spawns a subprocess with stdout: "ignore", stderr: "ignore" and then asserts only on proc.signalCode and exitCode:

await using proc = Bun.spawn({
  cmd: [bunExe(), "-e", script],
  env: { ...bunEnv, PATH: "/nonexistent" },
  stdout: "ignore",
  stderr: "ignore",
});
const exitCode = await proc.exited;
expect(proc.signalCode).toBeNull();
expect(exitCode).toBe(0);

The repo's testing convention (root CLAUDE.md, line ~128) explicitly says: "When spawning processes, tests should expect(stdout).toBe(...) BEFORE expect(exitCode).toBe(0). This gives you a more useful error message on test failure." By discarding both streams, this test loses the diagnostic context that convention is designed to preserve.

How it would manifest

Walk through a concrete failure: suppose a future change introduces a debug-build assertion inside Bun.openInEditor, or the -e script throws an uncaught error before reaching the GC loop. The subprocess writes the assertion/stack trace to stderr and exits with code 1 — not via a signal. In CI you would see:

  1. expect(proc.signalCode).toBeNull() → passes (it exited normally, not via signal).
  2. expect(exitCode).toBe(0) → fails with expected 0, received 1.

That's the entire output. Whoever triages it has to reproduce locally to find out what actually happened, because stderr was sent to /dev/null.

Why existing assertions don't cover it

The expect(proc.signalCode).toBeNull() line is nicely diagnostic for the specific regression this PR fixes — if SIGPWR comes back, you'll see expected null, received "SIGPWR". But it does nothing for non-signal failures, and the exitCode assertion alone carries no payload.

Why "ignore" was probably chosen

It's understandable: the script fires 30 openInEditor calls with PATH=/nonexistent, so stderr is likely full of non-deterministic "editor not found"/ENOENT noise from racing background spawns. Asserting expect(stderr).toBe("") would be flaky.

Suggested fix

Pipe stderr and include it in the failure message without asserting exact equality, e.g.:

stdout: "ignore",
stderr: "pipe",
...
const stderr = await proc.stderr.text();
expect(proc.signalCode).toBeNull();
expect(exitCode, stderr).toBe(0);

(or just stderr: "inherit" so CI logs capture it). This keeps the test tolerant of the expected editor-spawn noise while still giving a future maintainer something to read when exitCode !== 0 for an unrelated reason.

Severity

This is purely a test-quality / convention nit — it does not affect correctness of the fix or the test's ability to catch the SIGPWR regression. Not blocking.

Comment on lines +936 to +938
// SIGPWR is intentionally excluded: JSC uses it for GC thread suspend/resume
// on Linux (see WTF ThreadingPOSIX.cpp), so replacing its handler here would
// break the collector and can terminate the process with signal 30.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟣 Worth noting as a follow-up: the PR description correctly identifies that Bun.openInEditor() runs sync::spawn on detached threads, which means the "We only ever use bun.spawnSync on the main thread" comment at line 912 is now known to be false, and the underlying race on previous_actions[] + the unconditional memset still applies to every other signal in FOR_EACH_SIGNAL (e.g. the SIGINT/SIGTERMonExitSignal handler that restores TTY state on Ctrl-C, or any user process.on('SIGHUP'/...) handler, can still be left at SIG_DFL). Removing SIGPWR is the right call regardless and fixes the crash, so this is a pre-existing issue and not a blocker — but it'd be good to at least update the stale comment, and ideally track a follow-up to add locking/refcounting around previous_actions[] or stop openInEditor from calling sync::spawn off the main thread.

Extended reasoning...

What this is

The PR's own root-cause analysis (in the description) establishes two things that the code change doesn't fully address:

  1. The comment at c-bindings.cpp:912// Note: We only ever use bun.spawnSync on the main thread. — is the stated justification for previous_actions[] and Bun__currentSyncPID being unsynchronized process-globals, and it is now known to be false.
  2. The race on previous_actions[] that the description spells out applies to every signal in FOR_EACH_SIGNAL, not just SIGPWR. Removing SIGPWR eliminates the GC-driven crash but leaves the race intact for the remaining ~15 signals.

Code path

  • Bun.openInEditorEditor::open (src/runtime/cli/open.rs:378) does std::thread::Builder::spawn(auto_close) — a detached thread.
  • auto_close() (open.rs:518) calls sync::spawn(...).
  • sync::spawnspawn_posix (src/spawn/process.rs:3133) constructs SignalForwarding::register(), which calls Bun__registerSignalsForForwarding(); its Drop impl calls Bun__unregisterSignalsForForwarding() then crash_handler::reset_on_posix().
  • reset_on_posix() only re-installs the crash-class handlers (SIGSEGV/SIGILL/SIGBUS/SIGFPE), none of which are in FOR_EACH_SIGNAL, so it does not repair damage to SIGINT/SIGTERM/SIGHUP/etc.

So previous_actions[NSIG] and the memset(previous_actions, 0, ...) in Bun__unregisterSignalsForForwarding() are reached from multiple detached threads concurrently with no locking — exactly what the line-912 comment says cannot happen.

Why existing code doesn't prevent it

There is no mutex, refcount, or thread check around Bun__registerSignalsForForwarding / Bun__unregisterSignalsForForwarding. The only "protection" is the comment asserting main-thread-only use, which the PR description itself disproves. The reset_on_posix() call in SignalForwarding::Drop covers a disjoint set of signals from FOR_EACH_SIGNAL, so it doesn't help here.

Step-by-step proof (remaining race)

Take SIGINT as the example. bun_initialize_process() installs onExitSignal (which calls bun_restore_stdio()) for SIGINT/SIGTERM when any stdio is a TTY.

  1. Two concurrent Bun.openInEditor() calls spawn detached threads A and B.
  2. A: registersigaction(SIGINT, forwarder, &previous_actions[SIGINT]) saves onExitSignal into previous_actions[SIGINT].
  3. B: register — saves the forwarder (now the current handler) into previous_actions[SIGINT], overwriting onExitSignal.
  4. A: unregistersigaction(SIGINT, &previous_actions[SIGINT], NULL) installs the forwarder (from step 3), then memset(previous_actions, 0, sizeof(previous_actions)) zeroes the whole array.
  5. B: unregistersigaction(SIGINT, &previous_actions[SIGINT], NULL) installs a zeroed struct sigaction, i.e. sa_handler = 0 = SIG_DFL.

After this settles, SIGINT is at SIG_DFL. The next Ctrl-C terminates the process without running bun_restore_stdio(), leaving the terminal in whatever state Bun (or a child) put it in. The same applies to user-installed process.on('SIGHUP'/'SIGUSR2'/...) handlers.

Impact

Far less severe than the SIGPWR crash this PR fixes — GC fires SIGPWR during normal operation, whereas the remaining signals only matter on external delivery, and "multiple concurrent Bun.openInEditor() calls" is a fuzzer-pathological pattern. The likely worst case is a missed TTY restore on Ctrl-C or a silently dropped user signal handler. So: real, pre-existing, strictly improved by this PR, and reasonable to defer.

Suggested fix / follow-up

  • In this PR (cheap): delete or correct the comment at line 912 — the PR's own analysis directly contradicts it, and leaving it tells future readers no synchronization is needed.
  • Follow-up: either guard previous_actions[] / the memset with a mutex + nesting refcount, or stop Editor::open from calling sync::spawn on a detached thread (use a non-signal-forwarding spawn for the editor process).

Comment on lines 939 to 941
#define FOR_EACH_LINUX_ONLY_SIGNAL(M) \
M(SIGPOLL); \
M(SIGPWR); \
M(SIGSTKFLT);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟣 Pre-existing, but since this PR is auditing this exact list: SIGPOLL is an alias for SIGIO (29) on Linux, and SIGIOT is an alias for SIGABRT (6). Because REGISTER_SIGNAL runs twice for the same numeric signal, the second sigaction() saves the just-installed forwarder into previous_actions[N], so unregister "restores" the forwarder instead of the real prior handler — same handler-clobbering class as the SIGPWR bug. Dropping M(SIGPOLL) here (and M(SIGIOT) from FOR_EACH_POSIX_SIGNAL) would fix it.

Extended reasoning...

What the bug is

On Linux, several signal names are aliases for the same numeric signal. From <asm-generic/signal.h>:

  • SIGIOT is #defined to 6, identical to SIGABRT
  • SIGPOLL is #defined to SIGIO, signal 29

FOR_EACH_POSIX_SIGNAL lists both M(SIGABRT) and M(SIGIOT), and on Linux FOR_EACH_LINUX_ONLY_SIGNAL adds M(SIGPOLL) on top of the M(SIGIO) already in the POSIX list. So FOR_EACH_SIGNAL expands the same numeric signal twice for both 6 and 29.

Code path

REGISTER_SIGNAL expands to:

sigaction(SIG, &sa, &previous_actions[SIG]);

When this runs twice for the same numeric SIG, the first call correctly saves the real prior disposition into previous_actions[SIG] and installs the forwarder. The second call then installs the forwarder again — and writes the currently installed handler (the forwarder it just put there) back into previous_actions[SIG], overwriting the real saved handler.

UNREGISTER_SIGNAL then does sigaction(SIG, &previous_actions[SIG], NULL) twice, both times "restoring" the forwarder. The original disposition is never put back.

Why nothing prevents it

This is the same npm-derived signal list the PR is fixing. npm's JS-level loop is order-insensitive because Node deduplicates by signal number internally, but here the C++ macro expansion calls sigaction() directly, once per name, with no dedup. Unlike the multi-thread race described in the PR body, this self-overwrite is deterministic — it happens on a single thread, on every spawnSync, no concurrency required. crash_handler::reset_on_posix() (called after unregister) only reinstalls SIGSEGV/SIGILL/SIGBUS/SIGFPE, so it does not repair SIGABRT or SIGIO.

Step-by-step proof (SIGIO/SIGPOLL, signal 29)

  1. User does process.on('SIGIO', handler) → libuv installs a handler for signal 29.
  2. Bun.spawnSync(...)Bun__registerSignalsForForwarding():
    • M(SIGIO): sigaction(29, &forwarder, &previous_actions[29]) → saves libuv handler, installs forwarder.
    • M(SIGPOLL): sigaction(29, &forwarder, &previous_actions[29]) → saves forwarder (overwriting libuv handler), installs forwarder.
  3. Child exits → Bun__unregisterSignalsForForwarding():
    • M(SIGIO): restores previous_actions[29] = forwarder.
    • M(SIGPOLL): restores previous_actions[29] = forwarder again.
    • memset(previous_actions, 0, ...).
  4. Signal 29 now has the forwarder installed (with SA_RESETHAND, and Bun__currentSyncPID == 0). Next SIGIO delivery hits the forwarder, which just stashes it in Bun__pendingSignalToSend and returns — the user's handler never fires — and SA_RESETHAND then drops the disposition to SIG_DFL.

The identical sequence applies to signal 6 via SIGABRT/SIGIOT.

Impact

Any user-installed SIGABRT or SIGIO handler is permanently lost after a single spawnSync. Far less severe than SIGPWR — no internal Bun/JSC component depends on these the way GC depends on SIGPWR — so this won't crash the process on its own. But it is exactly the same handler-clobbering defect class this PR is eliminating, and M(SIGPOLL) sits one line away from the line being deleted.

Fix

Drop the redundant aliases:

  • Remove M(SIGIOT); from FOR_EACH_POSIX_SIGNAL (it's SIGABRT everywhere).
  • Remove M(SIGPOLL); from FOR_EACH_LINUX_ONLY_SIGNAL (it's SIGIO on Linux).

Alternatively, guard REGISTER_SIGNAL to skip a signal whose previous_actions[SIG].sa_handler is already the forwarder, but de-duplicating the list is simpler and matches the spirit of this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant