Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion src/jsc/bindings/c-bindings.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -933,10 +933,12 @@
M(SIGIO);

#if OS(LINUX)
// SIGPWR is intentionally excluded: JSC uses it for GC thread suspend/resume
// on Linux (see WTF ThreadingPOSIX.cpp), so replacing its handler here would
// break the collector and can terminate the process with signal 30.

Check notice on line 938 in src/jsc/bindings/c-bindings.cpp

View check run for this annotation

Claude / Claude Code Review

previous_actions[] race remains for other signals; main-thread-only comment now known false

Worth noting as a follow-up: the PR description correctly identifies that `Bun.openInEditor()` runs `sync::spawn` on detached threads, which means the "We only ever use bun.spawnSync on the main thread" comment at line 912 is now known to be false, and the underlying race on `previous_actions[]` + the unconditional `memset` still applies to every other signal in `FOR_EACH_SIGNAL` (e.g. the `SIGINT`/`SIGTERM` → `onExitSignal` handler that restores TTY state on Ctrl-C, or any user `process.on('SIG
Comment on lines +936 to +938

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟣 Worth noting as a follow-up: the PR description correctly identifies that Bun.openInEditor() runs sync::spawn on detached threads, which means the "We only ever use bun.spawnSync on the main thread" comment at line 912 is now known to be false, and the underlying race on previous_actions[] + the unconditional memset still applies to every other signal in FOR_EACH_SIGNAL (e.g. the SIGINT/SIGTERMonExitSignal handler that restores TTY state on Ctrl-C, or any user process.on('SIGHUP'/...) handler, can still be left at SIG_DFL). Removing SIGPWR is the right call regardless and fixes the crash, so this is a pre-existing issue and not a blocker — but it'd be good to at least update the stale comment, and ideally track a follow-up to add locking/refcounting around previous_actions[] or stop openInEditor from calling sync::spawn off the main thread.

Extended reasoning...

What this is

The PR's own root-cause analysis (in the description) establishes two things that the code change doesn't fully address:

  1. The comment at c-bindings.cpp:912// Note: We only ever use bun.spawnSync on the main thread. — is the stated justification for previous_actions[] and Bun__currentSyncPID being unsynchronized process-globals, and it is now known to be false.
  2. The race on previous_actions[] that the description spells out applies to every signal in FOR_EACH_SIGNAL, not just SIGPWR. Removing SIGPWR eliminates the GC-driven crash but leaves the race intact for the remaining ~15 signals.

Code path

  • Bun.openInEditorEditor::open (src/runtime/cli/open.rs:378) does std::thread::Builder::spawn(auto_close) — a detached thread.
  • auto_close() (open.rs:518) calls sync::spawn(...).
  • sync::spawnspawn_posix (src/spawn/process.rs:3133) constructs SignalForwarding::register(), which calls Bun__registerSignalsForForwarding(); its Drop impl calls Bun__unregisterSignalsForForwarding() then crash_handler::reset_on_posix().
  • reset_on_posix() only re-installs the crash-class handlers (SIGSEGV/SIGILL/SIGBUS/SIGFPE), none of which are in FOR_EACH_SIGNAL, so it does not repair damage to SIGINT/SIGTERM/SIGHUP/etc.

So previous_actions[NSIG] and the memset(previous_actions, 0, ...) in Bun__unregisterSignalsForForwarding() are reached from multiple detached threads concurrently with no locking — exactly what the line-912 comment says cannot happen.

Why existing code doesn't prevent it

There is no mutex, refcount, or thread check around Bun__registerSignalsForForwarding / Bun__unregisterSignalsForForwarding. The only "protection" is the comment asserting main-thread-only use, which the PR description itself disproves. The reset_on_posix() call in SignalForwarding::Drop covers a disjoint set of signals from FOR_EACH_SIGNAL, so it doesn't help here.

Step-by-step proof (remaining race)

Take SIGINT as the example. bun_initialize_process() installs onExitSignal (which calls bun_restore_stdio()) for SIGINT/SIGTERM when any stdio is a TTY.

  1. Two concurrent Bun.openInEditor() calls spawn detached threads A and B.
  2. A: registersigaction(SIGINT, forwarder, &previous_actions[SIGINT]) saves onExitSignal into previous_actions[SIGINT].
  3. B: register — saves the forwarder (now the current handler) into previous_actions[SIGINT], overwriting onExitSignal.
  4. A: unregistersigaction(SIGINT, &previous_actions[SIGINT], NULL) installs the forwarder (from step 3), then memset(previous_actions, 0, sizeof(previous_actions)) zeroes the whole array.
  5. B: unregistersigaction(SIGINT, &previous_actions[SIGINT], NULL) installs a zeroed struct sigaction, i.e. sa_handler = 0 = SIG_DFL.

After this settles, SIGINT is at SIG_DFL. The next Ctrl-C terminates the process without running bun_restore_stdio(), leaving the terminal in whatever state Bun (or a child) put it in. The same applies to user-installed process.on('SIGHUP'/'SIGUSR2'/...) handlers.

Impact

Far less severe than the SIGPWR crash this PR fixes — GC fires SIGPWR during normal operation, whereas the remaining signals only matter on external delivery, and "multiple concurrent Bun.openInEditor() calls" is a fuzzer-pathological pattern. The likely worst case is a missed TTY restore on Ctrl-C or a silently dropped user signal handler. So: real, pre-existing, strictly improved by this PR, and reasonable to defer.

Suggested fix / follow-up

  • In this PR (cheap): delete or correct the comment at line 912 — the PR's own analysis directly contradicts it, and leaving it tells future readers no synchronization is needed.
  • Follow-up: either guard previous_actions[] / the memset with a mutex + nesting refcount, or stop Editor::open from calling sync::spawn on a detached thread (use a non-signal-forwarding spawn for the editor process).

#define FOR_EACH_LINUX_ONLY_SIGNAL(M) \
M(SIGPOLL); \
M(SIGPWR); \
M(SIGSTKFLT);

Check notice on line 941 in src/jsc/bindings/c-bindings.cpp

View check run for this annotation

Claude / Claude Code Review

SIGPOLL/SIGIO and SIGIOT/SIGABRT aliases self-overwrite previous_actions[]

Pre-existing, but since this PR is auditing this exact list: `SIGPOLL` is an alias for `SIGIO` (29) on Linux, and `SIGIOT` is an alias for `SIGABRT` (6). Because `REGISTER_SIGNAL` runs twice for the same numeric signal, the second `sigaction()` saves the just-installed forwarder into `previous_actions[N]`, so unregister "restores" the forwarder instead of the real prior handler — same handler-clobbering class as the SIGPWR bug. Dropping `M(SIGPOLL)` here (and `M(SIGIOT)` from `FOR_EACH_POSIX_SIG
Comment on lines 939 to 941

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟣 Pre-existing, but since this PR is auditing this exact list: SIGPOLL is an alias for SIGIO (29) on Linux, and SIGIOT is an alias for SIGABRT (6). Because REGISTER_SIGNAL runs twice for the same numeric signal, the second sigaction() saves the just-installed forwarder into previous_actions[N], so unregister "restores" the forwarder instead of the real prior handler — same handler-clobbering class as the SIGPWR bug. Dropping M(SIGPOLL) here (and M(SIGIOT) from FOR_EACH_POSIX_SIGNAL) would fix it.

Extended reasoning...

What the bug is

On Linux, several signal names are aliases for the same numeric signal. From <asm-generic/signal.h>:

  • SIGIOT is #defined to 6, identical to SIGABRT
  • SIGPOLL is #defined to SIGIO, signal 29

FOR_EACH_POSIX_SIGNAL lists both M(SIGABRT) and M(SIGIOT), and on Linux FOR_EACH_LINUX_ONLY_SIGNAL adds M(SIGPOLL) on top of the M(SIGIO) already in the POSIX list. So FOR_EACH_SIGNAL expands the same numeric signal twice for both 6 and 29.

Code path

REGISTER_SIGNAL expands to:

sigaction(SIG, &sa, &previous_actions[SIG]);

When this runs twice for the same numeric SIG, the first call correctly saves the real prior disposition into previous_actions[SIG] and installs the forwarder. The second call then installs the forwarder again — and writes the currently installed handler (the forwarder it just put there) back into previous_actions[SIG], overwriting the real saved handler.

UNREGISTER_SIGNAL then does sigaction(SIG, &previous_actions[SIG], NULL) twice, both times "restoring" the forwarder. The original disposition is never put back.

Why nothing prevents it

This is the same npm-derived signal list the PR is fixing. npm's JS-level loop is order-insensitive because Node deduplicates by signal number internally, but here the C++ macro expansion calls sigaction() directly, once per name, with no dedup. Unlike the multi-thread race described in the PR body, this self-overwrite is deterministic — it happens on a single thread, on every spawnSync, no concurrency required. crash_handler::reset_on_posix() (called after unregister) only reinstalls SIGSEGV/SIGILL/SIGBUS/SIGFPE, so it does not repair SIGABRT or SIGIO.

Step-by-step proof (SIGIO/SIGPOLL, signal 29)

  1. User does process.on('SIGIO', handler) → libuv installs a handler for signal 29.
  2. Bun.spawnSync(...)Bun__registerSignalsForForwarding():
    • M(SIGIO): sigaction(29, &forwarder, &previous_actions[29]) → saves libuv handler, installs forwarder.
    • M(SIGPOLL): sigaction(29, &forwarder, &previous_actions[29]) → saves forwarder (overwriting libuv handler), installs forwarder.
  3. Child exits → Bun__unregisterSignalsForForwarding():
    • M(SIGIO): restores previous_actions[29] = forwarder.
    • M(SIGPOLL): restores previous_actions[29] = forwarder again.
    • memset(previous_actions, 0, ...).
  4. Signal 29 now has the forwarder installed (with SA_RESETHAND, and Bun__currentSyncPID == 0). Next SIGIO delivery hits the forwarder, which just stashes it in Bun__pendingSignalToSend and returns — the user's handler never fires — and SA_RESETHAND then drops the disposition to SIG_DFL.

The identical sequence applies to signal 6 via SIGABRT/SIGIOT.

Impact

Any user-installed SIGABRT or SIGIO handler is permanently lost after a single spawnSync. Far less severe than SIGPWR — no internal Bun/JSC component depends on these the way GC depends on SIGPWR — so this won't crash the process on its own. But it is exactly the same handler-clobbering defect class this PR is eliminating, and M(SIGPOLL) sits one line away from the line being deleted.

Fix

Drop the redundant aliases:

  • Remove M(SIGIOT); from FOR_EACH_POSIX_SIGNAL (it's SIGABRT everywhere).
  • Remove M(SIGPOLL); from FOR_EACH_LINUX_ONLY_SIGNAL (it's SIGIO on Linux).

Alternatively, guard REGISTER_SIGNAL to skip a signal whose previous_actions[SIG].sa_handler is already the forwarder, but de-duplicating the list is simpler and matches the spirit of this PR.


#endif

Expand Down
33 changes: 33 additions & 0 deletions test/js/bun/util/openInEditor-gc.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
import { expect, test } from "bun:test";
import { bunEnv, bunExe } from "harness";

// Bun.openInEditor spawns a detached thread that runs sync::spawn, which
// installs signal-forwarding handlers. On Linux the forwarding list used to
// include SIGPWR, which JSC uses for GC thread suspend/resume. Overlapping
// register/unregister calls from multiple editor threads could leave the
// SIGPWR disposition at SIG_DFL, so the next GC-driven SIGPWR killed the
// process with signal 30.
test.skipIf(process.platform !== "linux")(
"Bun.openInEditor does not clobber the GC thread-suspend signal handler",
async () => {
const script = `
for (let i = 0; i < 30; i++) {
try { Bun.openInEditor("/tmp/bun-open-in-editor-gc-" + i); } catch {}
}
await Bun.sleep(50);
for (let i = 0; i < 100; i++) {
new Uint8Array(10000);
Bun.gc(true);
}
`;
await using proc = Bun.spawn({
cmd: [bunExe(), "-e", script],
env: { ...bunEnv, PATH: "/nonexistent" },
stdout: "ignore",
stderr: "ignore",
});
const exitCode = await proc.exited;
expect(proc.signalCode).toBeNull();
expect(exitCode).toBe(0);

Check warning on line 31 in test/js/bun/util/openInEditor-gc.test.ts

View check run for this annotation

Claude / Claude Code Review

Test ignores stderr, leaving no diagnostics on unrelated exitCode failure

nit: Consider piping `stderr` (and surfacing it before the `exitCode` assertion) instead of `"ignore"`. Per the repo convention in CLAUDE.md, spawned-process tests should print/assert output before asserting `exitCode` so that an unrelated failure (e.g. a debug assertion or future `openInEditor` regression) doesn't reduce to an opaque "expected 0, received 1" in CI. The `signalCode` assertion already covers the SIGPWR case nicely, but stderr-on-failure would help debug everything else.
Comment on lines +26 to +31

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 nit: Consider piping stderr (and surfacing it before the exitCode assertion) instead of "ignore". Per the repo convention in CLAUDE.md, spawned-process tests should print/assert output before asserting exitCode so that an unrelated failure (e.g. a debug assertion or future openInEditor regression) doesn't reduce to an opaque "expected 0, received 1" in CI. The signalCode assertion already covers the SIGPWR case nicely, but stderr-on-failure would help debug everything else.

Extended reasoning...

What the issue is

The new test spawns a subprocess with stdout: "ignore", stderr: "ignore" and then asserts only on proc.signalCode and exitCode:

await using proc = Bun.spawn({
  cmd: [bunExe(), "-e", script],
  env: { ...bunEnv, PATH: "/nonexistent" },
  stdout: "ignore",
  stderr: "ignore",
});
const exitCode = await proc.exited;
expect(proc.signalCode).toBeNull();
expect(exitCode).toBe(0);

The repo's testing convention (root CLAUDE.md, line ~128) explicitly says: "When spawning processes, tests should expect(stdout).toBe(...) BEFORE expect(exitCode).toBe(0). This gives you a more useful error message on test failure." By discarding both streams, this test loses the diagnostic context that convention is designed to preserve.

How it would manifest

Walk through a concrete failure: suppose a future change introduces a debug-build assertion inside Bun.openInEditor, or the -e script throws an uncaught error before reaching the GC loop. The subprocess writes the assertion/stack trace to stderr and exits with code 1 — not via a signal. In CI you would see:

  1. expect(proc.signalCode).toBeNull() → passes (it exited normally, not via signal).
  2. expect(exitCode).toBe(0) → fails with expected 0, received 1.

That's the entire output. Whoever triages it has to reproduce locally to find out what actually happened, because stderr was sent to /dev/null.

Why existing assertions don't cover it

The expect(proc.signalCode).toBeNull() line is nicely diagnostic for the specific regression this PR fixes — if SIGPWR comes back, you'll see expected null, received "SIGPWR". But it does nothing for non-signal failures, and the exitCode assertion alone carries no payload.

Why "ignore" was probably chosen

It's understandable: the script fires 30 openInEditor calls with PATH=/nonexistent, so stderr is likely full of non-deterministic "editor not found"/ENOENT noise from racing background spawns. Asserting expect(stderr).toBe("") would be flaky.

Suggested fix

Pipe stderr and include it in the failure message without asserting exact equality, e.g.:

stdout: "ignore",
stderr: "pipe",
...
const stderr = await proc.stderr.text();
expect(proc.signalCode).toBeNull();
expect(exitCode, stderr).toBe(0);

(or just stderr: "inherit" so CI logs capture it). This keeps the test tolerant of the expected editor-spawn noise while still giving a future maintainer something to read when exitCode !== 0 for an unrelated reason.

Severity

This is purely a test-quality / convention nit — it does not affect correctness of the fix or the test's ability to catch the SIGPWR regression. Not blocking.

},
);
Loading