Fix crash when starting a compiled executable in a deleted cwd#31496
Conversation
A standalone `bun build --compile` binary bypasses `Arguments::parse` (no global flags, no baked exec-argv), so `absolute_working_dir` stays unset and the first `getcwd` of the entire startup happens inside `Transpiler::init` (FileSystem::init). When the cwd has been deleted that `getcwd` returns ENOENT. The per-VM init hook `init_runtime_state` had no error channel, so it swallowed the error, logged "Transpiler: init failed", and left `vm.transpiler` as the zeroed bytes the allocator handed out. The next read of that field (`configure_defines` -> `run_env_loader` -> `env_mut`) dereferenced a null pointer and the process crashed — the segfault seen when launching a compiled CLI from a directory that had been removed. Widen `init_runtime_state` to return `Result<_, Error>`. On `Transpiler::init` failure it now unwinds the per-VM state it set up (the RuntimeState box, its thread-local slot, and the AST stores) and returns the error, which `VirtualMachine::init` propagates with `?`. The CLI then turns it into a clean "ENOENT" message and exits 1 — the same path the plain `bun` CLI already took, and what the reference (`try Transpiler.init(...)`) intended. Test compiles a standalone binary and runs it from a deleted cwd, asserting a clean ENOENT + exit 1 rather than a crash.
|
Warning Review limit reached
More reviews will be available in 22 minutes and 28 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (1)
WalkthroughThis PR refactors runtime state initialization to propagate errors instead of silently failing. ChangesVM initialization error propagation
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
StatusRoot cause: standalone Fix: hook now returns Gate: verified test fails on the unfixed build (SIGILL, exit 132) and passes with the fix (exit 1) under Waiting on CI. |
|
Updated 12:15 AM PT - May 28th, 2026
❌ @robobun, your commit 0dc9ca6 has 1 failures in
Add 🧪 To try this PR locally: bunx bun-pr 31496That installs a local version of the PR into your bun-31496 --bun |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/jsc/VirtualMachine.rs`:
- Around line 1656-1663: VirtualMachine::init currently installs
VM/MAIN_THREAD_VM and allocates pre-hook resources (log, console) before calling
the unsafe init_runtime_state hook, and on Err it returns without undoing those
installs; introduce an RAII guard struct (e.g., InitGuard) created immediately
after VM/MAIN_THREAD_VM and pre-hook allocations that on Drop clears the
thread-local/global VM pointers (MAIN_THREAD_VM/VM) and frees/releases the
log/console allocations, then convert the call to init_runtime_state to use
map_err or let-binding so that on Err the guard runs and cleanup happens before
propagating the error; apply the same pattern for the second failing path
referenced (the other init() section around the other hook).
In `@test/bundler/bun-build-compile.test.ts`:
- Around line 563-575: Replace the shell wrapper that runs "/bin/sh -c" (the
Bun.spawn call creating proc with cmd ["/bin/sh", "-c", ...]) with an inline
Bun-based helper: use bunExe() with "-e" to run a short script that does
process.chdir(gone) then removes the directory, then spawn the compiled binary
(outfile) directly (e.g. Bun.spawnSync([outfile]) so the child inherits the
deleted cwd) instead of invoking a shell; update the variables currently used
(proc, bunEnv, outfile, gone) accordingly and remove reliance on
shell-quoting-sensitive command strings.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: bbbff726-114b-48e8-a82b-c4751b3d629d
📒 Files selected for processing (3)
src/jsc/VirtualMachine.rssrc/runtime/jsc_hooks.rstest/bundler/bun-build-compile.test.ts
CI statusBuild #58582: 72 lanes green (including The diff is ~30 lines across two source files (a Pushed one |
The widened `init_runtime_state` makes `VirtualMachine::init` abort via `?` when `Transpiler::init` fails, so a VM with a zeroed `transpiler` never gets returned. `load_preloads` only runs via `reload_entry_point*` on an already-constructed VM, so the `vm.transpiler.fs.is_null()` branch (a temporary workaround for the old swallow-the-error behavior, which referenced the now-removed `TODO(b2)`) is unreachable. Remove it and fix the stale comment.
Review round addressedPushed 81bc75f:
All three review threads resolved. Test still passes under ASAN ( |
The comment above the `match Transpiler::init(...)` still claimed it "returns `Err(Error::TODO)` until those surface, so the `Err` arm below is the live path for now" — stale on two counts: `Transpiler::init` no longer returns `Error::TODO` (it succeeds on the happy path), and this change rewrote that `Err` arm to abort init, so the sentence now misreads as "every VM init aborts". Drop the stale sentence; keep the still-accurate `ptr::write` load-bearing note. Comment-only.
There was a problem hiding this comment.
Both of my earlier nits are addressed and I didn't find any further issues, but since this touches VirtualMachine::init with unsafe pointer/TLS unwind logic on a hot startup path, I'd like a maintainer to sign off on the error-path cleanup ordering and the (well-argued) decision to skip an RAII guard around the pre-hook VM globals.
Extended reasoning...
Overview
This PR widens the RuntimeHooks::init_runtime_state hook signature from -> RuntimeState to -> Result<RuntimeState, bun_core::Error>, rewrites the Transpiler::init Err arm in src/runtime/jsc_hooks.rs to unwind per-VM state (clear RUNTIME_STATE TLS, heap::take the RuntimeState box, Store::deinit() the AST stores) and return Err, and has VirtualMachine::init propagate via ?. It also removes a now-dead fs.is_null() guard in load_preloads, refreshes two stale comments, and adds a POSIX-gated regression test in test/bundler/bun-build-compile.test.ts that compiles a standalone binary and execs it from a deleted cwd.
Security risks
None identified. The change converts a segfault into a clean ENOENT + exit 1. No new inputs, no auth/crypto/permissions surface, no user-controlled data flows into the new code path. The shell wrapper in the test interpolates only tempDir-generated paths, which the harness controls.
Level of scrutiny
High. VirtualMachine::init is on the startup path of every Bun process and every compiled binary. The new Err arm is hand-written unwind logic in unsafe Rust (raw-pointer heap::take, TLS slot manipulation, AST store teardown) that must mirror deinit_runtime_state correctly. The author also made a judgment call — declining CodeRabbit's RAII-guard suggestion for pre-hook VM/MAIN_THREAD_VM/log/console state — on the grounds of Zig-reference parity and the process-fatal nature of init failure. The reasoning is sound and well-cited, but it's the kind of architectural call a maintainer should confirm.
Other factors
- Both of my prior inline nits (stale
load_preloadsguard referencing the deletedTODO(b2), and the stale "Err arm is the live path" comment above the rewritten match) were addressed in 81bc75f and 0dc9ca6; all four review threads on the PR are resolved. - The bug-hunting system found no issues.
- The change resolves an explicit in-tree
TODO(b2): widen init_runtime_state return to Resultand matches the Zig reference'stry Transpiler.init(...)semantics. - Regression test was verified to fail (SIGILL/exit 132) on the unfixed build and pass (exit 1, ENOENT in stderr) with the fix, including under ASAN.
- CI: 72 lanes green; the musl-build red lanes are an unrelated LTO/data-layout toolchain failure ("Linking two modules of different data layouts"), not caused by this ~30-line diff.
The change looks correct to me, but "unsafe manual unwind in VM init" is squarely in the "critical code path" category where I should defer rather than approve unilaterally.
StatusAll four review threads are resolved; both claude[bot] nits fixed (81bc75f, 0dc9ca6), both coderabbit suggestions declined with rationale (and withdrawn by the bot). claude[bot]'s last note requests a maintainer sign-off on the CI build #58597 (latest push) is in progress with no failures so far — the ASAN/musl Rust+C++ compile lanes are green; the final The diff is a ~30-line source change (a |
CI is green except a flaky
|
What
Starting a
bun build --compilestandalone executable from a directory that has been deleted crashes with a segfault instead of a clean error:Reproduction
A process can keep a deleted directory as its cwd until the last fd to it closes, and
getcwd()then returnsENOENT. (POSIX-only — Windows refuses to remove a directory that is a process's cwd.)Cause
A standalone compiled binary bypasses
Arguments::parse(no global flags, no baked exec-argv), soabsolute_working_dirstays unset and the firstgetcwdof the entire startup happens insideTranspiler::init→FileSystem::init(src/resolver/lib.rs:316). When the cwd is gone thatgetcwdreturnsENOENT.The per-VM init hook
init_runtime_state(src/runtime/jsc_hooks.rs) had no error channel, so it swallowed the error, loggedTranspiler: init failed, and leftvm.transpileras the zeroed bytes the allocator handed out. The next read of that field (configure_defines→run_env_loader→Transpiler::env_mut) dereferenced a null pointer → crash.The plain
bun <file>/bun -e/bun runpaths don't hit this:Arguments::parsecallsgetcwdfirst and surfaces the ENOENT cleanly, masking the bug — so only compiled executables crash.Fix
Widen
init_runtime_stateto returnResult<RuntimeState, bun_core::Error>. OnTranspiler::initfailure it unwinds the per-VM state it set up (theRuntimeStatebox, its thread-local slot, and the AST stores — mirroringdeinit_runtime_state) and returns the error, whichVirtualMachine::initpropagates with?. The CLI then turns it into a cleanENOENTmessage and exits 1 — the path the plainbunCLI already used, and what the reference (try Transpiler.init(...), VirtualMachine.zig:1241) intended. This also resolves the in-treeTODO(b2): widen init_runtime_state return to Result.Verification
New test in
test/bundler/bun-build-compile.test.tscompiles a standalone binary and runs it from a deleted cwd.boot_standalone→Transpiler::env_mut, terminates via SIGILL (exit 132).stdoutempty,stderrcontainsENOENT, exit code 1.