Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 7 additions & 7 deletions src/jsc/ModuleLoader.rs
Original file line number Diff line number Diff line change
Expand Up @@ -57,14 +57,14 @@ impl ModuleLoader {
/// `VirtualMachine`, so passing both would alias (PORTING.md §Forbidden).
/// Access `module_loader` through `jsc_vm` instead.
pub fn reset_arena(jsc_vm: &mut VirtualMachine) {
// Spec ModuleLoader.zig:24-29: `if (smol) reset() else
// reset(.{.retain_with_limit = 8M})`. The port collapses both arms to
// `reset()` — `MimallocArena` is not a bump allocator, so there is no
// capacity to retain (see `MimallocArena::reset_retain_with_limit`
// PORT NOTE); mimalloc's per-thread segment cache already provides the
// warm-page reuse Zig's `.retain_with_limit` was after.
// Single owner of the reset — `transpile_source_code` only parks the Box.
let smol = jsc_vm.smol;
if let Some(arena) = jsc_vm.module_loader.transpile_source_code_arena.as_mut() {
arena.reset();
if smol {
arena.reset();
} else {
arena.reset_retain_with_limit(8 * 1024 * 1024);
}
}
}
}
Expand Down
10 changes: 5 additions & 5 deletions src/jsc/ZigException.rs
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@ use crate::schema_api as api;
use bun_core::String;
use bun_url::URL as ZigURL;

use crate::module_loader::ModuleLoader;
use crate::virtual_machine::VirtualMachine;
use crate::{
Exception, JSErrorCode, JSGlobalObject, JSRuntimeType, JSValue, ZigStackFrame, ZigStackTrace,
Expand Down Expand Up @@ -181,10 +180,11 @@ impl Holder {
self.loaded = false;
}
if self.need_to_clear_parser_arena_on_deinit {
// PORT NOTE: reshaped for borrowck — Zig `vm.module_loader.resetArena(vm)`
// would borrow `vm` twice; the Rust port made `reset_arena` an
// associated fn on `ModuleLoader` taking only `&mut VirtualMachine`.
ModuleLoader::reset_arena(vm);
// One-off PrintSource fetch — full-reset; `reset_arena`'s
// retain-with-limit is for the hot per-module transpile loop.
if let Some(arena) = vm.module_loader.transpile_source_code_arena.as_mut() {
arena.reset();
}
}
}

Expand Down
165 changes: 51 additions & 114 deletions src/runtime/jsc_hooks.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1931,6 +1931,50 @@ fn create_if_different(s: &bun_core::String, other: &[u8]) -> bun_core::String {
bun_core::String::clone_utf8(other)
}

/// Takes the per-VM arena, binds `AST_HEAP` to it; `Drop` clears `AST_HEAP`
/// and parks the Box back. `ModuleLoader::reset_arena` owns the reset.
struct ActiveTranspilerArena {
vm: *mut VirtualMachine,
arena: Option<Box<bun_alloc::Arena>>,
}

impl ActiveTranspilerArena {
fn take(vm: *mut VirtualMachine) -> Self {
// SAFETY: `vm` is the live per-thread VM.
let arena = unsafe { (*vm).module_loader.transpile_source_code_arena.take() }
.unwrap_or_else(|| Box::new(bun_alloc::Arena::new()));
bun_alloc::ast_alloc::set_thread_heap(arena.heap_ptr());
Self {
vm,
arena: Some(arena),
}
}

#[inline]
fn arena(&self) -> &bun_alloc::Arena {
self.arena.as_deref().unwrap()
}

fn into_arena_for_async_module(mut self) -> Box<bun_alloc::Arena> {
bun_alloc::ast_alloc::set_thread_heap(core::ptr::null_mut());
self.arena.take().unwrap()
}
}

impl Drop for ActiveTranspilerArena {
fn drop(&mut self) {
let Some(arena) = self.arena.take() else {
return;
};
bun_alloc::ast_alloc::set_thread_heap(core::ptr::null_mut());
// SAFETY: `self.vm` is the live per-thread VM.
let slot = unsafe { &mut (*self.vm).module_loader.transpile_source_code_arena };
if slot.is_none() {
*slot = Some(arena);
}
}
}
Comment on lines +1964 to +1976

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 When the slot is occupied, this drops the current frame's arena — but in the re-entrant + outer-ParseError case, that arena owns the source bytes the log spans point into, and process_fetch_log hasn't run yet. The old !give_back branch parked the arena unconditionally (overwriting any occupant) precisely to keep those bytes alive; with give_back_arena removed, the outer arena is now mi_heap_destroy'd here and transpile_file reads freed memory at :4290. Fix: when the slot is occupied, swap (park self.arena, drop the previous occupant) so the current frame's arena always survives for the caller's ArenaResetGuard.

Extended reasoning...

What the bug is

ActiveTranspilerArena::drop parks self.arena back into transpile_source_code_arena only when slot.is_none(); otherwise it lets the Box drop, which is MimallocArena::dropmi_heap_destroy. The new comment at jsc_hooks.rs:2125-2134 claims this is safe because "the caller's ArenaResetGuard … runs after process_fetch_log, so log spans pointing into arena-owned source bytes stay valid on the ParseError path" — but that statement only holds when the arena was actually parked. In the re-entrant case the slot is occupied, so the arena is destroyed instead, and the caller's ArenaResetGuard resets a different arena.

This is a regression introduced by removing the give_back_arena flag. Compare the three versions:

  • Zig spec (ModuleLoader.zig:145-165): on ParseError, give_back_arena = false (set at :293 and :339) makes the entire defer block a no-op — the arena is neither parked nor deinit()'d. It is intentionally leaked so its bytes outlive processFetchLog. The :161-163 destroy branch the new comment cites is only reachable when give_back_arena == true.
  • Old Rust port (removed by this PR): on !give_back, did *slot = Some(arena); return; unconditionally — overwriting any occupant. The deleted comment explicitly documented why: "reads log entries whose spans point into arena-owned source bytes. Freeing here would be a use-after-free."
  • New Rust port: no give_back flag; the same if slot.is_none() { park } else { drop } logic now applies to all paths including ParseError. This matches neither the spec nor the old port on the parse-error path.

Code path that triggers it

transpile_source_code_inner is re-entrant — the spec says so explicitly at ModuleLoader.zig:130 ("This code is potentially re-entrant"). The concrete trigger is macro evaluation during parse: parse_maybe_return_file_only (jsc_hooks.rs:2468) → parser visits a macro call → MacroContext::callvm.load_macro_entry_pointJSModuleLoader::load_and_evaluate_module → JSC → Bun__transpileFiletranspile_file (creates an ArenaResetGuard at jsc_hooks.rs:4210) → transpile_source_code_inner again on the same VM/thread.

The source bytes that log spans point into do live in this arena: transpile_source_code_inner calls parse_maybe_return_file_only::<false>parse_maybe_return_file_only_allow_shared_buffer::<_, false> (transpiler.rs:1390), so USE_SHARED_BUFFER=false and the file is read via read_file_with_allocator(..., Some(arena)) (transpiler.rs:1502-1508). The PORT NOTE at transpiler.rs:1528-1533 confirms: "bytes live … in this_parse.arena … bulk-freed by mi_heap_destroy when the per-call arena is recycled".

Why existing code doesn't prevent it

The new comment at the else branch says "drop the fresh Box (spec :161-163)", assuming the dropped Box is always the inner re-entrant frame's freshly-allocated arena. But re-entry is stack-shaped, and the inner frame returns first — it parks its fresh arena (slot was None at that point) and its ArenaResetGuard resets it in place via as_mut() (ModuleLoader.rs:68-75), leaving slot = Some(arena_Y). When the outer frame later returns with a ParseError, its Drop is the one that sees the occupied slot — and the Box it drops is arena_X, the outer frame's source-bearing arena, not a "fresh" one.

Step-by-step proof

  1. Level 0: transpile_file creates _reset_arena_0: ArenaResetGuard (jsc_hooks.rs:4210), calls transpile_source_code_inner. It take()s arena_X from the slot (slot → None), builds arena_guard_0 = ActiveTranspilerArena { arena: Some(arena_X) }, and reads the file into arena_X.
  2. During parse/visit, a macro call evaluates and import()s a module → Bun__transpileFilelevel 1 transpile_file creates _reset_arena_1, calls transpile_source_code_inner. Slot is None, so it allocates fresh arena_Y and builds arena_guard_1.
  3. Level 1 completes successfully. arena_guard_1::drop sees slot.is_none() → parks arena_Y (slot → Some(arena_Y)). _reset_arena_1::drop calls reset_arena, which does slot.as_mut().reset_retain_with_limit(...) — slot stays Some(arena_Y).
  4. Level 0's parse continues and hits an error (e.g. a syntax error after the macro call, or the macro itself logged an error so log.errors > 0 at jsc_hooks.rs:2546) → return Err(err!("ParseError")).
  5. Level 0's locals drop. arena_guard_0::drop sees slot = Some(arena_Y), so the if slot.is_none() branch is falsearena_X falls through and is Box::drop'd → mi_heap_destroy(heap_X).
  6. Back in level 0's transpile_file, the Err arm calls process_fetch_log(..., &mut log, ...) at jsc_hooks.rs:4290. This walks log.msgs and calls BuildMessage::create per msg (VirtualMachine.rs:2553-2585), reading Location line-text spans that point into arena_X's now-destroyed heap. UAF.
  7. _reset_arena_0::drop then resets arena_Y — the wrong arena.

Impact

Use-after-free on the JS thread when a module that uses a bundle-time macro (or otherwise re-enters the transpiler during parse) subsequently fails to parse. The freed bytes are read to build the user-visible error message, so this would typically manifest as garbage in the diagnostic location/line-text or a crash, depending on whether mimalloc has unmapped the page. The trigger is narrow (macro re-entry + outer parse error) but reachable from user code without any special flags.

Fix

Restore the old !give_back invariant that the current frame's arena always survives for the caller's process_fetch_log + ArenaResetGuard. The minimal change is to swap rather than drop when the slot is occupied:

let slot = unsafe { &mut (*self.vm).module_loader.transpile_source_code_arena };
// Park this frame's arena unconditionally so log spans stay valid for
// process_fetch_log; if a re-entrant frame already parked one, drop *that*
// (it has already been reset by the inner ArenaResetGuard).
let _prev = core::mem::replace(slot, Some(arena));

This preserves the PR's single-owner-reset goal (still no reset here) while matching the old port's parse-error semantics.


/// `ModuleLoader.transpileSourceCode(...)` — the runtime-transpiler path.
/// Port of `src/jsc/ModuleLoader.zig:85-826`: read file → `Transpiler::parse`
/// → `js_printer::print` → `ResolvedSource`.
Expand Down Expand Up @@ -2067,113 +2111,8 @@ fn transpile_source_code_inner(
let (main, main_hash) = unsafe { ((*jsc_vm).main(), (*jsc_vm).main_hash) };
let is_main = main.len() == path.text.len() && main_hash == hash && main == path.text;

// ── Arena take/give-back ────────────────────────────────────────
// Spec :128-165. Reuse the per-VM arena when free; allocate a
// fresh boxed one otherwise. `give_back_arena` is cleared on the
// ParseError / AsyncModule paths (which hand the arena to the
// async queue or leak it intentionally for the caller to inspect).
// SAFETY: per fn contract.
let mut arena: Box<bun_alloc::Arena> =
unsafe { (*jsc_vm).module_loader.transpile_source_code_arena.take() }
.unwrap_or_else(|| Box::new(bun_alloc::Arena::new()));
// Stable heap address (Box interior); survives the move into
// `arena_guard` and into the VM slot on give-back.
let arena_ptr: *const bun_alloc::Arena = &raw const *arena;
// Route `AstAlloc` to `arena`'s `mi_heap_t*` (see the
// `reset_store` note above). `_ast_scope.enter()` already nulled
// `AST_HEAP`; this rebinds it to the heap that the parser scratch
// and printer arena allocations also use.
bun_alloc::ast_alloc::set_thread_heap(arena.heap_ptr());
let mut give_back_arena = true;
// PORT NOTE: reshaped for borrowck — Zig's `defer` block becomes a
// scopeguard so `?`-early-returns still run it.
let mut arena_guard = scopeguard::guard(
(jsc_vm, arena, give_back_arena, args.flags),
|(jsc_vm, mut arena, give_back, flags)| {
// `AST_HEAP` was bound to `arena.heap_ptr()` for this
// transpile; clear it before `reset()` (which is
// `mi_heap_destroy` + `mi_heap_new`) so it never dangles.
// `_ast_scope.exit()` (drops after this guard) restores
// the surrounding scope's heap regardless.
bun_alloc::ast_alloc::set_thread_heap(core::ptr::null_mut());
// SAFETY: `jsc_vm` is the live per-thread VM (closure runs
// on the same thread, before the hook returns).
let slot = unsafe { &mut (*jsc_vm).module_loader.transpile_source_code_arena };
if !give_back {
// Spec :146-165 — when `give_back_arena == false` the
// Zig `defer` is a no-op because ownership was already
// transferred (to the AsyncModule queue, or held past
// `processFetchLog` so log spans pointing into it stay
// valid). The ParseError path that flips
// `give_back=false` is LIVE (not gated): the caller
// (`transpile_file` → `process_fetch_log`, spec
// :1112-1114) reads `log` entries whose spans point
// into arena-owned source bytes. Freeing here would be
// a use-after-free.
//
// PORT NOTE: we can't widen `TranspileExtra` (lower
// tier) to carry the `Box<Arena>` back, so park it in
// the per-VM slot UN-reset. `transpile_file`'s
// `_reset_arena` guard (`ModuleLoader::reset_arena`,
// spec :1083) runs after `process_fetch_log` and
// resets/reclaims it then — matching the spec lifetime.
// TODO(b2-cycle): once AsyncModule un-gates, the
// enqueue site must `ScopeGuard::into_inner` and hand
// the `Box<Arena>` to the queue instead of reaching
// here.
*slot = Some(arena);
return;
}
if slot.is_none() {
if flags != FetchFlags::PrintSource {
// SAFETY: per fn contract — `jsc_vm` is the live
// per-thread VM (closure runs on the same thread,
// before the hook returns).
if unsafe { (*jsc_vm).smol } {
arena.reset();
} else {
// Spec ModuleLoader.zig:155
// `.reset(.{.retain_with_limit = 8M})`.
// See `MimallocArena::reset_retain_with_limit`
// for why this is a no-op-until-limit rather
// than a bump-pointer reset (each fresh
// `mi_heap`'s first alloc pays
// `mi_arena_pages_alloc` → bitmap memset).
//
// PERF NOTE: the over-limit branch of this is
// `MimallocArena::reset()` = `mi_heap_destroy`
// + `mi_heap_new`, and `mi_heap_destroy` is
// the costly half (per-page free-list/bitmap
// teardown, plus `_mi_stats_merge_from`'s
// `mi_stats_t` walk when stats are compiled in).
// Because `AstAlloc::deallocate` is a no-op (the
// AST graph is abandoned, not freed — see the
// `Expr::Data::clone_in` aliasing invariant in
// `ast_alloc.rs`), this heap's footprint only
// *grows* across retained modules, so a tight
// cap means a `mi_heap_destroy` every few
// modules — and `next lint` transpiles a few
// hundred. `mi_heap_collect` can't substitute:
// it only returns *empty* pages, and there are
// none while the dead AST blocks pin them. So
// the lever is the cap: raise it to the spec's
// 8 MB (matching every other
// `reset_retain_with_limit` call site) so the
// common case retains the warm heap and the
// destroy fires ~4× less often. This re-adds the
// ~6 MB anon-rw mid-run footprint that commit
// bfe6056b1e8e shaved off by going to 2 MB —
// accepted: the lint/create-next RSS budget has
// headroom vs the Zig baseline, and the
// per-destroy CPU is the bigger lever.
arena.reset_retain_with_limit(8 * 1024 * 1024);
}
}
*slot = Some(arena);
}
// else: drop the fresh Box (spec :161-163).
},
);
let arena_guard = ActiveTranspilerArena::take(jsc_vm);
let arena_ptr: *const bun_alloc::Arena = &raw const *arena_guard.arena();
// ── Watcher fd / package_json lookup ────────────────────────────
// Spec :170-176.
let mut fd: Option<bun_sys::Fd> = None;
Expand Down Expand Up @@ -2304,8 +2243,8 @@ fn transpile_source_code_inner(
let mut input_file_fd = bun_sys::Fd::INVALID;
// Spec :251-256 `defer { if (should_close_input_file_fd and
// input_file_fd != .invalid) input_file_fd.close(); }` — this
// `defer` is unconditional in Zig (independent of `give_back_arena`)
// and must fire on every exit path: parse failure, JSON early
// `defer` is unconditional in Zig and must fire on every exit
// path: parse failure, JSON early
// return, `disable_transpilying`, already_bundled, empty `.cjs`,
// cache-hit, AsyncModule, the wasm recurse, and the print error.
// PORT NOTE: reshaped for borrowck — capture raw pointers so the
Expand Down Expand Up @@ -2533,7 +2472,6 @@ fn transpile_source_code_inner(
package_json,
);
}
arena_guard.2 = false; // give_back_arena = false
return Err(bun_core::err!("ParseError"));
};

Expand Down Expand Up @@ -2583,7 +2521,6 @@ fn transpile_source_code_inner(

// Spec :338-341.
if unsafe { (*(*jsc_vm).transpiler.log).errors > 0 } {
arena_guard.2 = false;
return Err(bun_core::err!("ParseError"));
}

Expand Down Expand Up @@ -2852,8 +2789,8 @@ fn transpile_source_code_inner(
core::mem::forget(fs_cache.reset_shared_buffer(buf));
}

// Hand `arena` ownership to the queue (defuse the give-back guard).
let (_, arena, _, _) = scopeguard::ScopeGuard::into_inner(arena_guard);
// Hand `arena` ownership to the queue.
let arena = arena_guard.into_arena_for_async_module();
// SAFETY: per fn contract — `jsc_vm` / `global_object` are the live
// per-thread VM / global; `package_json` is the opaque watcher
// forward-decl of `bun_resolver::package_json::PackageJSON`.
Expand Down Expand Up @@ -2940,7 +2877,7 @@ fn transpile_source_code_inner(
// built `parse_result.ast` from — the printer's
// rope-flattening scratch belongs in it, not in
// the per-VM `transpiler_arena`.
&arena_guard.1,
arena_guard.arena(),
parse_result,
&mut *(*extra).source_code_printer,
bun_js_printer::Format::EsmAscii,
Expand Down
43 changes: 43 additions & 0 deletions test/cli/run/require-cache.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -335,4 +335,47 @@ describe.concurrent("require.cache", () => {
isWindows ? 60000 : 30000,
);
});

test("synchronous transpile arena retain across many distinct modules", async () => {
const N = 200;
const files: Record<string, string> = {
"entry.cjs": `
const N = ${N};
let mismatches = 0;
for (let i = 0; i < N; i++) {
const m = require("./mod" + i + ".cjs");
if (m.id !== i) mismatches++;
if (m.tag !== "module-number-" + i) mismatches++;
if (m.values.length !== 8 || m.values[7] !== i * 7) mismatches++;
if (m.padded.length !== 4096 + String(i).length) mismatches++;
}
for (let i = 0; i < N; i++) {
const m = require("./mod" + i + ".cjs");
if (m.tag !== "module-number-" + i) mismatches++;
}
console.log(JSON.stringify({ mismatches, rss: process.memoryUsage.rss() }));
`,
};
for (let i = 0; i < N; i++) {
files[`mod${i}.cjs`] =
`// ${Buffer.alloc(2048, 120).toString()}\n` +
`exports.id = ${i};\n` +
`exports.tag = "module-number-" + ${i};\n` +
`exports.values = [${Array.from({ length: 8 }, (_, k) => i * k).join(", ")}];\n` +
`exports.padded = ${JSON.stringify(Buffer.alloc(4096, 46).toString() + i)};\n`;
}
const dir = tempDirWithFiles("transpile-arena-retain", files);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Use tempDir for this multi-file fixture instead of tempDirWithFiles.

On Line 378, please switch to tempDir(...) (with the fixture tree as the second argument) so this follows test harness conventions and disposable temp-dir patterns.

As per coding guidelines: “for multi-file tests, create a temporary directory using tempDir from harness.”

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/cli/run/require-cache.test.ts` at line 378, Replace the use of
tempDirWithFiles with the harness helper tempDir for the multi-file fixture:
locate the test code that calls tempDirWithFiles (the line creating const dir =
tempDirWithFiles("transpile-arena-retain", files)) and change it to call tempDir
with the fixture tree as the second argument (i.e., call
tempDir("transpile-arena-retain", files)); ensure the variable name (dir) and
the fixture object (files) are passed unchanged so the test continues to use the
disposable temp directory pattern required by the harness.


await using proc = Bun.spawn({
cmd: [bunExe(), "run", join(dir, "entry.cjs")],
env: { ...bunEnv, BUN_RUNTIME_TRANSPILER_CACHE_PATH: "0" },
stdout: "pipe",
stderr: "pipe",
});
const [stdout, stderr, exitCode] = await Promise.all([proc.stdout.text(), proc.stderr.text(), proc.exited]);
expect(stderr).toBe("");

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Filter known ASAN startup noise before asserting empty stderr.

On Line 387, expect(stderr).toBe("") can fail on ASAN runs when the subprocess boots the VM. Please split/filter stderr lines and ignore lines starting with "WARNING: ASAN interferes" before asserting emptiness.

Suggested patch
-    expect(stderr).toBe("");
+    const stderrLines = stderr
+      .split(/\r?\n/)
+      .filter(line => line.length > 0)
+      .filter(line => !line.startsWith("WARNING: ASAN interferes"));
+    expect(stderrLines).toEqual([]);

Based on learnings: when spawning subprocesses with bunEnv, use the repo’s standard ASAN warning-line filter before empty-stderr assertions.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
expect(stderr).toBe("");
const stderrLines = stderr
.split(/\r?\n/)
.filter(line => line.length > 0)
.filter(line => !line.startsWith("WARNING: ASAN interferes"));
expect(stderrLines).toEqual([]);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/cli/run/require-cache.test.ts` at line 387, The test currently asserts
expect(stderr).toBe("") which can fail due to ASAN startup noise; update the
assertion by splitting stderr into lines, filtering out any lines that start
with "WARNING: ASAN interferes" (and trim whitespace/newlines), then assert the
filtered array is empty (or join back and expect("")). Locate the stderr
variable and the expect(stderr).toBe("") assertion in the test and replace it
with the filtered-lines check so ASAN warning lines are ignored before asserting
no other stderr output.

const out = JSON.parse(stdout.trim());
expect(out.mismatches).toBe(0);
expect(exitCode).toBe(0);
});
});
Loading