Skip to content

resolver: restore package_manager.log after resolve to avoid dangling stack pointer#31020

Merged
Jarred-Sumner merged 4 commits into
mainfrom
farm/7ccf4dd4/fix-pm-log-dangling
May 19, 2026
Merged

resolver: restore package_manager.log after resolve to avoid dangling stack pointer#31020
Jarred-Sumner merged 4 commits into
mainfrom
farm/7ccf4dd4/fix-pm-log-dangling

Conversation

@robobun

@robobun robobun commented May 19, 2026

Copy link
Copy Markdown
Collaborator

What

resolveMaybeNeedsTrailingSlash swaps vm.log / resolver.log to a stack-local Log for the duration of _resolve, then restores them via a drop guard. The Zig original also swaps and restores transpiler.linker.log and resolver.package_manager.log; the Rust port had those behind a TODO(b2-cycle) and only handled vm.log + resolver.log.

When auto-install is enabled and the resolver lazily creates the PackageManager during _resolve, Resolver::get_package_manager seeds pm.log from resolver.log — which at that point is the stack-local Log. Because the restore guard never touched pm.log, it was left pointing into a dead stack frame after the function returned. The next resolve at a different stack depth that routes through the auto-install task runner dereferenced that stale pointer in Log::add_error_fmt, tripping ASAN's stack-use-after-scope (or segfaulting / executing garbage in release builds).

Stack at the fault:

#0  bun_ast::Log::add_formatted_msg
#1  bun_ast::Log::add_error_fmt
#2  bun_install::…::run_tasks
#7  bun_install::…::enqueue_dependency_to_root
#9  bun_resolver::Resolver::enqueue_dependency_to_resolve
#14 bun_resolver::Resolver::resolve_and_auto_install
#15 bun_jsc::VirtualMachine::_resolve
#16 bun_jsc::VirtualMachine::resolve_maybe_needs_trailing_slash::<true>

Fix

Swap and restore linker.log and (when present) package_manager.log in both copies of the resolve log guard (VirtualMachine::resolve_maybe_needs_trailing_slash and jsc_hooks::resolve_hook), matching VirtualMachine.zig. The restore re-checks resolver.package_manager at drop time so a PM that was lazily created during _resolve is also pointed back at the VM log.

Also adds the missing <cassert> include in wtf-bindings.cpp, which stopped being pulled in transitively.

Repro

// run from an empty dir with
// BUN_CONFIG_INSTALL=fallback BUN_CONFIG_REGISTRY=http://127.0.0.1:1
const realm = new ShadowRealm();
const variants = [
  () => realm.importValue("pkg-not-found-a", "x"),
  () => (() => realm.importValue("pkg-not-found-b", "x"))(),
  () => (() => (() => realm.importValue("pkg-not-found-c", "x"))())(),
  () => import("pkg-not-found-f"),
];
for (let i = 0; i < 100; i++)
  for (const v of variants) try { v()?.catch?.(() => {}); } catch {}

Segfaults on main, clean after this change.

Fixes #14432
Fixes #22407

… stack pointer

resolveMaybeNeedsTrailingSlash swaps the resolver log to a stack-local
Log while _resolve runs. If the auto-install path lazily initializes the
PackageManager during that call, pm.log is seeded from resolver.log and
ends up pointing at the stack Log. The Zig original swaps and restores
pm.log (and linker.log) alongside the other log aliases; the Rust port
had a TODO and only restored vm.log / resolver.log, so pm.log was left
pointing into a dead stack frame. A subsequent resolve at a different
stack depth that hits the auto-install task runner dereferences that
stale pointer in Log::add_error_fmt.

Swap/restore linker.log and (if present) package_manager.log in both
copies of the resolve log guard, matching the Zig spec. The restore
re-checks resolver.package_manager at drop time so a PM created during
_resolve is covered too.

Also add the missing <cassert> include in wtf-bindings.cpp now that it
is no longer pulled in transitively.
@robobun

robobun commented May 19, 2026

Copy link
Copy Markdown
Collaborator Author
Updated 10:30 PM PT - May 18th, 2026

@robobun, your commit 4583009d060254ca3b4fc8eb10af238fe1984320 passed in Build #55961! 🎉


🧪   To try this PR locally:

bunx bun-pr 31020

That installs a local version of the PR into your bun-31020 executable, so you can run:

bun-31020 --bun

@coderabbitai

coderabbitai Bot commented May 19, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 07dd8722-44ca-42cc-82f5-e75a97caf05b

📥 Commits

Reviewing files that changed from the base of the PR and between 448fad0 and 1d93629.

📒 Files selected for processing (2)
  • src/runtime/jsc_hooks.rs
  • test/js/bun/resolve/resolve-autoinstall-log-dangling.test.ts

Walkthrough

Synchronizes package_manager.log with temporary stack-log swaps/restores in VM and resolve hook paths, adds #include <cassert> for compilation, and adds an integration test that repeatedly triggers failing resolves to ensure no dangling package-manager log pointer remains.

Changes

Dangling Log Pointer Fix

Layer / File(s) Summary
VirtualMachine log-swap initialization and restore
src/jsc/VirtualMachine.rs
The resolve path's log-swap setup now updates linker.log and package_manager.log (if present) to point at the temporary stack log, and the guard's drop phase restores both back to old_log.
jsc_hooks log-swap updates
src/runtime/jsc_hooks.rs
transpile_source_code_inner and resolve_hook temporary-log swaps now cast/update transpiler.resolver.package_manager.log when a package manager exists, and restore it in the scopeguard paths.
Compilation support — cassert include
src/jsc/bindings/wtf-bindings.cpp
Adds #include <cassert> so assert(...) is declared in the translation unit.
Integration test for dangling-log fix
test/js/bun/resolve/resolve-autoinstall-log-dangling.test.ts
New test spawns bun with forced auto-install fallback, executes repeated failing imports at varying JS stack depths (inside a ShadowRealm and other wrappers), forces GC, and asserts a clean exit with no stderr and stdout "ok".

Possibly related PRs

  • oven-sh/bun#30880: Both PRs modify resolver/VM logging swap logic in src/runtime/jsc_hooks.rs and related VM resolve paths to address dangling log-pointer issues.
🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely describes the main fix: restoring package_manager.log after resolve to avoid a dangling stack pointer.
Description check ✅ Passed The description thoroughly covers both required sections: 'What' explains the bug and fix with stack traces and details, and 'How' provides a concrete reproducer demonstrating the issue.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

Copy link
Copy Markdown
Contributor

Found 2 issues this PR may fix:

  1. auto-install crashes when an error is thrown #14432 - Segfault in Log.addErrorFmt via PackageManager.enqueueDependencyToRoot during auto-install resolve — stack trace shows the exact dangling pm.log pointer path this PR fixes
  2. Bun segmentation fault #22407 - Segfault during PackageManager.runTasks via resolveAndAutoInstallresolveMaybeNeedsTrailingSlash — same auto-install resolve path with corrupted/dangling pointer

If this is helpful, copy the block below into the PR description to auto-close these issues on merge.

Fixes #14432
Fixes #22407

🤖 Generated with Claude Code

@github-actions

Copy link
Copy Markdown
Contributor

This PR may be a duplicate of:

  1. fix: sync PackageManager.log with resolver.log in resolveMaybeNeedsTrailingSlash #28310 - Same fix: syncs PackageManager.log with resolver.log in resolveMaybeNeedsTrailingSlash to prevent dangling stack pointer after resolve returns
  2. fix: prevent segfault in auto-install due to log allocator mismatch #26031 - Same root cause (pm.log pointing at dead stack frame during auto-install) with a different fix strategy (avoids swapping PM log entirely)

🤖 Generated with Claude Code

Comment thread src/runtime/jsc_hooks.rs
Comment thread test/js/bun/resolve/resolve-autoinstall-log-dangling.test.ts Outdated

@claude claude Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both follow-ups from my earlier review are addressed in 1d93629 (third log-swap site in transpile_source_code now matches, and the test uses --install=fallback). No further issues found, but this is unsafe raw-pointer plumbing in the core resolver/VM path that leans on the "sole dyn AutoInstaller impl" invariant for the .cast::<PackageManager>() downcast — worth a human pair of eyes before merge.

Extended reasoning...

Overview

Fixes a stack-use-after-scope in the Rust port of the runtime resolver: resolveMaybeNeedsTrailingSlash / resolve_hook / transpile_source_code swap several log pointers to a stack-local Log for the duration of _resolve, but the restore guards weren't touching transpiler.linker.log or resolver.package_manager.log. When auto-install lazily created the PackageManager mid-resolve, pm.log was seeded from the stack-local Log and never restored, leaving a dangling pointer that the next auto-install task dereferenced. The fix swaps/restores linker.log and pm.log at all three sites to match the Zig original, adds a regression test, and includes a trivial <cassert> header fix.

Security risks

This is memory-safety code: three new unsafe blocks doing pm.cast::<bun_install::PackageManager>().as_ptr() to downcast a NonNull<dyn AutoInstaller> and write through it. The cast's soundness rests on the comment-documented invariant that PackageManager is the sole AutoInstaller impl. If that invariant is correct (and it appears to be — the Zig source does the equivalent unconditionally), the change strictly removes a UAF; but the downcast pattern itself is the kind of thing a maintainer should sign off on. No injection/auth/data-exposure surface.

Level of scrutiny

High. This is hot-path runtime resolver code in VirtualMachine.rs and jsc_hooks.rs, with hand-rolled drop guards manipulating raw pointers. The fix is conceptually simple (mirror the Zig swap/restore) and well-explained, but "simple unsafe pointer fix in the module loader" is exactly the category that benefits from a human reviewer confirming the NonNull::cast is layout-valid and that re-reading resolver.package_manager at drop time can't race or alias anything else.

Other factors

  • My two earlier inline comments (third unfixed copy in transpile_source_code; dead BUN_CONFIG_INSTALL env var in the test) were both addressed in commit 1d93629, and the current diff reflects those fixes.
  • A regression test is included that exercises the auto-install path at varying stack depths against an unroutable registry; it now uses --install=fallback so it won't silently go inert if the auto default changes.
  • The bug-hunting system found nothing on this revision.
  • github-actions flagged two possibly-duplicate PRs (#28310, #26031) tackling the same root cause — a maintainer should confirm this is the variant they want to land.

@claude claude Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both earlier comments are addressed in 1d93629 and I have nothing further — deferring final sign-off since this is unsafe raw-pointer plumbing in the VM resolver path and Build #55953 is still showing red.

Extended reasoning...

Overview

The PR completes the Zig→Rust port of the log-swap/restore pattern at three sites: VirtualMachine::resolve_maybe_needs_trailing_slash, resolve_hook, and (after my earlier comment) transpile_source_code_inner. Each site now also swaps transpiler.linker.log and resolver.package_manager.log to the stack-local Log and restores them in the drop guard, with the restore re-checking package_manager so a PM lazily created during _resolve is also pointed back at the VM log. There's a one-line <cassert> include fix in wtf-bindings.cpp and a new regression test that hammers failing auto-install resolves at varying stack depths.

Prior feedback

I raised two points on the first revision: (1) the third copy of the swap in transpile_source_code was left behind a stale TODO(blocked_on), and (2) the test relied on a no-op BUN_CONFIG_INSTALL env var. Commit 1d93629 applied the same cast/swap/restore at the third site and switched the test to --install=fallback. Both threads are resolved and the current diff reflects the fixes; the bug-hunting pass on this revision found nothing.

Security risks

None in the conventional sense (no auth/crypto/input-parsing surface). The risk class here is memory safety: the change writes through pm.cast::<bun_install::PackageManager>().as_ptr() inside unsafe blocks, relying on the invariant that the dyn AutoInstaller stored on the resolver is always the concrete PackageManager. I confirmed that's the sole impl AutoInstaller for in the tree (src/install/auto_installer.rs), so the downcast is sound today, and the fix strictly removes a stack-use-after-scope rather than adding new lifetime hazards.

Level of scrutiny

Moderate-to-high. The diff is small and mechanical — it mirrors VirtualMachine.zig / ModuleLoader.zig line-for-line and applies the identical 4-line pattern at each site — but it's unsafe raw-pointer plumbing in the core runtime resolver path where a mistake is UB rather than a thrown error. That, plus the still-red Build #55953 on the latest commit, is enough that I'd rather a maintainer give it the final nod than auto-approve.

Other factors

No CODEOWNERS gate on these paths. The added regression test is well-targeted (varying stack depths + unroutable registry to force manager.log_mut().add_error_fmt). CodeRabbit had no actionable comments.

@robobun

robobun commented May 19, 2026

Copy link
Copy Markdown
Collaborator Author

CI: the only failure across both builds (#55953, #55961) is darwin-14-aarch64-test-bun with status Expired — the job never ran because no agent was available. All 72 test jobs that actually executed passed, including debian-13-x64-asan-test-bun (the sanitizer suite most relevant to this fix), darwin-14-x64-test-bun, and darwin-26-aarch64-test-bun. This looks like a darwin-14 arm64 agent availability issue rather than anything in this change.

@Jarred-Sumner Jarred-Sumner merged commit 2017304 into main May 19, 2026
77 checks passed
@Jarred-Sumner Jarred-Sumner deleted the farm/7ccf4dd4/fix-pm-log-dangling branch May 19, 2026 07:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bun segmentation fault auto-install crashes when an error is thrown

2 participants