Skip to content

Replace HiveArrayFallback's uninit-pointer API with safe construction#30879

Open
Jarred-Sumner wants to merge 1 commit into
mainfrom
claude/hive-safe-pool-api
Open

Replace HiveArrayFallback's uninit-pointer API with safe construction#30879
Jarred-Sumner wants to merge 1 commit into
mainfrom
claude/hive-safe-pool-api

Conversation

@Jarred-Sumner

Copy link
Copy Markdown
Collaborator

What

Deletes HiveArrayFallback::{get, try_get, get_and_see_if_new} — they return a raw *mut T to uninitialized pool memory, which is an unsound shape: forming &mut T over the uninit slot is immediate UB for any T with niche-bearing fields, and an early return between get() and the caller's ptr::write leaves the slot claimed-but-uninit so a later put() drops garbage.

Every caller is migrated to the safe entry points (get_init / claim()) that encode "a used slot is always fully initialized" in the type system:

  • renamer / h2_frame_parser / windows_event_loop: pool.get() + ptr::write(value)pool.get_init(value). The Windows path mirrors the posix event loop, which already used safe construction.
  • install enqueue: build the Task value first (owns no borrow of this), then get_init — removes the borrow conflict the uninit-slot pattern was working around.
  • get_network_task: claim() + forget — the slot stays claimed for the existing put() path, but no uninit *mut T escapes the pool's deprecated API. Documented as the one remaining "uninit pointer across a boundary" site.
  • server request context: invert RequestContext::create(*mut Self) to ::new() -> Self and RequestCtxOps::create_in to ::create() -> Self; call sites use safe Fallback::get_init. The constructor already built the literal by value and ptr::write'd it — get_init does exactly that, with no unsafe at the call site.
  • DevServer::defer_request: restructure into a fallible/self-reentrant phase (runs before any slot is claimed — early returns no longer have a slot to recycle) and an infallible placement-init phase via claim(). Also closes a latent slot leak: the old code only recycled the slot on the explicit None branch, not on the ?-error path.

The remaining hive APIs are get_init, emplace, claim(), put, put_raw — all encode the initialization invariant, none hand out uninit pointers.

Verification

  • cargo check --workspace clean (the deleted methods have zero remaining callers)
  • bun run rust:check-all (all 6 target triples — the Windows path is cfg-gated)
  • bun bd test on serve.test.ts, bake/dev-and-prod, http2, bundler_minify, bun-install
  • 2 separate bughunt sweeps found no logic or memory-safety regressions

@robobun

robobun commented May 16, 2026

Copy link
Copy Markdown
Collaborator
Updated 9:09 PM PT - May 17th, 2026

@Jarred-Sumner, your commit f2e0f2e34dfd1cbea4bd80774d5fc21892a89204 passed in Build #55610! 🎉


🧪   To try this PR locally:

bunx bun-pr 30879

That installs a local version of the PR into your bun-30879 executable, so you can run:

bun-30879 --bun

@coderabbitai

coderabbitai Bot commented May 16, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Removes deprecated two-phase pool-allocation APIs and migrates callers to get_init()/get_owned()/claim()+into_owned() flows; introduces HiveOwned handles; updates enqueue, runTasks, server request-context, DevServer, H2 parser, JS renamer, io, threading, and tests to the new ownership/initialization patterns.

Changes

Pool API safety migration

Layer / File(s) Summary
Remove deprecated HiveArray/Fallback unsafe APIs and update safety docs
src/collections/hive_array.rs
Deleted deprecated HiveArray::get() and legacy Fallback get-family methods; added HiveSlot::into_owned, HiveOwned, Fallback::get_owned, updated safety docs and unit tests to use get_init()/emplace()/claim() only.
UnboundedQueue/Batch ownership helpers
src/threading/unbounded_queue.rs
Added Batch::next_owned (unsafe) and UnboundedQueue::push_owned to transfer pool ownership using HiveOwned<T> without dropping the handle.
Task type and request payload updates
src/install/PackageManagerTask.rs, src/install/TarballStream.rs, src/install/lib.rs
Removed lifetime parameter from Task; converted request payloads to own HiveOwned<NetworkTask>; updated intrusive link types, callback accessors, and resolve task enqueue to use non-generic Task.
Task enqueue refactoring to construct-then-init pattern
src/install/PackageManager/PackageManagerEnqueue.rs
Refactored enqueue/init flows (e.g., enqueue_parse_npm_package, init_extract_task, enqueue_git_checkout) to build full Task values then call preallocated_resolve_tasks.get_init(value); signatures updated to accept HiveOwned<NetworkTask> and removed explicit 'static lifetimes.
runTasks: iterate owned slots and return HiveOwned from get_network_task
src/install/PackageManager/runTasks.rs
Drain loop now uses next_owned() and passes HiveOwned<NetworkTask> handles into enqueue APIs; get_network_task now returns HiveOwned<NetworkTask> (claim+into_owned). Pointer captures and error callbacks read via as_ref/as_ptr.
PopulateManifestCache: use HiveOwned handle
src/install/PackageManager/PopulateManifestCache.rs
start_manifest_task stores the claimed HiveOwned<NetworkTask> and derives net_ptr via as_ptr() before initialization/enqueue.
Store::get_init and FilePoll init
src/io/windows_event_loop.rs
Added Store::get_init delegating to HiveArray::get_init; refactored FilePoll::init_with_owner to construct a FilePoll value then call get_init.
NumberScope allocation refactoring in renamer
src/js_printer/renamer.rs
Replaced manual get() + unsafe write(...) with number_scope_pool.get_init(...) in assign_names_recursive flows and updated safety/cleanup comments.
H2FrameParser pool initialization with get_init
src/runtime/api/bun/h2_frame_parser.rs
Changed pooled-path construction to use pool.get_init(init).as_ptr() and updated related comments.
DevServer::defer_request refactor for slot lifecycle safety
src/runtime/bake/DevServer.rs
Refactored defer_request to perform fallible work before reserving a deferred-request slot; claim+init and link node only on success, removing early-slot claims and manual recycling on error paths.
RequestContext migration from in-place create to returning new()
src/runtime/server/RequestContext.rs
Replaced placement create(this: &mut MaybeUninit) with RequestContext::new(...) -> Self; callers now initialize pool slots via Fallback::get_init(...).
Server module and RequestCtxOps pool initialization updates
src/runtime/server/mod.rs, src/runtime/server/server_body.rs
Updated RequestCtxOps to return Self from create(...); refactored prepare_js_request_context and websocket/context fallback paths to use Fallback::get_init(...) with the new factory API.
Small tests and pointer adjustments
src/collections/*, src/install/NetworkTask.rs
Minor test signature simplifications, turbofish init forms, and removal of explicit 'static from NetworkTask.streaming_extract_task and related private type aliases.

Possibly related PRs:

  • oven-sh/bun#30901: Yes—both PRs change src/js_printer/renamer.rs to allocate NumberScope via number_scope_pool.get_init(...) instead of the deprecated HiveArrayFallback::get + unsafe write pattern.
🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and accurately describes the main change: replacing unsafe uninitialized-pointer APIs with safe construction in HiveArrayFallback.
Description check ✅ Passed The description is thorough, detailing what was removed, why (unsoundness), how each call site was migrated, and verification steps performed. It matches the template structure with 'What' and verification content.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/collections/hive_array.rs (1)

679-709: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Update the in-file tests to the new pool API.

This test module still uses the removed HiveArray::get() API, and Line 696 resets a.used with IntegerBitSet::init_empty() even though the field is now HiveBitSet<CAPACITY>. As written, cargo test for this file should no longer compile after the API removal.

Suggested migration
-            let b = a.get().unwrap();
-            // SAFETY: `b` points into `a.buffer` and was just unpoisoned by `get()`.
-            unsafe { *b = 0 };
-            assert!(a.get().unwrap() != b);
+            let b = a.get_init(0).unwrap().as_ptr();
+            assert!(a.get_init(0).unwrap().as_ptr() != b);
             assert_eq!(a.index_of(b), Some(0));
             // SAFETY: `b` is a fully-initialized hive slot.
             assert!(unsafe { a.put(b) });
-            assert!(a.get().unwrap() == b);
-            let c = a.get().unwrap();
-            // SAFETY: `c` points into `a.buffer` and was just unpoisoned by `get()`.
-            unsafe { *c = 123 };
+            assert!(a.get_init(0).unwrap().as_ptr() == b);
+            let c = a.get_init(123).unwrap().as_ptr();
             let mut d: Int = 12345;
             // SAFETY: `&mut d` is foreign — `put` returns `false` and drops nothing.
             assert!(unsafe { a.put(&mut d) } == false);
             assert!(a.r#in(&d) == false);
         }

-        a.used = IntegerBitSet::init_empty();
+        a.used = HiveBitSet::init_empty();
         {
             for i in 0..SIZE {
-                let b = a.get().unwrap();
-                // SAFETY: `b` points into `a.buffer` and was just unpoisoned by `get()`.
-                unsafe { *b = 0 };
+                let b = a.get_init(0).unwrap().as_ptr();
                 assert_eq!(a.index_of(b), Some(u32::try_from(i).expect("int cast")));
                 // SAFETY: `b` is a fully-initialized hive slot.
                 assert!(unsafe { a.put(b) });
-                assert!(a.get().unwrap() == b);
+                assert!(a.get_init(0).unwrap().as_ptr() == b);
             }
             for _ in 0..SIZE {
-                assert!(a.get().is_none());
+                assert!(a.get_init(0).is_none());
             }
         }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/collections/hive_array.rs` around lines 679 - 709, Tests still call the
removed HiveArray::get() and reset a.used with the old
IntegerBitSet::init_empty(); update the tests to use the new pool API (replace
every a.get().unwrap() / a.get().is_none() with the new allocator/borrow method
introduced by the pool refactor — e.g., acquire()/borrow()/alloc()/pop() as
appropriate for your change) and update the used reset to construct the new
bitset type (replace a.used = IntegerBitSet::init_empty() with a.used =
HiveBitSet::<CAPACITY>::init_empty() or the correct HiveBitSet constructor).
Keep the rest of the assertions (index_of, put, r#in) but target the new
get-equivalent return type when unwrapping and checking pointer equality.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/collections/hive_array.rs`:
- Around line 573-575: The doc for Fallback::put currently implies callers may
pass a raw pointer obtained directly from claim(), which is unsafe because
claim() yields a HiveSlot (an uninitialized slot) not a valid *mut T; update the
safety contract and docstring for pub unsafe fn put(&mut self, value: *mut T) in
Fallback to require the pointer be a valid, initialized T (e.g., produced by
calling claim().write(...) or by calling assume_init() on the slot) and
explicitly mention HiveSlot and the methods claim().write(...) and assume_init()
as the correct way to initialize the slot before calling put; make the wording
unambiguous that a bare claimed but uninitialized slot must not be passed to
put.

In `@src/install/PackageManager/runTasks.rs`:
- Around line 1718-1722: get_network_task currently permanently claims a slot
(via this.preallocated_network_tasks.claim()) and forgets it immediately, which
leaks the slot and leaves entries in network_dedupe_map if subsequent fallible
initialization (e.g., callers for_manifest(...) and for_tarball(...)) fails;
change get_network_task to return an RAII guard/token (e.g., ClaimGuard) that
holds the slot and implements Drop to release it, or return the slot wrapper
instead of forgetting it, then have callers take ownership and only call
mem::forget / finalize the claim after successful NetworkTask initialization (or
explicitly call a release on the guard on error) so that failed initializations
automatically roll back the claim and remove any dedupe map entry.

---

Outside diff comments:
In `@src/collections/hive_array.rs`:
- Around line 679-709: Tests still call the removed HiveArray::get() and reset
a.used with the old IntegerBitSet::init_empty(); update the tests to use the new
pool API (replace every a.get().unwrap() / a.get().is_none() with the new
allocator/borrow method introduced by the pool refactor — e.g.,
acquire()/borrow()/alloc()/pop() as appropriate for your change) and update the
used reset to construct the new bitset type (replace a.used =
IntegerBitSet::init_empty() with a.used = HiveBitSet::<CAPACITY>::init_empty()
or the correct HiveBitSet constructor). Keep the rest of the assertions
(index_of, put, r#in) but target the new get-equivalent return type when
unwrapping and checking pointer equality.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 341e5e8a-642f-465c-956c-cfbac0d45388

📥 Commits

Reviewing files that changed from the base of the PR and between 2a3d0e7 and ff6d7e8.

📒 Files selected for processing (10)
  • src/collections/hive_array.rs
  • src/install/PackageManager/PackageManagerEnqueue.rs
  • src/install/PackageManager/runTasks.rs
  • src/io/windows_event_loop.rs
  • src/js_printer/renamer.rs
  • src/runtime/api/bun/h2_frame_parser.rs
  • src/runtime/bake/DevServer.rs
  • src/runtime/server/RequestContext.rs
  • src/runtime/server/mod.rs
  • src/runtime/server/server_body.rs

Comment thread src/collections/hive_array.rs Outdated
Comment thread src/install/PackageManager/runTasks.rs Outdated
Comment thread src/collections/hive_array.rs Outdated
Comment thread src/collections/hive_array.rs Outdated
Comment thread src/runtime/server/RequestContext.rs

@claude claude Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing the test/doc/ctx_log feedback in 9de69f5 — all three look resolved. I'm still deferring to a human here given the scope: this restructures unsafe pool-slot lifecycle across the request-context hot path, DevServer::defer_request, and the install task pools, which is more than I'm comfortable auto-approving.

Extended reasoning...

Overview

This PR deletes HiveArray::get() and Fallback::{get, try_get, get_and_see_if_new} (the uninit-*mut T family) and migrates 12 files to get_init / claim(). Touched surfaces: the core pool collection, the HTTP RequestContext constructor (inverted from in-place create(&mut MaybeUninit<Self>) to by-value new() -> Self), server_body.rs / mod.rs request-context allocation sites, a ~130-line two-phase restructure of DevServer::defer_request, three install-enqueue task constructors, get_network_task, the Windows event-loop FilePoll store, the renamer's NumberScope pool, and the H2 frame-parser pool. A follow-up commit (9de69f5) migrated the #[cfg(test)] hive_array test, fixed stale intra-doc links, and re-added the ctx_log!("create ({:p})") traces at all three get_init call sites.

Security risks

No new attack surface — this is an internal memory-safety refactor. The risk class is correctness/UB rather than security: a mistake in the DevServer::defer_request two-phase init or the RequestContext constructor inversion could produce use-of-uninit or double-free under load, but the change is strictly removing an unsound API shape, not adding one.

Level of scrutiny

High. This is hot-path unsafe Rust: per-request pool allocation for Bun.serve, the DevServer deferred-request list (with self-referential abort-callback registration against a slot address taken before the slot is fully written), and the install network/resolve task pools. The defer_request restructure in particular reorders fallible work relative to slot claim and rewires raw addr_of_mut! field placement — correct by my reading, but exactly the kind of change that benefits from a second pair of eyes.

Other factors

  • All three of my prior inline comments (broken #[cfg(test)] build, stale Self::get intra-doc links, dropped ctx_log! create trace) are resolved in 9de69f5.
  • CodeRabbit's doc nit on Fallback::put is also resolved; its remaining comment on get_network_task rollback describes pre-existing behavior (the old get() had the same leak-on-? shape) and the PR explicitly documents it as the one remaining TODO site.
  • The bug-hunting system found nothing on this revision.
  • Buildkite reported failures on the earlier commit ff6d7e8; status on 9de69f5 isn't visible in the timeline.
  • Net: the change looks sound, but the combination of breadth (12 files), depth (unsafe slot-lifecycle invariants), and hot-path criticality puts it outside what I'll auto-approve.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/install/PackageManagerTask.rs`:
- Around line 24-27: Task exposes a public Request/NonNull<NetworkTask>
invariant that can be violated; hide the raw-pointer invariant by making the
Request union and any payload types non-public (pub(crate)) or by adding an
unsafe constructor on Task that requires the caller to supply a validated
NonNull<NetworkTask> with documentation of the lifetime/uniqueness contract;
update callers to use the new crate-private types or the unsafe Task::new(...)
initializer and ensure Task::callback still dereferences via as_mut() under that
invariant.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: a97f62dd-4356-4762-99da-c920e6bb7aaf

📥 Commits

Reviewing files that changed from the base of the PR and between ff6d7e8 and 04adbb8.

📒 Files selected for processing (12)
  • src/collections/bit_set.rs
  • src/collections/hive_array.rs
  • src/collections/linear_fifo.rs
  • src/install/NetworkTask.rs
  • src/install/PackageManager.rs
  • src/install/PackageManager/PackageManagerEnqueue.rs
  • src/install/PackageManager/runTasks.rs
  • src/install/PackageManagerTask.rs
  • src/install/TarballStream.rs
  • src/install/lib.rs
  • src/runtime/server/mod.rs
  • src/runtime/server/server_body.rs

Comment thread src/install/PackageManagerTask.rs
Comment thread src/runtime/server/mod.rs Outdated
Comment thread src/install/PackageManager/runTasks.rs Outdated

@claude claude Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing all the earlier feedback (test migration, doc links, ctx_log! restoration, and the REQUEST_CONTEXT_POOL_CAPACITY constant) — I didn't find any remaining issues. Deferring to a human reviewer given the scope: this introduces a new HiveOwned abstraction and restructures pool-slot lifecycle across the server request path, install task queue, and DevServer::defer_request, which is more than I'm comfortable auto-approving.

Extended reasoning...

Overview

This PR deletes the unsound HiveArrayFallback::{get, try_get, get_and_see_if_new} API (which returned *mut T to uninitialized pool memory) and migrates all 19 call sites to safe construction via get_init / claim(). It also introduces a new HiveOwned<T> sealed handle type, removes the 'a lifetime parameter from install::Task, inverts RequestContext::create(*mut Self) to ::new() -> Self, and restructures DevServer::defer_request into a fallible-first / claim-second shape that closes a latent slot leak on the ?-error path.

Security risks

None identified. This is a memory-safety hardening refactor; it removes UB hazards rather than introducing attack surface. No auth, crypto, network parsing, or user-input handling is touched.

Level of scrutiny

High. While the intent (encode "used slot ⇒ initialized" in the type system) is sound and the per-site migrations are mostly mechanical, several changes are not:

  • DevServer::defer_request reorders fallible work (prepare_and_save_js_request_context) relative to slot claiming and abort-callback registration — a control-flow change in a path with self-referential pointers and weak-ref bookkeeping.
  • The new HiveOwned type and assume_owned escape hatch establish a new ownership convention that future code will build on.
  • Task<'a>Task plus HiveOwned<NetworkTask> changes how the install task queue threads pool ownership across threads.
  • The server request-context path (prepare_js_request_context_for, on_upgrade) is the hottest allocation path in Bun.serve.

Other factors

All five of my prior inline comments (🔴 test compilation break from deleted HiveArray::get, 🟡 stale intra-doc links, 🟡 dropped ctx_log!("create") trace, 🟡 hardcoded 2048 capacity literals, 🟣 pre-existing leak note) have been addressed in follow-up commits 4a91b9c–ff5aec1a. The bug-hunting system found nothing on this revision. The CI failures in the robobun comment (unused import in expr.rs, ambiguous glob re-exports in paths/lib.rs, musl build-cpp failures) appear unrelated to this PR's changes. The new hive_owned_lifecycle test exercises the new type. Given the breadth and the hot-path touchpoints, a human should sign off.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
src/install/PackageManager/runTasks.rs (1)

1723-1733: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Reintroduce rollback for claimed NetworkTask slots.

get_network_task() still permanently claims the pool slot before later fallible setup runs. In this file, for_tarball(...) ? can still error after the claim, and HiveOwned is documented here as dropping to a no-op, so that path still strands the slot and leaves the network_dedupe_map entry wedged for the same task_id. Please keep a real claim guard alive until initialization succeeds, and only convert/relinquish ownership once the task is fully set up. As per coding guidelines "Prefer RAII (Drop) over manual cleanup in Rust code".

Also applies to: 1836-1882

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/install/PackageManager/runTasks.rs` around lines 1723 - 1733,
get_network_task currently calls this.preallocated_network_tasks.claim() and
immediately converts the slot into a permanently-held HiveOwned (via
into_owned), which can strand the slot if subsequent fallible setup (e.g.
for_tarball(...) ?) fails; instead keep a temporary claim guard (the raw claim
token returned by preallocated_network_tasks.claim()) alive and only call unsafe
slot.into_owned() after NetworkTask is fully initialized via
NetworkTask::write_init; implement the guard as an RAII Drop that releases the
claim on panic/error so the slot and the network_dedupe_map entry are not
wedged, and apply the same pattern to the other claim sites mentioned (around
lines 1836–1882) so ownership is only relinquished when initialization succeeds.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/collections/hive_array.rs`:
- Around line 469-484: Add the #[must_use] attribute to the HiveOwned<T> type to
warn when a claimed slot token is discarded; update the declaration for the
HiveOwned struct (the current pub struct HiveOwned<T>(NonNull<T>)) so that the
compiler emits a warning if a value returned by functions like get_owned() or
into_owned() is dropped without use.

In `@src/runtime/server/RequestContext.rs`:
- Around line 1309-1324: The constructor RequestContext::new currently accepts a
nullable raw pointer for server but other code assumes server is always present
(see server(), expect("infallible: server bound") and
debug_assert!(self.server.is_some())); change the constructor to enforce
non-null at the boundary by rejecting or converting the pointer to a non-null
type: either change the parameter to accept a NonNull/BackRef directly or call
NonNull::new(server.cast_mut()).map(bun_ptr::BackRef::from).expect("infallible:
server bound") (or panic with a clear message) instead of mapping to Option, and
store the guaranteed BackRef in the server field so downstream debug_asserts and
server() remain valid.

In `@src/runtime/server/server_body.rs`:
- Around line 3123-3157: The newly-claimed pool slot (created via
(*pool).get_init(Ctx::create(...)) and referenced by ctx_slot/ctx) must be held
by an RAII guard until PreparedRequestFor is fully constructed (i.e., until
to_js_for_bake() succeeds); create a small guard type that owns the pool slot
pointer and on Drop runs the existing cleanup logic that releases/deinitializes
the RequestContext (mirroring what manual cleanup would do), instantiate it
immediately after get_init(...) and before calling to_js_for_bake(), and call
guard.dismiss() (or drop it explicitly) when PreparedRequestFor construction
succeeds so the guard only cleans up on the failure path where to_js_for_bake()
returns None. Ensure the guard invokes the same deinit/return-to-pool operations
used elsewhere so pending-request bookkeeping and signal/body refs are properly
released.

---

Duplicate comments:
In `@src/install/PackageManager/runTasks.rs`:
- Around line 1723-1733: get_network_task currently calls
this.preallocated_network_tasks.claim() and immediately converts the slot into a
permanently-held HiveOwned (via into_owned), which can strand the slot if
subsequent fallible setup (e.g. for_tarball(...) ?) fails; instead keep a
temporary claim guard (the raw claim token returned by
preallocated_network_tasks.claim()) alive and only call unsafe slot.into_owned()
after NetworkTask is fully initialized via NetworkTask::write_init; implement
the guard as an RAII Drop that releases the claim on panic/error so the slot and
the network_dedupe_map entry are not wedged, and apply the same pattern to the
other claim sites mentioned (around lines 1836–1882) so ownership is only
relinquished when initialization succeeds.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 6367a5be-5dda-4e74-8434-2e92b72d09a6

📥 Commits

Reviewing files that changed from the base of the PR and between 04adbb8 and e49e914.

📒 Files selected for processing (10)
  • src/collections/hive_array.rs
  • src/collections/lib.rs
  • src/install/PackageManager/PackageManagerEnqueue.rs
  • src/install/PackageManager/PopulateManifestCache.rs
  • src/install/PackageManager/runTasks.rs
  • src/install/PackageManagerTask.rs
  • src/runtime/server/RequestContext.rs
  • src/runtime/server/mod.rs
  • src/runtime/server/server_body.rs
  • src/threading/unbounded_queue.rs

Comment thread src/collections/hive_array.rs
Comment thread src/runtime/server/RequestContext.rs
Comment on lines +3123 to 3157
// Both pools are layout-identical across the `H3` const (it only affects
// associated consts/types, not layout), so the server-owned pool pointer
// can be reinterpreted to `Ctx`'s monomorphization with a safe
// `<*mut _>::cast()`. Construction is the safe `Fallback::get_init`
// (claim + `MaybeUninit::write`): no caller forms `&mut`/`*mut` over
// uninitialized storage, and a panic in `Ctx::create` cannot leave a
// half-initialized `RequestContext` (its drop glue is the heaviest in
// the codebase). The only `unsafe` is the server-owned-pool deref.
let pool: *mut bun_collections::hive_array::Fallback<
Ctx,
{ super::request_context::REQUEST_CONTEXT_POOL_CAPACITY },
> = if Ctx::IS_H3 {
debug_assert!(
!self.h3_request_pool.is_null(),
"H3 request dispatched but h3_request_pool was never allocated (listen() H3 path not taken)"
);
self.h3_request_pool.cast()
} else {
self.request_pool.cast()
};
// SAFETY: `pool` is the live, server-owned request-context pool.
let ctx_slot: *mut Ctx = unsafe {
if Ctx::IS_H3 {
debug_assert!(
!self.h3_request_pool.is_null(),
"H3 request dispatched but h3_request_pool was never allocated (listen() H3 path not taken)"
);
let slot = (*self.h3_request_pool).claim();
Ctx::create_in(
slot.addr().as_ptr().cast(),
self_ptr,
req,
resp,
should_deinit_context,
method,
);
// SAFETY: `create_in` fully initialized the slot via `MaybeUninit::write`.
slot.assume_init().as_ptr().cast()
} else {
let slot = (*self.request_pool).claim();
Ctx::create_in(
slot.addr().as_ptr().cast(),
(*pool)
.get_init(Ctx::create(
self_ptr,
req,
resp,
should_deinit_context,
method,
);
// SAFETY: `create_in` fully initialized the slot via `MaybeUninit::write`.
slot.assume_init().as_ptr().cast()
}
))
.as_ptr()
};
// SAFETY: ctx_slot was just initialized by create_in.
ctx_log!("create<d> ({:p})<r>", ctx_slot);
// SAFETY: `ctx_slot` points at the freshly initialized pool slot.
let ctx = unsafe { &mut *ctx_slot };

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guard the claimed request context until the Bake JS wrapper succeeds.

This slot is now claimed and fully initialized before the later to_js_for_bake() branch, which can still return None. On that path the RequestContext, leaked Request, signal/body refs, and pending-request bookkeeping are left live, so repeated Bake failures can exhaust the pool and block shutdown. Keep a cleanup guard around the new context until PreparedRequestFor is fully constructed.

As per coding guidelines "Prefer RAII (Drop) over manual cleanup in Rust code".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/runtime/server/server_body.rs` around lines 3123 - 3157, The
newly-claimed pool slot (created via (*pool).get_init(Ctx::create(...)) and
referenced by ctx_slot/ctx) must be held by an RAII guard until
PreparedRequestFor is fully constructed (i.e., until to_js_for_bake() succeeds);
create a small guard type that owns the pool slot pointer and on Drop runs the
existing cleanup logic that releases/deinitializes the RequestContext (mirroring
what manual cleanup would do), instantiate it immediately after get_init(...)
and before calling to_js_for_bake(), and call guard.dismiss() (or drop it
explicitly) when PreparedRequestFor construction succeeds so the guard only
cleans up on the failure path where to_js_for_bake() returns None. Ensure the
guard invokes the same deinit/return-to-pool operations used elsewhere so
pending-request bookkeeping and signal/body refs are properly released.

Comment thread src/install/PackageManagerTask.rs Outdated
Comment on lines +3141 to 3157
self.request_pool.cast()
};
// SAFETY: `pool` is the live, server-owned request-context pool.
let ctx_slot: *mut Ctx = unsafe {
if Ctx::IS_H3 {
debug_assert!(
!self.h3_request_pool.is_null(),
"H3 request dispatched but h3_request_pool was never allocated (listen() H3 path not taken)"
);
let slot = (*self.h3_request_pool).claim();
Ctx::create_in(
slot.addr().as_ptr().cast(),
self_ptr,
req,
resp,
should_deinit_context,
method,
);
// SAFETY: `create_in` fully initialized the slot via `MaybeUninit::write`.
slot.assume_init().as_ptr().cast()
} else {
let slot = (*self.request_pool).claim();
Ctx::create_in(
slot.addr().as_ptr().cast(),
(*pool)
.get_init(Ctx::create(
self_ptr,
req,
resp,
should_deinit_context,
method,
);
// SAFETY: `create_in` fully initialized the slot via `MaybeUninit::write`.
slot.assume_init().as_ptr().cast()
}
))
.as_ptr()
};
// SAFETY: ctx_slot was just initialized by create_in.
ctx_log!("create<d> ({:p})<r>", ctx_slot);
// SAFETY: `ctx_slot` points at the freshly initialized pool slot.
let ctx = unsafe { &mut *ctx_slot };

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟣 Pre-existing: if to_js_for_bake() returns Err(_) (~line 3285), the function does return None after the pool slot has already been committed by get_init(), leaking the RequestContext slot, the body_hive ref, the signal ref, and the leaked Request object. This PR did not introduce this — the old claim() + create_in() + assume_init() path also permanently committed the slot before to_js_for_bake() ran — so it is not a blocker, but worth tracking (responding to the unresolved CodeRabbit thread above with the corrected severity).

Extended reasoning...

What the issue is

CodeRabbit's unresolved comment (id 3252701937) flags that in prepare_js_request_context_for, the pool slot is now claimed and fully initialized via (*pool).get_init(Ctx::create(...)) (lines 3144–3153) before the later to_js_for_bake() call at ~line 3282, which can return Err(_)return None at line 3285. On that error path the RequestContext slot, the body_hive ref, the signal.pending_activity_ref(), the Box::leak'd request_object, and the pending_requests += 1 bookkeeping are all left live with no cleanup.

That observation is correct. However, CodeRabbit's framing — "This slot is now claimed and fully initialized before…" — implies the PR introduced this. It did not.

Why this is byte-for-byte pre-existing

The pre-PR code at this site (visible in the diff's removed lines) was:

let slot = (*self.request_pool).claim();
Ctx::create_in(slot.addr().as_ptr().cast(), self_ptr, req, resp, ...);
slot.assume_init().as_ptr().cast()

HiveSlot::assume_init() is implemented as let this = ManuallyDrop::new(self); this.slot.cast::<T>() — wrapping the token in ManuallyDrop suppresses HiveSlot::drop, permanently committing the claim. The new code is (*pool).get_init(Ctx::create(...)), where get_init is self.claim().write(value) and HiveSlot::write() is also ManuallyDrop::new(self) + write. Both paths consume the HiveSlot token at exactly the same point in control flow — ~130 lines before to_js_for_bake() runs. Everything between the slot commit and line 3285 (body_hive allocation, signal ref, request_object leak, ctx.arm_on_data(resp), pending_requests += 1) is unchanged by this PR.

Step-by-step proof of equivalence

  1. Before PR: claim() reserves a slot and returns a HiveSlot token. create_in() writes the value via MaybeUninit::write. assume_init() wraps the token in ManuallyDropHiveSlot::drop never runs → slot stays claimed. Control reaches line ~3285.
  2. After PR: Ctx::create() builds the value on the stack. get_init()claim().write(value)write() wraps the token in ManuallyDrop and moves the value in → HiveSlot::drop never runs → slot stays claimed. Control reaches line ~3285.
  3. In both cases, if request_object.to_js_for_bake(global_this) then returns Err(e) (any non-OOM JsError), the match arm does Err(_) => return None. The function returns; the slot stays marked used in the bitset, body_hive is never put, the signal's pending-activity ref is never released, pending_requests is never decremented, and the Box::leak'd Request is never reclaimed.

The PR's only behavioral change in this region is that a panic inside Ctx::create() now happens before any slot is claimed (the value is built on the stack first, then claim() runs). For the Err(_) early-return path that CodeRabbit flagged, behavior is identical.

Why nothing prevents it

There is no RAII guard around ctx_slot between its commit and the PreparedRequestFor construction. The function has multiple side-effecting steps (hive alloc, signal ref, request leak, arm_on_data, pending-request counter) that all run after the slot commit and before the fallible to_js_for_bake(), and none of them are unwound on the Err branch. This is exactly the shape CodeRabbit's suggested fix describes — and it is the same shape the code had before this PR.

Impact

Real but narrow: to_js_for_bake() must return a non-OOM JsError during Bake request setup. Each such failure leaks one 2048-slot-pool entry (or a heap Box once the pool spills), one Body::Value hive slot, one signal ref, and one Request heap allocation, and leaves pending_requests permanently inflated (which can block server shutdown). Repeated failures during dev iteration could exhaust the inline pool and degrade to heap allocation per request.

Suggested follow-up (not for this PR)

CodeRabbit's suggested shape is right: hold a small RAII guard from immediately after get_init() through PreparedRequestFor construction, whose Drop runs the same cleanup sequence as the normal teardown path (ctx.deinit() → pool put, decrement pending_requests, release body_hive/signal). Disarm it on success. That is a separate change with its own blast radius (it touches the request teardown contract) and is appropriately deferred from this refactor PR.

@Jarred-Sumner Jarred-Sumner force-pushed the claude/hive-safe-pool-api branch from 706fb36 to 2219c60 Compare May 17, 2026 21:11

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
src/install/PackageManager/runTasks.rs (1)

1720-1730: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Rollback claimed NetworkTask slots on for_* errors.

Line 1727 converts the claim into a HiveOwned, and the comment at Lines 1723-1725 says that path does not recycle the slot automatically. Line 1879 then does for_tarball(...) ? after taking that handle, so an error leaves the pool slot claimed and the dedupe entry from has_created_network_task() behind. The same helper is now used by start_manifest_task() for for_manifest(...) ?, so the leak/wedge still exists on both fallible init paths. Please keep the claim under an RAII rollback guard until the task is fully initialized and enqueued, or explicitly put() the slot and remove the dedupe entry on error. As per coding guidelines "Prefer RAII (Drop) over manual cleanup in Rust code".

Also applies to: 1833-1880

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/install/PackageManager/runTasks.rs` around lines 1720 - 1730, The claimed
pool slot returned by get_network_task (preallocated_network_tasks.claim() →
into_owned()) must be protected by an RAII rollback guard until
NetworkTask::write_init(...) completes and the task is enqueued; otherwise
errors from for_tarball(...) / for_manifest(...) leave the slot and the
has_created_network_task() dedupe entry leaked. Wrap the claim in a small guard
type (or use an existing scope guard) that on Drop calls the pool put() and
clears the dedupe entry, and dismisses itself when initialization/enqueue
succeeds; alternatively ensure every fallible path after get_network_task (e.g.
in start_manifest_task, for_tarball, for_manifest) explicitly calls put() and
removes the dedupe entry before returning an Err, but prefer the RAII guard
approach per guidelines.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@src/install/PackageManager/runTasks.rs`:
- Around line 1720-1730: The claimed pool slot returned by get_network_task
(preallocated_network_tasks.claim() → into_owned()) must be protected by an RAII
rollback guard until NetworkTask::write_init(...) completes and the task is
enqueued; otherwise errors from for_tarball(...) / for_manifest(...) leave the
slot and the has_created_network_task() dedupe entry leaked. Wrap the claim in a
small guard type (or use an existing scope guard) that on Drop calls the pool
put() and clears the dedupe entry, and dismisses itself when
initialization/enqueue succeeds; alternatively ensure every fallible path after
get_network_task (e.g. in start_manifest_task, for_tarball, for_manifest)
explicitly calls put() and removes the dedupe entry before returning an Err, but
prefer the RAII guard approach per guidelines.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 44390edb-3c31-490c-a669-b453ee214a8e

📥 Commits

Reviewing files that changed from the base of the PR and between 706fb36 and 2219c60.

📒 Files selected for processing (13)
  • src/collections/bit_set.rs
  • src/collections/hive_array.rs
  • src/collections/lib.rs
  • src/collections/linear_fifo.rs
  • src/install/NetworkTask.rs
  • src/install/PackageManager.rs
  • src/install/PackageManager/PackageManagerEnqueue.rs
  • src/install/PackageManager/PopulateManifestCache.rs
  • src/install/PackageManager/runTasks.rs
  • src/install/PackageManagerTask.rs
  • src/install/TarballStream.rs
  • src/install/lib.rs
  • src/io/windows_event_loop.rs

Jarred-Sumner added a commit that referenced this pull request May 17, 2026
**ThreadPool::init dead generic.** `bundler::ThreadPool::init` (and its
helper `init_with_pool`) were generic over `V2` because, during the
phased Zig→Rust port, `bundle_v2.rs` carried two `BundleV2` definitions
— the canonical one plus a `bv2_impl` draft module — and both needed to
call `init`. The PORT NOTE on the function explicitly said it should
collapse to `&BundleV2<'_>` once `bv2_impl` was dropped. That module is
gone (`grep -rn "struct BundleV2" src/bundler/` finds exactly one
definition, `bundle_v2.rs:75`), so this collapses the generic, names the
concrete type, and drops the now-stale PORT NOTE. The body was already
storing a type-erased raw pointer, so the monomorphised code is
identical.

**cpuid cfg-gate mismatch.** `perf/hw_timer.rs` defines `struct
CpuidResult` and `fn cpuid()` under `#[cfg(target_arch = "x86_64")]`,
but their only callers live inside a `#[cfg(not(any(target_os = "macos",
target_os = "freebsd")))]` block (macOS/FreeBSD read the boot-time TSC
frequency from `sysctl` instead of probing CPUID). On
`x86_64-apple-darwin` and `x86_64-unknown-freebsd` the helpers were
therefore compiled with no callers, producing `never constructed` /
`never used` warnings. The cfg gates now mirror the callers' conditions.

Verified with `cargo check --workspace --keep-going` (clean), `bun run
rust:check-all` (10/10 targets), and `cargo fmt -p bun_bundler -p
bun_perf --check`. Confirmed the dead-code warnings on
`x86_64-apple-darwin` are present without this change and gone with it.
File set is disjoint from #30879.
Comment thread src/collections/hive_array.rs Outdated
Comment thread src/install/PackageManager/runTasks.rs
Squash of the original 14-commit history, rebased onto main after the
HiveArray UnsafeCell/aliasing rework merged. Both safety hardenings are
kept: the UnsafeCell buffer + HiveBox/box_at/ptr_at/alloc + miri infra
from main, and the HiveOwned/get_owned/into_owned API from this branch.

- Delete HiveArray::get / Fallback::{get, try_get, get_and_see_if_new}
  (and their now-orphaned test). They claim a slot then return *mut T to
  uninitialized storage; an early return between claim and ptr::write
  leaves a claimed-but-uninit slot for a later put() to drop. The
  UnsafeCell buffer change addressed the aliasing hazard but not the
  uninit one. claim()/get_init()/emplace() are the replacements and they
  had no remaining callers.
- Add HiveOwned<T>: sealed (no public ctor), move-only, #[must_use],
  lifetime-erased pool-slot handle for tasks that outlive the claiming
  scope. No Drop (recycle is the explicit Fallback::put step).
- Add Fallback::get_owned(), HiveSlot::into_owned(),
  HiveOwned::from_raw_for_queue() (queue ingress back-door),
  UnboundedQueue::push_owned/pop_owned/next_owned.
- get_owned() takes &self to match the rest of Fallback's interior-
  mutability receivers.
- Migrate Task.network / NetworkTask / DevServer / RequestContext /
  pool-task callers off raw uninit pointers and forged 'static muts.
- Migrate hive_array tests to claim()/get_init() and add a
  hive_owned_lifecycle test (passes under cargo miri -Zmiri-tree-borrows).
@Jarred-Sumner Jarred-Sumner force-pushed the claude/hive-safe-pool-api branch from f80f1ff to f2e0f2e Compare May 17, 2026 23:51

@claude claude Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All prior review nits are now addressed and this pass found nothing new — deferring to a human given the scope (new HiveOwned abstraction, Task<'a> lifetime removal, DevServer::defer_request control-flow restructure, and the server create→new reshape across 18 files).

Extended reasoning...

Overview

This PR deletes the unsound HiveArrayFallback::{get, try_get, get_and_see_if_new} uninit-pointer API and migrates every caller to safe construction. It touches 18 files across five subsystems: the hive_array collection itself (deletes deprecated entry points, adds HiveSlot::into_owned and a new HiveOwned<T> lifetime-erased handle type, adds Fallback::get_owned, rewrites tests), the install package manager (Task<'a>Task with HiveOwned<NetworkTask> payloads, enqueue/runTasks/PopulateManifestCache reshaped to build-then-get_init, get_network_task now returns HiveOwned), the HTTP server (RequestContext::create(&mut MaybeUninit)::new() -> Self, RequestCtxOps::create_in::create() -> Self, three call sites switched to get_init with a new REQUEST_CONTEXT_POOL_CAPACITY const), DevServer::defer_request (split into a fallible pre-claim phase and an infallible placement-init phase, closing a latent slot leak on the ? path), and UnboundedQueue (new push_owned/next_owned for round-tripping HiveOwned through the intrusive queue). Plus mechanical get_init migrations in renamer/h2_frame_parser/windows_event_loop and a sweep of stale get() doc references.

Security risks

None identified. This is an internal memory-safety refactor of pool-slot allocation; it does not touch auth, crypto, network parsing, or user-input handling. The change reduces unsafety surface by removing an API shape that handed out *mut T to uninitialized memory.

Level of scrutiny

High. The diff is large (~700 net lines across 18 files) and sits squarely in memory-safety-critical and hot-path code: the RequestContext pool is on every HTTP request, the install task pools are on every bun install, and UnboundedQueue is the cross-thread task channel. The PR introduces a new public abstraction (HiveOwned<T> with a #[doc(hidden)] from_raw_for_queue back-door), removes a lifetime parameter from a widely-used struct, and restructures control flow in DevServer::defer_request in a way that changes when side effects (on_aborted registration, server_handler.ctx.ref_()) run relative to slot claiming. These are exactly the kind of design and ordering decisions that benefit from a human reviewer who knows the surrounding invariants.

Other factors

Over multiple bot-review rounds the author has addressed every finding I raised (test-cfg build break, broken intra-doc links, dropped ctx_log! create trace, hardcoded 2048 capacity, unused NonNull import, into_owned Safety-doc wording, install-crate stale get() references) plus the actionable CodeRabbit items (#[must_use] on HiveOwned, NonNull server param, put() Safety wording). The one remaining unresolved CodeRabbit thread (server_body.rs to_js_for_bake leak) I previously analysed as byte-for-byte pre-existing, not introduced here. CI shows one Windows build-rust job failure that is an agent-provisioning error, not a compile failure. No new issues surfaced on this pass — the PR looks clean from the automated side, but its scope and design surface put it well outside the auto-approval bar.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants