Replace HiveArrayFallback's uninit-pointer API with safe construction#30879
Replace HiveArrayFallback's uninit-pointer API with safe construction#30879Jarred-Sumner wants to merge 1 commit into
Conversation
|
Updated 9:09 PM PT - May 17th, 2026
✅ @Jarred-Sumner, your commit f2e0f2e34dfd1cbea4bd80774d5fc21892a89204 passed in 🧪 To try this PR locally: bunx bun-pr 30879That installs a local version of the PR into your bun-30879 --bun |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughRemoves deprecated two-phase pool-allocation APIs and migrates callers to get_init()/get_owned()/claim()+into_owned() flows; introduces HiveOwned handles; updates enqueue, runTasks, server request-context, DevServer, H2 parser, JS renamer, io, threading, and tests to the new ownership/initialization patterns. ChangesPool API safety migration
Possibly related PRs:
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/collections/hive_array.rs (1)
679-709:⚠️ Potential issue | 🟠 Major | ⚡ Quick winUpdate the in-file tests to the new pool API.
This test module still uses the removed
HiveArray::get()API, and Line 696 resetsa.usedwithIntegerBitSet::init_empty()even though the field is nowHiveBitSet<CAPACITY>. As written,cargo testfor this file should no longer compile after the API removal.Suggested migration
- let b = a.get().unwrap(); - // SAFETY: `b` points into `a.buffer` and was just unpoisoned by `get()`. - unsafe { *b = 0 }; - assert!(a.get().unwrap() != b); + let b = a.get_init(0).unwrap().as_ptr(); + assert!(a.get_init(0).unwrap().as_ptr() != b); assert_eq!(a.index_of(b), Some(0)); // SAFETY: `b` is a fully-initialized hive slot. assert!(unsafe { a.put(b) }); - assert!(a.get().unwrap() == b); - let c = a.get().unwrap(); - // SAFETY: `c` points into `a.buffer` and was just unpoisoned by `get()`. - unsafe { *c = 123 }; + assert!(a.get_init(0).unwrap().as_ptr() == b); + let c = a.get_init(123).unwrap().as_ptr(); let mut d: Int = 12345; // SAFETY: `&mut d` is foreign — `put` returns `false` and drops nothing. assert!(unsafe { a.put(&mut d) } == false); assert!(a.r#in(&d) == false); } - a.used = IntegerBitSet::init_empty(); + a.used = HiveBitSet::init_empty(); { for i in 0..SIZE { - let b = a.get().unwrap(); - // SAFETY: `b` points into `a.buffer` and was just unpoisoned by `get()`. - unsafe { *b = 0 }; + let b = a.get_init(0).unwrap().as_ptr(); assert_eq!(a.index_of(b), Some(u32::try_from(i).expect("int cast"))); // SAFETY: `b` is a fully-initialized hive slot. assert!(unsafe { a.put(b) }); - assert!(a.get().unwrap() == b); + assert!(a.get_init(0).unwrap().as_ptr() == b); } for _ in 0..SIZE { - assert!(a.get().is_none()); + assert!(a.get_init(0).is_none()); } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/collections/hive_array.rs` around lines 679 - 709, Tests still call the removed HiveArray::get() and reset a.used with the old IntegerBitSet::init_empty(); update the tests to use the new pool API (replace every a.get().unwrap() / a.get().is_none() with the new allocator/borrow method introduced by the pool refactor — e.g., acquire()/borrow()/alloc()/pop() as appropriate for your change) and update the used reset to construct the new bitset type (replace a.used = IntegerBitSet::init_empty() with a.used = HiveBitSet::<CAPACITY>::init_empty() or the correct HiveBitSet constructor). Keep the rest of the assertions (index_of, put, r#in) but target the new get-equivalent return type when unwrapping and checking pointer equality.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/collections/hive_array.rs`:
- Around line 573-575: The doc for Fallback::put currently implies callers may
pass a raw pointer obtained directly from claim(), which is unsafe because
claim() yields a HiveSlot (an uninitialized slot) not a valid *mut T; update the
safety contract and docstring for pub unsafe fn put(&mut self, value: *mut T) in
Fallback to require the pointer be a valid, initialized T (e.g., produced by
calling claim().write(...) or by calling assume_init() on the slot) and
explicitly mention HiveSlot and the methods claim().write(...) and assume_init()
as the correct way to initialize the slot before calling put; make the wording
unambiguous that a bare claimed but uninitialized slot must not be passed to
put.
In `@src/install/PackageManager/runTasks.rs`:
- Around line 1718-1722: get_network_task currently permanently claims a slot
(via this.preallocated_network_tasks.claim()) and forgets it immediately, which
leaks the slot and leaves entries in network_dedupe_map if subsequent fallible
initialization (e.g., callers for_manifest(...) and for_tarball(...)) fails;
change get_network_task to return an RAII guard/token (e.g., ClaimGuard) that
holds the slot and implements Drop to release it, or return the slot wrapper
instead of forgetting it, then have callers take ownership and only call
mem::forget / finalize the claim after successful NetworkTask initialization (or
explicitly call a release on the guard on error) so that failed initializations
automatically roll back the claim and remove any dedupe map entry.
---
Outside diff comments:
In `@src/collections/hive_array.rs`:
- Around line 679-709: Tests still call the removed HiveArray::get() and reset
a.used with the old IntegerBitSet::init_empty(); update the tests to use the new
pool API (replace every a.get().unwrap() / a.get().is_none() with the new
allocator/borrow method introduced by the pool refactor — e.g.,
acquire()/borrow()/alloc()/pop() as appropriate for your change) and update the
used reset to construct the new bitset type (replace a.used =
IntegerBitSet::init_empty() with a.used = HiveBitSet::<CAPACITY>::init_empty()
or the correct HiveBitSet constructor). Keep the rest of the assertions
(index_of, put, r#in) but target the new get-equivalent return type when
unwrapping and checking pointer equality.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 341e5e8a-642f-465c-956c-cfbac0d45388
📒 Files selected for processing (10)
src/collections/hive_array.rssrc/install/PackageManager/PackageManagerEnqueue.rssrc/install/PackageManager/runTasks.rssrc/io/windows_event_loop.rssrc/js_printer/renamer.rssrc/runtime/api/bun/h2_frame_parser.rssrc/runtime/bake/DevServer.rssrc/runtime/server/RequestContext.rssrc/runtime/server/mod.rssrc/runtime/server/server_body.rs
There was a problem hiding this comment.
Thanks for addressing the test/doc/ctx_log feedback in 9de69f5 — all three look resolved. I'm still deferring to a human here given the scope: this restructures unsafe pool-slot lifecycle across the request-context hot path, DevServer::defer_request, and the install task pools, which is more than I'm comfortable auto-approving.
Extended reasoning...
Overview
This PR deletes HiveArray::get() and Fallback::{get, try_get, get_and_see_if_new} (the uninit-*mut T family) and migrates 12 files to get_init / claim(). Touched surfaces: the core pool collection, the HTTP RequestContext constructor (inverted from in-place create(&mut MaybeUninit<Self>) to by-value new() -> Self), server_body.rs / mod.rs request-context allocation sites, a ~130-line two-phase restructure of DevServer::defer_request, three install-enqueue task constructors, get_network_task, the Windows event-loop FilePoll store, the renamer's NumberScope pool, and the H2 frame-parser pool. A follow-up commit (9de69f5) migrated the #[cfg(test)] hive_array test, fixed stale intra-doc links, and re-added the ctx_log!("create ({:p})") traces at all three get_init call sites.
Security risks
No new attack surface — this is an internal memory-safety refactor. The risk class is correctness/UB rather than security: a mistake in the DevServer::defer_request two-phase init or the RequestContext constructor inversion could produce use-of-uninit or double-free under load, but the change is strictly removing an unsound API shape, not adding one.
Level of scrutiny
High. This is hot-path unsafe Rust: per-request pool allocation for Bun.serve, the DevServer deferred-request list (with self-referential abort-callback registration against a slot address taken before the slot is fully written), and the install network/resolve task pools. The defer_request restructure in particular reorders fallible work relative to slot claim and rewires raw addr_of_mut! field placement — correct by my reading, but exactly the kind of change that benefits from a second pair of eyes.
Other factors
- All three of my prior inline comments (broken
#[cfg(test)]build, staleSelf::getintra-doc links, droppedctx_log!create trace) are resolved in 9de69f5. - CodeRabbit's doc nit on
Fallback::putis also resolved; its remaining comment onget_network_taskrollback describes pre-existing behavior (the oldget()had the same leak-on-?shape) and the PR explicitly documents it as the one remaining TODO site. - The bug-hunting system found nothing on this revision.
- Buildkite reported failures on the earlier commit ff6d7e8; status on 9de69f5 isn't visible in the timeline.
- Net: the change looks sound, but the combination of breadth (12 files), depth (unsafe slot-lifecycle invariants), and hot-path criticality puts it outside what I'll auto-approve.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/install/PackageManagerTask.rs`:
- Around line 24-27: Task exposes a public Request/NonNull<NetworkTask>
invariant that can be violated; hide the raw-pointer invariant by making the
Request union and any payload types non-public (pub(crate)) or by adding an
unsafe constructor on Task that requires the caller to supply a validated
NonNull<NetworkTask> with documentation of the lifetime/uniqueness contract;
update callers to use the new crate-private types or the unsafe Task::new(...)
initializer and ensure Task::callback still dereferences via as_mut() under that
invariant.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: a97f62dd-4356-4762-99da-c920e6bb7aaf
📒 Files selected for processing (12)
src/collections/bit_set.rssrc/collections/hive_array.rssrc/collections/linear_fifo.rssrc/install/NetworkTask.rssrc/install/PackageManager.rssrc/install/PackageManager/PackageManagerEnqueue.rssrc/install/PackageManager/runTasks.rssrc/install/PackageManagerTask.rssrc/install/TarballStream.rssrc/install/lib.rssrc/runtime/server/mod.rssrc/runtime/server/server_body.rs
There was a problem hiding this comment.
Thanks for addressing all the earlier feedback (test migration, doc links, ctx_log! restoration, and the REQUEST_CONTEXT_POOL_CAPACITY constant) — I didn't find any remaining issues. Deferring to a human reviewer given the scope: this introduces a new HiveOwned abstraction and restructures pool-slot lifecycle across the server request path, install task queue, and DevServer::defer_request, which is more than I'm comfortable auto-approving.
Extended reasoning...
Overview
This PR deletes the unsound HiveArrayFallback::{get, try_get, get_and_see_if_new} API (which returned *mut T to uninitialized pool memory) and migrates all 19 call sites to safe construction via get_init / claim(). It also introduces a new HiveOwned<T> sealed handle type, removes the 'a lifetime parameter from install::Task, inverts RequestContext::create(*mut Self) to ::new() -> Self, and restructures DevServer::defer_request into a fallible-first / claim-second shape that closes a latent slot leak on the ?-error path.
Security risks
None identified. This is a memory-safety hardening refactor; it removes UB hazards rather than introducing attack surface. No auth, crypto, network parsing, or user-input handling is touched.
Level of scrutiny
High. While the intent (encode "used slot ⇒ initialized" in the type system) is sound and the per-site migrations are mostly mechanical, several changes are not:
DevServer::defer_requestreorders fallible work (prepare_and_save_js_request_context) relative to slot claiming and abort-callback registration — a control-flow change in a path with self-referential pointers and weak-ref bookkeeping.- The new
HiveOwnedtype andassume_ownedescape hatch establish a new ownership convention that future code will build on. Task<'a>→TaskplusHiveOwned<NetworkTask>changes how the install task queue threads pool ownership across threads.- The server request-context path (
prepare_js_request_context_for,on_upgrade) is the hottest allocation path inBun.serve.
Other factors
All five of my prior inline comments (🔴 test compilation break from deleted HiveArray::get, 🟡 stale intra-doc links, 🟡 dropped ctx_log!("create") trace, 🟡 hardcoded 2048 capacity literals, 🟣 pre-existing leak note) have been addressed in follow-up commits 4a91b9c–ff5aec1a. The bug-hunting system found nothing on this revision. The CI failures in the robobun comment (unused import in expr.rs, ambiguous glob re-exports in paths/lib.rs, musl build-cpp failures) appear unrelated to this PR's changes. The new hive_owned_lifecycle test exercises the new type. Given the breadth and the hot-path touchpoints, a human should sign off.
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (1)
src/install/PackageManager/runTasks.rs (1)
1723-1733:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftReintroduce rollback for claimed
NetworkTaskslots.
get_network_task()still permanently claims the pool slot before later fallible setup runs. In this file,for_tarball(...) ?can still error after the claim, andHiveOwnedis documented here as dropping to a no-op, so that path still strands the slot and leaves thenetwork_dedupe_mapentry wedged for the sametask_id. Please keep a real claim guard alive until initialization succeeds, and only convert/relinquish ownership once the task is fully set up. As per coding guidelines "Prefer RAII (Drop) over manual cleanup in Rust code".Also applies to: 1836-1882
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/install/PackageManager/runTasks.rs` around lines 1723 - 1733, get_network_task currently calls this.preallocated_network_tasks.claim() and immediately converts the slot into a permanently-held HiveOwned (via into_owned), which can strand the slot if subsequent fallible setup (e.g. for_tarball(...) ?) fails; instead keep a temporary claim guard (the raw claim token returned by preallocated_network_tasks.claim()) alive and only call unsafe slot.into_owned() after NetworkTask is fully initialized via NetworkTask::write_init; implement the guard as an RAII Drop that releases the claim on panic/error so the slot and the network_dedupe_map entry are not wedged, and apply the same pattern to the other claim sites mentioned (around lines 1836–1882) so ownership is only relinquished when initialization succeeds.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/collections/hive_array.rs`:
- Around line 469-484: Add the #[must_use] attribute to the HiveOwned<T> type to
warn when a claimed slot token is discarded; update the declaration for the
HiveOwned struct (the current pub struct HiveOwned<T>(NonNull<T>)) so that the
compiler emits a warning if a value returned by functions like get_owned() or
into_owned() is dropped without use.
In `@src/runtime/server/RequestContext.rs`:
- Around line 1309-1324: The constructor RequestContext::new currently accepts a
nullable raw pointer for server but other code assumes server is always present
(see server(), expect("infallible: server bound") and
debug_assert!(self.server.is_some())); change the constructor to enforce
non-null at the boundary by rejecting or converting the pointer to a non-null
type: either change the parameter to accept a NonNull/BackRef directly or call
NonNull::new(server.cast_mut()).map(bun_ptr::BackRef::from).expect("infallible:
server bound") (or panic with a clear message) instead of mapping to Option, and
store the guaranteed BackRef in the server field so downstream debug_asserts and
server() remain valid.
In `@src/runtime/server/server_body.rs`:
- Around line 3123-3157: The newly-claimed pool slot (created via
(*pool).get_init(Ctx::create(...)) and referenced by ctx_slot/ctx) must be held
by an RAII guard until PreparedRequestFor is fully constructed (i.e., until
to_js_for_bake() succeeds); create a small guard type that owns the pool slot
pointer and on Drop runs the existing cleanup logic that releases/deinitializes
the RequestContext (mirroring what manual cleanup would do), instantiate it
immediately after get_init(...) and before calling to_js_for_bake(), and call
guard.dismiss() (or drop it explicitly) when PreparedRequestFor construction
succeeds so the guard only cleans up on the failure path where to_js_for_bake()
returns None. Ensure the guard invokes the same deinit/return-to-pool operations
used elsewhere so pending-request bookkeeping and signal/body refs are properly
released.
---
Duplicate comments:
In `@src/install/PackageManager/runTasks.rs`:
- Around line 1723-1733: get_network_task currently calls
this.preallocated_network_tasks.claim() and immediately converts the slot into a
permanently-held HiveOwned (via into_owned), which can strand the slot if
subsequent fallible setup (e.g. for_tarball(...) ?) fails; instead keep a
temporary claim guard (the raw claim token returned by
preallocated_network_tasks.claim()) alive and only call unsafe slot.into_owned()
after NetworkTask is fully initialized via NetworkTask::write_init; implement
the guard as an RAII Drop that releases the claim on panic/error so the slot and
the network_dedupe_map entry are not wedged, and apply the same pattern to the
other claim sites mentioned (around lines 1836–1882) so ownership is only
relinquished when initialization succeeds.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 6367a5be-5dda-4e74-8434-2e92b72d09a6
📒 Files selected for processing (10)
src/collections/hive_array.rssrc/collections/lib.rssrc/install/PackageManager/PackageManagerEnqueue.rssrc/install/PackageManager/PopulateManifestCache.rssrc/install/PackageManager/runTasks.rssrc/install/PackageManagerTask.rssrc/runtime/server/RequestContext.rssrc/runtime/server/mod.rssrc/runtime/server/server_body.rssrc/threading/unbounded_queue.rs
| // Both pools are layout-identical across the `H3` const (it only affects | ||
| // associated consts/types, not layout), so the server-owned pool pointer | ||
| // can be reinterpreted to `Ctx`'s monomorphization with a safe | ||
| // `<*mut _>::cast()`. Construction is the safe `Fallback::get_init` | ||
| // (claim + `MaybeUninit::write`): no caller forms `&mut`/`*mut` over | ||
| // uninitialized storage, and a panic in `Ctx::create` cannot leave a | ||
| // half-initialized `RequestContext` (its drop glue is the heaviest in | ||
| // the codebase). The only `unsafe` is the server-owned-pool deref. | ||
| let pool: *mut bun_collections::hive_array::Fallback< | ||
| Ctx, | ||
| { super::request_context::REQUEST_CONTEXT_POOL_CAPACITY }, | ||
| > = if Ctx::IS_H3 { | ||
| debug_assert!( | ||
| !self.h3_request_pool.is_null(), | ||
| "H3 request dispatched but h3_request_pool was never allocated (listen() H3 path not taken)" | ||
| ); | ||
| self.h3_request_pool.cast() | ||
| } else { | ||
| self.request_pool.cast() | ||
| }; | ||
| // SAFETY: `pool` is the live, server-owned request-context pool. | ||
| let ctx_slot: *mut Ctx = unsafe { | ||
| if Ctx::IS_H3 { | ||
| debug_assert!( | ||
| !self.h3_request_pool.is_null(), | ||
| "H3 request dispatched but h3_request_pool was never allocated (listen() H3 path not taken)" | ||
| ); | ||
| let slot = (*self.h3_request_pool).claim(); | ||
| Ctx::create_in( | ||
| slot.addr().as_ptr().cast(), | ||
| self_ptr, | ||
| req, | ||
| resp, | ||
| should_deinit_context, | ||
| method, | ||
| ); | ||
| // SAFETY: `create_in` fully initialized the slot via `MaybeUninit::write`. | ||
| slot.assume_init().as_ptr().cast() | ||
| } else { | ||
| let slot = (*self.request_pool).claim(); | ||
| Ctx::create_in( | ||
| slot.addr().as_ptr().cast(), | ||
| (*pool) | ||
| .get_init(Ctx::create( | ||
| self_ptr, | ||
| req, | ||
| resp, | ||
| should_deinit_context, | ||
| method, | ||
| ); | ||
| // SAFETY: `create_in` fully initialized the slot via `MaybeUninit::write`. | ||
| slot.assume_init().as_ptr().cast() | ||
| } | ||
| )) | ||
| .as_ptr() | ||
| }; | ||
| // SAFETY: ctx_slot was just initialized by create_in. | ||
| ctx_log!("create<d> ({:p})<r>", ctx_slot); | ||
| // SAFETY: `ctx_slot` points at the freshly initialized pool slot. | ||
| let ctx = unsafe { &mut *ctx_slot }; |
There was a problem hiding this comment.
Guard the claimed request context until the Bake JS wrapper succeeds.
This slot is now claimed and fully initialized before the later to_js_for_bake() branch, which can still return None. On that path the RequestContext, leaked Request, signal/body refs, and pending-request bookkeeping are left live, so repeated Bake failures can exhaust the pool and block shutdown. Keep a cleanup guard around the new context until PreparedRequestFor is fully constructed.
As per coding guidelines "Prefer RAII (Drop) over manual cleanup in Rust code".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/runtime/server/server_body.rs` around lines 3123 - 3157, The
newly-claimed pool slot (created via (*pool).get_init(Ctx::create(...)) and
referenced by ctx_slot/ctx) must be held by an RAII guard until
PreparedRequestFor is fully constructed (i.e., until to_js_for_bake() succeeds);
create a small guard type that owns the pool slot pointer and on Drop runs the
existing cleanup logic that releases/deinitializes the RequestContext (mirroring
what manual cleanup would do), instantiate it immediately after get_init(...)
and before calling to_js_for_bake(), and call guard.dismiss() (or drop it
explicitly) when PreparedRequestFor construction succeeds so the guard only
cleans up on the failure path where to_js_for_bake() returns None. Ensure the
guard invokes the same deinit/return-to-pool operations used elsewhere so
pending-request bookkeeping and signal/body refs are properly released.
| self.request_pool.cast() | ||
| }; | ||
| // SAFETY: `pool` is the live, server-owned request-context pool. | ||
| let ctx_slot: *mut Ctx = unsafe { | ||
| if Ctx::IS_H3 { | ||
| debug_assert!( | ||
| !self.h3_request_pool.is_null(), | ||
| "H3 request dispatched but h3_request_pool was never allocated (listen() H3 path not taken)" | ||
| ); | ||
| let slot = (*self.h3_request_pool).claim(); | ||
| Ctx::create_in( | ||
| slot.addr().as_ptr().cast(), | ||
| self_ptr, | ||
| req, | ||
| resp, | ||
| should_deinit_context, | ||
| method, | ||
| ); | ||
| // SAFETY: `create_in` fully initialized the slot via `MaybeUninit::write`. | ||
| slot.assume_init().as_ptr().cast() | ||
| } else { | ||
| let slot = (*self.request_pool).claim(); | ||
| Ctx::create_in( | ||
| slot.addr().as_ptr().cast(), | ||
| (*pool) | ||
| .get_init(Ctx::create( | ||
| self_ptr, | ||
| req, | ||
| resp, | ||
| should_deinit_context, | ||
| method, | ||
| ); | ||
| // SAFETY: `create_in` fully initialized the slot via `MaybeUninit::write`. | ||
| slot.assume_init().as_ptr().cast() | ||
| } | ||
| )) | ||
| .as_ptr() | ||
| }; | ||
| // SAFETY: ctx_slot was just initialized by create_in. | ||
| ctx_log!("create<d> ({:p})<r>", ctx_slot); | ||
| // SAFETY: `ctx_slot` points at the freshly initialized pool slot. | ||
| let ctx = unsafe { &mut *ctx_slot }; |
There was a problem hiding this comment.
🟣 Pre-existing: if to_js_for_bake() returns Err(_) (~line 3285), the function does return None after the pool slot has already been committed by get_init(), leaking the RequestContext slot, the body_hive ref, the signal ref, and the leaked Request object. This PR did not introduce this — the old claim() + create_in() + assume_init() path also permanently committed the slot before to_js_for_bake() ran — so it is not a blocker, but worth tracking (responding to the unresolved CodeRabbit thread above with the corrected severity).
Extended reasoning...
What the issue is
CodeRabbit's unresolved comment (id 3252701937) flags that in prepare_js_request_context_for, the pool slot is now claimed and fully initialized via (*pool).get_init(Ctx::create(...)) (lines 3144–3153) before the later to_js_for_bake() call at ~line 3282, which can return Err(_) → return None at line 3285. On that error path the RequestContext slot, the body_hive ref, the signal.pending_activity_ref(), the Box::leak'd request_object, and the pending_requests += 1 bookkeeping are all left live with no cleanup.
That observation is correct. However, CodeRabbit's framing — "This slot is now claimed and fully initialized before…" — implies the PR introduced this. It did not.
Why this is byte-for-byte pre-existing
The pre-PR code at this site (visible in the diff's removed lines) was:
let slot = (*self.request_pool).claim();
Ctx::create_in(slot.addr().as_ptr().cast(), self_ptr, req, resp, ...);
slot.assume_init().as_ptr().cast()HiveSlot::assume_init() is implemented as let this = ManuallyDrop::new(self); this.slot.cast::<T>() — wrapping the token in ManuallyDrop suppresses HiveSlot::drop, permanently committing the claim. The new code is (*pool).get_init(Ctx::create(...)), where get_init is self.claim().write(value) and HiveSlot::write() is also ManuallyDrop::new(self) + write. Both paths consume the HiveSlot token at exactly the same point in control flow — ~130 lines before to_js_for_bake() runs. Everything between the slot commit and line 3285 (body_hive allocation, signal ref, request_object leak, ctx.arm_on_data(resp), pending_requests += 1) is unchanged by this PR.
Step-by-step proof of equivalence
- Before PR:
claim()reserves a slot and returns aHiveSlottoken.create_in()writes the value viaMaybeUninit::write.assume_init()wraps the token inManuallyDrop→HiveSlot::dropnever runs → slot stays claimed. Control reaches line ~3285. - After PR:
Ctx::create()builds the value on the stack.get_init()→claim().write(value)→write()wraps the token inManuallyDropand moves the value in →HiveSlot::dropnever runs → slot stays claimed. Control reaches line ~3285. - In both cases, if
request_object.to_js_for_bake(global_this)then returnsErr(e)(any non-OOMJsError), the match arm doesErr(_) => return None. The function returns; the slot stays markedusedin the bitset,body_hiveis neverput, the signal's pending-activity ref is never released,pending_requestsis never decremented, and theBox::leak'dRequestis never reclaimed.
The PR's only behavioral change in this region is that a panic inside Ctx::create() now happens before any slot is claimed (the value is built on the stack first, then claim() runs). For the Err(_) early-return path that CodeRabbit flagged, behavior is identical.
Why nothing prevents it
There is no RAII guard around ctx_slot between its commit and the PreparedRequestFor construction. The function has multiple side-effecting steps (hive alloc, signal ref, request leak, arm_on_data, pending-request counter) that all run after the slot commit and before the fallible to_js_for_bake(), and none of them are unwound on the Err branch. This is exactly the shape CodeRabbit's suggested fix describes — and it is the same shape the code had before this PR.
Impact
Real but narrow: to_js_for_bake() must return a non-OOM JsError during Bake request setup. Each such failure leaks one 2048-slot-pool entry (or a heap Box once the pool spills), one Body::Value hive slot, one signal ref, and one Request heap allocation, and leaves pending_requests permanently inflated (which can block server shutdown). Repeated failures during dev iteration could exhaust the inline pool and degrade to heap allocation per request.
Suggested follow-up (not for this PR)
CodeRabbit's suggested shape is right: hold a small RAII guard from immediately after get_init() through PreparedRequestFor construction, whose Drop runs the same cleanup sequence as the normal teardown path (ctx.deinit() → pool put, decrement pending_requests, release body_hive/signal). Disarm it on success. That is a separate change with its own blast radius (it touches the request teardown contract) and is appropriately deferred from this refactor PR.
706fb36 to
2219c60
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (1)
src/install/PackageManager/runTasks.rs (1)
1720-1730:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftRollback claimed
NetworkTaskslots onfor_*errors.Line 1727 converts the claim into a
HiveOwned, and the comment at Lines 1723-1725 says that path does not recycle the slot automatically. Line 1879 then doesfor_tarball(...) ?after taking that handle, so an error leaves the pool slot claimed and the dedupe entry fromhas_created_network_task()behind. The same helper is now used bystart_manifest_task()forfor_manifest(...) ?, so the leak/wedge still exists on both fallible init paths. Please keep the claim under an RAII rollback guard until the task is fully initialized and enqueued, or explicitlyput()the slot and remove the dedupe entry on error. As per coding guidelines "Prefer RAII (Drop) over manual cleanup in Rust code".Also applies to: 1833-1880
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/install/PackageManager/runTasks.rs` around lines 1720 - 1730, The claimed pool slot returned by get_network_task (preallocated_network_tasks.claim() → into_owned()) must be protected by an RAII rollback guard until NetworkTask::write_init(...) completes and the task is enqueued; otherwise errors from for_tarball(...) / for_manifest(...) leave the slot and the has_created_network_task() dedupe entry leaked. Wrap the claim in a small guard type (or use an existing scope guard) that on Drop calls the pool put() and clears the dedupe entry, and dismisses itself when initialization/enqueue succeeds; alternatively ensure every fallible path after get_network_task (e.g. in start_manifest_task, for_tarball, for_manifest) explicitly calls put() and removes the dedupe entry before returning an Err, but prefer the RAII guard approach per guidelines.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@src/install/PackageManager/runTasks.rs`:
- Around line 1720-1730: The claimed pool slot returned by get_network_task
(preallocated_network_tasks.claim() → into_owned()) must be protected by an RAII
rollback guard until NetworkTask::write_init(...) completes and the task is
enqueued; otherwise errors from for_tarball(...) / for_manifest(...) leave the
slot and the has_created_network_task() dedupe entry leaked. Wrap the claim in a
small guard type (or use an existing scope guard) that on Drop calls the pool
put() and clears the dedupe entry, and dismisses itself when
initialization/enqueue succeeds; alternatively ensure every fallible path after
get_network_task (e.g. in start_manifest_task, for_tarball, for_manifest)
explicitly calls put() and removes the dedupe entry before returning an Err, but
prefer the RAII guard approach per guidelines.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 44390edb-3c31-490c-a669-b453ee214a8e
📒 Files selected for processing (13)
src/collections/bit_set.rssrc/collections/hive_array.rssrc/collections/lib.rssrc/collections/linear_fifo.rssrc/install/NetworkTask.rssrc/install/PackageManager.rssrc/install/PackageManager/PackageManagerEnqueue.rssrc/install/PackageManager/PopulateManifestCache.rssrc/install/PackageManager/runTasks.rssrc/install/PackageManagerTask.rssrc/install/TarballStream.rssrc/install/lib.rssrc/io/windows_event_loop.rs
**ThreadPool::init dead generic.** `bundler::ThreadPool::init` (and its helper `init_with_pool`) were generic over `V2` because, during the phased Zig→Rust port, `bundle_v2.rs` carried two `BundleV2` definitions — the canonical one plus a `bv2_impl` draft module — and both needed to call `init`. The PORT NOTE on the function explicitly said it should collapse to `&BundleV2<'_>` once `bv2_impl` was dropped. That module is gone (`grep -rn "struct BundleV2" src/bundler/` finds exactly one definition, `bundle_v2.rs:75`), so this collapses the generic, names the concrete type, and drops the now-stale PORT NOTE. The body was already storing a type-erased raw pointer, so the monomorphised code is identical. **cpuid cfg-gate mismatch.** `perf/hw_timer.rs` defines `struct CpuidResult` and `fn cpuid()` under `#[cfg(target_arch = "x86_64")]`, but their only callers live inside a `#[cfg(not(any(target_os = "macos", target_os = "freebsd")))]` block (macOS/FreeBSD read the boot-time TSC frequency from `sysctl` instead of probing CPUID). On `x86_64-apple-darwin` and `x86_64-unknown-freebsd` the helpers were therefore compiled with no callers, producing `never constructed` / `never used` warnings. The cfg gates now mirror the callers' conditions. Verified with `cargo check --workspace --keep-going` (clean), `bun run rust:check-all` (10/10 targets), and `cargo fmt -p bun_bundler -p bun_perf --check`. Confirmed the dead-code warnings on `x86_64-apple-darwin` are present without this change and gone with it. File set is disjoint from #30879.
Squash of the original 14-commit history, rebased onto main after the
HiveArray UnsafeCell/aliasing rework merged. Both safety hardenings are
kept: the UnsafeCell buffer + HiveBox/box_at/ptr_at/alloc + miri infra
from main, and the HiveOwned/get_owned/into_owned API from this branch.
- Delete HiveArray::get / Fallback::{get, try_get, get_and_see_if_new}
(and their now-orphaned test). They claim a slot then return *mut T to
uninitialized storage; an early return between claim and ptr::write
leaves a claimed-but-uninit slot for a later put() to drop. The
UnsafeCell buffer change addressed the aliasing hazard but not the
uninit one. claim()/get_init()/emplace() are the replacements and they
had no remaining callers.
- Add HiveOwned<T>: sealed (no public ctor), move-only, #[must_use],
lifetime-erased pool-slot handle for tasks that outlive the claiming
scope. No Drop (recycle is the explicit Fallback::put step).
- Add Fallback::get_owned(), HiveSlot::into_owned(),
HiveOwned::from_raw_for_queue() (queue ingress back-door),
UnboundedQueue::push_owned/pop_owned/next_owned.
- get_owned() takes &self to match the rest of Fallback's interior-
mutability receivers.
- Migrate Task.network / NetworkTask / DevServer / RequestContext /
pool-task callers off raw uninit pointers and forged 'static muts.
- Migrate hive_array tests to claim()/get_init() and add a
hive_owned_lifecycle test (passes under cargo miri -Zmiri-tree-borrows).
f80f1ff to
f2e0f2e
Compare
There was a problem hiding this comment.
All prior review nits are now addressed and this pass found nothing new — deferring to a human given the scope (new HiveOwned abstraction, Task<'a> lifetime removal, DevServer::defer_request control-flow restructure, and the server create→new reshape across 18 files).
Extended reasoning...
Overview
This PR deletes the unsound HiveArrayFallback::{get, try_get, get_and_see_if_new} uninit-pointer API and migrates every caller to safe construction. It touches 18 files across five subsystems: the hive_array collection itself (deletes deprecated entry points, adds HiveSlot::into_owned and a new HiveOwned<T> lifetime-erased handle type, adds Fallback::get_owned, rewrites tests), the install package manager (Task<'a> → Task with HiveOwned<NetworkTask> payloads, enqueue/runTasks/PopulateManifestCache reshaped to build-then-get_init, get_network_task now returns HiveOwned), the HTTP server (RequestContext::create(&mut MaybeUninit) → ::new() -> Self, RequestCtxOps::create_in → ::create() -> Self, three call sites switched to get_init with a new REQUEST_CONTEXT_POOL_CAPACITY const), DevServer::defer_request (split into a fallible pre-claim phase and an infallible placement-init phase, closing a latent slot leak on the ? path), and UnboundedQueue (new push_owned/next_owned for round-tripping HiveOwned through the intrusive queue). Plus mechanical get_init migrations in renamer/h2_frame_parser/windows_event_loop and a sweep of stale get() doc references.
Security risks
None identified. This is an internal memory-safety refactor of pool-slot allocation; it does not touch auth, crypto, network parsing, or user-input handling. The change reduces unsafety surface by removing an API shape that handed out *mut T to uninitialized memory.
Level of scrutiny
High. The diff is large (~700 net lines across 18 files) and sits squarely in memory-safety-critical and hot-path code: the RequestContext pool is on every HTTP request, the install task pools are on every bun install, and UnboundedQueue is the cross-thread task channel. The PR introduces a new public abstraction (HiveOwned<T> with a #[doc(hidden)] from_raw_for_queue back-door), removes a lifetime parameter from a widely-used struct, and restructures control flow in DevServer::defer_request in a way that changes when side effects (on_aborted registration, server_handler.ctx.ref_()) run relative to slot claiming. These are exactly the kind of design and ordering decisions that benefit from a human reviewer who knows the surrounding invariants.
Other factors
Over multiple bot-review rounds the author has addressed every finding I raised (test-cfg build break, broken intra-doc links, dropped ctx_log! create trace, hardcoded 2048 capacity, unused NonNull import, into_owned Safety-doc wording, install-crate stale get() references) plus the actionable CodeRabbit items (#[must_use] on HiveOwned, NonNull server param, put() Safety wording). The one remaining unresolved CodeRabbit thread (server_body.rs to_js_for_bake leak) I previously analysed as byte-for-byte pre-existing, not introduced here. CI shows one Windows build-rust job failure that is an agent-provisioning error, not a compile failure. No new issues surfaced on this pass — the PR looks clean from the automated side, but its scope and design surface put it well outside the auto-approval bar.
What
Deletes
HiveArrayFallback::{get, try_get, get_and_see_if_new}— they return a raw*mut Tto uninitialized pool memory, which is an unsound shape: forming&mut Tover the uninit slot is immediate UB for anyTwith niche-bearing fields, and an early return betweenget()and the caller'sptr::writeleaves the slot claimed-but-uninit so a laterput()drops garbage.Every caller is migrated to the safe entry points (
get_init/claim()) that encode "a used slot is always fully initialized" in the type system:pool.get()+ptr::write(value)→pool.get_init(value). The Windows path mirrors the posix event loop, which already used safe construction.Taskvalue first (owns no borrow ofthis), thenget_init— removes the borrow conflict the uninit-slot pattern was working around.get_network_task:claim()+forget— the slot stays claimed for the existingput()path, but no uninit*mut Tescapes the pool's deprecated API. Documented as the one remaining "uninit pointer across a boundary" site.RequestContext::create(*mut Self)to::new() -> SelfandRequestCtxOps::create_into::create() -> Self; call sites use safeFallback::get_init. The constructor already built the literal by value andptr::write'd it —get_initdoes exactly that, with nounsafeat the call site.self-reentrant phase (runs before any slot is claimed — early returns no longer have a slot to recycle) and an infallible placement-init phase viaclaim(). Also closes a latent slot leak: the old code only recycled the slot on the explicitNonebranch, not on the?-error path.The remaining hive APIs are
get_init,emplace,claim(),put,put_raw— all encode the initialization invariant, none hand out uninit pointers.Verification
cargo check --workspaceclean (the deleted methods have zero remaining callers)bun run rust:check-all(all 6 target triples — the Windows path is cfg-gated)bun bd teston serve.test.ts, bake/dev-and-prod, http2, bundler_minify, bun-install