fix(socket): don't drop an unprotected Handlers on handler validation errors#31859
fix(socket): don't drop an unprotected Handlers on handler validation errors#31859robobun wants to merge 1 commit into
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (2)
Walkthrough
ChangesSocket Handler Callback Validation
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
|
Updated 6:12 AM PT - Jun 17th, 2026
❌ @robobun, your commit 4fb1950 has 1 failures in 🧪 To try this PR locally: bunx bun-pr 31859That installs a local version of the PR into your bun-31859 --bun |
|
This PR may be a duplicate of:
🤖 Generated with Claude Code |
There was a problem hiding this comment.
LGTM — clean reordering so validation errors return before a Handlers (and its unconditional Drop → unprotect) ever exists.
Extended reasoning...
Overview
This PR fixes a fuzzer-found debug-build panic (assertion failed: self.protection_count > 0) in Handlers::from_generated (src/runtime/socket/Handlers.rs). Previously the Handlers struct was constructed first, then callbacks were validated; an early return Err(...) on a non-callable handler or missing data/drain dropped a Handlers that had never been protect()ed, and Drop unconditionally calls unprotect(). The fix moves all fallible validation into local variables via a validated_callback! macro before constructing the struct, so every constructed Handlers is guaranteed to reach protect() (the only remaining steps between construction and protect() — with_async_context_if_needed and protect itself — are infallible). Validation order, error messages, and success-path behavior are unchanged. A subprocess-based regression test in test/js/bun/net/socket.test.ts exercises both error paths through both Bun.connect and Bun.listen.
Security risks
None introduced. The change actually closes a latent release-build hazard where unbalanced gcUnprotect calls could steal protection from a shared callback on another live socket. No new inputs, no auth/crypto/permissions surface.
Level of scrutiny
Moderate — this is GC-protection / memory-safety code in the socket runtime, but the change is a pure reordering of existing logic within a single ~40-line function. The macro is a straightforward refactor of the old assign_callback! that yields a value instead of mutating a field. I traced that JSValue::ZERO ↔ is_empty() semantics match the old code, that field iteration order is identical, and that nothing between struct construction and protect() can early-return.
Other factors
The PR description is precise about both the debug assert and the release-build unbalanced-unprotect consequence, and correctly notes this is the sole construction site for Handlers. The new test asserts exact error strings and exit code 0, so a regression to the panic would fail it on debug builds. No CODEOWNERS apply to this path, and the bug-hunting system found no issues.
f54c42e to
8b498d7
Compare
There was a problem hiding this comment.
LGTM — straightforward reordering so validation happens before Handlers is constructed, eliminating the unprotected-Drop path.
Extended reasoning...
Overview
This PR reorders Handlers::from_generated in src/runtime/socket/Handlers.rs so that all callback validation (the per-field is_callable() checks and the "at least data or drain" check) runs against local variables before the Handlers struct is constructed. Previously the struct was built first and validated afterward; an error return then dropped a Handlers whose callbacks were never protect()ed, but whose Drop unconditionally calls unprotect() — tripping a debug assert and issuing unbalanced gcUnprotect calls in release. A subprocess regression test in test/js/bun/net/socket.test.ts exercises both error paths through both Bun.connect and Bun.listen.
Security risks
None introduced. This is a crash/UB fix on an input-validation error path; no new surface area, no auth/crypto/permissions involved. If anything it closes a theoretical release-build hazard (stolen GC protection on a shared callback).
Level of scrutiny
Moderate — it touches GC protect/unprotect bookkeeping in the socket runtime, which is lifetime-sensitive. But the change itself is purely a code-motion: the validation logic, field order, and error messages are byte-for-byte preserved; only the point at which the struct is constructed moves to after the last fallible check. After construction, with_async_context_if_needed (infallible FFI) and protect() run immediately, so every constructed Handlers is protected before it can drop. The reasoning is clearly documented in an inline comment.
Other factors
- The new test is well-scoped, runs in a subprocess so a panic surfaces as a test failure, and asserts exact error messages and exit code.
- The single CI failure (
test/cli/install/bunx.test.ts) is unrelated to socket code. - A bot flagged #31817 as a possible duplicate; that's a merge-coordination question for maintainers, not a correctness concern with this change.
- The bug-hunting system found no issues.
…dler path (#31861) ### Problem Fuzzilli hit a nested panic while Bun was already processing a crash in a debug build: ``` panic: assertion failed: self.protection_count > 0 ... panicked at src/base64/lib.rs:318:15: attempt to negate with overflow thread panicked while processing panic. aborting. ``` The crash handler encodes backtrace addresses into the bun.report trace string by splitting each u64 into two u32 halves and bitcasting them to `i32` (`write_u64_as_two_vlqs` in `src/crash_handler/lib.rs`). `vlq::encode_slow_path` computes the magnitude of negative values with `-value`, which overflows for `i32::MIN`. So an address half of exactly `0x80000000` panics the panic hook: debug builds abort before the crash report or backtrace is written, and release builds silently wrap. The Zig reference (`src/sourcemap/VLQ.zig`) has the same checked negation, so this was inherited by the port; sourcemap callers never pass `i32::MIN`, only the crash handler's bitcast address halves do. ### Fix Compute the magnitude with `value.unsigned_abs()` in `encode_slow_path` (`src/base64/lib.rs`). The sign-magnitude VLQ format cannot represent `i32::MIN` in 32 bits regardless (its magnitude is 2^31), so that one value keeps degrading to "-0" exactly as release builds already emit, and the wire format bun.report decodes is unchanged. Everything in the representable domain `-(2^31 - 1)..=2^31 - 1` encodes identically to before. The encoder just can't panic anymore, which matters because it runs inside the panic hook. ### Test JS-level regression test in `test/js/bun/sourcemap/internal-sourcemap-roundtrip.test.ts`: `InternalSourceMap` sync-entry state is raw i32 and `appendVLQTo` computes deltas with `saturating_sub`, so a hand-crafted blob whose first window starts at generated column `i32::MIN` drives `VLQ::encode(i32::MIN)` through the existing `bun:internal-for-testing` `toVLQ` surface. On the unfixed encoder the spawned process aborts with "attempt to negate with overflow" (debug builds); with the fix it emits the wrapped "-0" encoding (`BAAA`) and exits 0. Also unit tests in the `vlq` module of `bun_base64`: a roundtrip over the representable domain including both extremes (pinning the documented `"+/////D"` / `"//////D"` encodings), and `encode(i32::MIN)` decoding to 0 without panicking. The latter fails on the unfixed code with the same overflow panic (`cargo test -p bun_base64`). `bun_base64` is in `MIRI_CRATES`, so CI runs these via `cargo miri test` on changes under `src/base64/`; verified locally that miri passes. The crash handler trigger itself (a backtrace address half equal to `0x80000000`) is not controllable from a test; the sourcemap path exercises the identical encoder edge. The primary crash in the fuzzer report (the `protection_count` assert) is fixed separately in #31859.
|
CI triage for the remaining failures:
The new regression test passed on all platforms in both builds. |
Handlers::from_generated built the Handlers struct first and validated the
callback options afterwards, but protect() only runs after validation
succeeds. The validation error returns therefore dropped a Handlers whose
callbacks were never protected: Drop unconditionally calls unprotect(),
which trips the protection_count assert in debug builds and issues
unbalanced gcUnprotect calls in release builds (stealing protection from
another socket sharing the same callback function).
Validate the callbacks from the generated config first and construct the
Handlers only after every fallible check has passed, so every constructed
Handlers is protected before it can be dropped.
Repro: Bun.connect({ socket: {} })
8b498d7 to
4fb1950
Compare
There was a problem hiding this comment.
LGTM — clean reordering so validation completes before the Handlers (and its Drop) ever exists; behavior and error messages on the success/error paths are unchanged.
Extended reasoning...
Overview
This PR fixes a fuzzer-found crash in Handlers::from_generated (src/runtime/socket/Handlers.rs). Previously the Handlers struct was constructed with zeroed callback fields, then each callback was validated and assigned in turn; an early return Err(...) on a non-callable handler or the missing-data/drain check dropped a Handlers whose protect() had never run, and Drop unconditionally calls unprotect(). The fix hoists all callback validation (and the data/drain presence check) into locals via a validated_callback! macro, then constructs the struct only once everything has passed — so every constructed Handlers reaches protect() before it can be dropped. A subprocess regression test in test/js/bun/net/socket.test.ts exercises both error paths through both Bun.connect and Bun.listen.
Security risks
None introduced. The change is a pure reordering of existing validation; no new inputs, no new FFI, no changed error semantics. If anything it closes a (narrow) GC-unprotect imbalance that could in theory let a shared callback be collected while another socket still uses it.
Level of scrutiny
Moderate-low. While this lives in GC-protection / Drop territory, the diff is mechanical: the macro body is byte-for-byte the same check as before but yields a value instead of assigning a field, validation order and error strings are unchanged, and the only code between construction and protect() (with_async_context_if_needed) is infallible. The single construction site means listen, connect, and reload are all covered.
Other factors
- Regression test added and reported green on all platforms; remaining CI failures were triaged as pre-existing on main.
- No CODEOWNERS entry for this path.
- A duplicate-PR bot flagged #31817 with the same fix; that's a merge-coordination question for maintainers, not a correctness concern with this diff.
|
CI triage for build #63130 on the rebased commit (4fb1950): The only failing test is All other test jobs passed, including the new regression test on every platform. The earlier |
Problem
Fuzzilli hit
panic: assertion failed: self.protection_count > 0in debug builds viaBun.connect(ArrayBuffer)(the options object resolved asocketvalue through prior prototype pollution in the fuzzer's reused process, which is why it was flaky). Deterministic repro:Handlers::from_generatedconstructed theHandlersstruct first and validated the callback options afterwards, butprotect()only runs after validation succeeds. The validation error returns ("Expected "X" callback to be a function", "Expected at least "data" or "drain" callback") dropped aHandlerswhose callbacks were never protected.Dropunconditionally callsunprotect(), so:protection_count > 0assert and abortgcUnprotectcalls on any callback assigned before the failing field (order: open, close, data, ...). If another live socket protects the same function object, its protection is stolen and the callback can be collected while still in use.The Zig reference (
Handlers.zig) has no drop on these error returns, so nothing unprotected there; the Rust port'sDropchanged that.Fix
Validate the callbacks from the generated config first and construct the
Handlersonly after every fallible check has passed (src/runtime/socket/Handlers.rs,from_generated). Nothing between construction andprotect()can fail now, so every constructedHandlersis protected before it can be dropped. Validation order and error messages are unchanged. This is the only construction site for this type, soBun.listen,Bun.connect, andreload()are all covered.Test
test/js/bun/net/socket.test.ts: "socket handler validation errors throw instead of crashing" spawns a subprocess exercising both error paths through bothBun.connectandBun.listen, asserting the exact error messages and exit code 0. On an unfixed debug build the subprocess panics with the fuzzer's fingerprint and the test fails; it passes with this fix. Release builds do not assert, so the panic is only observable under debug builds (bun bd test).The 9 other failures in
socket.test.tsin my environment are connection timeouts that reproduce identically on an unmodified build of main.Rebase note
Rebased onto main after #31155 added four TLS callback fields (
on_session,on_keylog,on_server_name,on_alpn_callback) toHandlers. Those fields now follow the same validate-before-construct path as the original nine, so a non-callable value for any of them also throws cleanly instead of dropping an unprotectedHandlers. VerifiedBun.connect({ socket: { data(){}, session: 123 } })throwsExpected "onSession" callback to be a functionon the rebased build; validation order and error messages match main.