webcore: free drained buffer in FileReader.onPull memcpy path#29924
Conversation
FileReader.drain() moves ownership of this.buffered (or the reader buffer) into the returned ByteList and leaves the source empty. When onPull copies that data into the JS-provided array it then called this.buffered.clearAndFree(), which is a no-op on the now-empty list, orphaning the moved allocation on every such pull. Free the drained ByteList instead, capturing its length first since deinit() invalidates the struct.
|
Warning Rate limit exceeded
To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (1)
WalkthroughRefactors FileReader.onPull memory handling to transfer ownership of drained allocations to local scope for explicit deallocation. Introduces tests to verify memory behavior during repeated stdout reading from spawned processes. Changes
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Review rate limit: 0/5 reviews remaining, refill in 1 minute and 28 seconds. Comment |
|
Updated 4:24 AM PT - Apr 29th, 2026
❌ @robobun, your commit 4ea2153 has 4 failures in
🧪 To try this PR locally: bunx bun-pr 29924That installs a local version of the PR into your bun-29924 --bun |
|
Found 2 issues this PR may fix:
🤖 Generated with Claude Code |
There was a problem hiding this comment.
LGTM — surgical leak fix: drain() moves ownership out of this.buffered, so freeing drained (with .len captured pre-deinit) is the right call.
Extended reasoning...
Overview
This PR fixes a memory leak in FileReader.onPull (src/bun.js/webcore/FileReader.zig). drain() uses ByteList.moveFromList to transfer ownership of either this.buffered or this.reader.buffer() into the returned ByteList, leaving the source empty. The old code then called this.buffered.clearAndFree(...) after the memcpy — a no-op on the already-emptied list — so the moved allocation was orphaned each time this branch was hit. The fix changes drained to var, captures drained.len (since BabyList.deinit sets this.* = undefined), and calls drained.deinit(bun.default_allocator). A new regression test (spawn-stdout-iterate-leak.test.ts) drives 2000 iterations through the exact memcpy path via a cat subprocess and asserts RSS growth stays under 50 MB (vs. ~80–130 MB pre-fix).
Security risks
None. This is a pure deallocation fix with no change to control flow, parsing, auth, or external inputs. Both possible sources of the drained buffer — this.buffered (unmanaged, allocated everywhere in this file with bun.default_allocator) and PosixBufferedReader._buffer (std.array_list.Managed(u8).init(bun.default_allocator)) — use bun.default_allocator, so the allocator passed to deinit matches and there is no double-free or allocator-mismatch risk. The data is memcpy'd into the JS buffer before the free, and the captured drained_len avoids any read-after-invalidate.
Level of scrutiny
Low-to-moderate. The Zig change is 6 effective lines, mechanical, and the PR description precisely explains the ownership-move semantics that made the old free a no-op. I verified moveFromList and BabyList.deinit in src/collections/baby_list.zig and the reader buffer's allocator in src/io/PipeReader.zig to confirm correctness. No CODEOWNERS entries cover this path.
Other factors
The bug-hunting system found no issues. The regression test is somewhat timing-dependent (relies on setImmediate ticks and poking _readableState.highWaterMark) and is correctly gated with todoIf(isWindows) since the leak is in the POSIX poll-reader path; the 50 MB threshold gives ~2× headroom over the observed post-fix noise (16–28 MB), so flakiness risk seems acceptable for a leak regression test.
Release builds recycle the orphaned 32 KiB blocks into later same-size allocations, so RSS growth is sub-linear and flaky to threshold (failed on Alpine 3.23). Under ASAN each leaked block is quarantined and cannot be reused, so the leak shows as clean linear growth (~150 MB over 4x1000 iterations vs <12 MB fixed). Gate the test to debug/ASAN builds and sample RSS across five equal runs in an isolated subprocess.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/js/bun/spawn/spawn-stdout-iterate-leak.fixture.ts`:
- Around line 56-59: The stdout stream is resumed before its "data" listener is
attached which can drop buffered bytes and make total flaky; move the
proc.stdout!.on("data", (d: Buffer) => void (total += d.length)) call to occur
before proc.stdout!.resume() (keep proc.stdin!.end() and await once(proc,
"close") as-is) so the data handler is registered prior to resuming the stream.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 018257b0-c924-4fad-af98-78d5932aacfa
📒 Files selected for processing (3)
src/bun.js/webcore/FileReader.zigtest/js/bun/spawn/spawn-stdout-iterate-leak.fixture.tstest/js/bun/spawn/spawn-stdout-iterate-leak.test.ts
There was a problem hiding this comment.
The fix looks correct — drain() moves ownership out via moveFromList, so freeing the local drained (with .len captured first) is right. Deferring since this is a hot-path allocator change in core streaming I/O and the last reported CI run (a2c1de8) shows build-zig failures that haven't been re-reported for 4ea2153 yet.
Extended reasoning...
Overview
This PR fixes a memory leak in src/bun.js/webcore/FileReader.zig's onPull memcpy branch. drain() returns a ByteList whose backing allocation was moved out of this.buffered (or the reader's internal buffer) via moveFromList, leaving the source empty. The old code then called this.buffered.clearAndFree(...) — a no-op on the already-emptied list — and dropped drained on the floor. The fix changes drained to var, captures drained.len into a local before calling drained.deinit(bun.default_allocator) (since deinit sets *this = undefined), and uses the captured length in the return value. Two new test files exercise the path under ASAN/debug builds via a cat subprocess with a tuned highWaterMark.
Security risks
None. This is purely an internal allocator-ownership fix; no user-controlled input affects the freed pointer, and both source buffers (this.buffered and the posix reader _buffer) are allocated with bun.default_allocator, so the deinit allocator matches.
Level of scrutiny
Medium-high. The diff is ~10 lines of Zig plus tests, and the reasoning is sound — I verified moveFromList resets the source list and that BabyList.deinit requires a mutable receiver and invalidates the struct (hence the drained_len capture). However, this sits in a hot streaming-I/O path where an ownership mistake would convert a leak into a UAF/double-free, so a maintainer familiar with the FileReader/ByteList ownership model should sign off.
Other factors
- The one CodeRabbit comment (listener-before-resume ordering in the fixture) was addressed in 4ea2153 and resolved.
- robobun's last CI status update is for a2c1de8 and shows build-zig failures across all platforms; no updated status is posted yet for 4ea2153. Worth confirming CI is green before merge.
- The regression test is gated to
isDebug || isASAN(skipped on Windows and release), which the PR description justifies — mimalloc recycles same-size-class blocks on release so RSS doesn't grow linearly. The test assertsexpect(stderr).toBe("")on a JS-executing subprocess underbunEnv; per repo conventions this is acceptable whenBUN_DEBUG_QUIET_LOGS=1suppresses ASAN noise.
|
After: Before: |
…h#29924) ## What `FileReader.onPull` leaked the drained buffer when copying it into the JS-provided pull array. ## Why `drain()` moves ownership of `this.buffered` (or the reader's internal buffer) into the returned `ByteList` via `moveFromList`, which resets the source to empty. After the memcpy into the JS array, the old code called `this.buffered.clearAndFree(...)` — a no-op on the already-emptied list — so the `ByteList` returned by `drain()` was never freed. ```zig var drained = this.drain(); // moves out of this.buffered ... @memcpy(buffer[0..drained.len], drained.slice()); this.buffered.clearAndFree(bun.default_allocator); // no-op: already empty // `drained` is dropped on the floor ``` The fix frees `drained` itself (capturing `.len` first since `deinit()` invalidates the struct). ## Reachability Most consumers route through a JS-side `handle.drain()` guard that intercepts buffered data before `onPull` sees it. The branch is reached when `onPull` is invoked directly while `this.buffered` is non-empty — e.g. a `child_process` stdout `NativeReadable` whose `_read()` goes straight to `ptr.pull()` after the first read, with the Node buffer already full so `_read` wasn't auto-rescheduled between the pending-fulfilling chunk and the next one. ## Test `test/js/bun/spawn/spawn-stdout-iterate-leak.test.ts` constructs that sequence with `cat` + 32 KiB writes + `highWaterMark = 32 KiB`, so every cycle hits `onPull(32768) = 32768` (the memcpy branch). The fixture subprocess samples RSS across five equal 1000-cycle runs and reports the first-to-last delta. The test is gated to debug/ASAN builds: under ASAN each orphaned 32 KiB block is quarantined and cannot be reused, so the leak shows as clean linear RSS growth. On release builds mimalloc recycles the orphaned blocks into later same-size allocations, making RSS growth sub-linear and too close to allocator noise to threshold reliably across all platforms. | | RSS delta (first→last sample) | |-----------|-----------| | before (debug+ASAN) | 148–162 MB | | after (debug+ASAN) | −8 to 12 MB | Threshold is 32 MB. --------- Co-authored-by: robobun <robobun@users.noreply.github.com>
What
FileReader.onPullleaked the drained buffer when copying it into the JS-provided pull array.Why
drain()moves ownership ofthis.buffered(or the reader's internal buffer) into the returnedByteListviamoveFromList, which resets the source to empty. After the memcpy into the JS array, the old code calledthis.buffered.clearAndFree(...)— a no-op on the already-emptied list — so theByteListreturned bydrain()was never freed.The fix frees
draineditself (capturing.lenfirst sincedeinit()invalidates the struct).Reachability
Most consumers route through a JS-side
handle.drain()guard that intercepts buffered data beforeonPullsees it. The branch is reached whenonPullis invoked directly whilethis.bufferedis non-empty — e.g. achild_processstdoutNativeReadablewhose_read()goes straight toptr.pull()after the first read, with the Node buffer already full so_readwasn't auto-rescheduled between the pending-fulfilling chunk and the next one.Test
test/js/bun/spawn/spawn-stdout-iterate-leak.test.tsconstructs that sequence withcat+ 32 KiB writes +highWaterMark = 32 KiB, so every cycle hitsonPull(32768) = 32768(the memcpy branch). The fixture subprocess samples RSS across five equal 1000-cycle runs and reports the first-to-last delta.The test is gated to debug/ASAN builds: under ASAN each orphaned 32 KiB block is quarantined and cannot be reused, so the leak shows as clean linear RSS growth. On release builds mimalloc recycles the orphaned blocks into later same-size allocations, making RSS growth sub-linear and too close to allocator noise to threshold reliably across all platforms.
Threshold is 32 MB.