node:fs: fix GC root leak for Buffer path arguments in async ops#32199
node:fs: fix GC root leak for Buffer path arguments in async ops#32199robobun wants to merge 6 commits into
Conversation
Async node:fs operations with a Buffer/TypedArray/ArrayBuffer path argument took two gcProtects (protect_eat at parse, to_thread_safe at schedule) but released only one at completion, leaking one permanent GC root per call. Remove the parse-time root: the call frame keeps the argument alive during the host call, and to_thread_safe's protect (released by ThreadSafe's drop) covers the async task. Delete the now dead ArgumentsSlice protect machinery and the ManuallyDrop handling that existed to suppress it, and release early-error args through into_thread_safe so those paths stay balanced too. Fixes #32191
|
Updated 9:16 PM PT - Jun 12th, 2026
❌ @robobun, your commit a6cffe8 has some failures in 🧪 To try this PR locally: bunx bun-pr 32199That installs a local version of the PR into your bun-32199 --bun |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (1)
WalkthroughThis PR removes per-argument protection from ArgumentsSlice, replaces protect_* parsing calls with eat()/next_eat(), moves async protection to to_thread_safe/into_thread_safe with drop-based cleanup, updates NodeFS async error paths to release resources via into_thread_safe(), and adds a subprocess regression test for Buffer/ArrayBuffer path leaks. ChangesAsync fs protection lifecycle refactor
Suggested reviewers:
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/jsc/CallFrame.rs (1)
320-325: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick winRename or remove
protect_eat_next.This helper no longer protects anything, so keeping the old name on a lifetime-sensitive API makes the new contract easy to misuse.
next_eatalready covers the current behavior; either delete this shim or rename it to reflect plain consumption.As per coding guidelines, "Name things truthfully: a function must not match more inputs than its name claims."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/jsc/CallFrame.rs` around lines 320 - 325, The method protect_eat_next on CallFrame no longer provides protection and should be removed or renamed to reflect plain consumption: either delete protect_eat_next and update all call sites to call next_eat() directly, or rename it to a truthful name like consume_next (or next_if_any) and keep its implementation, ensuring signatures (returning Option<JSValue>) and uses remain correct; search for protect_eat_next references and replace them with next_eat (or the new name) and run tests to confirm no lifetime contract changes are introduced.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/runtime/node/types.rs`:
- Around line 1254-1258: The async buffer parsing paths are double-protecting GC
roots: StringOrBuffer::from_js_maybe_async_into and
StringOrBuffer::from_js_with_encoding_maybe_async_into call protect() at
parse-time and then to_thread_safe() calls protect() again while unprotect()
only drops one root, leaking; remove the parse-time protect() in those async
branches (or instead make unprotect() reliably release both roots) so the async
Buffer-backed path only acquires a single root via to_thread_safe(); update both
from_js_maybe_async_into and from_js_with_encoding_maybe_async_into (and the
similar site around the other mention) and run a grep for other StringOrBuffer
protect() usages to fix any sibling sites.
---
Outside diff comments:
In `@src/jsc/CallFrame.rs`:
- Around line 320-325: The method protect_eat_next on CallFrame no longer
provides protection and should be removed or renamed to reflect plain
consumption: either delete protect_eat_next and update all call sites to call
next_eat() directly, or rename it to a truthful name like consume_next (or
next_if_any) and keep its implementation, ensuring signatures (returning
Option<JSValue>) and uses remain correct; search for protect_eat_next references
and replace them with next_eat (or the new name) and run tests to confirm no
lifetime contract changes are introduced.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 92c03d87-50cb-416c-9d09-a52f692b8b5b
📒 Files selected for processing (5)
src/jsc/CallFrame.rssrc/jsc/node_path.rssrc/runtime/node/node_fs_binding.rssrc/runtime/node/types.rstest/js/node/fs/fs-leak.test.js
protect_eat_next never protected anything (same as the Zig original it was ported from) and is byte-equivalent to next_eat. With protect_eat gone, the name is the last thing in ArgumentsSlice implying a GC root is taken during argument parsing.
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/runtime/node/node_fs.rs (1)
2837-2845:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winAdd direct async
readv/writevcoverage for thisnext_eat()path.This stops taking the parse-time root for
ArrayBufferView[], but the new regression coverage only exercises path-like arguments.FdVectorIogoes through a separateVectorArrayBufferpin/release path, so this sibling site still needs a focusedfs.promises.readv/writevleak/lifetime test in the existing fs leak suite.As per coding guidelines, "Fix the whole class in the same PR" and "Every behavioral change ships an automated test in the same PR."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/runtime/node/node_fs.rs` around lines 2837 - 2845, The VectorArrayBuffer::from_js call via arguments.next_eat() (the next_eat() path that uses will_be_async) lacks direct async readv/writev coverage; add focused fs leak-suite tests that call fs.promises.readv and fs.promises.writev using an ArrayBufferView[] argument so the code path that roots/pins via VectorArrayBuffer::from_js is exercised (not the FdVectorIo sibling path). Ensure tests exercise the async branch (will_be_async=true), assert buffers are released after completion, and integrate them into the existing fs leak detection harness so any lifetime/leak regressions on next_eat()/VectorArrayBuffer::from_js are caught.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@src/runtime/node/node_fs.rs`:
- Around line 2837-2845: The VectorArrayBuffer::from_js call via
arguments.next_eat() (the next_eat() path that uses will_be_async) lacks direct
async readv/writev coverage; add focused fs leak-suite tests that call
fs.promises.readv and fs.promises.writev using an ArrayBufferView[] argument so
the code path that roots/pins via VectorArrayBuffer::from_js is exercised (not
the FdVectorIo sibling path). Ensure tests exercise the async branch
(will_be_async=true), assert buffers are released after completion, and
integrate them into the existing fs leak detection harness so any lifetime/leak
regressions on next_eat()/VectorArrayBuffer::from_js are caught.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: c2b202ca-3e7d-4d3b-afc9-7057d51edc46
📒 Files selected for processing (3)
src/jsc/CallFrame.rssrc/runtime/api/BunObject.rssrc/runtime/node/node_fs.rs
💤 Files with no reviewable changes (1)
- src/jsc/CallFrame.rs
|
Re the outside-diff note on readv/writev: the protect_eat_next change was behavior-identical, it never protected (rooting for ArrayBufferView[] happens inside VectorArrayBuffer::from_js and is balanced by release() at completion). The new fixture segments confirm this: they pass on the unfixed build too. Added writev/readv segments to the leak test in 761e36b anyway, since the run_async early-path change does touch FdVectorIo's release lifecycle. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/js/node/fs/fs-leak.test.js`:
- Around line 182-190: The test currently measures only per-element Uint8Array
roots for writev/readv (via measure("Uint8Array", ...) and stores results in
deltas.writevBuffers and deltas.readvBuffers) but does not observe the scheduled
array wrapper root; add a second measurement that observes the array wrapper
(e.g., call measure with the appropriate array/object root type such as "Array"
or the runtime-specific wrapper type) using the same writev/readv invocations
and store them (e.g., deltas.writevArrayRoots and deltas.readvArrayRoots), then
add assertions that both the element-buffer metrics
(deltas.writevBuffers/deltas.readvBuffers) AND the new array-root metrics are
"ok"; repeat the same addition for the other occurrence referenced (lines
~216-225) so both element and array-root lifecycles are validated for writev and
readv.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: baa8ae96-2908-48ec-9aa4-07466c54656f
📒 Files selected for processing (1)
test/js/node/fs/fs-leak.test.js
Measure multiple heap types per segment in one pass and report the worst delta, so the scheduled JSArray root is observed directly rather than only transitively through its pinned elements. Shrink N to keep the debug runtime well under the default test timeout.
There was a problem hiding this comment.
No issues found and my earlier nit was addressed — but this rewrites the GC-root lifecycle contract for async fs args (removing the parse-time protect, the ArgumentsSlice Drop, and changing the early-error release path), which is memory-safety-critical enough that a human should sign off on the accounting.
Extended reasoning...
Overview
The PR fixes a per-call GC root leak for Buffer/TypedArray/ArrayBuffer path arguments in async node:fs operations. The core fix is two protect_eat() → eat() changes in PathLike::from_js_with_allocator. The bulk of the diff is consequential cleanup: deleting the now-unused ArgumentsSlice protection machinery (protected bitset, Drop impl, protect_eat/unprotect/init_async/protect_eat_next), removing the ManuallyDrop wrappers in run_async/cp/readdir, and changing the early-error release path from args.unprotect() to drop(args.into_thread_safe()). A subprocess regression test in fs-leak.test.js measures heap-type deltas across six segments.
Security risks
No injection/auth/data-exposure surface. The risk class here is memory safety: GC-root accounting for JS-backed buffers that cross to a worker thread. An under-protect would let a backing store be collected while a syscall reads it (UAF); an over-unprotect could release a root legitimately held elsewhere. The PR description audits every PathLike-parsing call site and the early-error path change is argued to be the exact inverse of parse-time acquisitions, but this is precisely the kind of invariant that benefits from a maintainer who knows the ThreadSafe<A>/Unprotect contract confirming the audit.
Level of scrutiny
High. This is core runtime code on a hot path, and the change is a contract rewrite (single-root via to_thread_safe only) rather than a localized patch. The early-error path change in run_async — into_thread_safe() taking the schedule-time protects and immediately dropping them to release both those and parse-time protects like WriteFile.data's — is subtle and depends on every FsArgument's to_thread_safe/unprotect pair being balanced for that idiom.
Other factors
All prior review threads (CodeRabbit's StringOrBuffer double-protect concern, my protect_eat_next vestige nit, the writev/readv array-root coverage ask) are resolved with follow-up commits and the author's responses demonstrate solid command of the accounting. The new test is well-constructed (warmup, exact-error gating, multi-type heap observation). The CI failures on musl targets are unrelated LTO link-layout errors. Still, per the approval guidelines this is squarely "complex change touching critical code paths," so deferring to a human reviewer.
Its only readers were the protect_eat/unprotect machinery deleted earlier in this branch.
There was a problem hiding this comment.
Both cleanup nits are addressed and I have nothing further on a6cffe8 — deferring to a human for the GC-root accounting itself, since the fix's correctness rests on the call-site audit (nothing retains a borrowed PathLike past the host call without to_thread_safe) and on the reworked run_async early-error release path, where an off-by-one in either direction is leak vs. UAF.
Extended reasoning...
Overview
The PR fixes #32191: every async node:fs op that received a Buffer/TypedArray/ArrayBuffer path leaked one permanent gcProtect root. The two-line core fix swaps arguments.protect_eat() → arguments.eat() in PathLike::from_js_with_allocator (src/runtime/node/types.rs). Everything else is the cleanup that enables: deleting protect_eat/protect_eat_next/unprotect/the protected bitset/Drop for ArgumentsSlice/init_async/the all field from ArgumentsSlice (src/jsc/CallFrame.rs), removing the now-unneeded ManuallyDrop wrapping in run_async/cp/readdir (src/runtime/node/node_fs_binding.rs), updating three next_eat() call sites (BunObject.rs, node_fs.rs), a doc-comment tweak in node_path.rs, and a new subprocess heap-delta regression test in fs-leak.test.js. Seven files, ~110 lines net removed from src/.
Security risks
No injection/auth/data-exposure surface. The risk profile is memory safety: gcProtect/gcUnprotect is refcounted, so under-protecting (or over-unprotecting) a JS value that backs a raw &[u8] borrowed across a thread hop is a use-after-free hazard. The PR explicitly changes the early-error branches (has_exception, pre-aborted signal) from args.unprotect(); drop(args) — which the description argues was already over-unprotecting by one for schedule-time-protected fields — to drop(args.into_thread_safe()), the same release path a completed task uses. That reasoning looks sound and is now well-commented, but it is exactly the kind of invariant a maintainer should confirm.
Level of scrutiny
High. This is hot-path runtime code (every async fs call) and the correctness argument is non-local: it depends on (a) the audit in the PR description that no PathLike consumer retains a borrowed buffer past the host call without going through to_thread_safe(), and (b) the per-args-type to_thread_safe/Unprotect pairings (e.g. WriteFile.data's parse-time protect being adopted rather than re-protected) staying balanced. The dead-code deletions are mechanically safe (compiler-checked); the lifecycle change is not.
Other factors
- All four review threads (two from me on
protect_eat_nextand the write-onlyallfield; two from CodeRabbit onStringOrBufferdouble-protect — refuted — and writev/readv array-root coverage) are resolved, with follow-up commits 6f1100f / 761e36b / a846f59 / a6cffe8. - The bug-hunting system found nothing on the latest revision.
- New test covers Buffer path, ArrayBuffer path, writeFile Buffer path, pre-aborted writeFile (the changed early-error branch), and writev/readv element+array roots; the description states the first three fail on the unfixed build with delta == N.
- Description notes a textual overlap with #32189 and a pre-existing
NewAsyncCpTask/AsyncReaddirRecursiveTaskleak left out of scope. - No CODEOWNERS entry for these paths.
|
CI status: the diff is green on every lane that ran tests (285 jobs passed). The one red job is This is an agent/VM infrastructure timeout unrelated to the change, which is platform-agnostic Rust. Ready for review; a re-run of that lane should clear it. |
Fixes #32191
Every async
node:fsoperation that received a Buffer/TypedArray/ArrayBuffer path argument leaked one GC root for that argument, permanently pinning the path buffer (and, for small Buffers, the shared Buffer pool chunk behind it).Repro
On 1.4.0-canary this reports one leaked
Uint8Arrayper call (5000+), survivingBun.gc(true). The sync equivalent does not leak.Cause
Two
gcProtects per call, onegcUnprotect:PathLike::from_js_with_allocator(src/runtime/node/types.rs) calledarguments.protect_eat()on the argument at parse time.PathLike::to_thread_safe(src/jsc/node_path.rs) protected the same value again at task-schedule time.ThreadSafe<A>'s drop runsPathLike::unprotect(), which unprotects once.protect_eatroot was supposed to be released byDrop for ArgumentsSlice, butrun_async(src/runtime/node/node_fs_binding.rs) wraps the slice inManuallyDropand intentionally never drops it on the success path.Net +1 permanent root per call;
gcProtectis refcounted so they accumulate. The sync path was balanced (parse protect released by the slice's drop at the end of the host call), matching the report. The Zig implementation shipped in 1.3.x had the same accounting, so this is not a port regression.Fix
Remove the parse-time
protect_eatroot (the twoarguments.protect_eat()toarguments.eat()changes in types.rs are the fix; the rest is the cleanup this enables). The root was redundant in both modes:to_thread_safe(), whose protect is released exactly once, on the JS thread, byThreadSafe's drop. Audited everyPathLike-parsing call site to confirm nothing retains a borrowed buffer past the host call without it: node:fs sync/async/cp/readdir, the WindowsUVFSRequestpath,fs.watch/fs.watchFile(copy the path bytes during the call), and Bun.file/Bun.write/S3 (callto_thread_safebefore storing). This also matches how sync buffer arguments (fs.writeSyncetc.) already work: they take no protect at all.Consequential changes in the same commit:
protect_eatnow has zero callers and theArgumentsSlice.protectedbitset is never set, so that machinery (protect_eat,unprotect, the bitset field, theDropimpl, the unusedinit_async) is deleted, and theManuallyDrophandling inrun_async/cp/readdir, which existed only to suppress the bitset unprotect, becomes plain drops.run_asyncearly-error paths (pending exception after parse, pre-aborted AbortSignal) previously calledargs.unprotect()when only parse-time roots existed, over-unprotecting by one for every buffer whose protect is taken at schedule time (PathLike,Read,Write, writev/readv); an unbalancedgcUnprotectcan release a root legitimately held elsewhere on the same value. They now release throughdrop(args.into_thread_safe()), the exact inverse of parse-time acquisitions and the same release path a completed task uses, which also keepsWriteFile.data's parse-time protect balanced.Verification
New test in test/js/node/fs/fs-leak.test.js measures
heapStats().objectTypeCountsdeltas across 128 async calls each for:accesswith Buffer paths,accesswith ArrayBuffer paths,writeFilewith a Buffer path, and pre-abortedwriteFilewith Buffer path and data. On the unfixed build the first three fail withleaked 128 objects over 128 calls; with the fix all deltas are ~0.Also ran on the debug build: fs.test.ts, promises.test.js, cp.test.ts, readdirSync-recursive-error-leak.test.ts, fs.watch/fs.watchFile suites, bun-write.test.js, resolve.test.ts. Remaining failures there are pre-existing debug/ASAN timeouts (100k-iteration and x100-loop tests that exceed the 5s default), run-as-root environment issues (EACCES tests), and a fixture whose ASAN RSS cap keys on the binary being named
bun-asan; each either reproduces on the unfixed binary or provably does not execute the changed arms (string paths).Notes
protect_eatarm of the snapshot branch collapses to plaineat).NewAsyncCpTask/AsyncReaddirRecursiveTaskleak their task box (including protected args) if converting the result to JS throws at completion, unlikeAsyncFSTaskwhich destroys via scopeguard.