Skip to content

s3: use .temporary lifetime for downloaded body so it gets freed#29923

Closed
robobun wants to merge 8 commits into
mainfrom
farm/716d7830/s3-body-leak
Closed

s3: use .temporary lifetime for downloaded body so it gets freed#29923
robobun wants to merge 8 commits into
mainfrom
farm/716d7830/s3-body-leak

Conversation

@robobun

@robobun robobun commented Apr 29, 2026

Copy link
Copy Markdown
Collaborator

Fixes #29083

What

await s3file.text() / .json() / .arrayBuffer() / .bytes() / .formData() leaked the entire downloaded S3 object body on every call.

Reproduce

const s3 = new Bun.S3Client({ /* ... */ });
for (let i = 0; i < 50; i++) await s3.file("1mb-object").arrayBuffer();
// RSS grows by ~50 MiB and never comes back

Cause

On a successful 200/204/206 download, S3HttpSimpleTask moves ownership of its heap response_buffer (a MutableString on bun.default_allocator) into the S3DownloadResult and zeroes its own copy, so the task's deinit() frees nothing — the callback owns the allocation.

S3BlobDownloadTask.onS3DownloadResolved aliases response.body.list.items and forwards it to the to*WithBytes handlers via doReadFromS3's wrapper, which was passing Lifetime.clone.

.clone semantics assume the bytes are backed by the Blob's Store. An S3 Store only holds path + credentials, not the bytes. So under .clone:

  • toArrayBufferViewWithBytes copied into a new ArrayBuffer and orphaned the original
  • toStringWithBytes created an external string whose finalizer (Store.external) only derefs the S3 Store
  • toJSONWithBytes freed only on .temporary
  • toFormDataWithBytes ignored the lifetime entirely

Nothing freed the download buffer.

Fix

  • doReadFromS3: pass .temporary instead of .clone, matching the local-file path in read_file.zig. Consumers then adopt or free the allocation.
  • toFormDataWithBytes: respect .temporary (FormData.toJS copies all slices out, so the input can be freed afterwards). This also fixes the same leak for Bun.file(path).formData().
  • getFormData: pass .clone instead of .temporary. For in-memory Blobs, toFormData hands sharedView() (Store-owned bytes) to toFormDataWithBytes; .temporary there was always wrong but was masked by toFormDataWithBytes ignoring lifetime. File/S3 paths supply their own .temporary internally and ignore this value. This brings it in line with getText / getArrayBuffer / getBytes.

Tests

New test/js/bun/s3/s3-body-leak.test.ts: points S3Client at a local Bun.serve mock (no credentials needed), downloads a 1 MiB object 50× per method, and asserts RSS growth stays well under one-buffer-per-iteration. Covers arrayBuffer / bytes / text / json / formData.

  • Before: 49–56 MiB growth per method.
  • After: growth stays near zero (25/25 pass under ASAN across 5 consecutive runs).

test/js/web/html/FormData.test.ts passes (110/111; the one remaining timeout is pre-existing on main).

The existing credential-gated s3-text-leak-fixture.js was never actually exercising this (it used new Array(N).fill(readLargeFile()), which calls the function once and fills N slots with the same promise). Left as-is since the new credential-free test provides strictly better coverage across all platforms.

Related

#20487 — this PR fixes the Bun.S3Client portion of that report; the @google-cloud/storage SDK path is unrelated code and not touched here.


Overlaps with #29776 (community PR from a fork, CI blocked) which makes the same core .clone.temporary change. This PR differs in that its test uses a local Bun.serve mock S3 endpoint so it runs without real S3 credentials or Docker.

@coderabbitai

coderabbitai Bot commented Apr 29, 2026

Copy link
Copy Markdown
Contributor

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 1480fb83-e12a-4630-b8ac-db9a40fff3e7

📥 Commits

Reviewing files that changed from the base of the PR and between 297919c and 07636bf.

📒 Files selected for processing (1)
  • test/js/bun/s3/s3-body-leak.test.ts

Walkthrough

Adjusted Blob→FormData lifetime semantics to align buffer ownership for S3-downloaded vs in-memory Blobs. Added a concurrent test that repeatedly exercises S3 body accessors and measures RSS to detect retained buffers or memory growth.

Changes

Cohort / File(s) Summary
Blob lifetime handling
src/bun.js/webcore/Blob.zig
Reworked Blob→FormData conversion lifetimes: S3 read callback now returns .temporary for downloaded heap bytes; getFormData uses .clone for in-memory Blobs supplying Store-owned sharedView() bytes; toFormDataWithBytes now accepts a Lifetime parameter and only frees the provided buf when it is .temporary.
S3 body leak tests
test/js/bun/s3/s3-body-leak.test.ts
Added a concurrent test spawning a Bun subprocess and local Bun.serve mock to exercise S3 body accessors (arrayBuffer, bytes, text, json, formData) across warmup and measured iterations, forcing GC, recording baseline/post RSS, and asserting bounded RSS growth to detect leaks.
🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 's3: use .temporary lifetime for downloaded body so it gets freed' clearly summarizes the main change: switching from .clone to .temporary lifetime for S3 downloaded bodies to enable proper memory deallocation.
Description check ✅ Passed The description is comprehensive with clear sections: 'What', 'Reproduce', 'Cause', 'Fix', 'Tests', and 'Related'. It covers the root cause, the solution, how to verify it works, and includes both before/after RSS metrics.
Linked Issues check ✅ Passed The PR directly addresses issue #29083 by implementing the required fixes: preventing RSS growth from repeated S3 body reader calls (.arrayBuffer(), .text(), .json(), .bytes(), .formData()) through correct lifetime semantics and includes comprehensive testing that verifies RSS remains bounded.
Out of Scope Changes check ✅ Passed All changes are scoped to the S3 memory leak fix: modifications to Blob.zig for lifetime handling, and a new test for S3 body leak verification. No unrelated changes are present.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Review rate limit: 1/5 review remaining, refill in 44 minutes and 42 seconds.

Comment @coderabbitai help to get the list of available commands and usage tips.

@robobun

robobun commented Apr 29, 2026

Copy link
Copy Markdown
Collaborator Author
Updated 2:48 PM PT - May 4th, 2026

@robobun, your commit 7463a13 has 3 failures in Build #51238 (All Failures):


🧪   To try this PR locally:

bunx bun-pr 29923

That installs a local version of the PR into your bun-29923 executable, so you can run:

bun-29923 --bun

@github-actions

Copy link
Copy Markdown
Contributor

Found 3 issues this PR may fix:

  1. Bun.S3File.arrayBuffer() retains RSS and reaches OOM in a 1 GiB Linux container despite forced GC #29083 - Reports exact bug: repeated S3File.arrayBuffer() calls cause unbounded RSS growth and OOM due to download buffer never being freed
  2. Memory leak when downloading file (both @google-cloud/storage and Bun's S3) #20487 - S3 file downloads cause RSS to never drop, confirmed with both @google-cloud/storage and Bun's native S3Client
  3. Bun.file.formData() #10201 - Bun.file().formData() not working correctly, fixed by the toFormDataWithBytes lifetime correction in this PR

If this is helpful, copy the block below into the PR description to auto-close these issues on merge.

Fixes #29083
Fixes #20487
Fixes #10201

🤖 Generated with Claude Code

@github-actions

Copy link
Copy Markdown
Contributor

This PR may be a duplicate of:

  1. [codex] Fix S3 arrayBuffer native buffer leak #29776 - Both make the identical core fix: changing doReadFromS3's lifetime from .clone to .temporary and teaching toFormDataWithBytes to free temporary buffers, to stop S3 downloaded body buffers from leaking.

🤖 Generated with Claude Code

Comment thread test/js/bun/s3/s3-text-leak-fixture.js Outdated
Comment thread test/js/bun/s3/s3-body-leak.test.ts Outdated
Comment thread test/js/bun/s3/s3-body-leak.test.ts Outdated
Comment thread test/js/bun/s3/s3-text-leak-fixture.js

@claude claude Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1f4f4fb addresses my last concern (the text-leak fixture is now sequential with a symmetric 30/30 warmup), and all earlier nits are resolved — deferring to a human only because this changes buffer-ownership semantics in Blob.zig and overlaps with #29776, which a maintainer should reconcile.

Extended reasoning...

Overview

The PR fixes a memory leak where every s3file.text()/.json()/.arrayBuffer()/.bytes()/.formData() call orphaned the entire downloaded body. Three small hunks in src/bun.js/webcore/Blob.zig: doReadFromS3's wrapper now passes .temporary instead of .clone; toFormDataWithBytes now defer frees its input on .temporary; and getFormData flips from .temporary to .clone so in-memory Blobs (which pass Store-owned sharedView() bytes through line 3906) aren't freed by the new defer. The rest is tests: a new credential-free s3-body-leak.test.ts covering all five methods via a local mock server, a rewrite of s3-text-leak-fixture.js (sequential, symmetric warmup, settle loop), and a 30→60 s timeout bump in s3.leak.test.ts.

Security risks

None. No auth, crypto, parsing, or trust-boundary changes — purely allocation-lifetime bookkeeping and test scaffolding. The only risk class is memory-safety (use-after-free / double-free), not security.

Level of scrutiny

Moderate-to-high. The diff is small and well-reasoned, but it changes ownership semantics of a heap buffer in core Blob code that every S3 download and Bun.file().formData() flows through. The getFormData .temporary.clone flip is a compensating change that must stay paired with the new defer free in toFormDataWithBytes; I traced the in-memory path (toFormDatasharedView()toFormDataWithBytes at Blob.zig:3906) and it's correct, but this is exactly the kind of invariant a maintainer should sign off on.

Other factors

  • All three of my earlier inline comments were addressed (formData coverage, describe.concurrent, and the text-leak fixture flakiness — the latter twice, with 1f4f4fb finally making warmup and measured phases symmetric and sequential as suggested).
  • CI build #48969 for 1f4f4fb is still in flight; the previous build's s3.leak.test.ts failures should be resolved by the fixture rewrite but that's unconfirmed.
  • Overlaps with community PR #29776 which makes the same core .clone.temporary change — a maintainer should decide which to land / how to credit.
  • ASAN-clean per the PR description, and FormData.test.ts passes.

Comment thread test/js/bun/s3/s3-body-leak.test.ts
Comment thread test/js/bun/s3/s3-body-leak.test.ts Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/js/bun/s3/s3-body-leak.test.ts`:
- Line 135: Replace the strict expect(stderr).toBe("") assertion with a check
that first strips the known ASAN startup warning from stderr (e.g. filter out
lines matching /^WARNING: ASAN interferes.*$/) and then assert the remaining
string is empty; update the test around the expect(stderr).toBe("") usage
(referencing the stderr variable in s3-body-leak.test.ts) so the test filters
that single known ASAN line before asserting emptiness.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 49b28e85-b850-4233-9406-3b4f9be64d16

📥 Commits

Reviewing files that changed from the base of the PR and between 0900b96 and 297919c.

📒 Files selected for processing (1)
  • test/js/bun/s3/s3-body-leak.test.ts

Comment thread test/js/bun/s3/s3-body-leak.test.ts Outdated

@claude claude Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found on the latest revision — the .temporary semantics check out against all four to*WithBytes consumers and the compensating getFormData flip is correct. Deferring only because this is manual memory-ownership in the core Blob/S3 path (and overlaps with #29776), so a maintainer sign-off seems appropriate.

Extended reasoning...

Overview

The net diff touches two files: three small logic changes in src/bun.js/webcore/Blob.zig (the .clone.temporary flip in doReadFromS3's wrapper, a new defer free in toFormDataWithBytes for the .temporary case, and a compensating .temporary.clone flip in getFormData), plus a new credential-free RSS-growth regression test at test/js/bun/s3/s3-body-leak.test.ts. All earlier changes to s3-text-leak-fixture.js were reverted, so the diff is now minimal and focused.

I traced each to*WithBytes consumer under .temporary: toStringWithBytes adopts via toExternalValue or frees explicitly (Blob.zig:3688-3701), toArrayBufferViewWithBytes wraps the buffer in an owning ArrayBuffer (Blob.zig:3865-3871), toJSONWithBytes already freed on .temporary, and toFormDataWithBytes now does too. The getFormData flip is necessary and correct: in-memory Blobs reach toFormDataWithBytes with sharedView() (Store-owned bytes, Blob.zig:3914-3919), which must not be freed; file/S3 paths supply their own .temporary and ignore this value. The change mirrors the existing local-file path (doReadFileNewReadFileHandler), so the pattern is established.

Security risks

None identified. This is a memory-leak fix; the only failure mode of getting lifetimes wrong here would be UAF/double-free, not a security boundary bypass. I checked specifically for the UAF risk introduced by the new defer free and confirmed the getFormData compensating change covers the in-memory-Blob case, and that FormData.toJS copies slices out before the defer fires.

Level of scrutiny

This is production-critical runtime code: manual memory ownership in the Blob/S3 download path, where a mistake means use-after-free or double-free for every s3file.text()/json()/arrayBuffer()/bytes()/formData() call. The change is small and I believe correct, but per the approval guidelines this is exactly the kind of "touches critical code paths" change where a human maintainer who owns this code should sign off rather than relying solely on bot review.

Other factors

All five of my prior inline comments across earlier revisions were addressed and resolved by the author; the bug-hunting system found nothing on the latest revision; the new test is credential-free, runs concurrently, covers all five body methods, and the author reports 25/25 ASAN passes. There is also an overlapping community PR (#29776) making the same core change — a maintainer may want to coordinate which lands and how the other is closed/credited.

Comment thread test/js/bun/s3/s3-body-leak.test.ts
robobun added 7 commits May 4, 2026 10:28
On successful S3 download, S3HttpSimpleTask moves ownership of its heap
response_buffer into the S3DownloadResult and zeroes its own copy, so the
callback owns the allocation. S3BlobDownloadTask.onS3DownloadResolved
forwards those bytes to the to*WithBytes handlers via doReadFromS3's
wrapper, which was passing Lifetime.clone.

.clone assumes the bytes are backed by the Blob's Store, but an S3 Store
only holds path/credentials. Every handler either copied the buffer and
dropped the original (arrayBuffer/bytes), created an external string whose
finalizer only deref'd the Store (text), or never freed at all (json,
formData). Net effect: every s3file.text()/.json()/.arrayBuffer()/.bytes()
leaked the entire downloaded object.

Switch doReadFromS3 to .temporary (matching the local-file path in
read_file.zig) so handlers take ownership and free the buffer.

Also make toFormDataWithBytes respect .temporary (it ignored lifetime
entirely, leaking for both S3 and local-file reads). Because getFormData
was passing .temporary even for in-memory Blobs whose bytes are
Store-owned via sharedView(), change it to .clone like getText /
getArrayBuffer / getBytes; the file/S3 paths supply their own .temporary
internally.

The existing s3-text-leak-fixture.js never caught this because
Array(n).fill(readLargeFile()) calls the function once and fills n slots
with the same promise; fix that too. Add a new credential-free leak test
that points S3Client at a local Bun.serve mock and measures RSS growth
across 50 1 MiB downloads for each body method.
- s3-body-leak.test.ts: use describe.concurrent (each test spawns an
  independent subprocess with its own port-0 server), add a formData
  case so the toFormDataWithBytes .temporary path has the same coverage
  as the other four body methods, and give each case a 60s timeout for
  slow debug/ASAN builds.
- s3-text-leak-fixture.js: now that it actually issues 100 concurrent
  reads (vs effectively 1 before), add a gc+sleep settle loop to both
  phases so the RSS samples are taken under the same conditions, and
  bump MAX_ALLOWED_MEMORY_USAGE_INCREMENT 15 -> 30 to allow for socket
  pool / allocator segment growth at the higher concurrency. The
  per-call leak this guards against would show as ~100 MiB here.
…owth noise

The previous 10-warmup / 50-measured split meant allocator segments
grew between the two RSS samples even without a leak, occasionally
pushing growth past the threshold (observed 27 MiB on text under
ASAN). Running the same iteration count for both phases puts the
allocator at steady state before the baseline sample, so growth with
the fix applied is near zero while the leak still shows as
~iterations MiB in the measured phase. 25/25 pass under ASAN.
s3.leak.test.ts was failing on 5 CI platforms after the previous
commit fixed the Array.fill(promise) bug so the fixture actually
issued 100 concurrent reads. With 10-concurrent warmup vs
100-concurrent measured, allocator segments / socket pool / TLS
buffers for the higher concurrency pushed RSS past the threshold
even with the leak fixed.

Apply the same approach proven stable in s3-body-leak.test.ts:
sequential reads with an equal-sized warmup phase so the allocator
is at steady state before the baseline sample. 30 iterations each;
the leak shows as ~30 MiB growth, threshold is 15 MiB. Bump the
test timeout to 60s for the 60 sequential R2 round-trips under
debug builds.

Validated against a local mock endpoint: passes with the fix,
fails with 'Memory usage is too high' on 1.3.13.
The fixture's .fill(promise) bug fix turned out to need R2 credentials
to validate properly; the sequential rewrite passed on 4/5 platforms
but ASAN's memory overhead still pushed past the threshold. Since the
new s3-body-leak.test.ts covers the same .text() path (plus the other
four body methods) without credentials and runs on every platform,
the R2-gated fixture is redundant for this leak. Leave it as-is and
let the credential-free test carry the regression coverage.
Match the pattern used elsewhere in test/ (memfd-disabled.test.ts,
udp_socket.test.ts, etc.) so the test doesn't flake on ASAN lanes that
emit 'WARNING: ASAN interferes with JSC signal handlers'.
@Jarred-Sumner Jarred-Sumner force-pushed the farm/716d7830/s3-body-leak branch from 07636bf to 802b91c Compare May 4, 2026 10:28

@claude claude Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The three-way lifetime change (doReadFromS3.temporary, toFormDataWithBytesdefer free, getFormData.clone) checks out against toFormData's in-memory sharedView() path, but since this is interlocked memory-ownership semantics in core Blob code where a slip means UAF/double-free, I'd like a human to give it a final look.

Extended reasoning...

Overview

Two files: src/runtime/webcore/Blob.zig (3 small hunks) and a new credential-free leak test test/js/bun/s3/s3-body-leak.test.ts. The Zig changes are tightly coupled: (1) doReadFromS3's wrapper switches the downloaded body's lifetime from .clone to .temporary so consumers free/adopt it; (2) toFormDataWithBytes now honors .temporary with a defer free(buf); (3) getFormData flips from .temporary to .clone because the in-memory path hands sharedView() (Store-owned bytes) to toFormDataWithBytes, and freeing those would be a UAF now that (2) actually frees. I cross-checked toFormData at Blob.zig:4062–4075 and the routing matches the PR description exactly — file/S3 paths supply their own lifetime via doReadFile/doReadFromS3 and ignore the getFormData value, while the in-memory branch passes it through.

Security risks

None identified. This is purely buffer-lifetime management for downloaded S3 bodies; no auth, crypto, parsing, or untrusted-input surface changes. The risk class here is memory safety (UAF/double-free), not security boundary.

Level of scrutiny

Higher than the diff size suggests. The fix is three lines of substance plus comments, but it changes ownership semantics in a hot path (Blob body readers) and the three edits must remain consistent — reverting any one without the others reintroduces either the leak or a UAF on in-memory blob.formData(). That coupling is well-documented in the new comments and the PR description, and the new test covers all five body methods including formData, but I'd still want a maintainer who owns Blob.zig to confirm the .temporary contract for toStringWithBytes/toArrayBufferViewWithBytes matches what read_file.zig already relies on.

Other factors

All seven of my earlier inline comments across previous revisions have been addressed and resolved (concurrent tests, formData coverage, fixture revert, comment wording, ASAN stderr filter, per-test timeout justified as load-bearing under ASAN). The bug-hunting pass found nothing this round. CI build #49152 shows 4 failures (s3-storage-class, fs.watch, worker_threads segfault, elysia stream) that are unrelated to the touched code paths. A community PR (#29776) independently arrived at the same core .clone.temporary change, which is a good corroborating signal.

@alii alii closed this May 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bun.S3File.arrayBuffer() retains RSS and reaches OOM in a 1 GiB Linux container despite forced GC

2 participants