fix(s3): don't double-free path when S3Client static ops throw after blob creation#29081
fix(s3): don't double-free path when S3Client static ops throw after blob creation#29081robobun wants to merge 4 commits into
Conversation
|
Updated 11:34 AM PT - May 6th, 2026
❌ @robobun, your commit 5db67d9 has 3 failures in
🧪 To try this PR locally: bunx bun-pr 29081That installs a local version of the PR into your bun-29081 --bun |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (1)
WalkthroughFor S3-backed blob creation in multiple code paths, Changes
🚥 Pre-merge checks | ✅ 2✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
|
This PR may be a duplicate of:
🤖 Generated with Claude Code |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/js/bun/s3/s3-static-missing-credentials.test.ts`:
- Around line 8-41: Add parameterized tests mirroring the existing
S3Client.presign test to cover the other static helpers that received the
ownership-transfer fix: unlink, write, size, exists, and stat. For each helper
name (e.g., "unlink", "write", "size", "exists", "stat") spawn Bun the same way
you do in s3-static-missing-credentials.test.ts with AWS_/S3_ env vars stripped,
run a small inlined script that calls Bun.S3Client.<helper>("some/key") inside a
try/catch, assert the caught e?.code === "ERR_S3_MISSING_CREDENTIALS", call
Bun.gc(true) and console.log("ok"), and then assert the child process
stdout/stderr/exitCode just like the presign test; keep the structure
table-driven (array of helper names and loop) so a single test block covers all
helpers.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 39ab9ef5-a1a3-4865-83a6-ef99e1d841a6
📒 Files selected for processing (2)
src/bun.js/webcore/S3File.zigtest/js/bun/s3/s3-static-missing-credentials.test.ts
| test("S3Client.presign without credentials throws instead of crashing", async () => { | ||
| const env: Record<string, string> = {}; | ||
| for (const [k, v] of Object.entries(bunEnv)) { | ||
| if (k.startsWith("AWS_") || k.startsWith("S3_")) continue; | ||
| if (typeof v === "string") env[k] = v; | ||
| } | ||
|
|
||
| const src = ` | ||
| let code; | ||
| try { | ||
| Bun.S3Client.presign("some/key"); | ||
| } catch (e) { | ||
| code = e?.code; | ||
| } | ||
| if (code !== "ERR_S3_MISSING_CREDENTIALS") { | ||
| console.error("unexpected result:", code); | ||
| process.exit(1); | ||
| } | ||
| Bun.gc(true); | ||
| console.log("ok"); | ||
| `; | ||
|
|
||
| await using proc = Bun.spawn({ | ||
| cmd: [bunExe(), "-e", src], | ||
| env, | ||
| stdout: "pipe", | ||
| stderr: "pipe", | ||
| }); | ||
|
|
||
| const [stdout, stderr, exitCode] = await Promise.all([proc.stdout.text(), proc.stderr.text(), proc.exited]); | ||
|
|
||
| expect({ stdout: stdout.trim(), stderr }).toEqual({ stdout: "ok", stderr: expect.any(String) }); | ||
| expect(exitCode).toBe(0); | ||
| }); |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Add follow-up coverage for the other static S3 helpers touched by this fix.
This regression test validates presign, but the same ownership-transfer fix was applied to unlink, write, size, exists, and stat too. A small table-driven set covering at least one throwing path per helper would reduce future drift risk.
I can draft a compact parameterized test block for those helpers if you want.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/js/bun/s3/s3-static-missing-credentials.test.ts` around lines 8 - 41,
Add parameterized tests mirroring the existing S3Client.presign test to cover
the other static helpers that received the ownership-transfer fix: unlink,
write, size, exists, and stat. For each helper name (e.g., "unlink", "write",
"size", "exists", "stat") spawn Bun the same way you do in
s3-static-missing-credentials.test.ts with AWS_/S3_ env vars stripped, run a
small inlined script that calls Bun.S3Client.<helper>("some/key") inside a
try/catch, assert the caught e?.code === "ERR_S3_MISSING_CREDENTIALS", call
Bun.gc(true) and console.log("ok"), and then assert the child process
stdout/stderr/exitCode just like the presign test; keep the structure
table-driven (array of helper names and loop) so a single test block covers all
helpers.
|
Dropped the regression test: the double-deref only surfaces as a hard failure via Fix verified against the fuzz build at Re: duplicates (#28495, #28592, #28417, #28423) — same root cause, all still open. This one is rebased on current main. |
There was a problem hiding this comment.
LGTM — the double-free fix is correct and applied consistently across all affected static S3 helpers.
Extended reasoning...
Overview
The PR modifies to fix a memory-safety bug present in 6 static S3 helper functions (, , , , , ). A new regression test in covers the crash scenario.
The fix
After is called, ownership of the string is transferred to the newly created blob's store. The existing would then double-free the same string if the subsequent operation (e.g. ) threw synchronously. The fix is to immediately overwrite with whose is a no-op, making the errdefer harmless. The pattern is correct and is applied uniformly across all 6 functions.
Security risks
None. This is a memory-safety fix for a crash/use-after-free in debug builds and a silent double-deref in release builds. There are no auth, crypto, or permission changes.
Level of scrutiny
Low. The change is small, mechanical, and self-contained. The root cause and fix are clearly explained. The test validates the exact crash scenario. Both reported bugs are pre-existing and not introduced by this PR.
Other factors
The two inline bug reports (NaN return from and wrong error messages in ) are explicitly marked pre-existing. Neither is introduced or worsened by this PR. They would be good follow-up fixes but do not block approval here.
| return globalThis.throwInvalidArguments("Expected a S3 or path to get size", .{}); | ||
| } | ||
| var blob = try constructS3FileInternalStore(globalThis, path.path, options); | ||
| path_or_blob.path = .{ .fd = bun.invalid_fd }; | ||
| defer blob.deinit(); | ||
|
|
||
| return S3BlobStatTask.size(globalThis, &blob); |
There was a problem hiding this comment.
🟣 Pre-existing: S3Client.size(s3FileObj) returns NaN synchronously while S3Client.size('path/key') returns a Promise<number> from an async HEAD request — the two call signatures behave completely differently. To fix, the .blob branch should call S3BlobStatTask.size(globalThis, blob) instead of Blob.getSize(blob, globalThis).
Extended reasoning...
What the bug is and how it manifests
In S3File.zig's size() static function, the .path branch and the .blob branch take fundamentally different code paths that return different types. The .path branch constructs a temporary Blob and calls S3BlobStatTask.size(globalThis, &blob), which fires an async HTTP HEAD request to S3 and returns a Promise<number>. The .blob branch calls Blob.getSize(blob, globalThis), which for a freshly-created S3File simply reads this.size synchronously and returns NaN when the size is unknown.
The specific code path that triggers it
When a user calls S3Client.size(s3FileObj) where s3FileObj is an S3File object (e.g., created via new Bun.S3File(...) or returned from an S3 listing), the dispatch reaches the .blob branch at S3File.zig lines ~201–203:
.blob => |*blob| {
return Blob.getSize(blob, globalThis);
},Blob.getSize synchronously reads this.size, which defaults to Blob.max_size for freshly created S3 objects whose size hasn't been fetched yet.
Why existing code doesn't prevent it
In Blob.zig (around line 3231–3235), getSize explicitly handles the S3 case:
if (this.isS3()) {
return jsc.JSValue.jsNumber(std.math.nan(f64));
}So instead of querying S3, it returns NaN. There is no guard in the .blob branch of size() to detect this and fall back to the async path.
What the impact would be
Any code that calls S3Client.size(s3FileObj) on a freshly-created or non-cached S3File silently receives NaN instead of a Promise. This is silent data corruption — no exception is thrown, and the caller has no way to distinguish "size not yet fetched" from a genuine error unless they explicitly check for NaN. Meanwhile, calling with a string path works correctly and returns a real async size.
How to fix it
Replace the .blob branch in size() with a call to S3BlobStatTask.size(globalThis, blob), mirroring the .path branch and the pattern already used in the stat() and exists() functions:
.blob => |*blob| {
return S3BlobStatTask.size(globalThis, blob);
},Note that stat() already uses S3BlobStatTask.stat(globalThis, blob) in its blob branch — size() is the odd one out.
Step-by-step proof
- User creates an S3 file object:
const f = new Bun.S3File('bucket/key')— size field isBlob.max_size(unknown). - User calls
S3Client.size(f)— dispatches to.blobbranch. Blob.getSize(f, globalThis)is called — hits theif (this.isS3())check in Blob.zig:3231.- Returns
jsNumber(nan(f64))synchronously. await S3Client.size(f)resolves toNaNinstead of the real file size.- Compare:
await S3Client.size('bucket/key')goes through the.pathbranch →S3BlobStatTask.size()→ HTTP HEAD → resolves with the real integer size.
| return globalThis.throwInvalidArguments("Expected a S3 or path to get size", .{}); | ||
| } | ||
| var blob = try constructS3FileInternalStore(globalThis, path.path, options); | ||
| path_or_blob.path = .{ .fd = bun.invalid_fd }; |
There was a problem hiding this comment.
🟣 Pre-existing copy-paste bug: the stat() function in S3File.zig has two error messages that were copied from size() and never updated — both say "Expected a S3 or path to get size" instead of something referencing stat. Users who call S3Client.stat() with an invalid argument (a non-S3 blob or a file descriptor) will see a confusing error message mentioning "size". Since the PR already touches this exact function (adding the double-free fix at the same location), this would be a good opportunity to fix the misleading messages too.
Extended reasoning...
What the bug is: The stat() function in src/bun.js/webcore/S3File.zig contains two error messages that were copy-pasted from the size() function without being updated. Both guards in stat() emit the string "Expected a S3 or path to get size" — a message that only makes sense in the context of size(), not stat().
The specific code path that triggers it: The first wrong message is in the blob-type check at the top of stat(): if (path_or_blob == .blob and (path_or_blob.blob.store == null or path_or_blob.blob.store.?.data \!= .s3)) — this returns throwInvalidArguments("Expected a S3 or path to get size", ...). The second is inside the switch on path_or_blob for the .path variant, where if (path == .fd) also returns the same "get size" message. Both of these are in the function named stat, not size.
Why existing code does not prevent it: This is a straightforward copy-paste error. The size() function (around line 193) has the same two guards with "get size" messages — and those are correct there. When stat() was added or derived from size(), the error strings were not updated.
Impact: The bug is a user-facing UX issue rather than a correctness problem. Any user who accidentally passes an invalid argument to S3Client.stat() — either a non-S3 blob or a file descriptor — will receive a JavaScript error saying the expected argument was something to "get size", which is the wrong operation entirely. This could lead to confusion when debugging.
How to fix it: Change both occurrences of "Expected a S3 or path to get size" inside the stat() function to say something like "Expected a S3 or path to stat" (matching the style of nearby functions like unlink which says "to delete" and exists which says "to check if it exists").
Step-by-step proof:
- A user calls
Bun.S3Client.stat(someRegularBlob)wheresomeRegularBlobis not an S3 blob. PathOrBlob.fromJSNoCopyresolves it to.blob.- The guard
if (path_or_blob == .blob and (... .data \!= .s3))is true. globalThis.throwInvalidArguments("Expected a S3 or path to get size", ...)fires.- The user sees a JavaScript error mentioning "get size" even though they called
stat()— misleading and confusing.
| } | ||
| const options = args.nextEat(); | ||
| var blob = try constructS3FileInternalStore(globalThis, path.path, options); | ||
| path_or_blob.path = .{ .fd = bun.invalid_fd }; | ||
| defer blob.deinit(); | ||
| return try getPresignUrlFrom(&blob, globalThis, options); | ||
| }, |
There was a problem hiding this comment.
🟣 This PR's fix is incomplete: two additional double-free scenarios remain for non-ASCII (.encoded_slice) paths. Both are pre-existing issues that exist in code directly related to what this PR modifies. (1) Inside constructS3FileWithS3Credentials, if a JS getter for the type option throws after initS3() has taken ownership of the path, the inner errdefer store.deinit() and the outer errdefer path_or_blob.path.deinit() both fire on the same memory — the PR's neutralization line is only reached on success. (2) The analogous S3Client instance methods in S3Client.zig (presign, exists, size, stat, write, unlink, ~lines 138-258) follow the exact same errdefer path.deinit() / defer blob.detach() pattern but were not given the same neutralization fix applied here.
Extended reasoning...
Bug 1: Internal throw inside constructS3FileWithS3Credentials (after initS3 but before return)
The PR adds path_or_blob.path = .{ .fd = bun.invalid_fd }; immediately after constructS3FileInternalStore returns, which prevents the outer errdefer from double-freeing when a later operation (e.g., getPresignUrlFrom) throws. However, this line is only reached when constructS3FileInternalStore (and its callee constructS3FileWithS3Credentials) returns successfully.
Inside constructS3FileWithS3Credentials, the sequence is:
Blob.Store.initS3(path, ...)— stores thePathLikeinstore.data.s3.pathlikevia a value copy. For.encoded_slicepaths (non-ASCII strings whereisWTFAllocated()is false),PathLike.toThreadSafe()hits theelse => {}branch (types.zig:551) — a no-op — so the store's copy and the caller'spath_or_blob.path.pathshare the same heap pointer with no extra reference.errdefer store.deinit()is registered insideconstructS3FileWithS3Credentials.if (try opts.getTruthyComptime(globalObject, "type"))— usestry, which propagates if a JS getter throws.- If the getter throws:
errdefer store.deinit()fires ->s3.deinit(allocator)->pathlike.deinit()-> frees the shared bytes. constructS3FileInternalStorepropagates the error back to the static helper (e.g.,presign).- The PR's neutralization line is skipped (it sits after the
trycall that failed). - The outer
errdefer path_or_blob.path.deinit()fires -> double-frees the same bytes.
Proof: Bun.S3Client.presign("clave/key", { get type() { throw new Error() } }) with a non-ASCII path triggers this. The .encoded_slice variant arises when Latin-1 or non-WTF-allocated strings are passed as the path.
Bug 2: S3Client instance methods in S3Client.zig not fixed
This PR fixes the six static S3 helpers in S3File.zig. The analogous instance methods in src/bun.js/webcore/S3Client.zig (~lines 138-258: presign, exists, size, stat, write, unlink) follow an identical pattern and have the identical root cause — but were not touched by this PR.
Each instance method follows this structure:
const path: jsc.Node.PathLike = try jsc.Node.PathLike.fromJS(globalThis, &args);
errdefer path.deinit();
var blob = try S3File.constructS3FileWithS3CredentialsAndOptions(globalThis, path, ...);
defer blob.detach();
return S3File.getPresignUrlFrom(&blob, globalThis, options);For .encoded_slice PathLike (non-ASCII paths): toThreadSafe() inside initS3 is a no-op, so the store's S3.pathlike and the caller's path variable share the same raw heap pointer. When the subsequent operation throws (e.g., getPresignUrlFrom throws ERR_S3_MISSING_CREDENTIALS synchronously):
defer blob.detach()->store.deref()->store.deinit()->s3.deinit()->pathlike.deinit()-> frees byteserrdefer path.deinit()-> double-frees the same bytes
Proof: new S3Client({...}).presign("non-ascii-key") with missing credentials on a non-ASCII path triggers this. All 6 instance methods are affected the same way.
Why this is pre-existing but still relevant
Both bugs existed before this PR. However, this PR directly addresses the same class of ownership confusion in the same codebase, explicitly fixes the static method variants, and demonstrates the correct mitigation pattern. Not extending the fix to (a) the internal-throw case and (b) the instance methods leaves material double-free paths open — especially now that the fix pattern is understood and documented in this diff.
Fix for Bug 1: Neutralize the path inside constructS3FileWithS3Credentials after initS3 succeeds (e.g., use a transferred flag or restructure the errdefer to account for the transfer of ownership before calling getTruthyComptime).
Fix for Bug 2: Apply the same path = .{ .string = bun.PathString.empty }; (or equivalent) neutralization to all 6 instance methods in S3Client.zig after constructS3FileWithS3CredentialsAndOptions returns successfully, mirroring the fix applied here to the static helpers.
| } | ||
| const options = args.nextEat(); | ||
| var blob = try constructS3FileInternalStore(globalThis, path.path, options); | ||
| path_or_blob.path = .{ .fd = bun.invalid_fd }; | ||
| defer blob.deinit(); | ||
| return try getPresignUrlFrom(&blob, globalThis, options); | ||
| }, |
There was a problem hiding this comment.
🟣 Pre-existing memory leak: when any of the six static S3 helpers (presign, unlink, write, size, exists, stat) is called with a custom non-standard MIME type (e.g. { type: 'application/x-custom' }), a heap buffer is allocated for the content type but never freed when the temporary blob is cleaned up via defer blob.deinit(). To fix, Blob.deinit() should check content_type_allocated and call bun.default_allocator.free(this.content_type) before returning.
Extended reasoning...
What the bug is and how it manifests
In constructS3FileWithS3Credentials (and its sibling constructS3FileWithS3CredentialsAndOptions), when the caller passes a type option with a custom MIME string that is not found in the built-in mimeType table, the code heap-allocates a buffer:
const content_type_buf = bun.handleOom(allocator.alloc(u8, slice.len));
blob.content_type = strings.copyLowercase(slice, content_type_buf);
blob.content_type_allocated = true;
This marks the blob's content-type as caller-owned memory. The six static helpers (presign, unlink, write, size, exists, stat) each construct a temporary blob and clean it up with defer blob.deinit(). On every such call with a non-standard MIME type, the heap buffer is silently leaked.
The specific code path that triggers it
Blob.deinit() (Blob.zig ~line 3560) is:
pub fn deinit(this: *Blob) void {
this.detach(); // only derefs store
this.name.deref();
if (this.isHeapAllocated()) { bun.destroy(this); }
// content_type buffer is NEVER freed
}
It calls detach() (which only decrements the store refcount), name.deref(), and conditionally bun.destroy(this) for heap-allocated blobs. There is no check for content_type_allocated.
Why existing code does not prevent it
The only places in the codebase that free content_type when content_type_allocated == true are in specific download-task cleanup paths (around lines 2324-2325 and 2667-2668) that are never reached for these temporary static-helper blobs. Since the blobs are stack-allocated (the functions return Blob by value, not *Blob), isHeapAllocated() is false, so even the bun.destroy path is skipped. The comment in dupeWithContentType (line 3523) — 'for now, we just want to avoid a use-after-free here' — further confirms that content_type memory management is known to be incomplete.
What the impact would be
Every call to any of the six static S3 helpers with a non-standard MIME type leaks a heap buffer of len(mime_type) bytes. In long-running server processes or loops, this results in unbounded memory growth with no way for the GC to reclaim it.
How to fix it
Add content-type cleanup to Blob.deinit():
pub fn deinit(this: *Blob) void {
if (this.content_type_allocated) {
bun.default_allocator.free(this.content_type);
this.content_type = "";
this.content_type_allocated = false;
}
this.detach();
this.name.deref();
if (this.isHeapAllocated()) { bun.destroy(this); }
}
Step-by-step proof
- User calls Bun.S3Client.presign('key', { type: 'application/x-custom' })
- presign() calls constructS3FileInternalStore(..., options) -> constructS3FileWithS3Credentials
- 'application/x-custom' is not in the mimeType table -> allocator.alloc(u8, 20) allocates 20 bytes on the heap
- blob.content_type = content_type_buf; blob.content_type_allocated = true
- constructS3FileInternalStore returns successfully
- The PR neutralization line runs (path_or_blob.path = .{ .fd = bun.invalid_fd })
- getPresignUrlFrom throws ERR_S3_MISSING_CREDENTIALS
- defer blob.deinit() fires -> calls detach() and name.deref(); isHeapAllocated() is false so bun.destroy is skipped
- The 20-byte content_type_buf is permanently leaked — no pointer to it remains
There was a problem hiding this comment.
Pre-existing, out of scope for this fuzzer crash fix.
|
Local gate vs BuildKite: The local gate check can't complete in this environment — main bumped the pinned zig compiler ( BuildKite is the authoritative signal and it builds fine: on #44742 the new Manual verification on the debug-fuzz binary at the crash commit
|
…blob creation
Bun.S3Client.presign("path") with no credentials would construct the
internal blob (which takes ownership of the path string), then fail in
signRequest. At that point both `defer blob.deinit()` and the outer
`errdefer path_or_blob.path.deinit()` ran, derefing the same WTFString
twice and tripping `hasAtLeastOneRef()` in debug builds.
After the blob has been constructed, neutralize the outer errdefer's
path so it becomes a no-op on the error path. Applied to all the static
S3 helpers that share this pattern (presign/unlink/write/size/exists/stat).
Fuzzer fingerprint: 0b4166b91ae54a3f
The double-deref only trips a debug-build assertion; release builds silently over-deref without an observable failure, so the test can't satisfy fail-before on release. The existing hasAtLeastOneRef() assertion in WTFStringImpl.deref already guards this path in debug/CI.
The double-deref trips a debug-only assertion, so skip on release.
f2b9d5f to
5db67d9
Compare
|
Superseded by #30495 (same fix against current main, minimal diff, deterministic test). |
What
Bun.S3Client.presign("path")with no credentials configured would crash in debug builds withreached unreachable codeinWTFStringImplStruct.deref.Why
The static S3 helpers (
presign,unlink,write,size,exists,stat) all follow this pattern:constructS3FileInternalStoretransfers ownership of the path to the blob's store (Blob.Store.initS3stores thePathLikewithout adding a ref). IfsomeOperationthen throws — whichgetPresignUrlFromalways does synchronously when credentials are missing — bothdefer blob.deinit()and the outererrdeferfire, derefing the sameWTFStringImpltwice.In debug builds this trips
bun.assert(self.hasAtLeastOneRef()). In release it's a silent over-deref; depending on timing the freed string could be an atom belonging to another thread, which is theAtomStringImpl::removeassertion the fuzzer originally reported.Fix
After the blob is constructed, replace
path_or_blob.pathwith a.fdvariant soPathOrFileDescriptor.deinit()is a no-op and the errdefer can't double-free.Repro
Before:
panic(main thread): reached unreachable codein debug.After: throws
ERR_S3_MISSING_CREDENTIALS.Fuzzer fingerprint:
0b4166b91ae54a3f