Fix use-after-free in S3 Store.initS3 PathLike refcounting#28417
Fix use-after-free in S3 Store.initS3 PathLike refcounting#28417robobun wants to merge 17 commits into
Conversation
|
Updated 12:38 PM PT - Mar 25th, 2026
❌ @robobun, your commit c5505f1 has 1 failures in
🧪 To try this PR locally: bunx bun-pr 28417That installs a local version of the PR into your bun-28417 --bun |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughUpdated ownership semantics and cleanup: Store comments now state Changes
🚥 Pre-merge checks | ✅ 2✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/js/bun/s3-presign-error.test.ts`:
- Around line 5-7: The test must scrub ambient AWS credential/config env vars
before spawning the child so it always hits the missing-credentials path: clone
the existing bunEnv and remove credential/config keys (eg. AWS_ACCESS_KEY_ID,
AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN, AWS_PROFILE, AWS_DEFAULT_PROFILE,
AWS_REGION, AWS_DEFAULT_REGION, AWS_CONFIG_FILE, AWS_SHARED_CREDENTIALS_FILE and
any other AWS_* entries) then pass that sanitized env into Bun.spawn (the
proc/Bun.spawn call using bunExe() and bunEnv). Ensure you mutate a copy (e.g.,
const env = { ...bunEnv }; delete env["..."];) and pass env instead of bunEnv so
the child cannot pick up host credentials.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 082d86b6-38d2-4e90-bcd7-2ebfa585d0e9
📒 Files selected for processing (2)
src/bun.js/webcore/blob/Store.zigtest/js/bun/s3-presign-error.test.ts
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/bun.js/webcore/S3Client.zig (1)
297-307: 🧹 Nitpick | 🔵 TrivialConsider adding ownership comment for consistency.
While
staticFilewasn't part of this fix (usesconstructInternalJSrather thanconstructS3FileWithS3CredentialsAndOptions), adding a similar ownership-transfer comment would improve code consistency and documentation.📝 Suggested documentation
const path = (try jsc.Node.PathLike.fromJS(globalThis, &args)) orelse { return globalThis.throwInvalidArguments("Expected file path string", .{}); }; - + // constructInternalJS takes ownership of path. return try S3File.constructInternalJS(globalThis, path, args.nextEat());🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/bun.js/webcore/S3Client.zig` around lines 297 - 307, Add an ownership-transfer comment for the `path` variable in `staticFile` to match other constructors: state that the `path` (created via `jsc.Node.PathLike.fromJS`) is transferred to the callee (`S3File.constructInternalJS`) and the callee assumes responsibility for freeing it; place this brief comment immediately before the `return try S3File.constructInternalJS(globalThis, path, args.nextEat());` call so readers see the ownership semantics for `path` consistent with `constructS3FileWithS3CredentialsAndOptions`.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@src/bun.js/webcore/S3Client.zig`:
- Around line 297-307: Add an ownership-transfer comment for the `path` variable
in `staticFile` to match other constructors: state that the `path` (created via
`jsc.Node.PathLike.fromJS`) is transferred to the callee
(`S3File.constructInternalJS`) and the callee assumes responsibility for freeing
it; place this brief comment immediately before the `return try
S3File.constructInternalJS(globalThis, path, args.nextEat());` call so readers
see the ownership semantics for `path` consistent with
`constructS3FileWithS3CredentialsAndOptions`.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 315d7747-1ab7-4520-95e4-0afada6425bc
📥 Commits
Reviewing files that changed from the base of the PR and between 65a7f2c7146c466877ee38bc1eda9459daecde2d and 27ec2c91ed9cc76d244b89a46e1fbc9a9f1e77f4.
📒 Files selected for processing (1)
src/bun.js/webcore/S3Client.zig
27ec2c9 to
62c4147
Compare
e3bbb03 to
265728a
Compare
There was a problem hiding this comment.
LGTM — all previously raised concerns have been addressed.
Extended reasoning...
Overview
This PR fixes a use-after-free in Store.initS3 / Store.initS3WithReferencedCredentials by transferring PathLike ownership to the construct functions, which guard cleanup via errdefer path_to_clean.deinit() and neutralize it after toThreadSafe() consumes the path. Changes touch S3Client.zig, S3File.zig, blob/Store.zig, and a new regression test.
Security risks
The fix itself addresses a memory-safety bug (UAF on WTFStringImpl). No new security surface introduced. The ownership-transfer pattern is carefully consistent across all call sites.
Level of scrutiny
This went through ~15 rounds of review with several substantive issues caught and fixed: (1) early-return leak in S3Client.write before the construct call, (2) double-free risk in that same function when the construct call itself fails, (3) env-scrubbing incompleteness in the test, (4) missing initS3 path coverage, (5) file placement. All were addressed. The final implementation is correct.
Other factors
Current diff shows the test uses stderr: "inherit" (not "pipe"), so the lingering stderr-assertion nit is moot. Both initS3WithReferencedCredentials and initS3 paths are covered. CI failures in the build log are all Windows/Darwin infrastructure flakes ("Failed to create agent", expired runners), not test failures. No bugs found by the bug hunting system.
There was a problem hiding this comment.
All outstanding concerns from prior review iterations have been addressed — LGTM.
Extended reasoning...
Overview
Fixes a use-after-free in initS3/initS3WithReferencedCredentials where toThreadSafe() consumed the underlying WTFStringImpl ref while callers still held the original PathLike. Changes touch S3Client.zig, S3File.zig, blob/Store.zig, and a new regression test.
Security risks
No security-sensitive code (auth/crypto/permissions) is involved. The fix closes a UAF that could corrupt memory or produce assertion failures, so it improves safety.
Level of scrutiny
This is non-trivial memory management in Zig with WTFStringImpl refcounting. The approach (ownership transfer with errdefer + neutralization inside the construct functions, callers no longer holding errdefs) is logically sound: if getCredentialsWithOptions throws before bun.handleOom(initS3(...)), the errdefer on path_to_clean fires safely; once initS3 returns (via handleOom, so no partial failure), toThreadSafe() has already consumed the string and path_to_clean is immediately neutralized. The special case for S3Client.write (early return before construct call) is handled with a guarded errdefer that is neutralized via owned_path before the handoff — correctly mirroring the pattern in S3File.zig.
Other factors
The PR went through 13 robobun verification iterations and 55/55 build/test steps pass for the Zig changes (only unrelated Darwin infrastructure timeouts in webview tests). All three code paths — constructS3FileWithS3CredentialsAndOptions (both initS3WithReferencedCredentials and initS3 branches) and constructS3FileWithS3Credentials (static path) — are exercised by the regression test. Env scrubbing covers AWS_, S3_, and BUN_S3_*. Using stderr: "inherit" rather than asserting stderr === "" is the correct choice since ASAN builds emit diagnostics to stderr legitimately.
There was a problem hiding this comment.
LGTM — all previously raised concerns have been addressed.
Extended reasoning...
Overview
This PR fixes a use-after-free in initS3/initS3WithReferencedCredentials where PathLike was passed by value and toThreadSafe() consumed the underlying WTFStringImpl without the caller knowing, leaving a dangling pointer for the errdefer path.deinit() to detonate on the error path. The fix transfers PathLike ownership into constructS3FileWithS3CredentialsAndOptions and constructS3FileWithS3Credentials, which add their own errdefer path_to_clean.deinit() and neutralize it after toThreadSafe() consumes the path. Four files are modified: Store.zig (comment-only), S3Client.zig (remove/restructure caller errdefs), S3File.zig (add neutralization at 6 sites), and a new regression test.
Security risks
None beyond the UAF being fixed. No auth, crypto, or permission changes.
Level of scrutiny
High — memory-safety fix in a complex ownership-transfer pattern across multiple Zig files. Required many review iterations. But the pattern is now consistent and correct throughout, and CI passed 55/55 steps.
Resolution of prior concerns
All issues I raised across previous review rounds have been closed: the write() early-return gap is correctly guarded (errdefer on the local copy, neutralize before handing owned_path to callee so the outer errdefer is a no-op if callee fails); the stderr capture-without-assertion antipattern is gone (stderr: "inherit" used instead); all three ownership-transfer code paths are covered by the regression test (initS3WithReferencedCredentials, initS3 via per-request creds, static constructS3FileWithS3Credentials); and env scrubbing covers AWS_, S3_, and BUN_S3_* keys.
There was a problem hiding this comment.
LGTM — all previously raised concerns have been resolved.
Extended reasoning...
Overview
This PR fixes a use-after-free in Store.initS3 / initS3WithReferencedCredentials where PathLike was passed by value, toThreadSafe() consumed (dereffed) the underlying WTFStringImpl, and a subsequent errdefer path.deinit() in the caller would double-deref the already-freed impl. The fix transfers ownership to the two construct functions (constructS3FileWithS3CredentialsAndOptions, constructS3FileWithS3Credentials), each of which registers its own errdefer path_to_clean.deinit() and neutralizes it after initS3/initS3WithReferencedCredentials consumes the path via toThreadSafe. In S3File.zig, path_or_blob is neutralized in-place before calling constructS3FileInternalStore. The write() case in S3Client.zig correctly guards the early-return window with an errdefer that is neutralized (via owned_path handoff) before the construct call.
Security risks
The fix addresses a memory safety bug (use-after-free / double-free) in production S3 code. The ownership transfer pattern is consistent and well-commented. No new attack surface is introduced; the change only tightens lifecycle management of internal string allocations.
Level of scrutiny
This is a correctness fix for a specific memory safety issue in low-level Zig string refcounting. The pattern (register errdefer, neutralize after ownership transfer) is applied consistently across all 8 affected call sites. The fix was verified against all 55 CI steps with 0 failures.
Other factors
All concerns raised in prior review rounds have been addressed: (1) the write() early-return gap is guarded; (2) the test covers all three ownership-transfer code paths (initS3WithReferencedCredentials, initS3, and constructS3FileWithS3Credentials via static method); (3) env scrubbing covers AWS_*, S3_*, and BUN_S3_*; (4) the stderr concern was resolved by switching to stderr: "inherit" rather than piping without asserting. The only remaining issue is a pre-existing copy-paste error in stat() error messages (unrelated to this fix).
| if (path == .fd) { | ||
| return globalThis.throwInvalidArguments("Expected a S3 or path to get size", .{}); | ||
| } | ||
| // constructS3FileInternalStore takes ownership of path. | ||
| path_or_blob = .{ .blob = .initEmpty(globalThis) }; | ||
| var blob = try constructS3FileInternalStore(globalThis, path.path, options); | ||
| defer blob.deinit(); | ||
|
|
There was a problem hiding this comment.
🟣 The stat() function in S3File.zig contains two error messages copied from size() that say "Expected a S3 or path to get size" instead of referencing stat -- callers who pass a non-S3 blob or a file descriptor to S3.stat() will see a misleading error message. This is a pre-existing copy-paste error, but the PR directly modifies this function body (adding the ownership-transfer lines at lines 601-602), making it an adjacent opportunity to fix the messages.
Extended reasoning...
What the bug is
The stat() function in src/bun.js/webcore/S3File.zig contains two error messages copied verbatim from the size() function and never updated:
- Blob validation check: throwInvalidArguments("Expected a S3 or path to get size")
- .fd path check: throwInvalidArguments("Expected a S3 or path to get size")
Both should say something like "Expected a S3 or path to stat" to correctly describe what stat() does.
The specific code path that triggers it
Any call to S3.stat() or Bun.s3.stat() that fails one of these guards produces the misleading message:
- Passing a non-S3 blob (e.g., a regular file blob) hits the blob validation check
- Passing a file descriptor path hits the .fd guard
In both cases the user sees an error about "get size" even though they called stat, not size.
Why existing code does not prevent it
This is purely a copy-paste oversight. The size() function at lines 180-210 uses identical guard messages. When stat() was created from size() as a template, these strings were never updated. There is no automated check that validates error message text against the function name.
Impact
The impact is confusion at the JavaScript API surface: a user calling Bun.s3.stat(blob) with a non-S3 blob or a file descriptor receives an error message that tells them they cannot "get size" of that path, which is irrelevant to what they were trying to do. While not a correctness bug, this is a DX issue that makes debugging harder.
How to fix it
Change both error strings in stat() from "Expected a S3 or path to get size" to "Expected a S3 or path to stat" (or similar stat-specific wording). The size() function's identical messages are correct for that function and should remain unchanged.
Step-by-step proof
- User calls Bun.s3.stat(regularFileBlob) where regularFileBlob.store.data != .s3
- Execution reaches the blob validation check in stat(): throwInvalidArguments("Expected a S3 or path to get size")
- User receives error: "Expected a S3 or path to get size"
- User is confused -- they called stat, not size; the error message does not match the operation they attempted
- The size() function contains the identical string for its own guard, which is correct for that function
- The PR adds 2 lines directly inside stat() (the path_or_blob neutralization at lines 601-602), placing the change immediately adjacent to both incorrect error messages
initS3 and initS3WithReferencedCredentials take a PathLike by value, creating a struct copy that shares the underlying WTFStringImpl without incrementing the refcount. The subsequent toThreadSafe() call may create a new thread-safe copy of the WTFStringImpl and deref the original, leaving the caller's PathLike with a dangling pointer. When the caller's errdefer later calls path.deinit(), it derefs the already-freed impl. Fix by refing the underlying string before toThreadSafe() so the caller's copy remains valid.
…ions Instead of adding ref() calls in initS3/initS3WithReferencedCredentials, fix ownership at the caller level. The construct functions (constructS3FileWithS3CredentialsAndOptions, constructS3FileWithS3Credentials) now take ownership of the PathLike and handle cleanup via errdefer if getCredentialsWithOptions throws before the path is consumed by toThreadSafe. Callers no longer errdefer path.deinit() since the construct functions own it. For static S3File functions using PathOrBlob, neutralize the errdefer by clearing the path after the construct call consumes it.
6c377a4 to
c5505f1
Compare
There was a problem hiding this comment.
LGTM — all previous feedback has been addressed.
Extended reasoning...
Overview
This PR fixes a use-after-free in Blob.Store.initS3 / initS3WithReferencedCredentials by transferring PathLike ownership into constructS3FileWithS3CredentialsAndOptions and constructS3FileWithS3Credentials. Both construct functions now register a path_to_clean errdefer and neutralize it after toThreadSafe() consumes the path. S3Client.write correctly guards the early-return gap with a local errdefer, then neutralizes via the owned_path handoff before the construct call. Six other callers in S3Client.zig drop their now-redundant errdefs. Six neutralization sites in S3File.zig prevent double-free when the path is passed to constructS3FileInternalStore.
Security risks
The bug itself was a memory-safety issue (UAF / refcount corruption). The fix is additive cleanup logic — it does not touch authentication, authorization, or credential handling in any way that could introduce new vulnerabilities. No new attack surface is introduced.
Level of scrutiny
Memory-safety fixes in Zig S3 plumbing warrant careful review, and this PR has received it: 19 iterations of robobun verification, multiple rounds of inline feedback, and 55/55 CI steps green on build #41188. All prior concerns I raised across multiple review rounds have been addressed in the final code.
Other factors
The regression test scrubs AWS_*, S3_*, and BUN_S3_* env vars, uses stderr: "inherit" (no spurious capture), and exercises all three affected code paths: initS3WithReferencedCredentials (no credential overrides), initS3 (per-request credentials), and constructS3FileWithS3Credentials (static Bun.S3Client.presign). The only remaining nit — copy-pasted "get size" error messages in stat() — is pre-existing, outside the diff scope, and non-blocking.
|
Superseded by #30495 (same fix against current main, minimal diff, deterministic test). |
initS3andinitS3WithReferencedCredentialstake aPathLikeby value, creating a struct copy that shares the underlyingWTFStringImplwithout incrementing the refcount. The subsequenttoThreadSafe()call may create a new thread-safe copy and deref the original, leaving the caller'sPathLikewith a dangling pointer.When the caller's
errdeferlater callspath.deinit(), it derefs the already-freed impl — triggering a use-after-free that manifests as either:panic: reached unreachable codein debug builds (fromassertNoExceptiondetecting corrupted JS state)ASSERTION FAILED: The string being removed is an atom in the string table of an other thread!(from freeing a WTFStringImpl that was already freed)The fix transfers PathLike ownership to the construct functions (
constructS3FileWithS3CredentialsAndOptions,constructS3FileWithS3Credentials), which handle cleanup via errdefer if they fail beforetoThreadSafe()consumes the path, and neutralize the errdefer afterward. Callers no longer errdefer path.deinit().Reproducer:
try { Bun.s3.presign("mykey"); } catch(e) {}Verification (robobun, iteration 11): CI on commit e5f81f0 (all Zig source changes) — all 55 build/test steps green across Linux x64/aarch64, musl, Windows x64/aarch64, ASAN; overall buildkite/bun marked "canceled" only because superseded by test-only commit da32e4f. Lint JavaScript and Format passed. Latest commit da32e4f only moves the test file to test/js/bun/s3/ — build #41155 pending. No TODO/FIXME/HACK/XXX in diff. Ownership transfer pattern verified correct: S3Client.zig removes errdefer path.deinit() from 6 callers (file, presign, exists, size, stat, unlink); write() retains errdefer for the early-return gap then neutralizes before handoff. S3File.zig neutralizes path_or_blob at 6 sites (presign, unlink, write, size, exists, stat) before constructS3FileInternalStore. constructS3FileWithS3CredentialsAndOptions and constructS3FileWithS3Credentials both add path_to_clean errdefer + neutralize after initS3/initS3WithReferencedCredentials consumes via toThreadSafe. Test spawns subprocess with AWS_/S3_/BUN_S3_* env scrubbed, tests both code paths, asserts ERR_S3_MISSING_CREDENTIALS + ERR_S3_INVALID_PATH + exit code 0 — would crash/panic on main. CodeRabbit nit (staticFile ownership comment) is outside diff and not blocking.
Verification (robobun, iteration 13): Commit e5f81f0 (all Zig changes) — 29 buildkite steps passed, 1 overall build marked 'canceled' (superseded); Format and Lint JavaScript green. Build #41155 (da32e4f, test-only move) — all failures are Darwin infrastructure (Expired/Canceled runners), not test failures. Latest commit 1046d82 is empty CI retry. No TODO/FIXME/HACK/XXX in diff. Code verified: S3Client.zig removes errdefer path.deinit() from 6 callers; write() correctly guards early-return gap then neutralizes via owned_path handoff. S3File.zig neutralizes path_or_blob at 6 sites before constructS3FileInternalStore. Both constructS3FileWithS3CredentialsAndOptions and constructS3FileWithS3Credentials add path_to_clean errdefer + neutralize after initS3 consumes via toThreadSafe. Test scrubs AWS_/S3_/BUN_S3_* env, tests both code paths, asserts ERR_S3_MISSING_CREDENTIALS + ERR_S3_INVALID_PATH + exit code 0. CodeRabbit nit about staticFile comment is outside diff, non-blocking.
Verification (robobun, iteration 13 final): Build #41188 (commit 1046d82): 55 steps passed, 0 failed, 4 canceled (superseded). Format + Lint JavaScript green on e5f81f0. Build #41208 (d0d15e2, test-only addition) in progress. No TODO/FIXME/HACK/XXX in diff. Code verified: S3Client.zig removes errdefer path.deinit() from 6 callers; write() guards early-return gap then neutralizes via owned_path handoff before construct call. S3File.zig neutralizes path_or_blob at 6 sites before constructS3FileInternalStore. Both constructS3FileWithS3CredentialsAndOptions and constructS3FileWithS3Credentials add path_to_clean errdefer + neutralize after initS3/initS3WithReferencedCredentials consumes via toThreadSafe. Test scrubs AWS_/S3_/BUN_S3_* env, exercises all 3 code paths (instance initS3WithReferencedCredentials, instance initS3 with changed creds, static constructS3FileWithS3Credentials), asserts ERR_S3_MISSING_CREDENTIALS + ERR_S3_INVALID_PATH + exit code 0. CodeRabbit nit about staticFile comment is outside diff, non-blocking.
Verified by robobun: Format and Lint JS pass. Buildkite build 41188 (commit 1046d82) completed 55/55 steps with 0 test failures before auto-cancel. Diff is clean (no TODO/FIXME/HACK). CodeRabbit reviews resolved — only remaining suggestion is a nit about adding an ownership comment to staticFile (not blocking). Test exercises all three ownership-transfer code paths (initS3WithReferencedCredentials, initS3 via changed credentials, and static constructS3FileWithS3Credentials) and asserts clean error instead of crash.
Verification (robobun, iteration 17): Zig sources unchanged since e5f81f0; build #41188 (commit 1046d82) passed 55/55 steps, 0 failures. Latest commit 90d1365 is test-only (+7 -3: adds static S3Client.presign path to test); Lint JavaScript green. No TODO/FIXME/HACK/XXX in diff. CodeRabbit nit (staticFile ownership comment) is outside diff, non-blocking. Test exercises all 3 ownership-transfer code paths and asserts clean error codes + exit 0.
Verification (robobun, iteration 19): Final review of commit 6c377a4 (CI retry only). Zig sources unchanged since e5f81f0. Build #41188 (commit 1046d82) passed 55/55 steps, 0 failures. Build #41251 failure (test/regression/issue/26225.test.ts on ASAN) is unrelated — that test covers multipart uploads with form-data+node-fetch, not S3 PathLike ownership. Lint JavaScript green. No TODO/FIXME/HACK/XXX in diff. Ownership transfer verified correct: S3Client.zig removes errdefer path.deinit() from 6 callers; write() guards early-return gap then neutralizes via owned_path before handoff. S3File.zig neutralizes path_or_blob at 6 call sites. Both constructS3FileWithS3CredentialsAndOptions and constructS3FileWithS3Credentials add path_to_clean errdefer + neutralize after initS3/initS3WithReferencedCredentials consumes via toThreadSafe. Test scrubs AWS_/S3_/BUN_S3_ env, exercises all 3 ownership-transfer code paths, asserts ERR_S3_MISSING_CREDENTIALS + ERR_S3_INVALID_PATH + exit 0.