Skip to content

[codex] Fix S3 arrayBuffer native buffer leak#29776

Closed
sorenbs wants to merge 2 commits into
oven-sh:mainfrom
sorenbs:claude/fix-s3-arraybuffer-leak
Closed

[codex] Fix S3 arrayBuffer native buffer leak#29776
sorenbs wants to merge 2 commits into
oven-sh:mainfrom
sorenbs:claude/fix-s3-arraybuffer-leak

Conversation

@sorenbs

@sorenbs sorenbs commented Apr 27, 2026

Copy link
Copy Markdown

Created by Codex.

Fixes #29083.

Summary

This fixes the native RSS leak when reading S3-backed blobs with arrayBuffer() and other body materialization paths.

  • Treat S3 download response bytes as a temporary owned buffer instead of cloning from a stable blob store.
  • Free temporary bytes in the JSON and FormData conversion helpers.
  • Keep the in-memory Blob FormData path using .clone so existing blob storage is not freed accidentally.
  • Add a local MinIO-backed regression test for Bun.s3.file().arrayBuffer() RSS retention.

Root Cause

doReadFromS3() passed downloaded response bytes to the body conversion functions with lifetime .clone.

For arrayBuffer(), that produced the expected JavaScript result, but the source HTTP response body buffer remained allocated natively. Since that source buffer was not attached to the resulting JS ArrayBuffer, JS heap, external memory, and arrayBuffers stayed bounded while anonymous RSS grew with every S3 download.

The fixed path marks S3 download bytes as .temporary, which lets the existing conversion helpers consume/free those response bytes after producing the JS value.

Reproduction

I used MinIO in Docker and a temporary Linux container harness with an 8 MiB S3 object and a 1 GiB memory limit.

Original Bun (oven/bun:1.3.11) was OOM-killed with exit 137:

Run Iteration VmRSS RssAnon
original, last sample before OOM 108 1,061,912 KiB / 1037.02 MiB 1,025,036 KiB / 1001.01 MiB

The patched Linux release build continued well past that point:

Run Iteration VmRSS RssAnon
patched, before ending 512 162,056 KiB / 158.26 MiB 117,528 KiB / 114.77 MiB
patched, after 30s settle 512 112,332 KiB / 109.70 MiB 67,804 KiB / 66.21 MiB

The original run still had only one visible 8 MiB JS ArrayBuffer before the OOM kill, which matches this being a native response-buffer lifetime issue rather than JS heap growth.

Validation

  • USE_SYSTEM_BUN=1 bun test test/js/bun/s3/s3.leak.test.ts fails the new regression on installed Bun 1.3.13 as expected.
  • bun bd --asan=off test test/js/bun/s3/s3.leak.test.ts -t "arrayBuffer"
  • bun bd --asan=off test test/js/bun/s3
  • bun bd --asan=off test test/js/web/fetch/body.test.ts test/js/web/fetch/blob.test.ts test/js/web/fetch/body-stream.test.ts test/js/web/fetch/body-mixin-errors.test.ts
  • git diff --check

I also attempted bun bd --asan=off test; that run stopped in test/v8/v8.test.ts while node-gyp was downloading/building Node 24.3.0 headers and the hook timed out. That failure was unrelated to the S3 change.

@sorenbs sorenbs marked this pull request as ready for review April 28, 2026 12:19

@claude claude Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@coderabbitai

coderabbitai Bot commented Apr 28, 2026

Copy link
Copy Markdown
Contributor

Walkthrough

This PR refactors Blob lifetime handling to prevent retained buffers during JSON and FormData conversions, particularly fixing RSS growth when downloading from S3. It also introduces a Docker-based leak detection test that validates RSS bounds during repeated S3 ArrayBuffer operations.

Changes

Cohort / File(s) Summary
Blob Lifetime Management
src/bun.js/webcore/Blob.zig
Refactored lifetime handling for JSON and FormData conversions: doReadFromS3 marks downloaded buffers as .temporary, toJSONWithBytes uses unconditional early defer for cleanup, toFormDataWithBytes now captures and respects the lifetime parameter and frees temporary buffers, and toFormData decouples from caller's lifetime by always forwarding .clone to toFormDataWithBytes.
S3 Leak Testing Infrastructure
test/js/bun/s3/s3-arraybuffer-leak-fixture.js, test/js/bun/s3/s3.leak.test.ts
Added fixture script for S3 RSS monitoring that allocates configurable payloads, uploads to S3, downloads repeatedly as ArrayBuffer, and validates RSS does not exceed baseline plus threshold. Integrated MinIO-based leak test suite with Docker setup, bucket provisioning, and spawned Bun process validation.
🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title '[codex] Fix S3 arrayBuffer native buffer leak' clearly and concisely describes the main change: fixing a native buffer memory leak in S3 arrayBuffer operations.
Description check ✅ Passed The PR description comprehensively covers both template sections with a detailed summary of changes, root cause analysis, reproduction steps, and validation testing.
Linked Issues check ✅ Passed All code changes directly address issue #29083 requirements: treating S3 bytes as temporary, freeing buffers in conversion helpers, preserving Blob FormData path, and adding MinIO regression test.
Out of Scope Changes check ✅ Passed All changes are tightly scoped to fixing the S3 arrayBuffer leak: Blob.zig lifetime management updates, new S3 fixture script, and MinIO test integration—no unrelated modifications detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/js/bun/s3/s3.leak.test.ts`:
- Around line 53-75: Remove the unconditional empty-stderr assertion for the
Bun.spawnSync call: instead of asserting stderr.toString() === "" after running
the subprocess (the spawn in this block that returns exitCode and stderr), rely
on checking exitCode to detect regressions; update the test to drop or guard the
expect(stderr.toString()).toBe("") line and only assert expect(exitCode).toBe(0)
(or, if you need optional warnings, conditionally allow known ASAN startup
warnings from stderr before failing).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 36a6940b-709f-4955-a34e-9ddf1cf19c09

📥 Commits

Reviewing files that changed from the base of the PR and between 4d615e8 and 14d3b6b.

📒 Files selected for processing (3)
  • src/bun.js/webcore/Blob.zig
  • test/js/bun/s3/s3-arraybuffer-leak-fixture.js
  • test/js/bun/s3/s3.leak.test.ts

Comment on lines +53 to +75
const { exitCode, stderr } = Bun.spawnSync(
[bunExe(), "--smol", path.join(dir, "s3-arraybuffer-leak-fixture.js")],
{
env: {
...bunEnv,
BUN_JSC_gcMaxHeapSize: "503316",
AWS_ACCESS_KEY_ID: minioOptions!.accessKeyId as string,
AWS_SECRET_ACCESS_KEY: minioOptions!.secretAccessKey as string,
AWS_ENDPOINT: minioOptions!.endpoint as string,
AWS_BUCKET: "buntest",
PAYLOAD_MIB: "1",
WARMUP_ITERATIONS: "8",
ITERATIONS: "80",
MAX_ALLOWED_RSS_INCREMENT_MB: "64",
},
stderr: "pipe",
stdout: "inherit",
stdin: "ignore",
},
);

expect(stderr.toString()).toBe("");
expect(exitCode).toBe(0);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Drop the unconditional empty-stderr assertion for this subprocess.

This child runs JS via bunExe(), so ASAN/debug builds can emit the known startup warning on stderr even with bunEnv. That makes Line 74 a spurious failure while the actual regression signal here is exitCode.

Suggested change
-      const { exitCode, stderr } = Bun.spawnSync(
+      const { exitCode } = Bun.spawnSync(
         [bunExe(), "--smol", path.join(dir, "s3-arraybuffer-leak-fixture.js")],
         {
           env: {
             ...bunEnv,
             BUN_JSC_gcMaxHeapSize: "503316",
@@
-      expect(stderr.toString()).toBe("");
       expect(exitCode).toBe(0);
Based on learnings: JS-executing `bunExe()` subprocesses can emit the `WARNING: ASAN interferes...` line on stderr, and for crash-detection-style tests in `test/js/bun/**`, `exitCode` is the reliable regression guard while empty-stderr assertions add brittle maintenance surface.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/js/bun/s3/s3.leak.test.ts` around lines 53 - 75, Remove the
unconditional empty-stderr assertion for the Bun.spawnSync call: instead of
asserting stderr.toString() === "" after running the subprocess (the spawn in
this block that returns exitCode and stderr), rely on checking exitCode to
detect regressions; update the test to drop or guard the
expect(stderr.toString()).toBe("") line and only assert expect(exitCode).toBe(0)
(or, if you need optional warnings, conditionally allow known ASAN startup
warnings from stderr before failing).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bun.S3File.arrayBuffer() retains RSS and reaches OOM in a 1 GiB Linux container despite forced GC

2 participants