Skip to content

fix(zlib): prevent use-after-free in WorkPool compression operations#28250

Closed
robobun wants to merge 5 commits into
mainfrom
claude/fix-zlib-uaf-22567
Closed

fix(zlib): prevent use-after-free in WorkPool compression operations#28250
robobun wants to merge 5 commits into
mainfrom
claude/fix-zlib-uaf-22567

Conversation

@robobun

@robobun robobun commented Mar 18, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Fix use-after-free crash when piping HTTP responses through zlib.createGunzip() (and other compression streams)
  • Hold strong GC references to input/output buffer JSValues during async WorkPool decompression to prevent garbage collection while native threads are still using the underlying memory
  • Apply the fix consistently across NativeZlib, NativeBrotli, and NativeZstd

Root Cause

In CompressionStream.write(), raw pointers to JS ArrayBuffer backing stores are extracted and passed to a WorkPool thread for async compression/decompression. However, no strong references held the original JS buffer objects alive, so the GC could collect them between scheduling the WorkPool task and its completion — causing a segfault when the native thread accesses freed memory.

The fix adds in_buf_value and out_buf_value Strong.Optional fields that are set before scheduling work and cleared when the work completes.

Closes #22567

Test plan

  • Added regression test test/regression/issue/22567.test.ts that pipes HTTP responses through createGunzip() with forced GC
  • Existing zlib tests pass (deflate-streaming, zlib-handle-bounds-check, bytesWritten)

🤖 Generated with Claude Code

Hold strong GC references to input/output buffer JSValues during async
WorkPool decompression to prevent the garbage collector from collecting
them while the native thread is still reading/writing the underlying
memory.

Closes #22567

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@robobun

robobun commented Mar 18, 2026

Copy link
Copy Markdown
Collaborator Author
Updated 10:17 AM PT - Mar 18th, 2026

❌ Your commit 5334fb6f has 1 failures in Build #40012 (All Failures):


🧪   To try this PR locally:

bunx bun-pr 28250

That installs a local version of the PR into your bun-28250 executable, so you can run:

bun-28250 --bun

@coderabbitai

coderabbitai Bot commented Mar 18, 2026

Copy link
Copy Markdown
Contributor

Warning

Rate limit exceeded

@robobun has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 1 minutes and 45 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: a41e31df-dbd8-45d5-9b1f-2a5d0a94a259

📥 Commits

Reviewing files that changed from the base of the PR and between 7b3a1c5 and 469a0d4.

📒 Files selected for processing (1)
  • test/regression/issue/22567.test.ts

Walkthrough

Holds strong references to input and output buffers during async zlib/Brotli/Zstd WorkPool operations, releases them after work completes or on close, and adds a regression test that pipes HTTP responses through zlib.createGunzip() to verify no crash.

Changes

Cohort / File(s) Summary
Native bindings — Brotli/Zlib/Zstd
src/bun.js/node/zlib/NativeBrotli.zig, src/bun.js/node/zlib/NativeZlib.zig, src/bun.js/node/zlib/NativeZstd.zig
Added in_buf_value and out_buf_value (jsc.Strong.Optional) fields to hold strong refs to input/output buffers; updated deinit to deinitialize these fields.
Runtime zlib binding implementation
src/bun.js/node/node_zlib_binding.zig
CompressionStream.write now stores strong references to buffers before scheduling async work; runFromJSThread releases those references after work completes; closeInternal deinitializes them on close.
Regression test
test/regression/issue/22567.test.ts
Added test that starts an HTTP server sending gzip data and performs parallel requests piping responses through zlib.createGunzip(), triggers GC, and asserts process prints "OK" without crashing.
🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: preventing use-after-free bugs in WorkPool compression operations by holding strong GC references to buffers.
Description check ✅ Passed The PR description covers the summary, root cause, fix approach, and test plan; it follows the repository template with 'What does this PR do?' and 'How did you verify your code works?' sections.
Linked Issues check ✅ Passed All coding requirements from issue #22567 are met: strong buffer references (in_buf_value/out_buf_value) are added across NativeZlib, NativeBrotli, NativeZstd; regression test verifies the fix prevents segfaults.
Out of Scope Changes check ✅ Passed All changes are scoped to the use-after-free fix: new fields in compression stream structs, buffer lifecycle management, and a targeted regression test; no unrelated modifications detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

📝 Coding Plan
  • Generate coding plan for human review comments

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/regression/issue/22567.test.ts`:
- Line 53: Remove the timer-based watchdog: delete the setTimeout(() => {
console.error("timeout"); process.exit(1); }, 10000) from the spawned test
script and instead rely on the test harness awaiting the completion condition
(e.g., await the child process exit/promise or the specific condition you
already assert). Ensure any spawn logic (the code that calls setTimeout) returns
or awaits the promise that resolves/rejects on the real completion condition so
the test framework's timeout controls failures rather than an explicit
process.exit call.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: f43cc32f-a540-4b05-be48-72e2066e4f36

📥 Commits

Reviewing files that changed from the base of the PR and between a7e9abe and 2dbecdd.

📒 Files selected for processing (5)
  • src/bun.js/node/node_zlib_binding.zig
  • src/bun.js/node/zlib/NativeBrotli.zig
  • src/bun.js/node/zlib/NativeZlib.zig
  • src/bun.js/node/zlib/NativeZstd.zig
  • test/regression/issue/22567.test.ts

Comment thread test/regression/issue/22567.test.ts Outdated
@github-actions

Copy link
Copy Markdown
Contributor

Found 5 issues this PR may fix:

  1. Safari WebSocket crashes with perMessageDeflate when mixing server.publish() and ws.send() #25673 - Safari WebSocket crashes with perMessageDeflate compression, exactly the type of use-after-free issue in compression operations that this PR addresses
  2. deflateSync produces binary that inflateSync fails to decompress #8886 - deflateSync produces binary that inflateSync fails to decompress, likely caused by memory corruption during async compression operations
  3. Decompression error: ZlibError - empty chunked gzip response breaks fetch() #23149 - Decompression error with ZlibError on empty chunked gzip responses, could be caused by GC collecting buffers during decompression
  4. Bun.zstdDecompress lacks decompressed size limit #28162 - Bun.zstdDecompress crashes, could benefit from the buffer management improvements in zstd operations
  5. GZIP negative windowBits argument not working #6280 - GZIP negative windowBits argument not working, may be related to memory corruption during gzip operations

If this is helpful, consider adding Fixes #<number> to the PR description to auto-close the issue on merge.

🤖 Generated with Claude Code

Claude Bot and others added 2 commits March 18, 2026 17:00
Remove the manual setTimeout watchdog from the spawned test subprocess
and rely on the test harness timeout instead. The subprocess exits via
process.exit(0) in the server.close() callback.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/regression/issue/22567.test.ts`:
- Line 21: Replace the repetitive string construction in the test's gzip.write
call: instead of using "x".repeat(80) produce the 80-byte repeated payload via
Buffer.alloc(80, 'x').toString() so the test follows the repository convention;
update the gzip.write invocation that currently builds "Line " + i + ": " +
"x".repeat(80) + "\\n" to use Buffer.alloc(80, 'x').toString() for the repeated
content.
- Around line 32-50: The http.get call can fail before its response callback,
leaving completed unchanged and the test hung; attach an error handler to the
Request object returned by http.get (the same call that currently sets up the
response callback) and on error log the error and call process.exit(1) so
failures surface immediately; ensure you reference the existing completed/total
variables and still call server.close if needed (or exit immediately) to avoid
hanging processes.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: c8e54036-e9c6-424f-8b36-7b7c8f5a1081

📥 Commits

Reviewing files that changed from the base of the PR and between 2dbecdd and 7b3a1c5.

📒 Files selected for processing (1)
  • test/regression/issue/22567.test.ts

const gzip = zlib.createGzip();
gzip.pipe(res);
for (let i = 0; i < 50; i++) {
gzip.write("Line " + i + ": " + "x".repeat(80) + "\\n");

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use Buffer.alloc(...).toString() for repetitive test payload content.

Line 21 currently uses "x".repeat(80), which conflicts with repository testing conventions.

♻️ Suggested fix
-    gzip.write("Line " + i + ": " + "x".repeat(80) + "\\n");
+    gzip.write("Line " + i + ": " + Buffer.alloc(80, "x").toString() + "\\n");

As per coding guidelines: "Use Buffer.alloc(count, fill).toString() instead of 'A'.repeat(count) to create repetitive strings in tests."

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
gzip.write("Line " + i + ": " + "x".repeat(80) + "\\n");
gzip.write("Line " + i + ": " + Buffer.alloc(80, "x").toString() + "\\n");
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/regression/issue/22567.test.ts` at line 21, Replace the repetitive
string construction in the test's gzip.write call: instead of using
"x".repeat(80) produce the 80-byte repeated payload via Buffer.alloc(80,
'x').toString() so the test follows the repository convention; update the
gzip.write invocation that currently builds "Line " + i + ": " + "x".repeat(80)
+ "\\n" to use Buffer.alloc(80, 'x').toString() for the repeated content.

Comment thread test/regression/issue/22567.test.ts
Claude Bot and others added 2 commits March 18, 2026 17:16
Handle connection-level errors on http.get to prevent the test from
hanging if the request itself fails before receiving a response.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

@claude claude Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — the fix correctly mirrors the existing this_value strong-reference pattern to also root the input/output buffers during async WorkPool operations.

Extended reasoning...

Overview

This PR fixes a use-after-free crash in zlib/Brotli/Zstd compression streams by holding strong GC references (jsc.Strong.Optional) to input and output JS buffer objects while their backing memory is being used by a WorkPool thread. The fix adds in_buf_value and out_buf_value fields to all three native compression types (NativeZlib, NativeBrotli, NativeZstd) and manages their lifecycle in CompressionStream.write(), runFromJSThread(), closeInternal(), and deinit(). A regression test is also included. As a bonus, the PR fixes a pre-existing bug in NativeZstd where this.this_value.deinit() was missing from deinit().

Security risks

This PR improves memory safety by preventing a use-after-free. There are no new security risks introduced — the change only adds GC rooting for buffers that were previously unrooted during async operations.

Level of scrutiny

The fix is mechanical and follows an established pattern already used for this_value in the same code. The strong references are set in the exact same location where this_value.set() is called, and cleared in the exact same locations where this_value is cleared/deinited. The write_in_progress guard prevents concurrent writes, so there is no risk of overwriting a previous strong reference. The pattern is applied identically across all three compression types.

Other factors

  • The coderabbitai comments about setTimeout and http.get error handling have been addressed in follow-up commits.
  • The remaining unaddressed nit about "x".repeat(80) vs Buffer.alloc is cosmetic and does not affect correctness.
  • The bug report about the explicit test timeout (15_000ms) is a valid convention violation but is a minor nit that does not affect the correctness of the fix itself.

}
expect(stdout.trim()).toBe("OK");
expect(exitCode).toBe(0);
}, 15_000);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 The test sets an explicit 15_000 ms timeout at line 68, violating the project convention in test/CLAUDE.md: "CRITICAL: Do not set a timeout on tests. Bun already has timeouts." Remove the second argument from the test() call.

Extended reasoning...

What the issue is

The regression test at test/regression/issue/22567.test.ts line 68 passes 15_000 as the second argument to test(), which sets an explicit 15-second timeout on the test case. The project's test/CLAUDE.md (line 120) explicitly states: "CRITICAL: Do not set a timeout on tests. Bun already has timeouts."

The specific code

Line 68 reads:

}, 15_000);

This is the closing of the test("pipe HTTP response through createGunzip without crash", async () => { ... }, 15_000) call. The 15_000 is the timeout parameter.

Why this matters

While this is a convention/style violation rather than a functional bug, the CLAUDE.md rule exists for good reason: Bun's test runner already provides default timeouts, and explicit timeouts create maintenance burden and inconsistency across the test suite. The rule is marked as "CRITICAL" in the project guidelines.

Impact

The test will still pass correctly with or without the explicit timeout. This is purely a convention violation. It's worth noting that ~20 other regression test files also have explicit timeouts, so this is a widespread pattern, but new code should follow the documented convention.

How to fix

Simply remove the , 15_000 argument from line 68, changing:

}, 15_000);

to:

});

Bun's default test timeout will handle the case where the test hangs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

cant NOT reproduced, close it if not help

2 participants