fix(formdata): preserve binary data with null bytes in multipart parsing by robobun · Pull Request #27483 · oven-sh/bun

robobun · 2026-02-26T20:59:06Z

Summary

Request.formData() truncated small binary files (≤8 bytes) at the first 0x00 (null) byte
Root cause: FormData.Field.value used bun.Semver.String, whose inline storage mode scans for null bytes to determine string length (C-string semantics)
Fix: Replace Field.value with a raw []const u8 slice into the input buffer, bypassing Semver.String entirely

Test plan

Added regression test test/regression/issue/27478.test.ts with 4 cases:
- Gzip header bytes ([0x1f, 0x8b, 0x08, 0x00]) - original issue repro
- All-null-byte file ([0x00, 0x00, 0x00, 0x00])
- Single null byte file ([0x00])
- 8-byte file with interleaved nulls ([0x01, 0x00, 0x02, 0x00, ...])
Tests pass with bun bd test and fail with USE_SYSTEM_BUN=1 bun test
Existing FormData tests pass (342 pass in body.test.ts, 110 pass in FormData.test.ts, 2 pass in form-data-set-append.test.js)

🤖 Generated with Claude Code

`Field.value` used `bun.Semver.String` which treats inline data (<=8 bytes) as null-terminated strings, truncating binary file content at the first 0x00 byte. Replace with a raw `[]const u8` slice into the input buffer. Closes #27478 Co-Authored-By: Claude <noreply@anthropic.com>

robobun · 2026-02-26T20:59:20Z

^{Updated 1:47 PM PT - Feb 26th, 2026}

❌ @autofix-ci[bot], your commit 7d19568 has 4 failures in Build #38179 (All Failures):

test/regression/issue/11297/11297.test.ts - 1 failing on 🪟 2019 x64-baseline
test/regression/issue/11297/11297.test.ts - 1 failing on 🪟 11 aarch64
test/integration/vite-build/vite-build.test.ts - 1 failing on 🪟 11 aarch64
test/bundler/bundler_compile_autoload.test.ts - 1 failing on 🐧 13 x64-asan
test/integration/next-pages/test/dev-server.test.ts - 1 failing on 🍎 13 aarch64
test/integration/next-pages/test/dev-server.test.ts - 1 failing on 🍎 14 aarch64

🧪 To try this PR locally:

bunx bun-pr 27483

That installs a local version of the PR into your bun-27483 executable, so you can run:

bun-27483 --bun

coderabbitai · 2026-02-26T21:01:03Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Disabled knowledge base sources:

Linear integration is disabled

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 9e3330a and 7d19568.

📒 Files selected for processing (3)

docs/bundler/executables.mdx
src/url.zig
test/regression/issue/27478.test.ts

Walkthrough

Changes to enable proper null-byte preservation in multipart form-data handling by modifying the FormData field value type from a string type to a raw binary slice, along with documentation formatting updates and regression test coverage for binary file parsing.

Changes

Cohort / File(s)	Summary
Documentation `docs/bundler/executables.mdx`	Formatting adjustments to the Supported targets table (spacing and alignment) and expanded explanatory content for AVX2 baseline warning with details on cross-compilation and detection behavior.
Core FormData Binary Support `src/url.zig`	Modified Field.value declaration from `Semver.String` to `[]const u8` to support binary data with null bytes; updated field value handling in multipart body parsing and entry processing to work directly with binary slices.
Test Coverage `test/regression/issue/27478.test.ts`	Added regression tests verifying that multipart/form-data formData() preserves null bytes in binary files across various edge cases, including null-byte distributions and exact byte-length validation.

🚥 Pre-merge checks | ✅ 4

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly summarizes the main change: fixing FormData to preserve binary data with null bytes during multipart parsing, which aligns with the core bug fix and code changes.
Description check	✅ Passed	The description includes both required sections (What does this PR do and How did you verify your code works), with detailed explanations of the bug, root cause, fix, and comprehensive test coverage.
Linked Issues check	✅ Passed	The PR fully addresses the linked issue `#27478` by replacing Semver.String with raw []const u8 slices to preserve null bytes, adding regression tests covering the reported bug case and edge cases, and verifying backward compatibility.
Out of Scope Changes check	✅ Passed	All changes are directly related to fixing issue `#27478`: src/url.zig modifies FormData.Field to preserve binary data, docs/bundler/executables.mdx contains only formatting updates unrelated to the fix, and test file adds regression tests.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

claude

LGTM — targeted fix replacing Semver.String (which truncates at null bytes) with a raw slice for binary field values, well-contained and well-tested.

Extended reasoning...

Overview

This PR fixes a bug where Request.formData() truncated small binary files at the first 0x00 byte. The change touches two files: src/url.zig (the fix) and a new regression test. The fix replaces FormData.Field.value from bun.Semver.String to []const u8, and updates the two sites where it is assigned and read.

Security risks

No security concerns. This fix actually improves correctness by preserving binary data integrity — the previous behavior silently truncated data, which could be considered a data integrity issue.

Level of scrutiny

Production-critical code (multipart form data parsing), but the change is minimal and well-scoped. I verified that FormData.Field is only used within url.zig itself — forEachMultipartEntry is called only from toJSFromMultipartData in the same file. The body slice points into the input buffer passed to the parser, which has sufficient lifetime throughout the callback execution.

Other factors

The filename and content_type fields correctly remain as bun.Semver.String since they contain text, not arbitrary binary data.
Four regression test cases cover the original issue, all-null bytes, single null byte, and 8-byte files with interleaved nulls (the Semver.String inline storage boundary).
Existing FormData test suites reportedly pass (342 + 110 + 2 tests).

claude

LGTM — targeted fix replacing Semver.String (C-string semantics) with a raw byte slice for FormData field values, with good regression tests.

Extended reasoning...

Overview

This PR fixes a bug where Request.formData() truncated small binary files at the first null byte. The core change is in src/url.zig: FormData.Field.value is changed from bun.Semver.String (which uses C-string null-termination for inline storage) to []const u8 (a raw byte slice). Two usage sites are updated accordingly. The PR also includes a benign markdown table formatting fix in docs/bundler/executables.mdx and a new regression test file.

Security risks

No security risks. The change actually fixes a data integrity issue — binary data was being silently truncated. The new code uses a direct slice into the input buffer, which has correct lifetime semantics (the input buffer outlives the callback).

Level of scrutiny

This is production-critical code (HTTP body parsing), so high scrutiny is warranted. However, the change is minimal (3 lines of Zig), the Field type is only used internally within url.zig (verified via grep — only two usage sites, both updated), and the fix is straightforward: replacing a string type that has null-termination semantics with a raw byte slice for binary data.

Other factors

No external callers access Field.value directly — the type is internal to the multipart parsing logic.
Regression tests cover the original issue (gzip header bytes), all-null files, single null byte, and 8-byte files with interleaved nulls.
The PR description confirms existing FormData tests continue to pass (342 + 110 + 2 tests).
CodeRabbit found no actionable issues.

…ing (oven-sh#27483) ## Summary - `Request.formData()` truncated small binary files (≤8 bytes) at the first `0x00` (null) byte - Root cause: `FormData.Field.value` used `bun.Semver.String`, whose inline storage mode scans for null bytes to determine string length (C-string semantics) - Fix: Replace `Field.value` with a raw `[]const u8` slice into the input buffer, bypassing `Semver.String` entirely ## Test plan - [x] Added regression test `test/regression/issue/27478.test.ts` with 4 cases: - Gzip header bytes (`[0x1f, 0x8b, 0x08, 0x00]`) - original issue repro - All-null-byte file (`[0x00, 0x00, 0x00, 0x00]`) - Single null byte file (`[0x00]`) - 8-byte file with interleaved nulls (`[0x01, 0x00, 0x02, 0x00, ...]`) - [x] Tests pass with `bun bd test` and fail with `USE_SYSTEM_BUN=1 bun test` - [x] Existing FormData tests pass (342 pass in body.test.ts, 110 pass in FormData.test.ts, 2 pass in form-data-set-append.test.js) Closes oven-sh#27478 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Bot <claude-bot@bun.sh> Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>

@as

The multipart parser stored part name, filename, and content-type as bun.Semver.String, which packs offset and length into 32-bit fields (with bit 31 of length stolen as a tag). For any part whose header sat past 4 GiB in the request body, the offset wrapped and the parser read garbage — field names came back as bytes from the middle of the preceding file body and the part was unreachable by name. This is the remaining half of the bug behind #21490. The file body itself (Field.value) was already switched to a raw slice in #27483, which fixed the 2 GiB length truncation, but name/filename/content_type were still u32-indexed. Switch all Field slices to raw []const u8 and drop the subslicer. Also remove a leftover @as(u32, @intcast(bytes.len)) in ArrayBuffer.fromBytes whose target fields are usize — it panicked on >4 GiB buffers in debug builds on the client-side FormData serialization path. Fixes #21490

@as

The multipart parser stored part name, filename, and content-type as bun.Semver.String, which packs offset and length into 32-bit fields (with bit 31 of length stolen as a tag). For any part whose header sat past 4 GiB in the request body, the offset wrapped and the parser read garbage — field names came back as bytes from the middle of the preceding file body and the part was unreachable by name. This is the remaining half of the bug behind #21490. The file body itself (Field.value) was already switched to a raw slice in #27483, which fixed the 2 GiB length truncation, but name/filename/content_type were still u32-indexed. Switch all Field slices to raw []const u8 and drop the subslicer. Also remove a leftover @as(u32, @intcast(bytes.len)) in ArrayBuffer.fromBytes whose target fields are usize — it panicked on >4 GiB buffers in debug builds on the client-side FormData serialization path. Fixes #21490

@as

The multipart parser stored part name, filename, and content-type as bun.Semver.String, which packs offset and length into 32-bit fields (with bit 31 of length stolen as a tag). For any part whose header sat past 4 GiB in the request body, the offset wrapped and the parser read garbage — field names came back as bytes from the middle of the preceding file body and the part was unreachable by name. This is the remaining half of the bug behind #21490. The file body itself (Field.value) was already switched to a raw slice in #27483, which fixed the 2 GiB length truncation, but name/filename/content_type were still u32-indexed. Switch all Field slices to raw []const u8 and drop the subslicer. Also remove a leftover @as(u32, @intcast(bytes.len)) in ArrayBuffer.fromBytes whose target fields are usize — it panicked on >4 GiB buffers in debug builds on the client-side FormData serialization path. Fixes #21490

@as

The multipart parser stored part name, filename, and content-type as bun.Semver.String, which packs offset and length into 32-bit fields (with bit 31 of length stolen as a tag). For any part whose header sat past 4 GiB in the request body, the offset wrapped and the parser read garbage — field names came back as bytes from the middle of the preceding file body and the part was unreachable by name. This is the remaining half of the bug behind #21490. The file body itself (Field.value) was already switched to a raw slice in #27483, which fixed the 2 GiB length truncation, but name/filename/content_type were still u32-indexed. Switch all Field slices to raw []const u8 and drop the subslicer. Also remove a leftover @as(u32, @intcast(bytes.len)) in ArrayBuffer.fromBytes whose target fields are usize — it panicked on >4 GiB buffers in debug builds on the client-side FormData serialization path. Fixes #21490

@as

The multipart parser stored part name, filename, and content-type as bun.Semver.String, which packs offset and length into 32-bit fields (with bit 31 of length stolen as a tag). For any part whose header sat past 4 GiB in the request body, the offset wrapped and the parser read garbage — field names came back as bytes from the middle of the preceding file body and the part was unreachable by name. This is the remaining half of the bug behind #21490. The file body itself (Field.value) was already switched to a raw slice in #27483, which fixed the 2 GiB length truncation, but name/filename/content_type were still u32-indexed. Switch all Field slices to raw []const u8 and drop the subslicer. Also remove a leftover @as(u32, @intcast(bytes.len)) in ArrayBuffer.fromBytes whose target fields are usize — it panicked on >4 GiB buffers in debug builds on the client-side FormData serialization path. Fixes #21490

github-actions Bot added the claude label Feb 26, 2026

[autofix.ci] apply automated fixes

7d19568

claude Bot reviewed Feb 26, 2026

View reviewed changes

Jarred-Sumner merged commit a870e7b into main Feb 28, 2026
60 of 64 checks passed

Jarred-Sumner deleted the claude/fix-formdata-null-byte-truncation-27478 branch February 28, 2026 10:42

robobun mentioned this pull request Mar 11, 2026

FormData multipart parsing truncates binary file content at null bytes (≤8 bytes files) #26740

Closed

github-actions Bot mentioned this pull request Mar 24, 2026

Update oven/bun Docker tag to v1.3.11 claytono/infra#1770

Merged

1 task

This was referenced May 13, 2026

fix(formdata): multipart parser corrupts parts located past 4 GiB #30609

Open

Bun Server Uploads Stop at 1,58 GB Despite Configured Higher Limit #21490

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(formdata): preserve binary data with null bytes in multipart parsing#27483

fix(formdata): preserve binary data with null bytes in multipart parsing#27483
Jarred-Sumner merged 2 commits into
mainfrom
claude/fix-formdata-null-byte-truncation-27478

robobun commented Feb 26, 2026

Uh oh!

robobun commented Feb 26, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Feb 26, 2026 •

edited

Loading

Uh oh!

claude Bot left a comment

Uh oh!

claude Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

robobun commented Feb 26, 2026

Summary

Test plan

Uh oh!

robobun commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Overview

Security risks

Level of scrutiny

Other factors

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Overview

Security risks

Level of scrutiny

Other factors

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

robobun commented Feb 26, 2026 •

edited

Loading

coderabbitai Bot commented Feb 26, 2026 •

edited

Loading