Skip to content

Make bun.webcore.Blob smaller and ref-counted#23015

Merged
taylordotfish merged 1 commit into
mainfrom
taylor.fish/blob
Sep 27, 2025
Merged

Make bun.webcore.Blob smaller and ref-counted#23015
taylordotfish merged 1 commit into
mainfrom
taylor.fish/blob

Conversation

@taylordotfish

Copy link
Copy Markdown
Contributor

Reduce the size of bun.webcore.Blob from 120 bytes to 96. Also make it ref-counted: in-progress work on improving the bindings generator depends on this, as it means C++ can pass a pointer to the Blob to Zig without risking it being destroyed if the GC collects the associated JSBlob.

Note that this PR depends on #23013.

(For internal tracking: fixes STAB-1289, STAB-1290)

@linear

linear Bot commented Sep 26, 2025

Copy link
Copy Markdown

@robobun

robobun commented Sep 26, 2025

Copy link
Copy Markdown
Collaborator
Updated 4:09 PM PT - Sep 26th, 2025

@taylordotfish, your commit c66bca5 has 3 failures in Build #27190 (All Failures):


🧪   To try this PR locally:

bunx bun-pr 23015

That installs a local version of the PR into your bun-23015 executable, so you can run:

bun-23015 --bun

@coderabbitai

coderabbitai Bot commented Sep 26, 2025

Copy link
Copy Markdown
Contributor

Walkthrough

Refactors Blob to be heap-managed and reference-counted with new ref/deref APIs, removes explicit allocator assignments across call sites, updates lifecycle/ownership checks, adjusts server routes to assert non-heap blobs, modifies macro runner blob handling, and changes C++ Blob binding to use an opaque BlobImpl with updated APIs.

Changes

Cohort / File(s) Summary of changes
Blob core: refcounting and opaque impl
src/bun.js/webcore/Blob.zig, src/bun.js/bindings/blob.h, src/bun.js/bindings/blob.cpp
Introduces heap-managed, ref-counted Blob with Blob__ref/Blob__deref; adds ownership/heap checks; replaces allocator field usage; adds Charset; switches C++ binding to opaque BlobImpl with Ref/RefPtr traits; updates API (Blob__setAsFile now void) and call sites; memoryCost uses impl().
Allocator removals and ownership alignment
src/StandaloneModuleGraph.zig, src/bun.js/api/BunObject.zig, src/bun.js/webcore/ObjectURLRegistry.zig, src/bun.js/webcore/S3Client.zig, src/bun.js/webcore/S3File.zig
Removes explicit Blob allocator assignments; simplifies S3File construction to return Blob.new directly; maintains existing control flow while relying on new Blob ownership semantics.
Body handling updates
src/bun.js/webcore/Body.zig
Replaces allocator-based logic with isHeapAllocated checks and bun.default_allocator for slices/MIME; updates Blob creation paths to Blob.new without setting allocator.
Server routes: enforce non-heap blobs
src/bun.js/api/server/FileRoute.zig, src/bun.js/api/server/StaticRoute.zig
Replaces blob.allocator=null with assertions ensuring blobs are not heap-allocated before duplication/usage.
Macro runner blob refactor
src/ast/Macro.zig
Changes blob type to pointer-to-const Blob; assigns resp directly; returns Expr.fromBlob immediately; removes explicit deinit and allocator manipulation.
Read file ownership transfer
src/bun.js/webcore/blob/read_file.zig
Replaces clearing allocator with context.takeOwnership() to transfer Blob ownership after promise swap.

Possibly related PRs

Suggested reviewers

  • zackradisic
  • markovejnovic

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Description Check ⚠️ Warning The description does not follow the repository’s required template headings and omits the “How did you verify your code works?” section, making it unclear how the changes were tested despite explaining the purpose. Please restructure the description to include the exact template headings “### What does this PR do?” and “### How did you verify your code works?” and provide concrete testing steps or results under the verification section.
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (3 passed)
Check name Status Explanation
Title Check ✅ Passed The title succinctly captures the primary goals of the changeset by specifying both the reduction in size and the addition of reference counting for bun.webcore.Blob, matching the PR’s main objectives without extra noise.
Linked Issues Check ✅ Passed The implementation adds explicit reference counting, removes the allocator field to shrink the struct, and updates the public API in Blob.zig as required by STAB-1289 and STAB-1290, fully satisfying both linked issues’ objectives.
Out of Scope Changes Check ✅ Passed All code modifications across Zig and C++ files directly address the new reference-counted design or accommodate the removal of the allocator field, and no unrelated functionality or modules were altered.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch taylor.fish/blob

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/bun.js/bindings/blob.cpp (1)

1-3: Include root.h per bindings guidance.

Bindings typically include root.h first to satisfy lints and config. Please add it.

+#include "root.h"
 #include "blob.h"
 #include "ZigGeneratedClasses.h"

As per coding guidelines

src/bun.js/webcore/Blob.zig (1)

4777-4815: Fix inverted heap-allocation check

new() initializes #ref_count to 1 for heap-backed blobs, while setNotHeapAllocated() resets it to 0 for stack/value blobs. With the current isHeapAllocated() returning true when the raw value is 0, every heap allocation now fails the finalize() assert, and non-heap copies take the bun.destroy(this) branch in deinit(), leading to double-frees/UAF. Flip the predicate so only > 0 is considered heap-allocated.

 pub fn isHeapAllocated(self: *const Blob) bool {
-    return self.#ref_count.raw_value == 0;
+    return self.#ref_count.raw_value != 0;
 }
🧹 Nitpick comments (1)
src/bun.js/bindings/blob.h (1)

14-16: Confirm extern ref/deref return types.

The externs return void*, but callers ignore the result. If not required by the Zig ABI, consider making them return void to avoid confusion. If mandated, ignore.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between a329da9 and d64d0689f0432c9dbe044a0bd33895e5f64c277e.

📒 Files selected for processing (17)
  • src/StandaloneModuleGraph.zig (0 hunks)
  • src/ast/Macro.zig (2 hunks)
  • src/bun.js/api/BunObject.zig (0 hunks)
  • src/bun.js/api/server/FileRoute.zig (2 hunks)
  • src/bun.js/api/server/StaticRoute.zig (1 hunks)
  • src/bun.js/bindings/blob.cpp (2 hunks)
  • src/bun.js/bindings/blob.h (2 hunks)
  • src/bun.js/webcore/Blob.zig (24 hunks)
  • src/bun.js/webcore/Body.zig (3 hunks)
  • src/bun.js/webcore/ObjectURLRegistry.zig (0 hunks)
  • src/bun.js/webcore/S3Client.zig (0 hunks)
  • src/bun.js/webcore/S3File.zig (1 hunks)
  • src/bun.js/webcore/blob/read_file.zig (1 hunks)
  • src/ptr.zig (2 hunks)
  • src/ptr/external_shared.zig (1 hunks)
  • src/ptr/raw_ref_count.zig (1 hunks)
  • src/string/WTFStringImpl.zig (1 hunks)
💤 Files with no reviewable changes (4)
  • src/bun.js/webcore/ObjectURLRegistry.zig
  • src/bun.js/api/BunObject.zig
  • src/StandaloneModuleGraph.zig
  • src/bun.js/webcore/S3Client.zig
👮 Files not reviewed due to content moderation or server errors (2)
  • src/bun.js/webcore/S3File.zig
  • src/bun.js/webcore/blob/read_file.zig
🧰 Additional context used
📓 Path-based instructions (8)
src/**/*.zig

📄 CodeRabbit inference engine (.cursor/rules/building-bun.mdc)

Implement debug logs in Zig using const log = bun.Output.scoped(.${SCOPE}, false); and invoking log("...", .{})

Files:

  • src/bun.js/api/server/StaticRoute.zig
  • src/ast/Macro.zig
  • src/bun.js/api/server/FileRoute.zig
  • src/bun.js/webcore/S3File.zig
  • src/string/WTFStringImpl.zig
  • src/ptr/external_shared.zig
  • src/ptr/raw_ref_count.zig
  • src/bun.js/webcore/blob/read_file.zig
  • src/bun.js/webcore/Body.zig
  • src/bun.js/webcore/Blob.zig
  • src/ptr.zig
**/*.zig

📄 CodeRabbit inference engine (.cursor/rules/javascriptcore-class.mdc)

**/*.zig: Declare the extern C symbol in Zig and export a Zig-friendly alias for use
Wrap the Bun____toJS extern in a Zig method that takes a JSGlobalObject and returns JSC.JSValue

**/*.zig: Format Zig files with zig-format (bun run zig-format)
In Zig, manage memory carefully with allocators and use defer for cleanup

Files:

  • src/bun.js/api/server/StaticRoute.zig
  • src/ast/Macro.zig
  • src/bun.js/api/server/FileRoute.zig
  • src/bun.js/webcore/S3File.zig
  • src/string/WTFStringImpl.zig
  • src/ptr/external_shared.zig
  • src/ptr/raw_ref_count.zig
  • src/bun.js/webcore/blob/read_file.zig
  • src/bun.js/webcore/Body.zig
  • src/bun.js/webcore/Blob.zig
  • src/ptr.zig
src/{**/js_*.zig,bun.js/api/**/*.zig}

📄 CodeRabbit inference engine (.cursor/rules/registering-bun-modules.mdc)

Use bun.JSError!JSValue for proper error propagation in JS-exposed Zig functions

Files:

  • src/bun.js/api/server/StaticRoute.zig
  • src/bun.js/api/server/FileRoute.zig
src/bun.js/**/*.zig

📄 CodeRabbit inference engine (.cursor/rules/zig-javascriptcore-classes.mdc)

src/bun.js/**/*.zig: In Zig binding structs, expose generated bindings via pub const js = JSC.Codegen.JS and re-export toJS/fromJS/fromJSDirect
Constructors and prototype methods should return bun.JSError!JSC.JSValue to integrate Zig error handling with JS exceptions
Use parameter name globalObject (not ctx) and accept (*JSC.JSGlobalObject, *JSC.CallFrame) in binding methods/constructors
Implement getters as get(this, globalObject) returning JSC.JSValue and matching the .classes.ts interface
Provide deinit() for resource cleanup and finalize() that calls deinit(); use bun.destroy(this) or appropriate destroy pattern
Access JS call data via CallFrame (argument(i), argumentCount(), thisValue()) and throw errors with globalObject.throw(...)
For properties marked cache: true, use the generated Zig accessors (NameSetCached/GetCached) to work with GC-owned values
In finalize() for objects holding JS references, release them using .deref() before destroy

Files:

  • src/bun.js/api/server/StaticRoute.zig
  • src/bun.js/api/server/FileRoute.zig
  • src/bun.js/webcore/S3File.zig
  • src/bun.js/webcore/blob/read_file.zig
  • src/bun.js/webcore/Body.zig
  • src/bun.js/webcore/Blob.zig
**/*.{cpp,h}

📄 CodeRabbit inference engine (.cursor/rules/javascriptcore-class.mdc)

**/*.{cpp,h}: When exposing a JS class with public Constructor and Prototype, define three C++ types: class Foo : public JSC::DestructibleObject (if it has C++ fields), class FooPrototype : public JSC::JSNonFinalObject, and class FooConstructor : public JSC::InternalFunction
If the class has C++ data members, inherit from JSC::DestructibleObject and provide proper destruction; if it has no C++ fields (only JS properties), avoid a class and use JSC::constructEmptyObject(vm, structure) with putDirectOffset
Prefer placing the subspaceFor implementation in the .cpp file rather than the header when possible

Files:

  • src/bun.js/bindings/blob.h
  • src/bun.js/bindings/blob.cpp
**/*.{cpp,cc,cxx,h,hpp,hxx}

📄 CodeRabbit inference engine (CLAUDE.md)

Format C/C++ sources/headers with clang-format (bun run clang-format)

Files:

  • src/bun.js/bindings/blob.h
  • src/bun.js/bindings/blob.cpp
**/*.cpp

📄 CodeRabbit inference engine (.cursor/rules/javascriptcore-class.mdc)

**/*.cpp: Include "root.h" at the top of C++ binding files to satisfy lints
Define prototype properties using a const HashTableValue array and declare accessors/functions with JSC_DECLARE_* macros
Prototype classes should subclass JSC::JSNonFinalObject, provide create/createStructure, DECLARE_INFO, finishCreation that reifies static properties, and set mayBePrototype on the Structure
Custom getters should use JSC_DEFINE_CUSTOM_GETTER, jsDynamicCast to validate this, and throwThisTypeError on mismatch
Custom setters should use JSC_DEFINE_CUSTOM_SETTER, validate this via jsDynamicCast, and store via WriteBarrier/set semantics
Prototype functions should use JSC_DEFINE_HOST_FUNCTION, validate this with jsDynamicCast, and return encoded JSValue
Constructors should subclass JSC::InternalFunction, return internalFunctionSpace in subspaceFor, set the prototype property as non-configurable/non-writable, and provide create/createStructure
Provide a setup function that builds the Prototype, Constructor, and Structure, and assigns them to the LazyClassStructure initializer
Use the cached Structure via globalObject->m_.get(globalObject) when constructing instances
Expose constructors to Zig via an extern "C" function that returns the constructor from the LazyClassStructure
Provide an extern "C" Bun____toJS function that creates an instance using the cached Structure and returns an EncodedJSValue

Files:

  • src/bun.js/bindings/blob.cpp
src/bun.js/bindings/**/*.cpp

📄 CodeRabbit inference engine (CLAUDE.md)

src/bun.js/bindings/**/*.cpp: For JS classes with a public constructor, implement Foo, FooPrototype, and FooConstructor (JSC patterns)
Define class properties using HashTableValue arrays in JSC bindings
Add iso subspaces for classes with C++ fields in JSC bindings
Cache structures in ZigGlobalObject for JSC-bound classes

Files:

  • src/bun.js/bindings/blob.cpp
🧠 Learnings (18)
📓 Common learnings
Learnt from: CR
PR: oven-sh/bun#0
File: .cursor/rules/registering-bun-modules.mdc:0-0
Timestamp: 2025-08-30T00:11:57.076Z
Learning: Applies to src/**/js_*.zig : Handle reference counting correctly with ref()/deref() in JS-facing Zig code
📚 Learning: 2025-08-30T00:13:36.815Z
Learnt from: CR
PR: oven-sh/bun#0
File: .cursor/rules/zig-javascriptcore-classes.mdc:0-0
Timestamp: 2025-08-30T00:13:36.815Z
Learning: Applies to src/bun.js/**/*.zig : Constructors and prototype methods should return bun.JSError!JSC.JSValue to integrate Zig error handling with JS exceptions

Applied to files:

  • src/bun.js/webcore/S3File.zig
📚 Learning: 2025-08-30T00:13:36.815Z
Learnt from: CR
PR: oven-sh/bun#0
File: .cursor/rules/zig-javascriptcore-classes.mdc:0-0
Timestamp: 2025-08-30T00:13:36.815Z
Learning: Applies to src/bun.js/bindings/generated_classes_list.zig : Update src/bun.js/bindings/generated_classes_list.zig to include new classes

Applied to files:

  • src/bun.js/webcore/S3File.zig
  • src/bun.js/webcore/Body.zig
  • src/bun.js/webcore/Blob.zig
📚 Learning: 2025-08-30T00:11:00.890Z
Learnt from: CR
PR: oven-sh/bun#0
File: .cursor/rules/javascriptcore-class.mdc:0-0
Timestamp: 2025-08-30T00:11:00.890Z
Learning: Applies to **/*.zig : Declare the extern C symbol in Zig and export a Zig-friendly alias for use

Applied to files:

  • src/string/WTFStringImpl.zig
  • src/ptr.zig
📚 Learning: 2025-09-08T04:44:59.101Z
Learnt from: CR
PR: oven-sh/bun#0
File: CLAUDE.md:0-0
Timestamp: 2025-09-08T04:44:59.101Z
Learning: Applies to src/bun.js/bindings/**/*.cpp : Add iso subspaces for classes with C++ fields in JSC bindings

Applied to files:

  • src/bun.js/bindings/blob.h
📚 Learning: 2025-09-08T00:41:12.052Z
Learnt from: CR
PR: oven-sh/bun#0
File: src/bun.js/bindings/v8/CLAUDE.md:0-0
Timestamp: 2025-09-08T00:41:12.052Z
Learning: Applies to src/bun.js/bindings/v8/**/*.{h,cpp} : Use JSC::WriteBarrier for heap-allocated references and implement visitChildren() for custom heap objects

Applied to files:

  • src/bun.js/bindings/blob.h
📚 Learning: 2025-08-30T00:11:00.890Z
Learnt from: CR
PR: oven-sh/bun#0
File: .cursor/rules/javascriptcore-class.mdc:0-0
Timestamp: 2025-08-30T00:11:00.890Z
Learning: Applies to **/*.cpp : Provide an extern "C" Bun__<Type>__toJS function that creates an instance using the cached Structure and returns an EncodedJSValue

Applied to files:

  • src/bun.js/bindings/blob.h
  • src/bun.js/bindings/blob.cpp
📚 Learning: 2025-08-30T00:11:57.076Z
Learnt from: CR
PR: oven-sh/bun#0
File: .cursor/rules/registering-bun-modules.mdc:0-0
Timestamp: 2025-08-30T00:11:57.076Z
Learning: Applies to src/**/js_*.zig : Handle reference counting correctly with ref()/deref() in JS-facing Zig code

Applied to files:

  • src/ptr/raw_ref_count.zig
  • src/ptr.zig
📚 Learning: 2025-09-02T17:14:01.470Z
Learnt from: taylordotfish
PR: oven-sh/bun#22227
File: src/ptr/shared.zig:470-470
Timestamp: 2025-09-02T17:14:01.470Z
Learning: In Zig, regular arithmetic operations (like +=) have built-in overflow detection in debug/safe builds and will panic automatically. Atomic operations (like fetchAdd) do not have this automatic safety checking, so explicit overflow checks must be added manually. This is why NonAtomicCount doesn't need explicit overflow checks while AtomicCount does.

Applied to files:

  • src/ptr/raw_ref_count.zig
📚 Learning: 2025-08-30T00:11:00.890Z
Learnt from: CR
PR: oven-sh/bun#0
File: .cursor/rules/javascriptcore-class.mdc:0-0
Timestamp: 2025-08-30T00:11:00.890Z
Learning: Applies to **/*.zig : Wrap the Bun__<Type>__toJS extern in a Zig method that takes a JSGlobalObject and returns JSC.JSValue

Applied to files:

  • src/bun.js/bindings/blob.cpp
  • src/bun.js/webcore/Blob.zig
📚 Learning: 2025-09-08T04:44:59.101Z
Learnt from: CR
PR: oven-sh/bun#0
File: CLAUDE.md:0-0
Timestamp: 2025-09-08T04:44:59.101Z
Learning: Applies to src/bun.js/bindings/**/*.cpp : Cache structures in ZigGlobalObject for JSC-bound classes

Applied to files:

  • src/bun.js/bindings/blob.cpp
📚 Learning: 2025-09-03T05:09:24.272Z
Learnt from: Jarred-Sumner
PR: oven-sh/bun#22335
File: src/http.zig:1376-1393
Timestamp: 2025-09-03T05:09:24.272Z
Learning: In bun's HTTP module, `bun.http.default_allocator` is the correct allocator to use for internal HTTP buffers like MutableString initialization for response_message_buffer and compressed_body. This allocator is initialized in HTTPThread.zig as `bun.http.default_arena.allocator()` and is used consistently across the HTTP module, separate from any local `default_allocator` variables.

Applied to files:

  • src/bun.js/webcore/Body.zig
📚 Learning: 2025-09-02T18:27:23.279Z
Learnt from: taylordotfish
PR: oven-sh/bun#22227
File: src/collections/multi_array_list.zig:24-24
Timestamp: 2025-09-02T18:27:23.279Z
Learning: The `#allocator` syntax in bun's custom Zig implementation is valid and does not require quoting with @"#allocator". Private fields using the `#` prefix work correctly throughout the codebase without special quoting syntax.

Applied to files:

  • src/bun.js/webcore/Body.zig
📚 Learning: 2025-09-02T17:09:17.391Z
Learnt from: taylordotfish
PR: oven-sh/bun#22227
File: src/allocators.zig:778-782
Timestamp: 2025-09-02T17:09:17.391Z
Learning: In Zig's bun codebase, the `isDefault` function in `src/allocators.zig` only needs to compare `allocator.vtable == c_allocator.vtable` rather than also checking the `ptr` field, because: (1) the codebase never creates multiple allocators that use `c_allocator.vtable` but have different `ptr`s, and (2) the default allocator vtable ignores the `ptr` field anyway, so any allocators sharing the same vtable would function identically.

Applied to files:

  • src/bun.js/webcore/Body.zig
📚 Learning: 2025-08-30T00:13:36.815Z
Learnt from: CR
PR: oven-sh/bun#0
File: .cursor/rules/zig-javascriptcore-classes.mdc:0-0
Timestamp: 2025-08-30T00:13:36.815Z
Learning: Applies to src/bun.js/**/*.zig : In Zig binding structs, expose generated bindings via pub const js = JSC.Codegen.JS<ClassName> and re-export toJS/fromJS/fromJSDirect

Applied to files:

  • src/bun.js/webcore/Blob.zig
📚 Learning: 2025-08-30T00:11:57.076Z
Learnt from: CR
PR: oven-sh/bun#0
File: .cursor/rules/registering-bun-modules.mdc:0-0
Timestamp: 2025-08-30T00:11:57.076Z
Learning: Applies to src/{**/js_*.zig,bun.js/api/**/*.zig} : Use bun.JSError!JSValue for proper error propagation in JS-exposed Zig functions

Applied to files:

  • src/bun.js/webcore/Blob.zig
📚 Learning: 2025-09-19T19:55:22.427Z
Learnt from: markovejnovic
PR: oven-sh/bun#22568
File: src/valkey/js_valkey_functions.zig:853-854
Timestamp: 2025-09-19T19:55:22.427Z
Learning: In Bun's JSPromise API, `resolvedPromiseValue(globalThis: *JSGlobalObject, value: JSValue) JSValue` returns a JSValue directly (not a JSPromise that needs `.toJS()` called on it). This is different from `resolvedPromise()` which returns a *JSPromise that would need `.toJS()`.

Applied to files:

  • src/bun.js/webcore/Blob.zig
📚 Learning: 2025-08-30T00:13:36.815Z
Learnt from: CR
PR: oven-sh/bun#0
File: .cursor/rules/zig-javascriptcore-classes.mdc:0-0
Timestamp: 2025-08-30T00:13:36.815Z
Learning: Applies to src/bun.js/**/*.zig : Use parameter name globalObject (not ctx) and accept (*JSC.JSGlobalObject, *JSC.CallFrame) in binding methods/constructors

Applied to files:

  • src/bun.js/webcore/Blob.zig
🧬 Code graph analysis (1)
src/bun.js/bindings/blob.cpp (2)
src/bun.js/bindings/node/http/JSHTTPParser.h (1)
  • impl (56-56)
src/bun.js/bindings/blob.h (4)
  • Blob (52-55)
  • Blob (57-60)
  • Blob (62-65)
  • Blob (87-92)
🔇 Additional comments (13)
src/ptr.zig (2)

16-16: LGTM: clean re-exports for ExternalShared.

Exporting ExternalShared from ptr is straightforward and aligns with the new shared-ownership utilities.


26-28: LGTM: RawRefCount exposure.

Re-exporting raw_ref_count and RawRefCount keeps API surface cohesive.

src/ast/Macro.zig (2)

345-354: Early-return refactor looks good (and avoids manual deinit).

Returning Expr.fromBlob directly simplifies control flow and aligns with ref-counted ownership.

Please confirm Expr.fromBlob takes responsibility for any necessary ref/deref on Blob inputs.


328-338: No action needed: value.as(jsc.WebCore.Blob) returns ?*jsc.WebCore.Blob and Expr.fromBlob expects *const jsc.WebCore.Blob.

src/string/WTFStringImpl.zig (2)

221-225: Correct ExternalShared descriptor exposure.

Providing external_shared_descriptor with ref/deref matches the ExternalShared contract.


227-229: WTFString alias via ExternalShared: LGTM.

This neatly exposes a shared, externally-managed handle for WTF::StringImpl.

src/bun.js/webcore/Body.zig (1)

1294-1299: LGTM: Content-Type parsing uses default allocator consistently.

The switch to bun.default_allocator for header slice and MimeType init aligns with the allocator simplification.

src/bun.js/bindings/blob.cpp (2)

5-5: Signature change to void: OK.

Updating Blob__setAsFile to void and adjusting call sites is consistent and keeps behavior unchanged.


29-32: Using impl() in memoryCost is correct post-refactor.

This matches the new BlobImpl-backed design.

src/bun.js/bindings/blob.h (4)

20-43: Ref/deref traits look correct and idiomatic.

Traits forward to C externs and integrate with Ref/RefPtr nicely.


44-66: Factory methods align with adopted BlobImpl.

create/createAdopted paths consistently dupe/adopt the opaque impl; good.


80-86: Constructor adopts BlobImpl via RefPtr: LGTM.

Initializer uses adoptRef with custom traits—matches intended lifetime semantics.


94-96: RefPtr member: LGTM.

m_impl as BlobRefPtr correctly encapsulates the impl lifetime.

Comment thread src/bun.js/api/server/FileRoute.zig
Comment thread src/bun.js/api/server/FileRoute.zig
Comment thread src/bun.js/api/server/StaticRoute.zig
Comment thread src/bun.js/webcore/Body.zig
Comment thread src/bun.js/webcore/Body.zig
Comment thread src/ptr/external_shared.zig
Comment thread src/ptr/raw_ref_count.zig
Comment thread src/ptr/raw_ref_count.zig

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/ptr/external_shared.zig (1)

10-11: Strengthen compile-time checks for descriptor signatures.

Ensure both functions match fn(*T) void to catch mismatch early.

-    _ = T.external_shared_descriptor.ref; // must define a `ref` function
-    _ = T.external_shared_descriptor.deref; // must define a `deref` function
+    comptime {
+        const FnExpected = fn(*T) void;
+        if (@TypeOf(T.external_shared_descriptor.ref) != FnExpected)
+            @compileError("external_shared_descriptor.ref must be: fn(*T) void");
+        if (@TypeOf(T.external_shared_descriptor.deref) != FnExpected)
+            @compileError("external_shared_descriptor.deref must be: fn(*T) void");
+    }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between d64d0689f0432c9dbe044a0bd33895e5f64c277e and 32ec8a421a937c404539e8c5567551bc246bfdfd.

📒 Files selected for processing (1)
  • src/ptr/external_shared.zig (1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
src/**/*.zig

📄 CodeRabbit inference engine (.cursor/rules/building-bun.mdc)

Implement debug logs in Zig using const log = bun.Output.scoped(.${SCOPE}, false); and invoking log("...", .{})

Files:

  • src/ptr/external_shared.zig
**/*.zig

📄 CodeRabbit inference engine (.cursor/rules/javascriptcore-class.mdc)

**/*.zig: Declare the extern C symbol in Zig and export a Zig-friendly alias for use
Wrap the Bun____toJS extern in a Zig method that takes a JSGlobalObject and returns JSC.JSValue

**/*.zig: Format Zig files with zig-format (bun run zig-format)
In Zig, manage memory carefully with allocators and use defer for cleanup

Files:

  • src/ptr/external_shared.zig
🔇 Additional comments (2)
src/ptr/external_shared.zig (2)

53-103: Optional’s methods must use its own Self type via @this(); current signatures won’t resolve and mis-type receivers/returns.

Inside the nested struct, refer to itself with a local alias (e.g., SelfOptional = @this()). This also updates field name per the prior fix.

-        pub const Optional = struct {
-            #impl: ?*T = null,
+        pub const Optional = struct {
+            const SelfOptional = @This();
+            raw: ?*T = null,

-            pub fn initNull() Optional {
+            pub fn initNull() SelfOptional {
                 return .{};
             }

             /// `incremented_raw`, if non-null, should have already had its ref count incremented
             /// by 1.
-            pub fn adopt(incremented_raw: ?*T) Optional {
-                return .{ .#impl = incremented_raw };
+            pub fn adopt(incremented_raw: ?*T) SelfOptional {
+                return .{ .raw = incremented_raw };
             }

-            pub fn deinit(self: *Optional) void {
-                if (self.#impl) |impl| {
+            pub fn deinit(self: *SelfOptional) void {
+                if (self.raw) |impl| {
                     T.external_shared_descriptor.deref(impl);
                 }
                 self.* = undefined;
             }

-            pub fn get(self: Optional) ?*T {
-                return self.#impl;
+            pub fn get(self: SelfOptional) ?*T {
+                return self.raw;
             }

             /// Sets `self` to null.
-            pub fn take(self: *Optional) ?NonOptional {
-                const result: NonOptional = .{ .#impl = self.#impl orelse return null };
-                self.#impl = null;
+            pub fn take(self: *SelfOptional) ?NonOptional {
+                const result: NonOptional = .{ .raw = self.raw orelse return null };
+                self.raw = null;
                 return result;
             }

-            pub fn clone(self: Optional) Optional {
-                if (self.#impl) |impl| {
+            pub fn clone(self: SelfOptional) SelfOptional {
+                if (self.raw) |impl| {
                     T.external_shared_descriptor.ref(impl);
                 }
                 return self;
             }

-            pub fn cloneFromRaw(raw: ?*T) Optional {
+            pub fn cloneFromRaw(raw: ?*T) SelfOptional {
                 if (raw) |some_raw| {
                     T.external_shared_descriptor.ref(some_raw);
                 }
-                return .{ .#impl = raw };
+                return .{ .raw = raw };
             }

             /// Returns the raw pointer without decrementing the ref count. Invalidates `self`.
-            pub fn leak(self: *Optional) ?*T {
+            pub fn leak(self: *SelfOptional) ?*T {
                 defer self.* = undefined;
-                return self.#impl;
+                return self.raw;
             }
         };

15-49: Fix invalid Zig identifier '#impl' (won’t compile) — rename to a valid field like 'raw' and update all uses.

Zig identifiers can’t contain '#'. This breaks parsing everywhere the field is referenced.

Apply this diff:

-        #impl: *T,
+        raw: *T,

         /// `incremented_raw` should have already had its ref count incremented by 1.
         pub fn adopt(incremented_raw: *T) Self {
-            return .{ .#impl = incremented_raw };
+            return .{ .raw = incremented_raw };
         }

         /// Deinitializes the shared pointer, decrementing the ref count.
         pub fn deinit(self: *Self) void {
-            T.external_shared_descriptor.deref(self.#impl);
+            T.external_shared_descriptor.deref(self.raw);
             self.* = undefined;
         }

         /// Gets the underlying pointer. This pointer may not be valid after `self` is
         /// deinitialized.
         pub fn get(self: Self) *T {
-            return self.#impl;
+            return self.raw;
         }

         /// Clones the shared pointer, incrementing the ref count.
         pub fn clone(self: Self) Self {
-            T.external_shared_descriptor.ref(self.#impl);
+            T.external_shared_descriptor.ref(self.raw);
             return self;
         }

         pub fn cloneFromRaw(raw: *T) Self {
             T.external_shared_descriptor.ref(raw);
-            return .{ .#impl = raw };
+            return .{ .raw = raw };
         }

         /// Returns the raw pointer without decrementing the ref count. Invalidates `self`.
         pub fn leak(self: *Self) *T {
             defer self.* = undefined;
-            return self.#impl;
+            return self.raw;
         }

         /// Invalidates `self`.
         pub fn intoOptional(self: *Self) Optional {
             defer self.* = undefined;
-            return .{ .#impl = self.#impl };
+            return .{ .raw = self.raw };
         }

Also applies to: 105-109

⛔ Skipped due to learnings
Learnt from: taylordotfish
PR: oven-sh/bun#22227
File: src/allocators/allocation_scope.zig:284-314
Timestamp: 2025-09-02T18:25:27.976Z
Learning: In bun's custom Zig implementation, the `#` prefix for private fields is valid syntax and should not be flagged as invalid. The syntax `#fieldname` creates private fields that cannot be accessed from outside the defining struct, and usage like `self.#fieldname` is correct within the same struct. This applies to fields like `#parent`, `#state`, `#allocator`, `#trace`, etc. throughout the codebase.
Learnt from: CR
PR: oven-sh/bun#0
File: .cursor/rules/javascriptcore-class.mdc:0-0
Timestamp: 2025-08-30T00:11:00.890Z
Learning: Applies to **/*.zig : Declare the extern C symbol in Zig and export a Zig-friendly alias for use
Learnt from: taylordotfish
PR: oven-sh/bun#22227
File: src/safety/alloc.zig:93-95
Timestamp: 2025-09-02T17:14:46.924Z
Learning: In bun's Zig codebase, they use a custom extension of Zig that supports private field syntax with the `#` prefix (e.g., `#allocator`, `#trace`). This is not standard Zig syntax but is valid in their custom implementation. Fields prefixed with `#` are private fields that cannot be accessed from outside the defining struct.
Learnt from: CR
PR: oven-sh/bun#0
File: .cursor/rules/registering-bun-modules.mdc:0-0
Timestamp: 2025-08-30T00:11:57.076Z
Learning: Applies to src/**/js_*.zig : Handle reference counting correctly with ref()/deref() in JS-facing Zig code
Learnt from: taylordotfish
PR: oven-sh/bun#22227
File: src/bun.zig:668-675
Timestamp: 2025-09-02T18:32:32.309Z
Learning: The `std.meta.trait` module has been removed from the Zig standard library, which will cause compilation failures in functions that use it like `std.meta.trait.isSingleItemPtr()`. This affects functions like `isHeapMemory()` in `src/bun.zig`.
Learnt from: cirospaciari
PR: oven-sh/bun#22842
File: src/bun.js/webcore/ResumableSink.zig:274-276
Timestamp: 2025-09-25T18:14:27.722Z
Learning: In Zig code, private fields are declared and accessed using the `#` prefix. When a field is declared as `#field_name`, it must be accessed as `this.#field_name`, not `this.field_name`. The `#` prefix is part of the private field access syntax and should not be removed.

Comment thread src/ptr/external_shared.zig Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/bun.js/webcore/Blob.zig (2)

3464-3472: Fix tri-state ASCII detection: current code won’t compile.

this.isAllASCII() orelse this.store.?.is_all_ascii yields a plain bool, but it’s later treated as ?bool (== null and .?). Make it explicitly ?bool.

Apply this diff:

-    const could_be_all_ascii = this.isAllASCII() orelse this.store.?.is_all_ascii;
-    if (could_be_all_ascii == null or !could_be_all_ascii.?) {
+    var could_be_all_ascii: ?bool = this.isAllASCII();
+    if (could_be_all_ascii == null and this.store != null) {
+        could_be_all_ascii = this.store.?.is_all_ascii;
+    }
+    if (could_be_all_ascii == null or !could_be_all_ascii.?) {

3572-3581: Repeat the same tri-state fix in JSON path.

Same ?bool vs bool issue here; mirror the fix to avoid compile errors.

Apply this diff:

-    const could_be_all_ascii = this.isAllASCII() orelse this.store.?.is_all_ascii;
+    var could_be_all_ascii: ?bool = this.isAllASCII();
+    if (could_be_all_ascii == null and this.store != null) {
+        could_be_all_ascii = this.store.?.is_all_ascii;
+    }
🧹 Nitpick comments (2)
src/ptr/external_shared.zig (1)

53-103: Minor: prefer @this() alias inside Optional for consistency.

Defining a local alias (e.g., const SelfOptional = @This();) and using it for receivers/returns improves clarity and prevents future rename hazards.

src/bun.js/webcore/Blob.zig (1)

3232-3240: Finalizer refcount drop is fine; consider a more direct expression.

var shared = Blob.Ref.adopt(this); shared.deinit(); effectively does a single deref(). For readability, consider calling Blob__deref(this) directly.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 32ec8a421a937c404539e8c5567551bc246bfdfd and a0070f0410ea33943054ee5ddb906aee48181b5a.

📒 Files selected for processing (2)
  • src/bun.js/webcore/Blob.zig (24 hunks)
  • src/ptr/external_shared.zig (1 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
src/**/*.zig

📄 CodeRabbit inference engine (.cursor/rules/building-bun.mdc)

Implement debug logs in Zig using const log = bun.Output.scoped(.${SCOPE}, false); and invoking log("...", .{})

Files:

  • src/ptr/external_shared.zig
  • src/bun.js/webcore/Blob.zig
**/*.zig

📄 CodeRabbit inference engine (.cursor/rules/javascriptcore-class.mdc)

**/*.zig: Declare the extern C symbol in Zig and export a Zig-friendly alias for use
Wrap the Bun____toJS extern in a Zig method that takes a JSGlobalObject and returns JSC.JSValue

**/*.zig: Format Zig files with zig-format (bun run zig-format)
In Zig, manage memory carefully with allocators and use defer for cleanup

Files:

  • src/ptr/external_shared.zig
  • src/bun.js/webcore/Blob.zig
src/bun.js/**/*.zig

📄 CodeRabbit inference engine (.cursor/rules/zig-javascriptcore-classes.mdc)

src/bun.js/**/*.zig: In Zig binding structs, expose generated bindings via pub const js = JSC.Codegen.JS and re-export toJS/fromJS/fromJSDirect
Constructors and prototype methods should return bun.JSError!JSC.JSValue to integrate Zig error handling with JS exceptions
Use parameter name globalObject (not ctx) and accept (*JSC.JSGlobalObject, *JSC.CallFrame) in binding methods/constructors
Implement getters as get(this, globalObject) returning JSC.JSValue and matching the .classes.ts interface
Provide deinit() for resource cleanup and finalize() that calls deinit(); use bun.destroy(this) or appropriate destroy pattern
Access JS call data via CallFrame (argument(i), argumentCount(), thisValue()) and throw errors with globalObject.throw(...)
For properties marked cache: true, use the generated Zig accessors (NameSetCached/GetCached) to work with GC-owned values
In finalize() for objects holding JS references, release them using .deref() before destroy

Files:

  • src/bun.js/webcore/Blob.zig
🧠 Learnings (9)
📚 Learning: 2025-08-30T00:13:36.815Z
Learnt from: CR
PR: oven-sh/bun#0
File: .cursor/rules/zig-javascriptcore-classes.mdc:0-0
Timestamp: 2025-08-30T00:13:36.815Z
Learning: Applies to src/bun.js/bindings/generated_classes_list.zig : Update src/bun.js/bindings/generated_classes_list.zig to include new classes

Applied to files:

  • src/bun.js/webcore/Blob.zig
📚 Learning: 2025-08-30T00:13:36.815Z
Learnt from: CR
PR: oven-sh/bun#0
File: .cursor/rules/zig-javascriptcore-classes.mdc:0-0
Timestamp: 2025-08-30T00:13:36.815Z
Learning: Applies to src/bun.js/**/*.zig : For properties marked cache: true, use the generated Zig accessors (<Prop>NameSetCached/GetCached) to work with GC-owned values

Applied to files:

  • src/bun.js/webcore/Blob.zig
📚 Learning: 2025-08-30T00:13:36.815Z
Learnt from: CR
PR: oven-sh/bun#0
File: .cursor/rules/zig-javascriptcore-classes.mdc:0-0
Timestamp: 2025-08-30T00:13:36.815Z
Learning: Applies to src/bun.js/**/*.zig : In Zig binding structs, expose generated bindings via pub const js = JSC.Codegen.JS<ClassName> and re-export toJS/fromJS/fromJSDirect

Applied to files:

  • src/bun.js/webcore/Blob.zig
📚 Learning: 2025-08-30T00:11:57.076Z
Learnt from: CR
PR: oven-sh/bun#0
File: .cursor/rules/registering-bun-modules.mdc:0-0
Timestamp: 2025-08-30T00:11:57.076Z
Learning: Applies to src/**/js_*.zig : Handle reference counting correctly with ref()/deref() in JS-facing Zig code

Applied to files:

  • src/bun.js/webcore/Blob.zig
📚 Learning: 2025-08-30T00:13:36.815Z
Learnt from: CR
PR: oven-sh/bun#0
File: .cursor/rules/zig-javascriptcore-classes.mdc:0-0
Timestamp: 2025-08-30T00:13:36.815Z
Learning: Applies to src/bun.js/**/*.zig : Constructors and prototype methods should return bun.JSError!JSC.JSValue to integrate Zig error handling with JS exceptions

Applied to files:

  • src/bun.js/webcore/Blob.zig
📚 Learning: 2025-08-30T00:11:00.890Z
Learnt from: CR
PR: oven-sh/bun#0
File: .cursor/rules/javascriptcore-class.mdc:0-0
Timestamp: 2025-08-30T00:11:00.890Z
Learning: Applies to **/*.zig : Wrap the Bun__<Type>__toJS extern in a Zig method that takes a JSGlobalObject and returns JSC.JSValue

Applied to files:

  • src/bun.js/webcore/Blob.zig
📚 Learning: 2025-08-30T00:11:57.076Z
Learnt from: CR
PR: oven-sh/bun#0
File: .cursor/rules/registering-bun-modules.mdc:0-0
Timestamp: 2025-08-30T00:11:57.076Z
Learning: Applies to src/{**/js_*.zig,bun.js/api/**/*.zig} : Use bun.JSError!JSValue for proper error propagation in JS-exposed Zig functions

Applied to files:

  • src/bun.js/webcore/Blob.zig
📚 Learning: 2025-09-19T19:55:22.427Z
Learnt from: markovejnovic
PR: oven-sh/bun#22568
File: src/valkey/js_valkey_functions.zig:853-854
Timestamp: 2025-09-19T19:55:22.427Z
Learning: In Bun's JSPromise API, `resolvedPromiseValue(globalThis: *JSGlobalObject, value: JSValue) JSValue` returns a JSValue directly (not a JSPromise that needs `.toJS()` called on it). This is different from `resolvedPromise()` which returns a *JSPromise that would need `.toJS()`.

Applied to files:

  • src/bun.js/webcore/Blob.zig
📚 Learning: 2025-08-30T00:13:36.815Z
Learnt from: CR
PR: oven-sh/bun#0
File: .cursor/rules/zig-javascriptcore-classes.mdc:0-0
Timestamp: 2025-08-30T00:13:36.815Z
Learning: Applies to src/bun.js/**/*.zig : Use parameter name globalObject (not ctx) and accept (*JSC.JSGlobalObject, *JSC.CallFrame) in binding methods/constructors

Applied to files:

  • src/bun.js/webcore/Blob.zig
🔇 Additional comments (3)
src/ptr/external_shared.zig (1)

53-71: Optional self-type fix looks correct (resolves prior blocker).

Methods in Optional now use self: *Optional / self: Optional and return Optional, avoiding the earlier type mismatch against the outer Self. This should compile cleanly.

src/bun.js/webcore/Blob.zig (2)

16-20: Heap allocation with refcount=1 is appropriate.

Initializing #ref_count to 1 on new() aligns with the GC finalizer decrement semantics.


4747-4768: Charset enum addition is a solid tri-state replacement for prior flag.

API is minimal and clear; fromIsAllASCII and isAllASCII() read well.

Reduce the size of `bun.webcore.Blob` from 120 bytes to 96. Also make it
ref-counted: in-progress work on improving the bindings generator
depends on this, as it means C++ can pass a pointer to the `Blob` to Zig
without risking it being destroyed if the GC collects the associated
`JSBlob`.

Note that this PR depends on #23013.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/bun.js/webcore/Blob.zig (2)

353-362: Compilation risk: try used in a void function.

onStructuredCloneSerialize returns void, but uses try. Either propagate errors (change signature) or explicitly handle them. Given StructuredCloneWriter.WriteError = error{} (empty), prefer catching as unreachable to keep signature stable.

Apply this diff:

-    try _onStructuredCloneSerialize(this, Writer, writer);
+    _onStructuredCloneSerialize(this, Writer, writer) catch unreachable;

3370-3393: Fix use-after-free in Blob.dupeWithContentType

Change both duped.isHeapAllocated() checks to this.isHeapAllocated() so the logic inspects the original Blob’s ref count, not the newly zeroed copy:

--- a/src/bun.js/webcore/Blob.zig
+++ b/src/bun.js/webcore/Blob.zig
@@ 3370,7 +3370,7 @@ pub fn dupeWithContentType(this: *const Blob, include_content_type: bool) Blob {
     duped.setNotHeapAllocated();
-    if (duped.content_type_allocated and duped.isHeapAllocated() and !include_content_type) {
+    if (duped.content_type_allocated and this.isHeapAllocated() and !include_content_type) {
         // avoid use-after-free
@@ 3388,7 +3388,7 @@ pub fn dupeWithContentType(this: *const Blob, include_content_type: bool) Blob {
         duped.content_type_was_set = duped.content_type.len > 0;
     } else if (duped.content_type_allocated and duped.isHeapAllocated() and include_content_type) {
-        duped.content_type = bun.handleOom(bun.default_allocator.dupe(u8, this.content_type));
+    } else if (duped.content_type_allocated and this.isHeapAllocated() and include_content_type) {
         duped.content_type = bun.handleOom(bun.default_allocator.dupe(u8, this.content_type));
     }
🧹 Nitpick comments (2)
src/bun.js/webcore/Blob.zig (2)

1078-1079: Avoid extra struct copy by duping directly into new().

Current code is fine; minor micro-optimization:

-        const cloned = Blob.new(source_blob.dupe());
-        return JSPromise.resolvedPromiseValue(ctx, cloned.toJS(ctx));
+        return JSPromise.resolvedPromiseValue(ctx, Blob.new(source_blob.dupe()).toJS(ctx));

124-170: Added debug logs to file reads (LGTM).

Helpful for diagnosing I/O paths; keep logs guarded to avoid noise in production if needed.

If you want consistency with internal guideline naming, consider const log = bun.Output.scoped(.Blob, false); and use log(...). As per coding guidelines.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between a0070f0410ea33943054ee5ddb906aee48181b5a and c66bca5.

📒 Files selected for processing (13)
  • src/StandaloneModuleGraph.zig (0 hunks)
  • src/ast/Macro.zig (2 hunks)
  • src/bun.js/api/BunObject.zig (0 hunks)
  • src/bun.js/api/server/FileRoute.zig (2 hunks)
  • src/bun.js/api/server/StaticRoute.zig (1 hunks)
  • src/bun.js/bindings/blob.cpp (2 hunks)
  • src/bun.js/bindings/blob.h (2 hunks)
  • src/bun.js/webcore/Blob.zig (24 hunks)
  • src/bun.js/webcore/Body.zig (3 hunks)
  • src/bun.js/webcore/ObjectURLRegistry.zig (0 hunks)
  • src/bun.js/webcore/S3Client.zig (0 hunks)
  • src/bun.js/webcore/S3File.zig (1 hunks)
  • src/bun.js/webcore/blob/read_file.zig (1 hunks)
💤 Files with no reviewable changes (4)
  • src/bun.js/webcore/ObjectURLRegistry.zig
  • src/bun.js/webcore/S3Client.zig
  • src/StandaloneModuleGraph.zig
  • src/bun.js/api/BunObject.zig
🚧 Files skipped from review as they are similar to previous changes (4)
  • src/bun.js/api/server/FileRoute.zig
  • src/bun.js/api/server/StaticRoute.zig
  • src/bun.js/bindings/blob.cpp
  • src/bun.js/webcore/S3File.zig
🧰 Additional context used
📓 Path-based instructions (5)
src/**/*.zig

📄 CodeRabbit inference engine (.cursor/rules/building-bun.mdc)

Implement debug logs in Zig using const log = bun.Output.scoped(.${SCOPE}, false); and invoking log("...", .{})

Files:

  • src/ast/Macro.zig
  • src/bun.js/webcore/blob/read_file.zig
  • src/bun.js/webcore/Blob.zig
  • src/bun.js/webcore/Body.zig
**/*.zig

📄 CodeRabbit inference engine (.cursor/rules/javascriptcore-class.mdc)

**/*.zig: Declare the extern C symbol in Zig and export a Zig-friendly alias for use
Wrap the Bun____toJS extern in a Zig method that takes a JSGlobalObject and returns JSC.JSValue

**/*.zig: Format Zig files with zig-format (bun run zig-format)
In Zig, manage memory carefully with allocators and use defer for cleanup

Files:

  • src/ast/Macro.zig
  • src/bun.js/webcore/blob/read_file.zig
  • src/bun.js/webcore/Blob.zig
  • src/bun.js/webcore/Body.zig
src/bun.js/**/*.zig

📄 CodeRabbit inference engine (.cursor/rules/zig-javascriptcore-classes.mdc)

src/bun.js/**/*.zig: In Zig binding structs, expose generated bindings via pub const js = JSC.Codegen.JS and re-export toJS/fromJS/fromJSDirect
Constructors and prototype methods should return bun.JSError!JSC.JSValue to integrate Zig error handling with JS exceptions
Use parameter name globalObject (not ctx) and accept (*JSC.JSGlobalObject, *JSC.CallFrame) in binding methods/constructors
Implement getters as get(this, globalObject) returning JSC.JSValue and matching the .classes.ts interface
Provide deinit() for resource cleanup and finalize() that calls deinit(); use bun.destroy(this) or appropriate destroy pattern
Access JS call data via CallFrame (argument(i), argumentCount(), thisValue()) and throw errors with globalObject.throw(...)
For properties marked cache: true, use the generated Zig accessors (NameSetCached/GetCached) to work with GC-owned values
In finalize() for objects holding JS references, release them using .deref() before destroy

Files:

  • src/bun.js/webcore/blob/read_file.zig
  • src/bun.js/webcore/Blob.zig
  • src/bun.js/webcore/Body.zig
**/*.{cpp,h}

📄 CodeRabbit inference engine (.cursor/rules/javascriptcore-class.mdc)

**/*.{cpp,h}: When exposing a JS class with public Constructor and Prototype, define three C++ types: class Foo : public JSC::DestructibleObject (if it has C++ fields), class FooPrototype : public JSC::JSNonFinalObject, and class FooConstructor : public JSC::InternalFunction
If the class has C++ data members, inherit from JSC::DestructibleObject and provide proper destruction; if it has no C++ fields (only JS properties), avoid a class and use JSC::constructEmptyObject(vm, structure) with putDirectOffset
Prefer placing the subspaceFor implementation in the .cpp file rather than the header when possible

Files:

  • src/bun.js/bindings/blob.h
**/*.{cpp,cc,cxx,h,hpp,hxx}

📄 CodeRabbit inference engine (CLAUDE.md)

Format C/C++ sources/headers with clang-format (bun run clang-format)

Files:

  • src/bun.js/bindings/blob.h
🧠 Learnings (12)
📚 Learning: 2025-08-30T00:11:00.890Z
Learnt from: CR
PR: oven-sh/bun#0
File: .cursor/rules/javascriptcore-class.mdc:0-0
Timestamp: 2025-08-30T00:11:00.890Z
Learning: Applies to **/*.cpp : Provide an extern "C" Bun__<Type>__toJS function that creates an instance using the cached Structure and returns an EncodedJSValue

Applied to files:

  • src/bun.js/bindings/blob.h
📚 Learning: 2025-08-30T00:13:36.815Z
Learnt from: CR
PR: oven-sh/bun#0
File: .cursor/rules/zig-javascriptcore-classes.mdc:0-0
Timestamp: 2025-08-30T00:13:36.815Z
Learning: Applies to src/bun.js/bindings/generated_classes_list.zig : Update src/bun.js/bindings/generated_classes_list.zig to include new classes

Applied to files:

  • src/bun.js/webcore/Blob.zig
  • src/bun.js/webcore/Body.zig
📚 Learning: 2025-08-30T00:13:36.815Z
Learnt from: CR
PR: oven-sh/bun#0
File: .cursor/rules/zig-javascriptcore-classes.mdc:0-0
Timestamp: 2025-08-30T00:13:36.815Z
Learning: Applies to src/bun.js/**/*.zig : For properties marked cache: true, use the generated Zig accessors (<Prop>NameSetCached/GetCached) to work with GC-owned values

Applied to files:

  • src/bun.js/webcore/Blob.zig
📚 Learning: 2025-08-30T00:11:57.076Z
Learnt from: CR
PR: oven-sh/bun#0
File: .cursor/rules/registering-bun-modules.mdc:0-0
Timestamp: 2025-08-30T00:11:57.076Z
Learning: Applies to src/**/js_*.zig : Handle reference counting correctly with ref()/deref() in JS-facing Zig code

Applied to files:

  • src/bun.js/webcore/Blob.zig
📚 Learning: 2025-08-30T00:13:36.815Z
Learnt from: CR
PR: oven-sh/bun#0
File: .cursor/rules/zig-javascriptcore-classes.mdc:0-0
Timestamp: 2025-08-30T00:13:36.815Z
Learning: Applies to src/bun.js/**/*.zig : In Zig binding structs, expose generated bindings via pub const js = JSC.Codegen.JS<ClassName> and re-export toJS/fromJS/fromJSDirect

Applied to files:

  • src/bun.js/webcore/Blob.zig
📚 Learning: 2025-08-30T00:13:36.815Z
Learnt from: CR
PR: oven-sh/bun#0
File: .cursor/rules/zig-javascriptcore-classes.mdc:0-0
Timestamp: 2025-08-30T00:13:36.815Z
Learning: Applies to src/bun.js/**/*.zig : Constructors and prototype methods should return bun.JSError!JSC.JSValue to integrate Zig error handling with JS exceptions

Applied to files:

  • src/bun.js/webcore/Blob.zig
📚 Learning: 2025-08-30T00:11:00.890Z
Learnt from: CR
PR: oven-sh/bun#0
File: .cursor/rules/javascriptcore-class.mdc:0-0
Timestamp: 2025-08-30T00:11:00.890Z
Learning: Applies to **/*.zig : Wrap the Bun__<Type>__toJS extern in a Zig method that takes a JSGlobalObject and returns JSC.JSValue

Applied to files:

  • src/bun.js/webcore/Blob.zig
📚 Learning: 2025-08-30T00:11:57.076Z
Learnt from: CR
PR: oven-sh/bun#0
File: .cursor/rules/registering-bun-modules.mdc:0-0
Timestamp: 2025-08-30T00:11:57.076Z
Learning: Applies to src/{**/js_*.zig,bun.js/api/**/*.zig} : Use bun.JSError!JSValue for proper error propagation in JS-exposed Zig functions

Applied to files:

  • src/bun.js/webcore/Blob.zig
📚 Learning: 2025-09-19T19:55:22.427Z
Learnt from: markovejnovic
PR: oven-sh/bun#22568
File: src/valkey/js_valkey_functions.zig:853-854
Timestamp: 2025-09-19T19:55:22.427Z
Learning: In Bun's JSPromise API, `resolvedPromiseValue(globalThis: *JSGlobalObject, value: JSValue) JSValue` returns a JSValue directly (not a JSPromise that needs `.toJS()` called on it). This is different from `resolvedPromise()` which returns a *JSPromise that would need `.toJS()`.

Applied to files:

  • src/bun.js/webcore/Blob.zig
📚 Learning: 2025-08-30T00:13:36.815Z
Learnt from: CR
PR: oven-sh/bun#0
File: .cursor/rules/zig-javascriptcore-classes.mdc:0-0
Timestamp: 2025-08-30T00:13:36.815Z
Learning: Applies to src/bun.js/**/*.zig : Use parameter name globalObject (not ctx) and accept (*JSC.JSGlobalObject, *JSC.CallFrame) in binding methods/constructors

Applied to files:

  • src/bun.js/webcore/Blob.zig
📚 Learning: 2025-09-03T05:09:24.272Z
Learnt from: Jarred-Sumner
PR: oven-sh/bun#22335
File: src/http.zig:1376-1393
Timestamp: 2025-09-03T05:09:24.272Z
Learning: In bun's HTTP module, `bun.http.default_allocator` is the correct allocator to use for internal HTTP buffers like MutableString initialization for response_message_buffer and compressed_body. This allocator is initialized in HTTPThread.zig as `bun.http.default_arena.allocator()` and is used consistently across the HTTP module, separate from any local `default_allocator` variables.

Applied to files:

  • src/bun.js/webcore/Body.zig
📚 Learning: 2025-09-02T18:27:23.279Z
Learnt from: taylordotfish
PR: oven-sh/bun#22227
File: src/collections/multi_array_list.zig:24-24
Timestamp: 2025-09-02T18:27:23.279Z
Learning: The `#allocator` syntax in bun's custom Zig implementation is valid and does not require quoting with @"#allocator". Private fields using the `#` prefix work correctly throughout the codebase without special quoting syntax.

Applied to files:

  • src/bun.js/webcore/Body.zig
🧬 Code graph analysis (1)
src/bun.js/bindings/blob.h (2)
src/bun.js/bindings/webcore/MessageEvent.cpp (10)
  • create (68-71)
  • create (68-68)
  • create (73-76)
  • create (73-73)
  • create (83-86)
  • create (83-83)
  • create (90-93)
  • create (90-90)
  • create (95-117)
  • create (95-95)
src/bun.js/bindings/webcore/WebSocket.cpp (6)
  • create (223-226)
  • create (223-223)
  • create (228-231)
  • create (228-228)
  • create (233-248)
  • create (233-233)
🔇 Additional comments (10)
src/bun.js/webcore/Body.zig (2)

763-764: Use debug-only assert to avoid release aborts.

Line 763 still calls assert(!new_blob.isHeapAllocated()), which will terminate release builds if a heap-backed blob sneaks through. We only want this check in debug builds—swap to bun.debugAssert so production keeps running while we retain the safeguard in debug.

-                assert(!new_blob.isHeapAllocated()); // owned by Body
+                bun.debugAssert(!new_blob.isHeapAllocated()); // owned by Body

1082-1083: Same: guard with bun.debugAssert.

assert(!body.value.Blob.isHeapAllocated()) will also abort production; use the debug-only variant instead.

-        assert(!body.value.Blob.isHeapAllocated()); // owned by Body
+        bun.debugAssert(!body.value.Blob.isHeapAllocated()); // owned by Body
src/bun.js/webcore/Blob.zig (8)

16-20: Constructor correctly sets heap refcount (LGTM).

Heap-alloc via bun.new and initializing the refcount to 1 is correct for ExternalShared semantics.


3233-3240: Finalize adopts into ExternalShared before deref (LGTM).

This preserves external refs and avoids premature frees when JS finalizer runs.


4747-4768: Compact charset representation (LGTM).

Replacing ad-hoc flags with an enum is clearer and reduces size. Helpers are sensible.


4769-4776: takeOwnership invalidates source in-place—verify callers.

Writing self.* = undefined is intentional poison; ensure no caller touches self after calling this. If any external uses keep *Blob around, convert them to consume by value instead.


4790-4800: Ref/deref guards (LGTM).

Asserting heap-allocation on ref/deref and destroying on last deref matches the new ownership model.


2766-2783: Slice copies charset field (LGTM).

Carrying charset into sub-slices is correct for ≤-range slices.


638-640: Blob__dupe returns a heap blob (LGTM).

Wrapping dupeWithContentType(true) with new() ensures safe handoff to JS.


509-517: Heap-alloc assertion after clone-deserialize (LGTM).

This protects against stack/inline blobs leaking into JS values.

@taylordotfish taylordotfish merged commit eb04e4e into main Sep 27, 2025
56 of 61 checks passed
@taylordotfish taylordotfish deleted the taylor.fish/blob branch September 27, 2025 00:18
@coderabbitai coderabbitai Bot mentioned this pull request Jan 18, 2026
4 tasks
Jarred-Sumner added a commit that referenced this pull request Apr 29, 2026
…9910)

## What

`Blob.dupeWithContentType` guarded its content_type handling on
`duped.isHeapAllocated()` immediately *after* calling
`duped.setNotHeapAllocated()`, so both branches were dead. When the
source Blob's `content_type` is heap-allocated, the bitwise-copied dupe
aliased the same allocation while both sides had `content_type_allocated
== true`.

This is a regression from #23015: the pre-refactor code checked
`duped.allocator != null` *before* clearing it at the end of the
function; the refactor moved the clear to the top but left the (renamed)
guard in place.

## Repro

```js
const file = Bun.file(path, { type: "application/x-custom-type-not-in-registry-abcdefghijklm" });
const response = new Response(file);               // body holds a dupe that aliases file.content_type
await file.write("hello", { type: "application/x-..." }); // frees file.content_type
response.headers.get("content-type");              // reads freed memory
```

On ASAN builds:
```
==716==ERROR: AddressSanitizer: use-after-poison on address 0x71df454301c0
    #1 in Zig::toStringCopy(ZigString) helpers.h:217
    #2 in WebCore__FetchHeaders__put bindings.cpp:2082
    #5 in bun.js.webcore.Response.getOrCreateHeaders Response.zig:358
```

On release builds the freed slot gets reused and the read produces
garbage:
```
TypeError: Header '25' has invalid value: 'ion/x-custom-type-not-in-registry-abcdefghijklm'
```

## Fix

Drop the `isHeapAllocated()` guard and always deep-copy an allocated
`content_type` in `dupeWithContentType`. The old `!include_content_type`
branch's "resolve to static mime or fall back to empty" is gone — it
would have dropped FormData's `multipart/form-data; boundary=...` (and
any non-registry type) on `Response.clone()`, and the branch itself was
marked `// TODO: fix this / this is a bug`. The `include_content_type`
parameter is now a no-op.

Since every dupe now owns its `content_type` copy, `Blob.deinit()` frees
it. That in turn required closing a few places that held a
bitwise-copied Blob alongside the live owner:

- `fromJSWithoutDeferGC` `move=true`: deep-copy `name`/`content_type`
into the moved-out value so the source JS Blob keeps sole ownership; the
BuildArtifact arm now `dupe()`s (its "move" only nulled the store on a
local copy).
- `getSliceFrom()`: free the dupe's copy before overwriting it with the
slice's own type.
- `doWrite`/`getWriter`: clear `content_type_allocated` after the
in-place free so a registry-resolved static string isn't later freed by
`deinit()`.
- `BlobOrStringOrBuffer.deinitAndUnprotect`: only deref the store
(matching its `deinit()`) since `.blob` is a raw view of a live JS Blob.

## Verified

- `bun bd test test/js/web/fetch/blob.test.ts` — 16/16 pass
- New UAF test fails on both debug/ASAN (use-after-poison) and system
bun (garbage header) without the fix, passes with it
- New clone test guards against dropping FormData's boundary on
`Response.clone()`
- ASAN stress: 1k×
`Response.clone`/`blob.slice`/`createObjectURL`+revoke/`new
Response([blob])`/`write({type})` — no double-free
- RSS is flat across 50k × 1KB-type `Response.clone()` and
`blob.slice()`

---------

Co-authored-by: robobun <robobun@users.noreply.github.com>
Co-authored-by: Jarred Sumner <jarred@jarredsumner.com>
xhjkl pushed a commit to xhjkl/bun that referenced this pull request May 14, 2026
…en-sh#29910)

## What

`Blob.dupeWithContentType` guarded its content_type handling on
`duped.isHeapAllocated()` immediately *after* calling
`duped.setNotHeapAllocated()`, so both branches were dead. When the
source Blob's `content_type` is heap-allocated, the bitwise-copied dupe
aliased the same allocation while both sides had `content_type_allocated
== true`.

This is a regression from oven-sh#23015: the pre-refactor code checked
`duped.allocator != null` *before* clearing it at the end of the
function; the refactor moved the clear to the top but left the (renamed)
guard in place.

## Repro

```js
const file = Bun.file(path, { type: "application/x-custom-type-not-in-registry-abcdefghijklm" });
const response = new Response(file);               // body holds a dupe that aliases file.content_type
await file.write("hello", { type: "application/x-..." }); // frees file.content_type
response.headers.get("content-type");              // reads freed memory
```

On ASAN builds:
```
==716==ERROR: AddressSanitizer: use-after-poison on address 0x71df454301c0
    oven-sh#1 in Zig::toStringCopy(ZigString) helpers.h:217
    oven-sh#2 in WebCore__FetchHeaders__put bindings.cpp:2082
    oven-sh#5 in bun.js.webcore.Response.getOrCreateHeaders Response.zig:358
```

On release builds the freed slot gets reused and the read produces
garbage:
```
TypeError: Header '25' has invalid value: 'ion/x-custom-type-not-in-registry-abcdefghijklm'
```

## Fix

Drop the `isHeapAllocated()` guard and always deep-copy an allocated
`content_type` in `dupeWithContentType`. The old `!include_content_type`
branch's "resolve to static mime or fall back to empty" is gone — it
would have dropped FormData's `multipart/form-data; boundary=...` (and
any non-registry type) on `Response.clone()`, and the branch itself was
marked `// TODO: fix this / this is a bug`. The `include_content_type`
parameter is now a no-op.

Since every dupe now owns its `content_type` copy, `Blob.deinit()` frees
it. That in turn required closing a few places that held a
bitwise-copied Blob alongside the live owner:

- `fromJSWithoutDeferGC` `move=true`: deep-copy `name`/`content_type`
into the moved-out value so the source JS Blob keeps sole ownership; the
BuildArtifact arm now `dupe()`s (its "move" only nulled the store on a
local copy).
- `getSliceFrom()`: free the dupe's copy before overwriting it with the
slice's own type.
- `doWrite`/`getWriter`: clear `content_type_allocated` after the
in-place free so a registry-resolved static string isn't later freed by
`deinit()`.
- `BlobOrStringOrBuffer.deinitAndUnprotect`: only deref the store
(matching its `deinit()`) since `.blob` is a raw view of a live JS Blob.

## Verified

- `bun bd test test/js/web/fetch/blob.test.ts` — 16/16 pass
- New UAF test fails on both debug/ASAN (use-after-poison) and system
bun (garbage header) without the fix, passes with it
- New clone test guards against dropping FormData's boundary on
`Response.clone()`
- ASAN stress: 1k×
`Response.clone`/`blob.slice`/`createObjectURL`+revoke/`new
Response([blob])`/`write({type})` — no double-free
- RSS is flat across 50k × 1KB-type `Response.clone()` and
`blob.slice()`

---------

Co-authored-by: robobun <robobun@users.noreply.github.com>
Co-authored-by: Jarred Sumner <jarred@jarredsumner.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants