Skip to content

esm bytecode#26402

Merged
Jarred-Sumner merged 72 commits into
mainfrom
ali/esm-bytecode
Jan 30, 2026
Merged

esm bytecode#26402
Jarred-Sumner merged 72 commits into
mainfrom
ali/esm-bytecode

Conversation

@alii

@alii alii commented Jan 23, 2026

Copy link
Copy Markdown
Member

What does this PR do?

How did you verify your code works?

@robobun

robobun commented Jan 23, 2026

Copy link
Copy Markdown
Collaborator
Updated 1:42 AM PT - Jan 30th, 2026

@alii, your commit 1c0d6eb has 2 failures in Build #36201 (All Failures):


🧪   To try this PR locally:

bunx bun-pr 26402

That installs a local version of the PR into your bun-26402 executable, so you can run:

bun-26402 --bun

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/bun.js/RuntimeTranspilerCache.zig (1)

203-229: Set esm_record_hash when building metadata.
Metadata.esm_record_hash is encoded but never assigned, so it stays zero even when data exists. If the field is meant for integrity checks (like output_hash), it should be populated.

🛠️ Suggested fix
@@
                     metadata.output_hash = hash(output_bytes);
                     metadata.sourcemap_hash = hash(sourcemap);
+                    metadata.esm_record_hash = if (esm_record.len > 0) hash(esm_record) else 0;
🤖 Fix all issues with AI agents
In `@src/bun.js/ModuleLoader.zig`:
- Around line 1200-1204: The current .module_info assignment swallows errors
from analyze_transpiled_module.ModuleInfoDeserialized.create with "catch null";
change it to catch the error, log a debug/error message including the error and
context (e.g., the file identifier like file.path or file.name and that this
occurred in ModuleLoader.zig during ModuleInfoDeserialized.create), and then
return null so behavior is unchanged; ensure the log uses the project's existing
logging facility (or std.debug.print if none) and still assigns .module_info =
null on failure.

In `@src/bun.js/RuntimeTranspilerStore.zig`:
- Around line 574-581: When mi.finalize() fails inside the module_info block, we
must set this.parse_error before destroying module_info and returning so the job
reports a clear failure; update the error handling in RuntimeTranspilerStore.zig
around the module_info/mi.finalize() call to assign this.parse_error =
error.ModuleInfoFinalizationFailed (or add ModuleInfoFinalizationFailed to the
error set or use an appropriate existing error) with contextual info, then
mi.destroy() and return, ensuring the parse_error is observed by the dispatcher
rather than silently returning.

Comment thread src/bun.js/ModuleLoader.zig
Comment thread src/bun.js/RuntimeTranspilerStore.zig Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@src/bun.js/RuntimeTranspilerStore.zig`:
- Around line 475-481: The current pattern misuses bun.handleOom around
analyze_transpiled_module.ModuleInfoDeserialized.create by combining it with
`catch null`, which either swallows non-OOM errors or duplicates panic behavior;
pick one consistent approach: if deserialization can fail recoverably, remove
bun.handleOom and keep `module_info = ...create(...) catch null` to handle null
returns gracefully; otherwise, remove the `catch null` and call
`bun.handleOom(analyze_transpiled_module.ModuleInfoDeserialized.create(...))` so
OOMs panic and other errors propagate as intended. Adjust the code around
module_info, bun.handleOom, and ModuleInfoDeserialized.create accordingly.
♻️ Duplicate comments (2)
src/bun.js/ModuleLoader.zig (1)

1200-1204: Silent failure on deserialization errors.

The catch null silently swallows deserialization errors from ModuleInfoDeserialized.create. This could mask corrupt bytecode or version mismatches in standalone module loading.

src/bun.js/RuntimeTranspilerStore.zig (1)

574-581: Silent failure on finalization error.

If mi.finalize() fails, the code destroys module_info and returns early without setting this.parse_error, causing the job to dispatch to the main thread with no error indication and no valid resolved_source.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
src/js_printer.zig (1)

5825-5838: Runtime cache drops module_info payload.

cache.put(..., "", "") always stores an empty esm_record, so cache hits won’t be able to reconstruct module_info even when opts.module_info exists. This is inconsistent with the new module_info propagation and defeats the cache path.

Consider either (a) finalizing/serializing module_info before cache.put, or (b) moving cache updates to the caller after finalization so the serialized payload can be passed here.

src/bun.js/RuntimeTranspilerStore.zig (2)

475-496: Ensure cache entry transfers esm_record ownership.

ModuleInfoDeserialized.create(entry.esm_record, ...) will free the buffer later. If the cache entry still owns esm_record, this risks double-free or UAF.

Please either (a) detach ownership from the cache entry (e.g., clear the buffer after handoff), or (b) pass a copied buffer / mark owns_memory=false for borrowed data.


554-568: ModuleInfo allocation is leaked / lifetime is unclear.

You allocate ModuleInfo, finalize it, and then store only mi.asDeserialized() in ResolvedSource. The ModuleInfo pointer itself is discarded and never destroyed, which leaks the allocation unless another owner keeps it alive and frees it later.

Consider a transfer API (e.g., takeDeserialized) and then destroy ModuleInfo, or store ModuleInfo itself in ResolvedSource so ownership is explicit.

Also applies to: 613-618

src/transpiler.zig (1)

776-887: Forward module_info for .cjs/.esm as well (or document Bun-only intent).

Right now module_info is only passed in the .esm_ascii path (Line 876). If module_info is expected for .cjs or .esm, those formats silently drop it.

🛠️ Suggested update (if module_info should apply to all formats)
@@
             .cjs => try js_printer.printCommonJS(
@@
                 .{
                     .bundling = false,
                     .runtime_imports = ast.runtime_imports,
                     .require_ref = ast.require_ref,
                     .css_import_behavior = transpiler.options.cssImportBehavior(),
                     .source_map_handler = source_map_context,
                     .minify_whitespace = transpiler.options.minify_whitespace,
                     .minify_syntax = transpiler.options.minify_syntax,
                     .minify_identifiers = transpiler.options.minify_identifiers,
                     .transform_only = transpiler.options.transform_only,
                     .runtime_transpiler_cache = runtime_transpiler_cache,
+                    .module_info = module_info,
                     .print_dce_annotations = transpiler.options.emit_dce_annotations,
                     .hmr_ref = ast.wrapper_ref,
                     .mangled_props = null,
                 },
                 enable_source_map,
             ),
@@
             .esm => try js_printer.printAst(
@@
                 .{
                     .bundling = false,
                     .runtime_imports = ast.runtime_imports,
                     .require_ref = ast.require_ref,
                     .source_map_handler = source_map_context,
                     .css_import_behavior = transpiler.options.cssImportBehavior(),
                     .minify_whitespace = transpiler.options.minify_whitespace,
                     .minify_syntax = transpiler.options.minify_syntax,
                     .minify_identifiers = transpiler.options.minify_identifiers,
                     .transform_only = transpiler.options.transform_only,
                     .import_meta_ref = ast.import_meta_ref,
                     .runtime_transpiler_cache = runtime_transpiler_cache,
+                    .module_info = module_info,
                     .print_dce_annotations = transpiler.options.emit_dce_annotations,
                     .hmr_ref = ast.wrapper_ref,
                     .mangled_props = null,
                 },
                 enable_source_map,
             ),
🤖 Fix all issues with AI agents
In `@src/analyze_transpiled_module.zig`:
- Around line 7-12: The ModuleInfo struct exposes internal fields that should be
private; update non-public fields (e.g., allocator, is_typescript, finalized,
deserialized and the other internal fields in ModuleInfo) to use Zig's private
field syntax (#) if they are not part of the public API; locate the ModuleInfo
declaration and prefix those internal field names with #, then update any
internal module code that references them to use the new private names while
leaving true public API fields unchanged.
- Around line 53-66: The create() function on ModuleInfoDeserialized currently
sets owns_memory = true but assigns the borrowed serialized slice directly,
risking double-free/UAF in destroy(); update create() so it makes an owned copy
of serialized using the provided allocator (allocate and copy bytes, set data to
the owned slice and keep owns_memory = true), and leave destroy() unchanged to
free only when owns_memory is true; alternatively (if you prefer not to copy)
change create() to set owns_memory = false and document that callers must pass
an owned buffer when true ownership is required — reference the
ModuleInfoDeserialized.create and ModuleInfoDeserialized.destroy symbols to
implement the fix.
- Around line 22-43: finalize() currently creates ModuleInfoDeserialized with
.owns_memory = false (module owns the allocation) but exposes a raw pointer via
asDeserialized(), causing callers who keep only that pointer to leak when
ModuleInfo is destroyed; add a new method takeDeserialized() that returns
?*ModuleInfoDeserialized, sets the stored self.deserialized to null, and flips
the returned deserialized.owns_memory to true so ownership transfers to the
caller; update destroy(self: *ModuleInfo) to only call d.destroy() when
d.owns_memory == false (module still owns it) so transferred pointers aren’t
double-freed.
- Around line 1-2: Move the two `@import` statements (currently at top) to the
bottom of the file per project convention, and fix
ModuleInfoDeserialized.create() ownership: do not store the incoming serialized:
[]const u8 pointer as-is while marking owns_memory = true; instead allocate a
copy of the slice via the allocator inside ModuleInfoDeserialized.create(),
store the copied buffer and set owns_memory = true, so destroy() may safely free
it; alternatively (if you prefer caller-managed memory) set owns_memory = false
and document that callers must keep the buffer alive, but do not leave the
current incorrect combination of owning flag = true with a non-owned pointer.
Ensure destroy() still frees only when owns_memory is true.

In `@src/bun.zig`:
- Line 1104: The new import declaration "pub const analyze_transpiled_module =
`@import`(\"./analyze_transpiled_module.zig\");" is placed outside the bottom
import block; move that `@import` line into the file's bottom import block so all
`@import` statements are grouped together per repo Zig import placement rules
(locate the existing bottom import block and add/merge this declaration there,
removing the current one).
♻️ Duplicate comments (2)
src/bun.js/ModuleLoader.zig (1)

1200-1203: Consider logging deserialization failures.

Swallowing errors can make bytecode corruption/version mismatches hard to diagnose; a debug log would help without changing behavior.

src/bun.js/RuntimeTranspilerStore.zig (1)

574-579: Finalize failure still drops error reporting.

The error path still returns without recording a parse error, which can surface as a silent failure downstream.

Comment thread src/analyze_transpiled_module.zig Outdated
Comment thread src/analyze_transpiled_module.zig
Comment thread src/analyze_transpiled_module.zig Outdated
Comment thread src/analyze_transpiled_module.zig Outdated
Comment thread src/bun.zig Outdated
@alii alii force-pushed the ali/esm-bytecode branch from 5d666d7 to 6c7ef4d Compare January 23, 2026 22:34

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
src/bundler/linker_context/postProcessJSChunk.zig (2)

28-52: Add defer cleanup for ModuleInfo to prevent allocator leaks in long-running builds.

ModuleInfo.create() allocates with bun.default_allocator but the pointer is never freed after serialization. In watch/dev server scenarios with repeated bundling, this causes cumulative memory leaks. Add defer module_info.destroy(); after line 33 (or after final use if you prefer explicit lifetime management).


97-216: Verify if wrapper declarations added in step 3 duplicate those already collected in step 1.

ModuleInfo.addVar does not deduplicate—it unconditionally appends via _addRecord. Since DeclCollector scans out_stmts (which includes wrapper decls in outside_wrapper_prefix), and step 3 re-adds the same wrapper_refs using the same renamer, duplicate entries will be created. Either gate wrapper insertion to skip already-collected wrappers, or implement deduplication in addVar/_addRecord.

src/js_printer.zig (1)

1110-1121: Propagate top-level var tracking into nested statements

TopLevel.subVar() exists but isn't used at block/loop boundaries. printBody() and printForLoopInit() currently hardcode TopLevel.init(.no) and .{} respectively, dropping var declarations inside top-level blocks/loops from module_info. This results in incomplete declaredVariables when module_info is consulted instead of parsing.

Update printBody(p: *Printer, stmt: Stmt) to accept tlmtlo: TopLevel and pass tlmtlo.subVar() to nested printBlock() and printStmt() calls. Similarly, update printForLoopInit(p: *Printer, initSt: Stmt) to accept tlmtlo: TopLevel and propagate it to printDecls() calls to track var/let/const/using declarations at loop initializers.

🔧 Suggested direction (apply similarly to other nested-body call sites)
-        pub fn printBody(p: *Printer, stmt: Stmt) void {
+        pub fn printBody(p: *Printer, stmt: Stmt, tlmtlo: TopLevel) void {
             switch (stmt.data) {
                 .s_block => |block| {
                     p.printSpace();
-                    p.printBlock(stmt.loc, block.stmts, block.close_brace_loc, TopLevel.init(.no));
+                    p.printBlock(stmt.loc, block.stmts, block.close_brace_loc, tlmtlo.subVar());
                     p.printNewline();
                 },
                 else => {
                     p.printNewline();
                     p.indent();
-                    p.printStmt(stmt, TopLevel.init(.no)) catch unreachable;
+                    p.printStmt(stmt, tlmtlo.subVar()) catch unreachable;
                     p.unindent();
                 },
             }
         }
-        pub fn printForLoopInit(p: *Printer, initSt: Stmt) void {
+        pub fn printForLoopInit(p: *Printer, initSt: Stmt, tlmtlo: TopLevel) void {
             switch (initSt.data) {
                 .s_local => |s| {
                     switch (s.kind) {
                         .k_var => {
-                            p.printDecls("var", s.decls.slice(), ExprFlag.Set.init(.{ .forbid_in = true }), .{});
+                            const tlm: TopLevelAndIsExport = if (may_have_module_info and tlmtlo.isTopLevel())
+                                .{ .is_export = false, .is_top_level = .declared }
+                            else
+                                .{};
+                            p.printDecls("var", s.decls.slice(), ExprFlag.Set.init(.{ .forbid_in = true }), tlm);
                         },

Also applies to: 1133-1142, 4953-4966

🤖 Fix all issues with AI agents
In `@src/analyze_transpiled_module.zig`:
- Around line 82-88: In create(), before installing the errdefer that calls
res.deinit(), explicitly initialize ModuleInfoDeserialized.owner to a safe,
known discriminant (e.g., the "none"/empty variant or a null/empty pointer
variant) so deinit() never reads an uninitialized union; in practice, assign a
safe owner value on the freshly allocated res immediately after try
gpa.create(ModuleInfoDeserialized) and before errdefer res.deinit() so any
early-return/err paths call deinit() on a fully initialized owner field.
- Around line 361-362: The function zig__renderDiff currently performs an inline
`@import` of DiffFormatter which violates Zig guidelines; remove the inline
`@import` from inside zig__renderDiff and instead add a module-level constant
declaration like const DiffFormatter =
`@import`("./bun.js/test/diff_format.zig").DiffFormatter placed at the bottom of
the file so the auto-formatter will position it correctly; keep the
zig__renderDiff signature and usages referencing DiffFormatter unchanged.
- Around line 441-444: The destructor zig__ModuleInfo__destroy currently calls
info.deinit() and then bun.default_allocator.destroy(info), which violates
allocator pairing because ModuleInfo.create() stores its allocator in info.gpa;
change zig__ModuleInfo__destroy to call info.destroy() so destruction uses the
allocator saved on ModuleInfo (honoring info.gpa) and performs both deinits and
allocator-specific destroy logic instead of always using bun.default_allocator.

In `@src/bundler/Chunk.zig`:
- Around line 506-509: The ModuleInfo allocated with
ModuleInfo.create(bun.default_allocator, ...) in postProcessJSChunk.zig is
stored in chunk.module_info but never freed after being serialized to
module_info_bytes in generateChunksInParallel.zig; either free it immediately
after serialization by calling
analyze_transpiled_module.ModuleInfo.destroy(module_info) and nulling
chunk.module_info (and releasing allocator-owned memory if needed), or if the
ModuleInfo must live for the chunk lifetime, document that and add a cleanup
path on chunk teardown (e.g., call analyze_transpiled_module.ModuleInfo.destroy
on chunk.module_info in the chunk destructor/cleanup); update the code paths
around module_info, ModuleInfo.create, module_info_bytes, and
generateChunksInParallel.zig/postProcessJSChunk.zig accordingly so ownership is
explicit and no leak remains.

In `@src/bundler/linker_context/generateChunksInParallel.zig`:
- Around line 307-355: The code currently swallows allocation failures in
generateChunksInParallel (e.g. in unique_key_to_path.put, std.fmt.allocPrint,
replacements.append, and mi.str) using `catch continue`/`catch {}` which can
produce corrupted module_info; change those `catch` sites to propagate the error
(return an error union or bubble up) or explicitly abort on OOM instead of
continuing so allocation failures fail the build deterministically; update the
unique_key_to_path.put(...) catch, allocPrint(...) catch,
replacements.append(...) catch, and mi.str(...) catch to return/propagate the
allocation error from generateChunksInParallel (or call a consistent OOM
handler) and ensure callers handle that error.

In `@src/bundler/linker_context/postProcessJSChunk.zig`:
- Around line 943-965: The loop in postProcessJSChunk currently calls
mi.addVar(str_id, .declared) for every local declaration, which is incorrect for
non-`var` bindings; change it to derive the VarKind from the statement kind
(s.kind) before calling mi.addVar so `let`/`const` locals are recorded
correctly. Inside the .s_local arm (the loop over s.decls and binding
.b_identifier), compute a VarKind (e.g., map s.kind to the corresponding VarKind
enum variant) and pass that kind to mi.addVar(str_id, kind) instead of the
hardcoded .declared; ensure you reference the same s.kind used in the outer
scope and update all mi.addVar calls in this block.

In `@src/js_printer.zig`:
- Around line 6293-6302: The ArrayList `buf` in serializeModuleInfo is not
deinitialized on error paths and can leak if deserialized.serialize or
buf.toOwnedSlice fail; after creating `var buf: std.ArrayList(u8) = .empty;` add
an errdefer to call `buf.deinit(bun.default_allocator);` so any subsequent
`catch return ""` or other error returns run cleanup, and remove or adjust the
errdefer before the successful return (after buf.toOwnedSlice succeeds) to avoid
double-free.
♻️ Duplicate comments (2)
src/analyze_transpiled_module.zig (2)

1-2: Move top-level @import statements to the bottom of the file.
This violates the Zig style rules used in this repo. As per coding guidelines, please move these imports to the bottom of the file.


158-170: Consider making internal ModuleInfo fields private with #.
If these fields aren’t intended as public API, mark them private and update internal accesses. As per coding guidelines, use # for private fields.

Comment thread src/analyze_transpiled_module.zig
Comment thread src/analyze_transpiled_module.zig Outdated
Comment thread src/analyze_transpiled_module.zig
Comment thread src/bundler/Chunk.zig
Comment thread src/bundler/linker_context/generateChunksInParallel.zig
Comment thread src/bundler/linker_context/postProcessJSChunk.zig
Comment thread src/js_printer.zig Outdated
alii and others added 9 commits January 23, 2026 14:50
Co-Authored-By: Claude <noreply@anthropic.com>
The ownership bug claim is a false positive: create() already calls
gpa.dupe(u8, source) on line 83 to copy the data before storing it.

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
Use defer instead of errdefer since the function returns []const u8
(not an error union), so errdefer would never trigger. On the success
path, toOwnedSlice sets buf to .empty making deinit a no-op. On
failure paths (serialize or toOwnedSlice catch), deinit properly
frees the buffer.

Co-Authored-By: Claude <noreply@anthropic.com>
In ModuleInfoDeserialized.create(), the struct returned by gpa.create()
is uninitialized. The errdefer was calling res.deinit() which switches
on the uninitialized owner tagged union, causing undefined behavior if
any parsing operation fails. The duped slice already has its own errdefer
for cleanup, so we only need to free the struct allocation itself.

Co-Authored-By: Claude <noreply@anthropic.com>
Per project conventions, @import statements should never be inline
inside functions. Move it to the module-level import block at the
bottom of the file.

Co-Authored-By: Claude <noreply@anthropic.com>
Instead of manually calling info.deinit() + bun.default_allocator.destroy(),
use info.destroy() which correctly uses self.gpa (the allocator stored at
creation time). This is more robust and respects allocator pairing.

Co-Authored-By: Claude <noreply@anthropic.com>
The ModuleInfo struct allocated via bun.default_allocator (not the
arena) was never freed after being serialized to module_info_bytes.
Call destroy() and null out the pointer after serialization since
the struct is no longer needed.

Co-Authored-By: Claude <noreply@anthropic.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Fix all issues with AI agents
In `@src/bundler/linker_context/postProcessJSChunk.zig`:
- Around line 118-165: Add a guard to skip imports flagged as unused/type-only
when iterating .s_import in postProcessJSChunk.zig: inside the .s_import case
(the block that references record =
&source_import_records[s.import_record_index]), check record.flags.is_unused and
continue if true (same placement as the is_unused check in linker.zig). This
ensures you ignore elided/type-only imports before attempting to register names
with mi/requestModule, mi.addVar, mi.addImportInfoSingle, and
mi.addImportInfoNamespace.

In `@src/js_printer.zig`:
- Around line 3562-3576: The module_info writes in printBinding use
bun.handleOom which aborts on allocation failure; change them to best-effort
(skip on OOM) by checking p.moduleInfo() and each subsequent call result instead
of calling bun.handleOom: in printBinding, after obtaining mi and computing
local_name/name_id (renamer.nameForSymbol, mi.str), call the mi methods
(mi.addVar, mi.addExportInfoLocal) in a way that handles failure (e.g., if the
API returns an optional/null or error, just ignore it) so OOM does not crash the
process; apply the same pattern for the other similar block(s) referenced (lines
~3675-3706) where bun.handleOom is used for module_info writes.
- Around line 4953-4966: The for-loop initializer handler (printForLoopInit)
currently calls printDecls with an empty TopLevelAndIsExport, so top-level
var/let/const declared in for headers aren’t recorded in module_info; modify
printForLoopInit to accept and propagate a TopLevel (or TopLevelAndIsExport)
parameter (rather than using the empty value), derive the proper TopLevel via
tlmtlo.subVar()/subLet()/subConst() as appropriate, and pass that TopLevel into
the printDecls call so the declarations are recorded in module_info instead of
being dropped.
- Around line 6293-6308: serializeModuleInfo returns an owned slice allocated
with bun.default_allocator but RuntimeTranspilerCache.put (which reads/writes
esm_record without taking ownership) does not free it, causing a leak; after
each call site that passes the returned esm_record to RuntimeTranspilerCache.put
(the two places that call serializeModuleInfo and pass esm_record), free the
buffer with the same allocator (bun.default_allocator) once put() completes to
release the memory and avoid leaking the slice.
♻️ Duplicate comments (3)
src/analyze_transpiled_module.zig (3)

358-366: Move inline @import to the bottom of the file.

Line 359 has an inline @import inside the function, which violates the coding guideline: "Never use @import() inline inside functions in Zig; always place imports at the bottom of the file."

♻️ Suggested refactor
 export fn zig__renderDiff(expected_ptr: [*:0]const u8, expected_len: usize, received_ptr: [*:0]const u8, received_len: usize, globalThis: *bun.jsc.JSGlobalObject) void {
-    const DiffFormatter = `@import`("./bun.js/test/diff_format.zig").DiffFormatter;
     const formatter = DiffFormatter{
         .received_string = received_ptr[0..received_len],
         .expected_string = expected_ptr[0..expected_len],
         .globalThis = globalThis,
     };
     bun.Output.errorWriter().print("DIFF:\n{any}\n", .{formatter}) catch {};
 }

 // At bottom of file with other imports:
 const std = `@import`("std");
 const bun = `@import`("bun");
+const DiffFormatter = `@import`("./bun.js/test/diff_format.zig").DiffFormatter;

438-441: Use info.destroy() to respect the stored allocator.

zig__ModuleInfo__destroy calls info.deinit() then bun.default_allocator.destroy(info), but ModuleInfo.create() stores the allocator in info.gpa. If the ModuleInfo was created with a different allocator, this causes memory corruption. The destroy() method at lines 262-266 already handles this correctly.

Proposed fix
 export fn zig__ModuleInfo__destroy(info: *ModuleInfo) void {
-    info.deinit();
-    bun.default_allocator.destroy(info);
+    info.destroy();
 }

79-84: errdefer res.deinit() executes on uninitialized struct if early parsing fails.

Between lines 83-84, res is allocated but not initialized. If any eat/eatC call fails (lines 86-102) before res.* is assigned (line 104), the errdefer triggers res.deinit(), which switches on the uninitialized owner union - reading garbage and causing undefined behavior.

Proposed fix: defer errdefer until after assignment
 pub fn create(source: []const u8, gpa: std.mem.Allocator) !*ModuleInfoDeserialized {
     const duped = try gpa.dupe(u8, source);
     errdefer gpa.free(duped);
     var rem: []const u8 = duped;
-    var res = try gpa.create(ModuleInfoDeserialized);
-    errdefer res.deinit();

     const record_kinds_len = std.mem.readInt(u32, try eatC(&rem, 4), .little);
     const record_kinds = std.mem.bytesAsSlice(RecordKind, try eat(&rem, record_kinds_len * `@sizeOf`(RecordKind)));
     _ = try eat(&rem, (4 - (record_kinds_len % 4)) % 4); // alignment padding

     const buffer_len = std.mem.readInt(u32, try eatC(&rem, 4), .little);
     const buffer = std.mem.bytesAsSlice(StringID, try eat(&rem, buffer_len * `@sizeOf`(StringID)));

     const requested_modules_len = std.mem.readInt(u32, try eatC(&rem, 4), .little);
     const requested_modules_keys = std.mem.bytesAsSlice(StringID, try eat(&rem, requested_modules_len * `@sizeOf`(StringID)));
     const requested_modules_values = std.mem.bytesAsSlice(ModuleInfo.FetchParameters, try eat(&rem, requested_modules_len * `@sizeOf`(ModuleInfo.FetchParameters)));

     const is_typescript = (try eatC(&rem, 1))[0] != 0;
     _ = try eat(&rem, 3); // alignment padding

     const strings_len = std.mem.readInt(u32, try eatC(&rem, 4), .little);
     const strings_lens = std.mem.bytesAsSlice(u32, try eat(&rem, strings_len * `@sizeOf`(u32)));
     const strings_buf = rem;

+    var res = try gpa.create(ModuleInfoDeserialized);
+    errdefer res.deinit();
     res.* = .{
         .strings_buf = strings_buf,
         // ... rest unchanged
     };
     return res;
 }

Comment thread src/bundler/linker_context/postProcessJSChunk.zig Outdated
Comment thread src/js_printer.zig
Comment thread src/js_printer.zig
Comment thread src/js_printer.zig Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

🤖 Fix all issues with AI agents
In `@src/analyze_transpiled_module.zig`:
- Around line 82-88: The current errdefer res.deinit() calls deinit on an
uninitialized ModuleInfoDeserialized (res) which can free garbage fields (like
owner); change the cleanup to release the raw allocation instead: replace
errdefer res.deinit() with errdefer gpa.free(res) so you only free the buffer if
initialization fails, and once you have fully initialized res.* (inside
create()), then set/replace the cleanup to call res.deinit() (or call
res.deinit() on error after initialization) so proper destructor runs only on a
fully initialized ModuleInfoDeserialized.

In `@src/bun.js/bindings/BunAnalyzeTranspiledModule.cpp`:
- Around line 112-116: The function
JSC_JSModuleRecord__addRequestedModuleHostDefined currently indexes
identifierArray directly with hostDefinedImportType which can bypass the
special-case handling for sentinel values and risk out-of-bounds access; change
the creation of ScriptFetchParameters to call
getFromIdentifierArray(moduleRecord->vm(), identifierArray,
hostDefinedImportType) (or otherwise use the same helper used elsewhere) instead
of identifierArray[hostDefinedImportType].string(), then pass that result into
ScriptFetchParameters::create and keep
moduleRecord->appendRequestedModule(getFromIdentifierArray(moduleRecord->vm(),
identifierArray, moduleName), std::move(attributes)) unchanged so all identifier
accesses uniformly use getFromIdentifierArray.
- Around line 1-26: The file contains duplicate `#include` lines for the same
headers; remove the extra includes so each header is included only once
(specifically remove the two redundant includes of
"JavaScriptCore/JSModuleRecord.h" and the duplicate "JavaScriptCore/Nodes.h"),
keeping one instance of each required include and preserving any include
ordering needed for symbols referenced by ModuleAnalyzer, Parser,
ExceptionScope, ZigSourceProvider, and BunAnalyzeTranspiledModule.
- Around line 278-280: The comparator passed to std::sort allocates temporary
std::string objects by calling a.utf8().toStdString() and b.utf8().toStdString()
on every comparison (in the lambda used for sorting sortedEntries), which is
inefficient; replace that lambda with a direct, allocation-free comparison such
as using WTF::codePointCompareLessThan(a, b) (or another available String
compare method that avoids toStdString()), e.g., change the comparator for
sortedEntries to call the non-allocating string comparison function instead of
utf8().toStdString().
- Line 196: The local variable currently named ModuleAnalyzer shadows the
class/type ModuleAnalyzer; rename the variable to a lowercase identifier (e.g.,
moduleAnalyzer or analyzer) where it is constructed (the expression
"ModuleAnalyzer ModuleAnalyzer(globalObject, moduleKey, sourceCode,
moduleProgramNode->varDeclarations(), moduleProgramNode->lexicalVariables(),
moduleProgramNode->features());") and update all subsequent uses to the new name
to avoid shadowing and improve readability while keeping the class name
unchanged.

In `@src/bundler/linker_context/generateChunksInParallel.zig`:
- Around line 307-321: The unique_key_to_path map is built using
c.options.public_path but browser chunks emitted from server builds need the
browser transpilier's public_path; update the loop that builds
unique_key_to_path (referencing unique_key_to_path, ch.unique_key,
ch.final_rel_path, and bun.bundle_v2.cheapPrefixNormalizer) to pick the
per-chunk public_path: if ch.flags.is_browser_chunk_from_server_build use
bundler.transpilerForTarget(.browser).options.public_path, otherwise use
c.options.public_path, then pass that selected public_path into
cheapPrefixNormalizer and use its result to create the resolved path stored
under unique_key_to_path for that chunk. Ensure you still guard on
ch.unique_key.len and ch.final_rel_path.len and keep allocator/error handling
as-is.

In `@src/js_printer.zig`:
- Around line 641-667: The code currently drops top-level "var" information by
calling TopLevel.init(.no) when entering nested bodies; fix by threading the
TopLevel value through printBody/printBlock/printStmt call chain instead of
passing .no, replace spots that create nested scopes with tlmtlo.subVar() to
produce the correct nested TopLevel (use the subVar() method on
TopLevel/TopLevelAndIsExport), and update printDeclStmt to skip recording
lexical declarations when the passed TopLevel.is_top_level == .var_only; ensure
all call sites that previously used TopLevel.init(.no) now accept and forward
the TopLevel parameter so module_info sees var-only top-level declarations
correctly.
- Around line 6293-6302: serializeModuleInfo returns an owned slice via
buf.toOwnedSlice(bun.default_allocator) that must be freed by the caller; after
calling cache.put(...) with the returned esm_record you need to free that owned
slice to avoid leaking memory — call the allocator's free for the slice (use
bun.default_allocator.free(esm_record) or the appropriate allocator free API)
immediately after cache.put(...) and apply the same change at both call sites
that pass the returned esm_record to cache.put (the two locations around the
existing cache.put calls).
♻️ Duplicate comments (1)
src/analyze_transpiled_module.zig (1)

1-2: Hoist inline @import to file scope

Line 362 uses inline @import inside zig__renderDiff. Move that import to file scope (and keep @imports grouped at the bottom) to align with the project Zig conventions. As per coding guidelines, keep imports at file scope and at the bottom.

♻️ Suggested change
-export fn zig__renderDiff(expected_ptr: [*:0]const u8, expected_len: usize, received_ptr: [*:0]const u8, received_len: usize, globalThis: *bun.jsc.JSGlobalObject) void {
-    const DiffFormatter = `@import`("./bun.js/test/diff_format.zig").DiffFormatter;
+export fn zig__renderDiff(expected_ptr: [*:0]const u8, expected_len: usize, received_ptr: [*:0]const u8, received_len: usize, globalThis: *bun.jsc.JSGlobalObject) void {
+    const DiffFormatter = DiffFormatterType;

Add at file scope (near other imports at the bottom):

const DiffFormatterType = `@import`("./bun.js/test/diff_format.zig").DiffFormatter;
#!/bin/bash
# Verify import placement and inline `@import` usage.
rg -n "@import" src/analyze_transpiled_module.zig

Also applies to: 361-363

Comment thread src/analyze_transpiled_module.zig
Comment thread src/bun.js/bindings/BunAnalyzeTranspiledModule.cpp
Comment thread src/bun.js/bindings/BunAnalyzeTranspiledModule.cpp
Comment thread src/bun.js/bindings/BunAnalyzeTranspiledModule.cpp Outdated
Comment thread src/bun.js/bindings/BunAnalyzeTranspiledModule.cpp
Comment thread src/bundler/linker_context/generateChunksInParallel.zig Outdated
Comment thread src/js_printer.zig
Comment thread src/js_printer.zig Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@src/analyze_transpiled_module.zig`:
- Around line 79-103: The buffer returned by gpa.dupe(u8, source) in create is
only 1-byte aligned, so passing slices of it to std.mem.bytesAsSlice for types
like StringID, ModuleInfo.FetchParameters and u32 (e.g., record_kinds, buffer,
requested_modules_* and strings_lens) can violate alignment; fix by allocating
an aligned backing buffer (4-byte alignment) before parsing: allocate a new
buffer via gpa.alloc or gpa.create with element type that has 4-byte alignment,
copy the duped bytes into it (or use an allocator API that supports alignment),
then use a rem pointer into that aligned buffer for all eat/eatC calls and
bytesAsSlice uses; update errdefer/free to free the aligned buffer instead of
the original duped pointer so bytesAsSlice is always given properly aligned
memory.
♻️ Duplicate comments (1)
src/analyze_transpiled_module.zig (1)

155-170: Consider marking internal ModuleInfo fields private (#).

If these fields aren’t part of the public API, using # keeps the surface minimal and follows project conventions.
As per coding guidelines, please prefer # for private fields.

Comment thread src/analyze_transpiled_module.zig

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/bun.js/RuntimeTranspilerStore.zig (1)

475-481: Inconsistent error handling pattern with bun.handleOom + catch null.

The current pattern bun.handleOom(...) catch null is problematic: handleOom will panic on OOM before catch null can run. For non-OOM errors (deserialization failures), they get swallowed by catch null after handleOom returns successfully.

Based on learnings, module_info is supplementary metadata and should use graceful degradation. Remove handleOom and use only catch null:

🐛 Proposed fix
                 if (entry.esm_record.len > 0) {
                     if (entry.metadata.module_type == .cjs) {
                         `@panic`("TranspilerCache contained cjs module with module info");
                     }
-                    module_info = bun.handleOom(analyze_transpiled_module.ModuleInfoDeserialized.create(entry.esm_record, bun.default_allocator)) catch null;
+                    module_info = analyze_transpiled_module.ModuleInfoDeserialized.create(entry.esm_record, bun.default_allocator) catch null;
                 }
♻️ Duplicate comments (6)
src/js_printer.zig (4)

1110-1142: Thread TopLevel into nested bodies to retain top‑level var tracking.

Line 1114/1120 passes TopLevel.init(.no) into nested bodies, which drops top‑level var declarations inside blocks/loops from module_info. Thread the incoming TopLevel (or tlmtlo.subVar()) through printBody/printBlock/printStmt so var‑only top‑level bindings are recorded.


3562-3576: Avoid hard OOM failures for module_info writes.

Line 3569 and related blocks still use bun.handleOom for module_info writes, which aborts on OOM. module_info is best‑effort metadata and should degrade gracefully instead of crashing. Please convert these to optional/error‑tolerant calls (e.g., if (mi.str(...)) |id| { _ = mi.addVar(...) catch {}; }). Based on learnings, module_info should remain best‑effort and avoid crashing on OOM.


4953-4966: Top‑level for‑loop init bindings aren’t recorded in module_info.

Line 4953+ passes an empty TopLevelAndIsExport into printDecls, so top‑level for header declarations are skipped in module_info. Thread the appropriate TopLevel context into printForLoopInit and use it when calling printDecls.


6019-6030: Free serialized module_info buffer after cache.put().

serializeModuleInfo returns an owned slice; RuntimeTranspilerCache.put() does not take ownership. The slice is never freed after cache.put(), leaking memory at both call sites. Free the slice immediately after the call.

♻️ Proposed fix
 if (opts.runtime_transpiler_cache) |cache| {
     const esm_record = serializeModuleInfo(opts.module_info);
+    defer bun.default_allocator.free(esm_record);
     cache.put(printer.writer.ctx.getWritten(), source_maps_chunk.buffer.list.items, esm_record);
 }
src/bundler/linker_context/generateChunksInParallel.zig (1)

315-323: Per-chunk public_path not applied when building unique_key→path map.

The loop builds unique_key_to_path using c.options.public_path for all chunks, but browser chunks emitted from server builds use bundler.transpilerForTarget(.browser).options.public_path (see lines 399-402). This mismatch causes module_info to reference paths with incorrect prefixes for browser chunks.

🔧 Suggested fix
         for (chunks) |*ch| {
             if (ch.unique_key.len > 0 and ch.final_rel_path.len > 0) {
+                const chunk_public_path = if (ch.flags.is_browser_chunk_from_server_build)
+                    bundler.transpilerForTarget(.browser).options.public_path
+                else
+                    c.options.public_path;
                 // Use cheapPrefixNormalizer to match the path normalization done by
                 // IntermediateOutput.code() (e.g. stripping "./" from relative paths)
-                const normalizer = bun.bundle_v2.cheapPrefixNormalizer(c.options.public_path, ch.final_rel_path);
+                const normalizer = bun.bundle_v2.cheapPrefixNormalizer(chunk_public_path, ch.final_rel_path);
                 const resolved = std.fmt.allocPrint(c.allocator(), "{s}{s}", .{ normalizer[0], normalizer[1] }) catch continue;
                 unique_key_to_path.put(ch.unique_key, resolved) catch continue;
             }
         }
src/bundler/linker_context/postProcessJSChunk.zig (1)

132-139: Missing guard for unused/type-only imports.

The code checks record.path.is_disabled and record.tag == .bun, but does not check record.flags.is_unused. Type-only imports and other elided imports have is_unused set and should be skipped to avoid registering phantom bindings in module_info.

🔧 Suggested fix
                         .s_import => |s| {
                             const record = &source_import_records[s.import_record_index];
                             if (record.path.is_disabled) continue;
                             if (record.tag == .bun) continue;
+                            if (record.flags.is_unused) continue;

                             const import_path = record.path.text;

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@src/analyze_transpiled_module.zig`:
- Around line 147-152: StringContext.eql is using u32 values directly in slice
ops which breaks on 64-bit targets; in the eql implementation (pub fn eql(...))
cast the slice start and length to usize before slicing—convert the result of
`@intFromEnum`(item_key) to usize for the start index and cast
strings_lens[item_i] (u32) to usize for the length, then use those usize values
in the two slices passed to bun.strings.eqlLong.
- Around line 155-164: The ModuleInfo struct declares buffer and record_kinds as
managed std.ArrayList but the code uses unmanaged allocator APIs
(append(self.gpa,...), appendSlice(self.gpa,...), deinit(self.gpa)), so change
the types of buffer and record_kinds to std.ArrayListUnmanaged(StringID) and
std.ArrayListUnmanaged(RecordKind) respectively so they match the unmanaged
usage with the gpa allocator; update any initialization/deinit sites that
reference buffer or record_kinds to use the Unmanaged API symbols already used
in the code (e.g., append, appendSlice, deinit with self.gpa).
♻️ Duplicate comments (6)
src/bundler/linker_context/generateChunksInParallel.zig (1)

315-321: Use per‑chunk public_path when resolving module_info paths.

Browser chunks from server builds need the browser transpiler’s public_path; using c.options.public_path for all chunks produces incorrect module_info paths.

🛠️ Suggested fix
-                const normalizer = bun.bundle_v2.cheapPrefixNormalizer(c.options.public_path, ch.final_rel_path);
+                const public_path = if (ch.flags.is_browser_chunk_from_server_build)
+                    bundler.transpilerForTarget(.browser).options.public_path
+                else
+                    c.options.public_path;
+                const normalizer = bun.bundle_v2.cheapPrefixNormalizer(public_path, ch.final_rel_path);
src/bundler/linker_context/postProcessJSChunk.zig (1)

132-165: Skip unused/type‑only imports when populating module_info.

You still need to ignore record.flags.is_unused to avoid recording elided/type‑only imports.

🛠️ Suggested fix
-                            if (record.path.is_disabled) continue;
+                            if (record.path.is_disabled) continue;
+                            if (record.flags.is_unused) continue;
                             if (record.tag == .bun) continue;
src/js_printer.zig (4)

1110-1162: Top‑level var in nested blocks still isn’t captured

These call sites pass TopLevel.init(.no), so top‑level var declared inside nested blocks/loops won’t be recorded in module_info.

Also applies to: 4241-4249, 4336-4357, 4422-4436, 4988-5047


4953-4966: Top‑level bindings in for headers still aren’t recorded

printForLoopInit passes an empty TopLevelAndIsExport, so top‑level var/let/const in loop headers won’t be captured in module_info.


3562-3576: Avoid bun.handleOom in module_info writes (best‑effort metadata)

These paths still abort on OOM; module_info should degrade gracefully instead of crashing. Based on learnings, module_info should remain best‑effort and avoid hard OOM failures.

Also applies to: 3675-3706, 3793-3798, 3831-3836, 3875-3880, 3908-3913, 3931-3936, 3973-3982, 4136-4140, 4197-4205, 4624-4671


6019-6030: Free esm_record after cache.put()

serializeModuleInfo() returns an owned slice; it’s still not freed after cache.put(), so the buffer leaks.

🐛 Proposed fix
 if (opts.runtime_transpiler_cache) |cache| {
     const esm_record = serializeModuleInfo(opts.module_info);
+    defer bun.default_allocator.free(esm_record);
     cache.put(printer.writer.ctx.getWritten(), source_maps_chunk.buffer.list.items, esm_record);
 }
@@
 if (opts.runtime_transpiler_cache) |cache| {
     const esm_record = serializeModuleInfo(opts.module_info);
+    defer bun.default_allocator.free(esm_record);
     cache.put(printer.writer.ctx.getWritten(), "", esm_record);
 }

Comment thread src/analyze_transpiled_module.zig
Comment thread src/analyze_transpiled_module.zig Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@src/analyze_transpiled_module.zig`:
- Line 52: The struct field "dead" is declared but never set, so the defensive
check in zig__ModuleInfoDeserialized__toJSModuleRecord is ineffective; either
remove the unused "dead" field or set it to true at the start of the deinit()
implementation (e.g., in the deinit method for the same struct) so that
zig__ModuleInfoDeserialized__toJSModuleRecord can detect freed instances; update
the struct definition and deinit() accordingly and ensure any tests or callers
of deinit() reflect the chosen behavior.

In `@src/js_printer.zig`:
- Around line 641-668: Rename the internal struct fields to use the private `#`
prefix and update all usages: change `is_export` → `#is_export` in
TopLevelAndIsExport and `is_top_level` → `#is_top_level` in TopLevel (both
branches controlled by `may_have_module_info`), and update any struct literals
and field access inside TopLevel methods (`init`, `subVar`, `isTopLevel`) and
call sites to use `.#is_top_level`/`.#is_export` as appropriate (preserve
behavior and enum `IsTopLevel` usage).
- Around line 3699-3703: The code calls str.slice(p.options.allocator) to
produce str8 and passes it to mi.str(str8) but does not free the UTF‑8 buffer
when str is UTF‑16; update the block around may_have_module_info /
p.moduleInfo() to free the allocated buffer (use p.options.allocator.free on
str8) after calling mi.str, or avoid allocation by using slice8() when the input
is guaranteed UTF‑8 or by using stringCloned()/ModuleInfo APIs that take
ownership; ensure the final change still calls bun.handleOom(mi.str(...)) and
preserves the tlm.is_top_level branch that calls mi.addVar(name_id, vk).
♻️ Duplicate comments (6)
src/bundler/linker_context/generateChunksInParallel.zig (1)

307-321: Use per‑chunk public_path when resolving module_info placeholders.
Line 319 uses c.options.public_path, but browser chunks from server builds are emitted with the browser transpiler’s public_path. This can make cross‑chunk specifiers in module_info point at the wrong paths.

🛠️ Proposed fix
-        for (chunks) |*ch| {
+        const bundler = `@as`(*bun.bundle_v2.BundleV2, `@fieldParentPtr`("linker", c));
+        for (chunks) |*ch| {
             if (ch.unique_key.len > 0 and ch.final_rel_path.len > 0) {
                 // Use cheapPrefixNormalizer to match the path normalization done by
                 // IntermediateOutput.code() (e.g. stripping "./" from relative paths)
-                const normalizer = bun.bundle_v2.cheapPrefixNormalizer(c.options.public_path, ch.final_rel_path);
+                const public_path = if (ch.flags.is_browser_chunk_from_server_build)
+                    bundler.transpilerForTarget(.browser).options.public_path
+                else
+                    c.options.public_path;
+                const normalizer = bun.bundle_v2.cheapPrefixNormalizer(public_path, ch.final_rel_path);
                 const resolved = std.fmt.allocPrint(c.allocator(), "{s}{s}", .{ normalizer[0], normalizer[1] }) catch continue;
                 unique_key_to_path.put(ch.unique_key, resolved) catch continue;
             }
         }
src/bundler/linker_context/postProcessJSChunk.zig (1)

132-165: Skip type‑only/unused imports when populating module_info.
Without an is_unused guard, type‑only/elided imports can be recorded as runtime dependencies.

🛠️ Proposed fix
-                            const record = &source_import_records[s.import_record_index];
-                            if (record.path.is_disabled) continue;
+                            const record = &source_import_records[s.import_record_index];
+                            if (record.flags.is_unused) continue;
+                            if (record.path.is_disabled) continue;
                             if (record.tag == .bun) continue;
src/js_printer.zig (4)

1110-1142: Top‑level var inside nested blocks still isn’t recorded in module_info.

printBody/printBlock enter nested scopes with TopLevel.init(.no), so top‑level var declared inside blocks/loops won’t be captured. Thread the TopLevel through nested bodies and use tlmtlo.subVar() when descending; also ensure lexical declarations are ignored when is_top_level == .var_only (per the earlier note).


3562-3576: Avoid bun.handleOom for module_info writes (best‑effort metadata).

These module_info calls currently abort on allocation failure. Module_info is supplementary and should degrade gracefully; use best‑effort patterns (if (mi.str(...)) |id| { _ = mi.addVar(...) catch {}; } else |_| {}) rather than bun.handleOom.
Based on learnings, module_info writes should not crash the build on OOM.

Also applies to: 3793-3999, 4624-4673, 5998-6002


4953-4966: Top‑level for initializer bindings aren’t recorded in module_info.

printForLoopInit always passes an empty TopLevelAndIsExport, so top‑level var/let/const in for (...) headers are dropped. Thread TopLevel into printForLoopInit and pass the right TopLevelAndIsExport.


6018-6030: Free esm_record after cache.put() (owned slice).

serializeModuleInfo() returns an owned slice from bun.default_allocator, but the call sites never free it after cache.put(). This leaks per transpiled module. Consider guarding against the empty-string sentinel before freeing.

🐛 Proposed fix
-            if (opts.runtime_transpiler_cache) |cache| {
-                const esm_record = serializeModuleInfo(opts.module_info);
-                cache.put(printer.writer.ctx.getWritten(), source_maps_chunk.buffer.list.items, esm_record);
-            }
+            if (opts.runtime_transpiler_cache) |cache| {
+                const esm_record = serializeModuleInfo(opts.module_info);
+                defer if (esm_record.len > 0) bun.default_allocator.free(esm_record);
+                cache.put(printer.writer.ctx.getWritten(), source_maps_chunk.buffer.list.items, esm_record);
+            }

Comment thread src/analyze_transpiled_module.zig Outdated
Comment thread src/js_printer.zig
Comment thread src/js_printer.zig

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@src/analyze_transpiled_module.zig`:
- Around line 162-163: The fields buffer and record_kinds are declared as
std.ArrayList but the code uses the unmanaged API (append(self.gpa, ...),
.deinit(self.gpa), .items); change their types to
std.ArrayListUnmanaged(StringID) and std.ArrayListUnmanaged(RecordKind)
respectively so the calls using self.gpa work correctly, and ensure any
initialization/deinitialization and .items access use the unmanaged list
semantics (i.e., keep append(self.gpa, ...), .deinit(self.gpa), and .items usage
after switching the types).
- Around line 379-387: The loop mixes usize and u32 when computing offsets and
slices; cast the per-iteration length to usize before using it in arithmetic and
slicing to satisfy Zig's type rules. In the for loop over res.strings_lens,
convert len to a usize (e.g., let len_usize = `@intToType`(usize, len) or similar)
and then use len_usize in the bounds check (res.strings_buf.len < offset +
len_usize), in the slice expression (res.strings_buf[offset..][0..len_usize]),
and when advancing offset (offset += len_usize); keep the rest of the logic
(IdentifierArray.create, identifiers.setFromUtf8, identifiers.destroy)
unchanged.
♻️ Duplicate comments (2)
src/analyze_transpiled_module.zig (2)

79-88: Fix alignment before bytesAsSlice to avoid UB.

Line 80 uses gpa.dupe(u8, source), which only guarantees 1‑byte alignment. You later reinterpret this buffer as u32, StringID, and FetchParameters slices, which require 4‑byte alignment. On strict architectures this can trap or silently misread. Allocate an aligned buffer and copy before parsing.

🛠️ Proposed fix
-        const duped = try gpa.dupe(u8, source);
-        errdefer gpa.free(duped);
-        var rem: []const u8 = duped;
+        const duped = try gpa.alignedAlloc(u8, `@alignOf`(u32), source.len);
+        errdefer gpa.free(duped);
+        `@memcpy`(duped, source);
+        var rem: []const u8 = duped;

155-161: Make internal ModuleInfo fields private with # (if not API).

These fields appear internal to the struct. If they aren’t part of the public surface, mark them private using the # prefix and update internal references accordingly. As per coding guidelines.

Also applies to: 164-167

Comment thread src/analyze_transpiled_module.zig Outdated
Comment thread src/analyze_transpiled_module.zig

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@src/js_printer.zig`:
- Around line 6332-6348: serializeModuleInfo currently returns a mix of
ownerships (a static "" on failure and an allocator‑owned slice on success);
change the API to make ownership explicit by either (A) changing the signature
to ?[]u8 and returning null on failure while keeping the success path that
returns the owned slice from buf.toOwnedSlice(bun.default_allocator), or (B)
always returning an allocator‑owned slice (allocate and return an empty owned
slice instead of the static "" on error). Update serializeModuleInfo, its
callers, and document the chosen behavior; reference serializeModuleInfo,
mi.finalize(), deserialized.serialize(...),
buf.toOwnedSlice(bun.default_allocator), and bun.default_allocator when making
the changes.

Comment thread src/js_printer.zig Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@src/bun.js/ModuleLoader.zig`:
- Around line 426-434: Replace the hard panic in ModuleLoader.zig where a CJS
entry unexpectedly contains transpiler module_info: inside the blk that sets
module_info (refer to
analyze_transpiled_module.ModuleInfoDeserialized.createFromCachedRecord,
entry.esm_record, and entry.metadata.module_type), change the `@panic` call to log
a warning about the corrupted/transpiler-cache mismatch (use the existing
logging facility in scope, e.g., the module's logger or std.debug.warn) and
continue as a cache miss by returning null for module_info so the loader will
reparse/retranspile instead of crashing.

Comment thread src/bun.js/ModuleLoader.zig Outdated
Claude Bot and others added 2 commits January 29, 2026 13:03
Remove the [Bun] and [Bytecode Runtime] debug logs that were added
under verboseDiskCache. These logs were causing test failures because
the tests expected only [Disk Cache] lines in stderr output.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Comment thread test/js/bun/typescript/type-export.test.ts Outdated
Comment thread src/bun.js/RuntimeTranspilerCache.zig
Comment thread src/bun.js/bindings/ZigSourceProvider.cpp
Comment thread src/OutputFile.zig

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/bun.js/RuntimeTranspilerCache.zig (1)

363-389: ⚠️ Potential issue | 🟠 Major

Free sourcemap on errors after esm_record load.

Once the sourcemap is assigned, a later failure while reading or hashing esm_record will return early without freeing the sourcemap. Add an error-path cleanup after the sourcemap block to avoid leaks.

🛠️ Proposed fix
             if (this.metadata.sourcemap_byte_length > 0) {
                 const sourcemap = try sourcemap_allocator.alloc(u8, this.metadata.sourcemap_byte_length);
                 errdefer sourcemap_allocator.free(sourcemap);
                 const read_bytes = try file.preadAll(sourcemap, this.metadata.sourcemap_byte_offset);
                 if (read_bytes != this.metadata.sourcemap_byte_length) {
                     return error.MissingData;
                 }

                 this.sourcemap = sourcemap;
             }
+            errdefer if (this.sourcemap.len > 0) {
+                sourcemap_allocator.free(this.sourcemap);
+            }

             if (this.metadata.esm_record_byte_length > 0) {
🤖 Fix all issues with AI agents
In `@src/bun.js/bindings/ZigSourceProvider.cpp`:
- Around line 102-113: Add a defensive validation before converting
bytecode_origin_path to a file URL: in the getSourceOrigin lambda, after
obtaining resolvedSource.bytecode_origin_path.toWTFString(...), ensure the
string is a filesystem path (not already a URL scheme like "file://" or
"builtin:") and assert or return a fallback if it is not; use the same guard
pattern as ImportMetaObject.cpp (check for a URL scheme or
isAbsoluteFilesystemPath) and only call
WTF::URL::fileURLWithFileSystemPath(bytecodeOriginPath) when the check passes,
otherwise fall back to toSourceOrigin(sourceURLString, isBuiltin) or trigger a
defensive assert so the bytecode cache origin cannot be corrupted.

Comment thread src/bun.js/bindings/ZigSourceProvider.cpp
metadata.output_hash = hash(output_bytes);
metadata.sourcemap_hash = hash(sourcemap);
if (esm_record.len > 0) {
metadata.esm_record_hash = hash(esm_record);

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be suspicious that this is conditionally written?

metadata.esm_record_hash defaults to 0 otherwise

// bytecode_origin_path is used for bytecode cache validation where the origin must match
// exactly what was used at build time.
const auto getSourceOrigin = [&]() -> SourceOrigin {
auto bytecodeOriginPath = resolvedSource.bytecode_origin_path.toWTFString(BunString::ZeroCopy);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably should be transferToWTFString() but its okay

.specifier = specifier,
.source_url = specifier.dupeRef(),
// bytecode_origin_path is the path used when generating bytecode; must match for cache hits
.bytecode_origin_path = if (file.bytecode_origin_path.len > 0) bun.String.fromBytes(file.bytecode_origin_path) else bun.String.empty,

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this field on the struct is needed. The module info should have everything it needs.

@Jarred-Sumner Jarred-Sumner merged commit 71ce550 into main Jan 30, 2026
53 of 55 checks passed
@Jarred-Sumner Jarred-Sumner deleted the ali/esm-bytecode branch January 30, 2026 09:38
xhjkl pushed a commit to xhjkl/bun that referenced this pull request May 14, 2026
### What does this PR do?

### How did you verify your code works?

---------

Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Claude Bot <claude-bot@bun.sh>
Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants