Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions src/bun.js/bindings/ErrorCode.ts
Original file line number Diff line number Diff line change
Expand Up @@ -216,6 +216,7 @@ const errors: ErrorCodeMapping = [
["ERR_MYSQL_LIFETIME_TIMEOUT", Error, "MySQLError"],
["ERR_UNHANDLED_REJECTION", Error, "UnhandledPromiseRejection"],
["ERR_REQUIRE_ASYNC_MODULE", Error],
["ERR_REQUIRE_CYCLE_MODULE", Error],
["ERR_S3_INVALID_ENDPOINT", Error],
["ERR_S3_INVALID_METHOD", Error],
["ERR_S3_INVALID_PATH", Error],
Expand Down
36 changes: 34 additions & 2 deletions src/bun.js/bindings/ZigGlobalObject.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -740,8 +740,40 @@
bool entryExistedBefore = false;
if (auto* entry = loader->registryEntry(key)) {
entryExistedBefore = true;
if (isModuleEvaluated(entry->record())) {
auto* ns = entry->record()->getModuleNamespace(globalObject, false);
auto* record = entry->record();
if (isModuleEvaluated(record)) {
auto* ns = record->getModuleNamespace(globalObject, false);
RETURN_IF_EXCEPTION(scope, {});
return JSValue::encode(ns);
}
// The record already exists at status Evaluating, i.e. it is on an
// outer InnerModuleEvaluation stack that we are re-entering. Calling
// loadModuleSync here would reach CyclicModuleRecord::evaluate() and
// violate its precondition (status must be LINKED, EVALUATING-ASYNC,
// or EVALUATED), so handle this case directly instead.
if (auto* cyclic = dynamicDowncast<JSC::CyclicModuleRecord>(record); cyclic && cyclic->status() == JSC::CyclicModuleRecord::Status::Evaluating) {
// ExecuteModule sets the generator State field from Init -> Executing
// before entering the body. State == Init therefore means the body
// has not started: the module is still in the requested-modules
// loop of InnerModuleEvaluation, and let/const/class exports are
// TDZ. Returning the namespace here is the regression that surfaces
// as a confusing "x is not a function" far from the cycle. Match
// Node's ERR_REQUIRE_CYCLE_MODULE instead.
JSValue state = record->internalField(JSC::AbstractModuleRecord::Field::State).get();
if (state.isInt32() && state.asInt32() == static_cast<int32_t>(JSC::AbstractModuleRecord::State::Init))
return Bun::throwError(globalObject, scope, Bun::ErrorCode::ERR_REQUIRE_CYCLE_MODULE, makeString("Cannot require() ES Module "_s, keyString, " in a cycle."_s));

Check warning on line 764 in src/bun.js/bindings/ZigGlobalObject.cpp

View check run for this annotation

Claude / Claude Code Review

ERR_REQUIRE_CYCLE_MODULE message omits '(from <parent>)' suffix

nit: Node's `ERR_REQUIRE_CYCLE_MODULE` message includes a `(from <parent>)` suffix naming the requesting module, which helps identify which side of the cycle issued the `require()`. Bun's message only includes the target module path. Plumbing the parent through `$loadEsmIntoCjs` → `$esmLoadSync` (CommonJS.ts has `this.filename`) would close the gap, but `.code` and the target path are already correct so this is non-blocking.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 nit: Node's ERR_REQUIRE_CYCLE_MODULE message includes a (from <parent>) suffix naming the requesting module, which helps identify which side of the cycle issued the require(). Bun's message only includes the target module path. Plumbing the parent through $loadEsmIntoCjs$esmLoadSync (CommonJS.ts has this.filename) would close the gap, but .code and the target path are already correct so this is non-blocking.

Extended reasoning...

What the divergence is

Node's ESM loader (lib/internal/modules/esm/loader.js) constructs this error as:

let message = `Cannot require() ES Module ${filename} in a cycle.`;
if (parentFilename) message += ` (from ${parentFilename})`;
throw new ERR_REQUIRE_CYCLE_MODULE(message);

So Node's message names both the target module and the parent module that issued the require(). Bun's new message at ZigGlobalObject.cpp:764 is:

makeString("Cannot require() ES Module "_s, keyString, " in a cycle."_s)

— only the target. The (from <parent>) suffix is missing.

Why it can't be added trivially

functionEsmLoadSync only receives one argument: callFrame->argument(0), the resolved key of the target module. The parent/requesting module's path isn't available at this layer. The call chain is overridableRequire (which has this.filename) → $requireESM(id)$loadEsmIntoCjs(resolved)$esmLoadSync(resolved)functionEsmLoadSync. None of those intermediate calls forward the parent filename, so adding the suffix would require threading a second argument through $loadEsmIntoCjs and $esmLoadSync.

Step-by-step example

Using test 3's fixture: a.mjs imports b.mjs, and b.mjs does createRequire(import.meta.url)('./a.mjs').

  1. a.mjs enters InnerModuleEvaluationstatus = Evaluating, generator State = Init (still in the requested-modules loop).
  2. b.mjs body runs, calls require('./a.mjs')overridableRequire (where this.filename is /…/b.mjs).
  3. $requireESM$loadEsmIntoCjs('/…/a.mjs')$esmLoadSync('/…/a.mjs') — only the target path is passed.
  4. functionEsmLoadSync finds the registry entry for a.mjs at Status::Evaluating with State == Init and throws with message Cannot require() ES Module /…/a.mjs in a cycle.
  5. Node for the same scenario produces Cannot require() ES Module /…/a.mjs in a cycle. (from /…/b.mjs).

The user sees which module can't be required, but not which module did the requiring — in a two-file cycle that's obvious, but in a larger graph (e.g. a → b → c → d → require(a)) the parent path is the more useful debugging hint.

Addressing the refutation

One verifier argued this isn't worth a comment because Bun routinely diverges from Node's exact error wording and .code is what matters for programmatic compat. That's correct on the merits — .code === 'ERR_REQUIRE_CYCLE_MODULE' is set, the error class is Error, and the target path is in the message (test 3 verifies all of this), so any code that branches on e.code or greps the message for the target filename works identically. This is purely a human-readable-message difference, and the fix requires non-trivial signature changes across three layers. Filing as a nit that can be deferred to a follow-up; it should not block this PR.

Suggested fix (follow-up)

Pass the parent filename as a second argument from overridableRequire/requireESMFromHijackedExtension (this.filename) through $requireESM$loadEsmIntoCjs$esmLoadSync, then in functionEsmLoadSync read callFrame->argument(1) and append (from <parent>) when it's a non-empty string.

// State != Init: ExecuteModule has been entered. For a non-TLA
// record that means the body ran to completion synchronously
// (status only flips to Evaluated once the SCC root pops the
// Tarjan stack). The namespace is fully populated; return it
// without re-driving the loader. For a TLA record reachable here
// the body is suspended at its first await (status flips to
// EvaluatingAsync only when the SCC root pops), so post-await
// bindings are TDZ — treat that the same as the EvaluatingAsync
// path below and surface the async-module error.
if (cyclic->hasTLA())
return throwVMTypeError(globalObject, scope, makeString("require() async module \""_s, keyString, "\" is unsupported. use \"await import()\" instead."_s));
Comment on lines +774 to +775

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 nit (pre-existing pattern): this throws a plain TypeError with no .code, whereas Node sets error.code = 'ERR_REQUIRE_ASYNC_MODULE' for this case — and ErrorCode.ts already defines ERR_REQUIRE_ASYNC_MODULE (currently unused). Since this line intentionally mirrors the existing fallback at ~line 819 (and ModuleLoader.cpp:684/739), consider switching all of them to Bun::throwError(globalObject, scope, Bun::ErrorCode::ERR_REQUIRE_ASYNC_MODULE, ...) in a follow-up rather than just this one.

Extended reasoning...

What the issue is

The new hasTLA() branch at ZigGlobalObject.cpp:773-774 uses throwVMTypeError(globalObject, scope, makeString(...)), which constructs a plain JSC TypeError with only a message and no .code property. Node.js, by contrast, throws an error with error.code === 'ERR_REQUIRE_ASYNC_MODULE' (and Error rather than TypeError) when require() targets a module that contains top-level await. Bun's own ErrorCode.ts:217 already defines ERR_REQUIRE_ASYNC_MODULE, but a grep confirms it is not referenced anywhere in C++ — the code is generated but never used.

Why this is a copy of a pre-existing gap, not a new one

The comment at lines 771-772 says "treat that the same as the EvaluatingAsync path below and surface the async-module error" — and the EvaluatingAsync path at ~line 819 uses the identical throwVMTypeError(...) pattern with the same message string. ModuleLoader.cpp:684 and ModuleLoader.cpp:739 do the same. So the author deliberately matched existing behavior for consistency; the Node-compat gap predates this PR. Notably, the same PR does use the structured form one branch up at line 763 (Bun::throwError(..., Bun::ErrorCode::ERR_REQUIRE_CYCLE_MODULE, ...)), so the mechanism is clearly available and known here.

How it manifests / step-by-step

Take an ESM cycle where a.mjs (which contains top-level await) imports b.mjs, and b.mjs does createRequire(import.meta.url)('./a.mjs') while a.mjs is on the InnerModuleEvaluation stack:

  1. a.mjs enters InnerModuleEvaluationstatus = Evaluating. ExecuteModule sets generator State = Executing and runs the body up to the first await.
  2. Before the SCC root pops, control reaches b.mjs, which calls require('./a.mjs')functionEsmLoadSync.
  3. The registry entry exists, isModuleEvaluated is false, status == Evaluating, State != Init, and cyclic->hasTLA() is true → line 774 executes.
  4. The thrown value is a TypeError whose only own property is message. User code that does catch (e) { if (e.code === 'ERR_REQUIRE_ASYNC_MODULE') ... } (the documented Node way to detect this and fall back to dynamic import()) sees e.code === undefined and takes the wrong branch.

Impact

Minor Node-compat divergence in a narrow edge case (TLA module at status == Evaluating being require()d in a cycle). The much more common path at line ~819 already had the same divergence on main, so no behavior is worse than before — the gap is just propagated to one more code path. Anything that pattern-matches on the error message string will still work.

Suggested fix

Replace the throwVMTypeError here (and at the three pre-existing sites: ZigGlobalObject.cpp:~819, ModuleLoader.cpp:684, ModuleLoader.cpp:739) with:

return Bun::throwError(globalObject, scope, Bun::ErrorCode::ERR_REQUIRE_ASYNC_MODULE,
    makeString("require() async module \""_s, keyString, "\" is unsupported. use \"await import()\" instead."_s));

Fixing only line 774 would make it inconsistent with the path it explicitly mirrors, so this is best done as a small follow-up across all four sites; it shouldn't block this PR.

auto* ns = record->getModuleNamespace(globalObject, false);
Comment on lines +754 to +776

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Inspect local JSC/WebKit state definitions and nearby commentary.
rg -n -C3 'AbstractModuleRecord::State|Field::State|ExecuteModule|CyclicModuleRecord::Status::Evaluating' \
  --iglob '*.h' --iglob '*.cpp'

# Check whether there is already coverage for re-entry from inside an executing ESM body.
rg -n -C2 'createRequire\(import\.meta\.url\)|ERR_REQUIRE_CYCLE_MODULE|require\("\./.*\.mjs"\)' test src

Repository: oven-sh/bun

Length of output: 21266


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Find JSC AbstractModuleRecord::State enum definition
fd -e h -e cpp | xargs rg -l 'AbstractModuleRecord.*State' | head -5

# Extract the State enum definition
rg -A 20 'enum.*State.*{' src/bun.js/bindings/ZigGlobalObject.cpp || true
rg -n 'AbstractModuleRecord::State' --iglob '*.h' -A 1 -B 1 | head -40

Repository: oven-sh/bun

Length of output: 205


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Check the full context around the State check - look at ExecuteModule behavior
rg -n 'ExecuteModule' --iglob '*.h' --iglob '*.cpp' | grep -i state | head -20

# Look for any guard or check after State != Init that might validate completion
rg -n -A 10 'state.asInt32.*Init' src/bun.js/bindings/ZigGlobalObject.cpp

Repository: oven-sh/bun

Length of output: 1302


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Check if the test file covers self-require from within body
cat test/js/bun/resolve/require-esm-cycle.test.ts | grep -A 30 'self.*require\|createRequire' | head -50

# Search for any test that has a module importing itself
rg -l 'require.*\[".*"\].*require.*\[".*"\]' test

Repository: oven-sh/bun

Length of output: 1847


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Look for the State enum in JavaScriptCore headers or JSC binding files
find . -name "*.h" -o -name "*.cpp" | xargs rg -n 'enum class State|enum State' | grep -i module | head -20

# More specific: look for AbstractModuleRecord definition
rg -n 'class AbstractModuleRecord' --iglob '*.h' -A 30 | head -60

Repository: oven-sh/bun

Length of output: 37


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Check if there are any other State values besides Init
rg -n 'AbstractModuleRecord::State::' src/bun.js/bindings/ --iglob '*.cpp' --iglob '*.h' | sort -u

# Look at the test file in detail for all test cases
cat test/js/bun/resolve/require-esm-cycle.test.ts | wc -l

# Show the entire test file
cat test/js/bun/resolve/require-esm-cycle.test.ts

Repository: oven-sh/bun

Length of output: 3947


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Check if there's a self-require test case (module A requires itself from within its own body)
rg -n 'import.*\./.*mjs.*require.*\./.*mjs' test/js/bun/resolve/ -A 5 -B 5 | head -80

# Also search for any TODO or FIXME comments in ZigGlobalObject.cpp related to this
rg -n 'TODO|FIXME|XXX' src/bun.js/bindings/ZigGlobalObject.cpp | grep -i 'state\|cycle\|require'

Repository: oven-sh/bun

Length of output: 37


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Look for what happens during module execution - is there a guard preventing self-require?
# Search for how InnerModuleEvaluation or ExecuteModule handles re-entry
rg -n 'InnerModuleEvaluation|ExecuteModule' src/bun.js/bindings/ --iglob '*.cpp' --iglob '*.h' | head -30

# Check if there's any comment or code explaining why State != Init is sufficient
rg -n -B5 -A5 'State.*Executing' src/bun.js/bindings/ZigGlobalObject.cpp

Repository: oven-sh/bun

Length of output: 1519


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Create a test case for self-require within body to see what actually happens
cat > /tmp/test_self_require.mjs << 'EOF'
// Test: module requiring itself from within its own body
import { createRequire } from "node:module";
const require = createRequire(import.meta.url);

console.log("Starting body");
let x;
try {
  // Try to require ourselves from within the body
  const self = require("./test_self_require.mjs");
  console.log("Self-require succeeded, x =", typeof self.x);
  x = "defined-after-require";
} catch (e) {
  console.log("Self-require threw:", e.code || e.message);
  x = "caught-error";
}

export const x = x;
console.log("Body complete, x =", x);
EOF

# Check what paths would hit this code
rg -n 'registryEntry.*key' src/bun.js/bindings/ZigGlobalObject.cpp -B 3 -A 3 | head -40

Repository: oven-sh/bun

Length of output: 1719


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Check if the concern about State is explicitly addressed in comments or follow-up fixes
rg -n 'State.*Executing\|Executing.*State\|state.*complete\|body.*complete' src/bun.js/bindings/ZigGlobalObject.cpp

# Look for any recent commits or changes related to this
git log --oneline -20 -- src/bun.js/bindings/ZigGlobalObject.cpp 2>/dev/null || echo "Git log not available"

Repository: oven-sh/bun

Length of output: 37


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Look for all possible AbstractModuleRecord State values
rg -n 'State::' src/bun.js/bindings/ --iglob '*.cpp' --iglob '*.h' | grep -i 'abstract\|module' | head -20

# Search in WebKit/JSC if available - check for the enum definition
find . -path '*/JavaScriptCore/*' -name '*.h' 2>/dev/null | xargs rg 'enum.*State' 2>/dev/null | head -10 || true

# Look for State field/enum in bindings
rg -n 'Field::State' src/bun.js/bindings/ --iglob '*.cpp' --iglob '*.h' -B 2 -A 2

Repository: oven-sh/bun

Length of output: 2366


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Check if there's any mechanism preventing re-entry during body execution
# Look for guards or checks before functionEsmLoadSync is called
rg -n 'functionEsmLoadSync' src/bun.js/bindings/ZigGlobalObject.cpp -B 10 | head -50

# Check what calls functionEsmLoadSync and if there are any conditions
rg -n 'functionEsmLoadSync' src/bun.js/bindings/ --iglob '*.cpp' -B 5 -A 5 | head -80

Repository: oven-sh/bun

Length of output: 7061


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Examine the full control flow of functionEsmLoadSync to understand all paths
sed -n '680,830p' src/bun.js/bindings/ZigGlobalObject.cpp

Repository: oven-sh/bun

Length of output: 8355


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Check if there's any mechanism that would prevent recursive re-entry during body execution
# Look for any stack depth checks or re-entry guards
rg -n 'recursion\|reentry\|depth\|stack' src/bun.js/bindings/ZigGlobalObject.cpp | head -20

# Check what happens when a module self-requires - does the entry creation prevent it?
rg -n 'registryEntry.*key\|moduleMap' src/bun.js/bindings/ZigGlobalObject.cpp -B 2 -A 2 | head -40

Repository: oven-sh/bun

Length of output: 37


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Look for what State values exist - check JSC Generator or AbstractModuleRecord
# Search for where State is actually modified/set during execution
rg -n 'internalField.*State.*set\|State.*=' src/bun.js/bindings/ --iglob '*.cpp' --iglob '*.h' | grep -v '//' | head -20

Repository: oven-sh/bun

Length of output: 37


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Check if the concern about self-require is actually covered in tests
# Search for any test that requires the same module from within its own body
find test -name '*.mjs' -o -name '*.ts' | xargs grep -l 'createRequire.*import.meta.url' | while read f; do
  if grep -q 'require.*self\|require.*\./' "$f" 2>/dev/null; then
    echo "=== $f ==="
    cat "$f" | head -50
  fi
done | head -100

Repository: oven-sh/bun

Length of output: 3512


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Look at the test coverage more carefully - check all test files for module self-require
rg 'require.*\(.*\..*mjs' test/js/bun/resolve/require-esm-cycle.test.ts -A 5 -B 5 | head -80

Repository: oven-sh/bun

Length of output: 985


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Understand the ExecuteModule flow: does State change during body execution?
# Look for where State field is read/written/modified
rg -n 'internalField.*State\|State.*value' src/bun.js/bindings/ --iglob '*.cpp' -B 3 -A 3 | head -60

# Check if there's documentation about State lifecycle
rg -n 'ExecuteModule\|State.*Init\|State.*Executing' src/bun.js/bindings/ZigGlobalObject.cpp -B 2 -A 2

Repository: oven-sh/bun

Length of output: 37


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# The key question: can a module require() itself from within its body?
# Create a minimal test to understand the behavior
cat > /tmp/test_self_require.mjs << 'EOF'
// Test module that tries to require itself
import { createRequire } from "node:module";
const require = createRequire(import.meta.url);

console.log("Before require");
try {
  const self = require("./test_self_require.mjs");
  console.log("Self-require returned, typeof self.exported:", typeof self.exported);
} catch (e) {
  console.log("Self-require threw:", e.code || e.name, e.message);
}
console.log("After require");

export const exported = "value from export";
EOF

# Also check: is there a test in the suite that would catch self-require issues?
rg -l 'self.*require\|require.*self' test/js/bun/resolve/ 

# Search for "Cannot access" error which would indicate TDZ exposure
rg -n 'Cannot access.*before initialization' test/ | head -5

Repository: oven-sh/bun

Length of output: 37


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Look for the State enum values more directly
rg -n 'AbstractModuleRecord' --iglob '*.h' -A 40 | grep -E 'enum|State' | head -30

# Check if there's any documentation on when State changes
rg -n 'State.*Executing\|Executing.*State\|Init.*Executing' src/bun.js/bindings/ --iglob '*.cpp' --iglob '*.h' -B 1 -A 1 | head -40

Repository: oven-sh/bun

Length of output: 197


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Examine what happens when loadModuleSync is called from within a module body
# Check the Tarjan stack logic mentioned in comments
rg -n 'Tarjan\|SCC.*root\|root.*pops' src/bun.js/bindings/ZigGlobalObject.cpp -B 2 -A 2

# Check if there's any guard that prevents re-entry from the same module
rg -n 'registryEntry\|moduleMap' src/bun.js/bindings/ZigGlobalObject.cpp -B 1 -A 1 | head -50

Repository: oven-sh/bun

Length of output: 37


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Check the comment about State field more carefully - understand ExecuteModule
# Look for when State is actually set and modified
rg -n 'State.*=.*Executing\|State.*=.*Init' src/bun.js/bindings/ --iglob '*.cpp' --iglob '*.h' | head -20

# Check if the State field is maintained throughout the body execution or if it changes
rg -n 'generator.*State\|JSGenerator.*State' src/bun.js/bindings/ --iglob '*.cpp' --iglob '*.h' | head -20

# Look at what hasTLA() does and if there are similar checks for non-TLA
rg -n 'hasTLA\|evaluationError' src/bun.js/bindings/ZigGlobalObject.cpp -B 2 -A 2 | head -40

Repository: oven-sh/bun

Length of output: 37


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Check if there's a control flow that would prevent self-require from being an issue
# Look at how InnerModuleEvaluation works and if it has any re-entry guards
rg -n 'InnerModuleEvaluation\|statusTryToRun' src/bun.js/bindings/ --iglob '*.cpp' --iglob '*.h' -B 2 -A 2 | head -60

# Check the actual test to see what the behavior should be
# Can a module call require() on itself from within its body?
cat test/js/bun/resolve/require-esm-cycle.test.ts | grep -A 50 "test.*SCC sibling"

Repository: oven-sh/bun

Length of output: 37


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# The key question: what are all the possible State values and when do they change?
# Let's look at JSC generator state
rg -n 'class.*Generator\|enum.*State' src/bun.js/bindings/ --iglob '*.h' -A 10 | head -80

# Look for documentation or implementation of how State changes during execution
rg -n 'generator.*State.*Init\|generator.*State.*Executing' src/bun.js/bindings/ --iglob '*.cpp' -A 3 -B 3 | head -50

Repository: oven-sh/bun

Length of output: 37


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Check whether there's any mechanism preventing a module from re-entering require() on itself
# Look at loadModuleSync and its guards
rg -n 'loadModuleSync\|CyclicModuleRecord::evaluate' src/bun.js/bindings/ZigGlobalObject.cpp -B 3 -A 3 | head -60

# Look for any evaluation stack or guard mechanism
rg -n 'evaluation.*stack\|InnerModuleEvaluation' src/bun.js/bindings/ZigGlobalObject.cpp -B 2 -A 2 | head -40

Repository: oven-sh/bun

Length of output: 37


🏁 Script executed:

#!/bin/bash
set -euoiphisde pipefail

# Most direct approach: what does the comment say about the assumption?
# Lines 764-770 claim State != Init means body is complete, let's verify
sed -n '750,800p' src/bun.js/bindings/ZigGlobalObject.cpp

Repository: oven-sh/bun

Length of output: 173


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Most direct approach: what does the comment say about the assumption?
# Lines 764-770 claim State != Init means body is complete, let's verify
sed -n '750,800p' src/bun.js/bindings/ZigGlobalObject.cpp

Repository: oven-sh/bun

Length of output: 3329


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Check the full test to understand what's covered
cat test/js/bun/resolve/require-esm-cycle.test.ts

Repository: oven-sh/bun

Length of output: 3783


Test coverage is incomplete—no test for module self-require from within its body.

Lines 753-775 assume State != Init proves the module body finished executing. However, the generator State field may remain in its "Executing" state throughout the entire body, and there is no test for a module calling require() on itself from within its own body (via createRequire(import.meta.url)). If such a self-require occurs while the body is still running, the namespace would be returned with later exports still in TDZ, reintroducing the partial-namespace bug.

Add a test case that:

  • Has module A create a require function and call require("./a.mjs") from within its own body
  • Verify it either throws ERR_REQUIRE_CYCLE_MODULE or correctly defers the access until the body completes

Or document/verify that the JSC generator State field has a stronger completion signal than "not Init" that is checked implicitly.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/bun.js/bindings/ZigGlobalObject.cpp` around lines 753 - 775, The PR is
missing a test that reproduces a module self-require from inside its own body to
ensure we don't return a partially-populated namespace when the generator State
stays "Executing"; add a unit/integration test that loads an ES module A which
calls createRequire(import.meta.url) and does require("./a.mjs") from within its
module body, asserting that the operation either throws ERR_REQUIRE_CYCLE_MODULE
or blocks/defers access until the body completes (i.e., never returns a
namespace with TDZ exports); place the test alongside other module/require tests
and exercise both a plain module and a TLA case so the code paths around
ZigGlobalObject.cpp symbols cyclic / AbstractModuleRecord::Field::State / hasTLA
/ getModuleNamespace are covered.

RETURN_IF_EXCEPTION(scope, {});
return JSValue::encode(ns);
Comment on lines +762 to 778

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 The State == Init check only catches cycles where the target is still in the requested-modules loop of InnerModuleEvaluation; it misses the case where the target's body is currently executing and re-enters itself via require() (directly, or via a CJS dep that requires back into it). In that case status == Evaluating but State == Executing, so the check falls through and returns a namespace whose let/const/class exports are still TDZ — Node throws ERR_REQUIRE_CYCLE_MODULE here. The comment "For a non-TLA record that means the body ran to completion synchronously" is only true when observed from an SCC sibling, not from inside the module's own call frame.

Extended reasoning...

What's wrong

The new Evaluating branch in functionEsmLoadSync uses the generator State field to decide between two cases:

  • State == Init → body not started → throw ERR_REQUIRE_CYCLE_MODULE
  • State != Init (and !hasTLA) → body ran to completion → return namespace

The implicit assumption (stated in the comment at lines 764-767) is: "State != Init: ExecuteModule has been entered. For a non-TLA record that means the body ran to completion synchronously." That inference holds when you're observing the record from outside its own call frame — i.e. the SCC-sibling case in test 2, where c.mjs's body has returned and d.mjs then requires it. It does not hold when you're observing the record from inside its own body via re-entrant require(). There, ExecuteModule has been entered but has not returned, and any let/const/class exports declared after the require() call are still TDZ.

The PR's own comment at lines 754-755 says "ExecuteModule sets the generator State field from Init -> Executing before entering the body", which directly implies that while the body is on the call stack, State == Executing. And JSC's AbstractModuleRecord::State (an alias for JSGenerator::State) has no separate "completed" value that a non-TLA module transitions to after its body returns — so the field fundamentally cannot distinguish "body currently on the call stack" from "body finished, waiting for SCC root to pop". Both are non-Init.

Concrete trigger

// a.mjs
import { createRequire } from 'node:module';
const req = createRequire(import.meta.url);
const self = req('./a.mjs');
console.log(self.value);          // ReferenceError: Cannot access 'value' before initialization
export const value = 'ok';

A more realistic variant: a.mjs's body calls into a CJS helper which requires ./a.mjs back:

// a.mjs
import init from './b.cjs';
init();                           // runs during a.mjs body
export const value = 'ok';
// b.cjs
module.exports = () => require('./a.mjs').value;  // TDZ instead of ERR_REQUIRE_CYCLE_MODULE

Step-by-step proof

  1. a.mjs starts evaluating: InnerModuleEvaluation sets status = Evaluating, then ExecuteModule sets State = Executing and enters the body.
  2. The body calls req('./a.mjs'). a.mjs was loaded as ESM, so it is not in $requireMap; overridableRequire falls through to this.$require, which returns -1 for an .mjs file.
  3. $requireESM('./a.mjs') calls $esmNamespaceForCjs, which returns undefined because it requires status == Evaluated (isModuleEvaluated).
  4. $loadEsmIntoCjs$esmLoadSyncfunctionEsmLoadSync. The registry entry exists, isModuleEvaluated(record) is false, status == Evaluating matches the new branch.
  5. state == Init? No — it's Executing. hasTLA()? No. → getModuleNamespace is returned.
  6. Back in overridableRequire, namespace !== undefined, so it sets __esModule and returns namespace to the caller. Reading self.value throws Cannot access 'value' before initialization — exactly the symptom this PR is meant to eliminate, just via a different cycle topology.

Node.js throws ERR_REQUIRE_CYCLE_MODULE whenever the target's V8 status is kEvaluating, with no body-started/not-started distinction, so Node throws here.

Why existing code doesn't prevent it

The PR's three tests cover (1) target in the requested-modules loop (State == Init) and (2) target whose body has already returned (SCC sibling). Neither exercises a target whose ExecuteModule frame is still on the native stack. Test 2 passing actually proves that a non-TLA module's State is non-Init after its body returns — which is the same value it has during body execution, confirming the two cases are indistinguishable via this field.

Pre-PR, this case also returned a TDZ namespace (it fell through to loadModuleSync), so this is an incomplete fix rather than a new regression. But the PR now explicitly handles status == Evaluating and codifies the wrong behavior with a comment asserting the namespace is "fully populated" when it isn't.

Suggested fix

Since the State field can't disambiguate, you'd need a different signal — e.g. a per-record flag set after ExecuteModule returns, or walking the SCC stack to see if the record's ExecuteModule frame is above us. Failing that, throwing ERR_REQUIRE_CYCLE_MODULE for any status == Evaluating (matching Node exactly) is safer than returning a possibly-TDZ namespace, at the cost of breaking the SCC-sibling case in test 2 — though Node also throws in that case, so test 2 is arguably asserting non-Node behavior anyway.

}
Expand Down
16 changes: 10 additions & 6 deletions src/js/builtins/CommonJS.ts
Original file line number Diff line number Diff line change
Expand Up @@ -103,17 +103,19 @@

// -1 means we need to lookup the module from the ESM registry.
if (out === -1) {
let namespace;
try {
out = $requireESM(id);
namespace = $requireESM(id);
} catch (exception) {
// Since the ESM code is mostly JS, we need to handle exceptions here.
$requireMap.$delete(id);
throw exception;
}

// If we can pull out a ModuleNamespaceObject, let's do it.
const namespace = $esmNamespaceForCjs(id);
// $requireESM returns the namespace object directly (it throws on
// undefined). Use that — re-reading via $esmNamespaceForCjs would miss the
// Evaluating-but-body-ran case that $esmLoadSync handles.
if (namespace !== undefined) {

Check warning on line 118 in src/js/builtins/CommonJS.ts

View check run for this annotation

Claude / Claude Code Review

Dead namespace !== undefined guard contradicts new comment

Nit: the new comment says "$requireESM ... throws on undefined", which is correct — `requireESM()` ends with `if (exports === undefined) throw new TypeError(...)` and the catch block here re-throws — so `namespace` can never be undefined when `if (namespace !== undefined)` is reached (same in `requireESMFromHijackedExtension`). The guard is now always-true dead code that contradicts the comment immediately above it; consider dropping the check (and the now-unreachable fallthrough) or removing th
Comment on lines +115 to 118

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Nit: the new comment says "$requireESM ... throws on undefined", which is correct — requireESM() ends with if (exports === undefined) throw new TypeError(...) and the catch block here re-throws — so namespace can never be undefined when if (namespace !== undefined) is reached (same in requireESMFromHijackedExtension). The guard is now always-true dead code that contradicts the comment immediately above it; consider dropping the check (and the now-unreachable fallthrough) or removing the "(it throws on undefined)" parenthetical.

Extended reasoning...

What

The PR adds a comment at src/js/builtins/CommonJS.ts:115-117 (and the mirror in requireESMFromHijackedExtension at ~338-340):

$requireESM returns the namespace object directly (it throws on undefined).

…and then immediately follows it with the pre-existing guard if (namespace !== undefined) { ... }. The comment and the guard contradict each other: if $requireESM throws on undefined, the guard is dead.

Why the guard is provably dead

requireESM() in the same file (~316-323) is:

var exports = $esmNamespaceForCjs(resolved);
if (exports === undefined) {
  exports = $loadEsmIntoCjs(resolved);
}
if (exports === undefined) {
  throw new TypeError(`require() failed to evaluate module ...`);
}
return exports;

So $requireESM(id) either returns a non-undefined value or throws. Back in overridableRequire, the call is wrapped in a try/catch whose catch block does $requireMap.$delete(id); throw exception; — i.e. it re-throws. Therefore when control reaches line 118, namespace !== undefined is guaranteed and the if is always entered. The fallthrough to $evaluateCommonJSModule(mod, this) for the out === -1 branch is consequently unreachable (which is in fact the intended new behavior — the PR wants to return namespace in the SCC-sibling case rather than re-evaluate — but the dead guard obscures that intent).

Step-by-step

  1. out === -1, so we enter the ESM branch and call namespace = $requireESM(id).
  2. Case A — $requireESM returns: by its definition the return value is non-undefined. namespace !== undefined is true → enter the block, set __esModule, return mod.exports = namespace[...].
  3. Case B — $requireESM throws (including the "failed to evaluate module" TypeError for the would-be-undefined case): the catch block deletes from $requireMap and re-throws. Line 118 is never reached.
  4. There is no Case C where line 118 is reached with namespace === undefined.

The same reasoning applies verbatim to requireESMFromHijackedExtension, where the dead else-path is return; (leaving this.exports untouched), which is likewise unreachable.

Addressing the refutation

One verifier argued this is a harmless always-true defensive check that "future-proofs against changes to $requireESM" and isn't worth a comment. That's fair on the behavioral axis — there is zero functional impact, which is why this is filed as a nit, not a bug. But the issue isn't merely "redundant defensive code": the PR adds a comment that explicitly asserts the opposite of what the retained guard checks, on the line directly above it. A reader sees "it throws on undefined" and then if (namespace !== undefined) and has to go verify which one is wrong. That's a self-introduced inconsistency in this PR rather than a subjective style preference about pre-existing code, so it's worth a one-line nit.

Suggested fix

Either:

  • Drop the if (namespace !== undefined) wrapper in both functions and let the body run unconditionally (also deletes the unreachable fallthrough in overridableRequire), or
  • Keep the defensive guard and remove "(it throws on undefined)" from the comment so the two no longer contradict.

Either is a couple-of-line change with no behavior difference.

// In Bun, when __esModule is not defined, it's a CustomAccessor on the prototype.
// Various libraries expect __esModule to be set when using ESM from require().
// We don't want to always inject the __esModule export into every module,
Expand Down Expand Up @@ -324,16 +326,18 @@

export function requireESMFromHijackedExtension(this: JSCommonJSModule, id: string) {
$assert(this);
let namespace;
try {
$requireESM(id);
namespace = $requireESM(id);
} catch (exception) {
// Since the ESM code is mostly JS, we need to handle exceptions here.
$requireMap.$delete(id);
throw exception;
}

// If we can pull out a ModuleNamespaceObject, let's do it.
const namespace = $esmNamespaceForCjs(id);
// $requireESM returns the namespace object directly (it throws on
// undefined). Use that — re-reading via $esmNamespaceForCjs would miss the
// Evaluating-but-body-ran case that $esmLoadSync handles.
if (namespace !== undefined) {
// In Bun, when __esModule is not defined, it's a CustomAccessor on the prototype.
// Various libraries expect __esModule to be set when using ESM from require().
Expand Down
108 changes: 108 additions & 0 deletions test/js/bun/resolve/require-esm-cycle.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
// require(esm) on a module that is already on the InnerModuleEvaluation stack
// (status == Evaluating) used to fall through to loadModuleSync and return the
// namespace object before the body ran, exposing TDZ let/const/class exports.
// Now it must throw ERR_REQUIRE_CYCLE_MODULE — unless ExecuteModule has already
// run for that record (an SCC sibling whose body completed but whose status
// only flips to Evaluated once the SCC root pops), in which case the namespace
// is fully populated and is still returned.
import { expect, test } from "bun:test";
import { bunEnv, bunExe, tempDir } from "harness";

test("require(esm) on a module mid-evaluation throws ERR_REQUIRE_CYCLE_MODULE", async () => {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 nit: per test/CLAUDE.md ("Prefer concurrent tests over sequential tests… make them concurrent with test.concurrent or describe.concurrent"), these three tests each spawn an independent Bun subprocess in their own tempDir with no shared state, so they could be test.concurrent to run in parallel.

Extended reasoning...

What

test/CLAUDE.md:22 states:

Prefer concurrent tests over sequential tests: When multiple tests in the same file spawn processes or write files, make them concurrent with test.concurrent or describe.concurrent unless it's very difficult to make them concurrent.

This file defines three tests (lines 11, 45, 81), each of which:

  1. creates its own isolated tempDir(...) with a unique prefix (require-esm-cycle-throw, require-esm-cycle-sibling, require-esm-cycle-message),
  2. spawns an independent Bun.spawn({ cmd: [bunExe(), ...] }) subprocess pointed at that dir,
  3. awaits stdout/stderr/exit and asserts on the result.

There is no shared state between them — no shared temp dir, no shared port/server, no module-level mutable variables — so they are trivially parallelizable.

Why it matters

Each test pays full bun process-startup cost. Running them sequentially serializes three process spawns; declaring them concurrent lets the runner overlap that I/O wait, which is exactly the case the CLAUDE.md guideline targets. Other files in test/js/bun/resolve/ (e.g. resolve-error.test.ts, bun-main-entry-point.test.ts) already use test.concurrent for the same shape of test.

That said, the immediate sibling files this PR cites in its test plan (require-esm-transitive-tla.test.ts, require-esm-microtask-order.test.ts, require-esm-gc-roots.test.ts) also use plain test(), so this PR is consistent with its closest neighbors. CLAUDE.md is the authoritative guideline though, so flagging it here.

Step-by-step

  1. Test 1 ("throws ERR_REQUIRE_CYCLE_MODULE") creates tempDir("require-esm-cycle-throw", {...}) and spawns bun entry.mjs in it.
  2. Test 2 ("SCC sibling… returns the populated namespace") creates a separate tempDir("require-esm-cycle-sibling", {...}) and spawns bun entry.mjs in that dir.
  3. Test 3 ("error names the requested module") creates a third tempDir("require-esm-cycle-message", {...}) and spawns bun a.mjs.
  4. None of the three reads or writes anything the others can observe; the using dir declarations dispose each temp dir at end-of-scope independently. Therefore running them concurrently cannot change the outcome.

Fix

Either change each test(...) to test.concurrent(...), or wrap all three in a describe.concurrent("require(esm) cycles", () => { ... }) block. No assertion changes needed.

Severity

Pure style/convention — the tests are functionally correct as written. Filing as a nit.

using dir = tempDir("require-esm-cycle-throw", {
"a.mjs": `
import "./b.mjs";
export const value = "ok";
`,
"b.mjs": `
import { createRequire } from "node:module";
const require = createRequire(import.meta.url);
let code = "no-throw";
try {
const a = require("./a.mjs");
// pre-fix: this access threw "Cannot access 'value' before initialization"
void a.value;
} catch (e) {
code = e.code ?? e.constructor.name;
}
console.log(code);
`,
"entry.mjs": `import "./a.mjs";`,
});

await using proc = Bun.spawn({
cmd: [bunExe(), "entry.mjs"],
env: bunEnv,
cwd: String(dir),
stderr: "pipe",
});
const [stdout, stderr, exitCode] = await Promise.all([proc.stdout.text(), proc.stderr.text(), proc.exited]);
expect(stderr).toBe("");
expect(stdout.trim()).toBe("ERR_REQUIRE_CYCLE_MODULE");
expect(exitCode).toBe(0);
});

test("require(esm) on an SCC sibling whose body already ran returns the populated namespace", async () => {
// entry imports c then d. c imports entry back, so c and entry share an SCC;
// after c.execute() returns c stays at status Evaluating until entry (the
// SCC root) finishes. d's body then require()s c — c is at Evaluating with
// generator State != Init (body ran), so the const export is readable
// without re-driving the loader.
using dir = tempDir("require-esm-cycle-sibling", {
"entry.mjs": `
import "./c.mjs";
import "./d.mjs";
export const FROM_ENTRY = 1;
`,
"c.mjs": `
import "./entry.mjs";
export const FROM_C = "populated";
`,
"d.mjs": `
import { createRequire } from "node:module";
const require = createRequire(import.meta.url);
const c = require("./c.mjs");
console.log(c.FROM_C);
`,
});

await using proc = Bun.spawn({
cmd: [bunExe(), "entry.mjs"],
env: bunEnv,
cwd: String(dir),
stderr: "pipe",
});
const [stdout, stderr, exitCode] = await Promise.all([proc.stdout.text(), proc.stderr.text(), proc.exited]);
expect(stderr).toBe("");
expect(stdout.trim()).toBe("populated");
expect(exitCode).toBe(0);
});

test("require(esm) cycle error names the requested module", async () => {
using dir = tempDir("require-esm-cycle-message", {
"a.mjs": `
import "./b.mjs";
export const value = 1;
`,
"b.mjs": `
import { createRequire } from "node:module";
const require = createRequire(import.meta.url);
try {
require("./a.mjs");
} catch (e) {
console.log(e.code, e.message.includes("a.mjs"));
}
`,
});

await using proc = Bun.spawn({
cmd: [bunExe(), "a.mjs"],
env: bunEnv,
cwd: String(dir),
stderr: "pipe",
});
const [stdout, stderr, exitCode] = await Promise.all([proc.stdout.text(), proc.stderr.text(), proc.exited]);
expect(stderr).toBe("");
expect(stdout.trim()).toBe("ERR_REQUIRE_CYCLE_MODULE true");
expect(exitCode).toBe(0);
});
Loading