require(esm): throw ERR_REQUIRE_CYCLE_MODULE instead of returning TDZ namespace#29884
require(esm): throw ERR_REQUIRE_CYCLE_MODULE instead of returning TDZ namespace#29884dylan-conway wants to merge 3 commits into
Conversation
…duleEvaluation functionEsmLoadSync treated any record at Status::Evaluating as "body already ran, namespace populated" and returned it. That is only true after loadModuleSync drives the record through ExecuteModule. If the record was *already* Evaluating before we touched it, it is on an outer InnerModuleEvaluation stack and its body may not have started — let/const/class exports are TDZ, so the returned namespace blew up far from the cycle as "Cannot access 'x' before initialization". Distinguish via the generator State internal field: Init means ExecuteModule has not been entered (throw ERR_REQUIRE_CYCLE_MODULE, matching Node); otherwise the body has run for a non-TLA record (status only flips to Evaluated once the SCC root pops the Tarjan stack), so return the namespace directly without re-driving the loader. The early return also avoids violating CyclicModuleRecord::evaluate()'s precondition assertion. CommonJS.ts: use $requireESM's return value directly instead of re-reading via $esmNamespaceForCjs, whose stricter status==Evaluated check dropped the namespace in the SCC-sibling case.
|
Updated 4:41 PM PT - Apr 30th, 2026
❌ @Jarred-Sumner, your commit e026e8a has some failures in 🧪 To try this PR locally: bunx bun-pr 29884That installs a local version of the PR into your bun-29884 --bun |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (2)
WalkthroughAdds ERR_REQUIRE_CYCLE_MODULE to error mappings, changes sync ESM load path to detect and handle cyclic re-entry (throwing cycle or TLA errors or returning namespace as appropriate), captures ESM namespace returns in CommonJS require flows, and adds tests validating require(esm) cycle behavior. Changes
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Review rate limit: 4/5 reviews remaining, refill in 12 minutes. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
test/js/bun/resolve/require-esm-cycle.test.ts (1)
1-109: 🛠️ Refactor suggestion | 🟠 MajorMove these cases into the existing
require(esm)test suite.The new coverage is useful, but the repo’s test guidelines ask new cases for an existing codepath to live in the existing test file instead of a new standalone suite. The PR summary already notes there is existing
require(esm)coverage, so these three regressions should be folded into that file rather than addingrequire-esm-cycle.test.ts.As per coding guidelines, "Add new tests to existing test files for the code being changed, not in new separate files."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/js/bun/resolve/require-esm-cycle.test.ts` around lines 1 - 109, These three new tests (the ones named "require(esm) on a module mid-evaluation throws ERR_REQUIRE_CYCLE_MODULE", "require(esm) on an SCC sibling whose body already ran returns the populated namespace", and "require(esm) cycle error names the requested module") should be moved into the existing require(esm) test suite file rather than in a new standalone file: copy their test functions and imports (expect, test, bunEnv, bunExe, tempDir, Bun.spawn usage) into the existing require(esm) test file, resolve any duplicate imports/helpers, run the suite to ensure no name collisions, and then remove this separate require-esm-cycle.test.ts file so the cases live with the related tests.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/bun.js/bindings/ZigGlobalObject.cpp`:
- Around line 753-775: The PR is missing a test that reproduces a module
self-require from inside its own body to ensure we don't return a
partially-populated namespace when the generator State stays "Executing"; add a
unit/integration test that loads an ES module A which calls
createRequire(import.meta.url) and does require("./a.mjs") from within its
module body, asserting that the operation either throws ERR_REQUIRE_CYCLE_MODULE
or blocks/defers access until the body completes (i.e., never returns a
namespace with TDZ exports); place the test alongside other module/require tests
and exercise both a plain module and a TLA case so the code paths around
ZigGlobalObject.cpp symbols cyclic / AbstractModuleRecord::Field::State / hasTLA
/ getModuleNamespace are covered.
---
Outside diff comments:
In `@test/js/bun/resolve/require-esm-cycle.test.ts`:
- Around line 1-109: These three new tests (the ones named "require(esm) on a
module mid-evaluation throws ERR_REQUIRE_CYCLE_MODULE", "require(esm) on an SCC
sibling whose body already ran returns the populated namespace", and
"require(esm) cycle error names the requested module") should be moved into the
existing require(esm) test suite file rather than in a new standalone file: copy
their test functions and imports (expect, test, bunEnv, bunExe, tempDir,
Bun.spawn usage) into the existing require(esm) test file, resolve any duplicate
imports/helpers, run the suite to ensure no name collisions, and then remove
this separate require-esm-cycle.test.ts file so the cases live with the related
tests.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 35cf5e21-03d9-4b2f-a987-4b4e36427480
📒 Files selected for processing (4)
src/bun.js/bindings/ErrorCode.tssrc/bun.js/bindings/ZigGlobalObject.cppsrc/js/builtins/CommonJS.tstest/js/bun/resolve/require-esm-cycle.test.ts
| if (auto* cyclic = dynamicDowncast<JSC::CyclicModuleRecord>(record); cyclic && cyclic->status() == JSC::CyclicModuleRecord::Status::Evaluating) { | ||
| // ExecuteModule sets the generator State field from Init -> Executing | ||
| // before entering the body. State == Init therefore means the body | ||
| // has not started: the module is still in the requested-modules | ||
| // loop of InnerModuleEvaluation, and let/const/class exports are | ||
| // TDZ. Returning the namespace here is the regression that surfaces | ||
| // as a confusing "x is not a function" far from the cycle. Match | ||
| // Node's ERR_REQUIRE_CYCLE_MODULE instead. | ||
| JSValue state = record->internalField(JSC::AbstractModuleRecord::Field::State).get(); | ||
| if (state.isInt32() && state.asInt32() == static_cast<int32_t>(JSC::AbstractModuleRecord::State::Init)) | ||
| return Bun::throwError(globalObject, scope, Bun::ErrorCode::ERR_REQUIRE_CYCLE_MODULE, makeString("Cannot require() ES Module "_s, keyString, " in a cycle."_s)); | ||
| // State != Init: ExecuteModule has been entered. For a non-TLA | ||
| // record that means the body ran to completion synchronously | ||
| // (status only flips to Evaluated once the SCC root pops the | ||
| // Tarjan stack). The namespace is fully populated; return it | ||
| // without re-driving the loader. For a TLA record reachable here | ||
| // the body is suspended at its first await (status flips to | ||
| // EvaluatingAsync only when the SCC root pops), so post-await | ||
| // bindings are TDZ — treat that the same as the EvaluatingAsync | ||
| // path below and surface the async-module error. | ||
| if (cyclic->hasTLA()) | ||
| return throwVMTypeError(globalObject, scope, makeString("require() async module \""_s, keyString, "\" is unsupported. use \"await import()\" instead."_s)); | ||
| auto* ns = record->getModuleNamespace(globalObject, false); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Inspect local JSC/WebKit state definitions and nearby commentary.
rg -n -C3 'AbstractModuleRecord::State|Field::State|ExecuteModule|CyclicModuleRecord::Status::Evaluating' \
--iglob '*.h' --iglob '*.cpp'
# Check whether there is already coverage for re-entry from inside an executing ESM body.
rg -n -C2 'createRequire\(import\.meta\.url\)|ERR_REQUIRE_CYCLE_MODULE|require\("\./.*\.mjs"\)' test srcRepository: oven-sh/bun
Length of output: 21266
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Find JSC AbstractModuleRecord::State enum definition
fd -e h -e cpp | xargs rg -l 'AbstractModuleRecord.*State' | head -5
# Extract the State enum definition
rg -A 20 'enum.*State.*{' src/bun.js/bindings/ZigGlobalObject.cpp || true
rg -n 'AbstractModuleRecord::State' --iglob '*.h' -A 1 -B 1 | head -40Repository: oven-sh/bun
Length of output: 205
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Check the full context around the State check - look at ExecuteModule behavior
rg -n 'ExecuteModule' --iglob '*.h' --iglob '*.cpp' | grep -i state | head -20
# Look for any guard or check after State != Init that might validate completion
rg -n -A 10 'state.asInt32.*Init' src/bun.js/bindings/ZigGlobalObject.cppRepository: oven-sh/bun
Length of output: 1302
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Check if the test file covers self-require from within body
cat test/js/bun/resolve/require-esm-cycle.test.ts | grep -A 30 'self.*require\|createRequire' | head -50
# Search for any test that has a module importing itself
rg -l 'require.*\[".*"\].*require.*\[".*"\]' testRepository: oven-sh/bun
Length of output: 1847
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Look for the State enum in JavaScriptCore headers or JSC binding files
find . -name "*.h" -o -name "*.cpp" | xargs rg -n 'enum class State|enum State' | grep -i module | head -20
# More specific: look for AbstractModuleRecord definition
rg -n 'class AbstractModuleRecord' --iglob '*.h' -A 30 | head -60Repository: oven-sh/bun
Length of output: 37
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Check if there are any other State values besides Init
rg -n 'AbstractModuleRecord::State::' src/bun.js/bindings/ --iglob '*.cpp' --iglob '*.h' | sort -u
# Look at the test file in detail for all test cases
cat test/js/bun/resolve/require-esm-cycle.test.ts | wc -l
# Show the entire test file
cat test/js/bun/resolve/require-esm-cycle.test.tsRepository: oven-sh/bun
Length of output: 3947
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Check if there's a self-require test case (module A requires itself from within its own body)
rg -n 'import.*\./.*mjs.*require.*\./.*mjs' test/js/bun/resolve/ -A 5 -B 5 | head -80
# Also search for any TODO or FIXME comments in ZigGlobalObject.cpp related to this
rg -n 'TODO|FIXME|XXX' src/bun.js/bindings/ZigGlobalObject.cpp | grep -i 'state\|cycle\|require'Repository: oven-sh/bun
Length of output: 37
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Look for what happens during module execution - is there a guard preventing self-require?
# Search for how InnerModuleEvaluation or ExecuteModule handles re-entry
rg -n 'InnerModuleEvaluation|ExecuteModule' src/bun.js/bindings/ --iglob '*.cpp' --iglob '*.h' | head -30
# Check if there's any comment or code explaining why State != Init is sufficient
rg -n -B5 -A5 'State.*Executing' src/bun.js/bindings/ZigGlobalObject.cppRepository: oven-sh/bun
Length of output: 1519
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Create a test case for self-require within body to see what actually happens
cat > /tmp/test_self_require.mjs << 'EOF'
// Test: module requiring itself from within its own body
import { createRequire } from "node:module";
const require = createRequire(import.meta.url);
console.log("Starting body");
let x;
try {
// Try to require ourselves from within the body
const self = require("./test_self_require.mjs");
console.log("Self-require succeeded, x =", typeof self.x);
x = "defined-after-require";
} catch (e) {
console.log("Self-require threw:", e.code || e.message);
x = "caught-error";
}
export const x = x;
console.log("Body complete, x =", x);
EOF
# Check what paths would hit this code
rg -n 'registryEntry.*key' src/bun.js/bindings/ZigGlobalObject.cpp -B 3 -A 3 | head -40Repository: oven-sh/bun
Length of output: 1719
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Check if the concern about State is explicitly addressed in comments or follow-up fixes
rg -n 'State.*Executing\|Executing.*State\|state.*complete\|body.*complete' src/bun.js/bindings/ZigGlobalObject.cpp
# Look for any recent commits or changes related to this
git log --oneline -20 -- src/bun.js/bindings/ZigGlobalObject.cpp 2>/dev/null || echo "Git log not available"Repository: oven-sh/bun
Length of output: 37
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Look for all possible AbstractModuleRecord State values
rg -n 'State::' src/bun.js/bindings/ --iglob '*.cpp' --iglob '*.h' | grep -i 'abstract\|module' | head -20
# Search in WebKit/JSC if available - check for the enum definition
find . -path '*/JavaScriptCore/*' -name '*.h' 2>/dev/null | xargs rg 'enum.*State' 2>/dev/null | head -10 || true
# Look for State field/enum in bindings
rg -n 'Field::State' src/bun.js/bindings/ --iglob '*.cpp' --iglob '*.h' -B 2 -A 2Repository: oven-sh/bun
Length of output: 2366
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Check if there's any mechanism preventing re-entry during body execution
# Look for guards or checks before functionEsmLoadSync is called
rg -n 'functionEsmLoadSync' src/bun.js/bindings/ZigGlobalObject.cpp -B 10 | head -50
# Check what calls functionEsmLoadSync and if there are any conditions
rg -n 'functionEsmLoadSync' src/bun.js/bindings/ --iglob '*.cpp' -B 5 -A 5 | head -80Repository: oven-sh/bun
Length of output: 7061
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Examine the full control flow of functionEsmLoadSync to understand all paths
sed -n '680,830p' src/bun.js/bindings/ZigGlobalObject.cppRepository: oven-sh/bun
Length of output: 8355
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Check if there's any mechanism that would prevent recursive re-entry during body execution
# Look for any stack depth checks or re-entry guards
rg -n 'recursion\|reentry\|depth\|stack' src/bun.js/bindings/ZigGlobalObject.cpp | head -20
# Check what happens when a module self-requires - does the entry creation prevent it?
rg -n 'registryEntry.*key\|moduleMap' src/bun.js/bindings/ZigGlobalObject.cpp -B 2 -A 2 | head -40Repository: oven-sh/bun
Length of output: 37
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Look for what State values exist - check JSC Generator or AbstractModuleRecord
# Search for where State is actually modified/set during execution
rg -n 'internalField.*State.*set\|State.*=' src/bun.js/bindings/ --iglob '*.cpp' --iglob '*.h' | grep -v '//' | head -20Repository: oven-sh/bun
Length of output: 37
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Check if the concern about self-require is actually covered in tests
# Search for any test that requires the same module from within its own body
find test -name '*.mjs' -o -name '*.ts' | xargs grep -l 'createRequire.*import.meta.url' | while read f; do
if grep -q 'require.*self\|require.*\./' "$f" 2>/dev/null; then
echo "=== $f ==="
cat "$f" | head -50
fi
done | head -100Repository: oven-sh/bun
Length of output: 3512
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Look at the test coverage more carefully - check all test files for module self-require
rg 'require.*\(.*\..*mjs' test/js/bun/resolve/require-esm-cycle.test.ts -A 5 -B 5 | head -80Repository: oven-sh/bun
Length of output: 985
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Understand the ExecuteModule flow: does State change during body execution?
# Look for where State field is read/written/modified
rg -n 'internalField.*State\|State.*value' src/bun.js/bindings/ --iglob '*.cpp' -B 3 -A 3 | head -60
# Check if there's documentation about State lifecycle
rg -n 'ExecuteModule\|State.*Init\|State.*Executing' src/bun.js/bindings/ZigGlobalObject.cpp -B 2 -A 2Repository: oven-sh/bun
Length of output: 37
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# The key question: can a module require() itself from within its body?
# Create a minimal test to understand the behavior
cat > /tmp/test_self_require.mjs << 'EOF'
// Test module that tries to require itself
import { createRequire } from "node:module";
const require = createRequire(import.meta.url);
console.log("Before require");
try {
const self = require("./test_self_require.mjs");
console.log("Self-require returned, typeof self.exported:", typeof self.exported);
} catch (e) {
console.log("Self-require threw:", e.code || e.name, e.message);
}
console.log("After require");
export const exported = "value from export";
EOF
# Also check: is there a test in the suite that would catch self-require issues?
rg -l 'self.*require\|require.*self' test/js/bun/resolve/
# Search for "Cannot access" error which would indicate TDZ exposure
rg -n 'Cannot access.*before initialization' test/ | head -5Repository: oven-sh/bun
Length of output: 37
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Look for the State enum values more directly
rg -n 'AbstractModuleRecord' --iglob '*.h' -A 40 | grep -E 'enum|State' | head -30
# Check if there's any documentation on when State changes
rg -n 'State.*Executing\|Executing.*State\|Init.*Executing' src/bun.js/bindings/ --iglob '*.cpp' --iglob '*.h' -B 1 -A 1 | head -40Repository: oven-sh/bun
Length of output: 197
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Examine what happens when loadModuleSync is called from within a module body
# Check the Tarjan stack logic mentioned in comments
rg -n 'Tarjan\|SCC.*root\|root.*pops' src/bun.js/bindings/ZigGlobalObject.cpp -B 2 -A 2
# Check if there's any guard that prevents re-entry from the same module
rg -n 'registryEntry\|moduleMap' src/bun.js/bindings/ZigGlobalObject.cpp -B 1 -A 1 | head -50Repository: oven-sh/bun
Length of output: 37
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Check the comment about State field more carefully - understand ExecuteModule
# Look for when State is actually set and modified
rg -n 'State.*=.*Executing\|State.*=.*Init' src/bun.js/bindings/ --iglob '*.cpp' --iglob '*.h' | head -20
# Check if the State field is maintained throughout the body execution or if it changes
rg -n 'generator.*State\|JSGenerator.*State' src/bun.js/bindings/ --iglob '*.cpp' --iglob '*.h' | head -20
# Look at what hasTLA() does and if there are similar checks for non-TLA
rg -n 'hasTLA\|evaluationError' src/bun.js/bindings/ZigGlobalObject.cpp -B 2 -A 2 | head -40Repository: oven-sh/bun
Length of output: 37
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Check if there's a control flow that would prevent self-require from being an issue
# Look at how InnerModuleEvaluation works and if it has any re-entry guards
rg -n 'InnerModuleEvaluation\|statusTryToRun' src/bun.js/bindings/ --iglob '*.cpp' --iglob '*.h' -B 2 -A 2 | head -60
# Check the actual test to see what the behavior should be
# Can a module call require() on itself from within its body?
cat test/js/bun/resolve/require-esm-cycle.test.ts | grep -A 50 "test.*SCC sibling"Repository: oven-sh/bun
Length of output: 37
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# The key question: what are all the possible State values and when do they change?
# Let's look at JSC generator state
rg -n 'class.*Generator\|enum.*State' src/bun.js/bindings/ --iglob '*.h' -A 10 | head -80
# Look for documentation or implementation of how State changes during execution
rg -n 'generator.*State.*Init\|generator.*State.*Executing' src/bun.js/bindings/ --iglob '*.cpp' -A 3 -B 3 | head -50Repository: oven-sh/bun
Length of output: 37
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Check whether there's any mechanism preventing a module from re-entering require() on itself
# Look at loadModuleSync and its guards
rg -n 'loadModuleSync\|CyclicModuleRecord::evaluate' src/bun.js/bindings/ZigGlobalObject.cpp -B 3 -A 3 | head -60
# Look for any evaluation stack or guard mechanism
rg -n 'evaluation.*stack\|InnerModuleEvaluation' src/bun.js/bindings/ZigGlobalObject.cpp -B 2 -A 2 | head -40Repository: oven-sh/bun
Length of output: 37
🏁 Script executed:
#!/bin/bash
set -euoiphisde pipefail
# Most direct approach: what does the comment say about the assumption?
# Lines 764-770 claim State != Init means body is complete, let's verify
sed -n '750,800p' src/bun.js/bindings/ZigGlobalObject.cppRepository: oven-sh/bun
Length of output: 173
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Most direct approach: what does the comment say about the assumption?
# Lines 764-770 claim State != Init means body is complete, let's verify
sed -n '750,800p' src/bun.js/bindings/ZigGlobalObject.cppRepository: oven-sh/bun
Length of output: 3329
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Check the full test to understand what's covered
cat test/js/bun/resolve/require-esm-cycle.test.tsRepository: oven-sh/bun
Length of output: 3783
Test coverage is incomplete—no test for module self-require from within its body.
Lines 753-775 assume State != Init proves the module body finished executing. However, the generator State field may remain in its "Executing" state throughout the entire body, and there is no test for a module calling require() on itself from within its own body (via createRequire(import.meta.url)). If such a self-require occurs while the body is still running, the namespace would be returned with later exports still in TDZ, reintroducing the partial-namespace bug.
Add a test case that:
- Has module A create a require function and call
require("./a.mjs")from within its own body - Verify it either throws
ERR_REQUIRE_CYCLE_MODULEor correctly defers the access until the body completes
Or document/verify that the JSC generator State field has a stronger completion signal than "not Init" that is checked implicitly.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/bun.js/bindings/ZigGlobalObject.cpp` around lines 753 - 775, The PR is
missing a test that reproduces a module self-require from inside its own body to
ensure we don't return a partially-populated namespace when the generator State
stays "Executing"; add a unit/integration test that loads an ES module A which
calls createRequire(import.meta.url) and does require("./a.mjs") from within its
module body, asserting that the operation either throws ERR_REQUIRE_CYCLE_MODULE
or blocks/defers access until the body completes (i.e., never returns a
namespace with TDZ exports); place the test alongside other module/require tests
and exercise both a plain module and a TLA case so the code paths around
ZigGlobalObject.cpp symbols cyclic / AbstractModuleRecord::Field::State / hasTLA
/ getModuleNamespace are covered.
| JSValue state = record->internalField(JSC::AbstractModuleRecord::Field::State).get(); | ||
| if (state.isInt32() && state.asInt32() == static_cast<int32_t>(JSC::AbstractModuleRecord::State::Init)) | ||
| return Bun::throwError(globalObject, scope, Bun::ErrorCode::ERR_REQUIRE_CYCLE_MODULE, makeString("Cannot require() ES Module "_s, keyString, " in a cycle."_s)); | ||
| // State != Init: ExecuteModule has been entered. For a non-TLA | ||
| // record that means the body ran to completion synchronously | ||
| // (status only flips to Evaluated once the SCC root pops the | ||
| // Tarjan stack). The namespace is fully populated; return it | ||
| // without re-driving the loader. For a TLA record reachable here | ||
| // the body is suspended at its first await (status flips to | ||
| // EvaluatingAsync only when the SCC root pops), so post-await | ||
| // bindings are TDZ — treat that the same as the EvaluatingAsync | ||
| // path below and surface the async-module error. | ||
| if (cyclic->hasTLA()) | ||
| return throwVMTypeError(globalObject, scope, makeString("require() async module \""_s, keyString, "\" is unsupported. use \"await import()\" instead."_s)); | ||
| auto* ns = record->getModuleNamespace(globalObject, false); | ||
| RETURN_IF_EXCEPTION(scope, {}); | ||
| return JSValue::encode(ns); |
There was a problem hiding this comment.
🔴 The State == Init check only catches cycles where the target is still in the requested-modules loop of InnerModuleEvaluation; it misses the case where the target's body is currently executing and re-enters itself via require() (directly, or via a CJS dep that requires back into it). In that case status == Evaluating but State == Executing, so the check falls through and returns a namespace whose let/const/class exports are still TDZ — Node throws ERR_REQUIRE_CYCLE_MODULE here. The comment "For a non-TLA record that means the body ran to completion synchronously" is only true when observed from an SCC sibling, not from inside the module's own call frame.
Extended reasoning...
What's wrong
The new Evaluating branch in functionEsmLoadSync uses the generator State field to decide between two cases:
State == Init→ body not started → throwERR_REQUIRE_CYCLE_MODULEState != Init(and!hasTLA) → body ran to completion → return namespace
The implicit assumption (stated in the comment at lines 764-767) is: "State != Init: ExecuteModule has been entered. For a non-TLA record that means the body ran to completion synchronously." That inference holds when you're observing the record from outside its own call frame — i.e. the SCC-sibling case in test 2, where c.mjs's body has returned and d.mjs then requires it. It does not hold when you're observing the record from inside its own body via re-entrant require(). There, ExecuteModule has been entered but has not returned, and any let/const/class exports declared after the require() call are still TDZ.
The PR's own comment at lines 754-755 says "ExecuteModule sets the generator State field from Init -> Executing before entering the body", which directly implies that while the body is on the call stack, State == Executing. And JSC's AbstractModuleRecord::State (an alias for JSGenerator::State) has no separate "completed" value that a non-TLA module transitions to after its body returns — so the field fundamentally cannot distinguish "body currently on the call stack" from "body finished, waiting for SCC root to pop". Both are non-Init.
Concrete trigger
// a.mjs
import { createRequire } from 'node:module';
const req = createRequire(import.meta.url);
const self = req('./a.mjs');
console.log(self.value); // ReferenceError: Cannot access 'value' before initialization
export const value = 'ok';A more realistic variant: a.mjs's body calls into a CJS helper which requires ./a.mjs back:
// a.mjs
import init from './b.cjs';
init(); // runs during a.mjs body
export const value = 'ok';
// b.cjs
module.exports = () => require('./a.mjs').value; // TDZ instead of ERR_REQUIRE_CYCLE_MODULEStep-by-step proof
a.mjsstarts evaluating:InnerModuleEvaluationsetsstatus = Evaluating, thenExecuteModulesetsState = Executingand enters the body.- The body calls
req('./a.mjs').a.mjswas loaded as ESM, so it is not in$requireMap;overridableRequirefalls through tothis.$require, which returns-1for an.mjsfile. $requireESM('./a.mjs')calls$esmNamespaceForCjs, which returnsundefinedbecause it requiresstatus == Evaluated(isModuleEvaluated).$loadEsmIntoCjs→$esmLoadSync→functionEsmLoadSync. The registry entry exists,isModuleEvaluated(record)is false,status == Evaluatingmatches the new branch.state == Init? No — it'sExecuting.hasTLA()? No. →getModuleNamespaceis returned.- Back in
overridableRequire,namespace !== undefined, so it sets__esModuleand returnsnamespaceto the caller. Readingself.valuethrowsCannot access 'value' before initialization— exactly the symptom this PR is meant to eliminate, just via a different cycle topology.
Node.js throws ERR_REQUIRE_CYCLE_MODULE whenever the target's V8 status is kEvaluating, with no body-started/not-started distinction, so Node throws here.
Why existing code doesn't prevent it
The PR's three tests cover (1) target in the requested-modules loop (State == Init) and (2) target whose body has already returned (SCC sibling). Neither exercises a target whose ExecuteModule frame is still on the native stack. Test 2 passing actually proves that a non-TLA module's State is non-Init after its body returns — which is the same value it has during body execution, confirming the two cases are indistinguishable via this field.
Pre-PR, this case also returned a TDZ namespace (it fell through to loadModuleSync), so this is an incomplete fix rather than a new regression. But the PR now explicitly handles status == Evaluating and codifies the wrong behavior with a comment asserting the namespace is "fully populated" when it isn't.
Suggested fix
Since the State field can't disambiguate, you'd need a different signal — e.g. a per-record flag set after ExecuteModule returns, or walking the SCC stack to see if the record's ExecuteModule frame is above us. Failing that, throwing ERR_REQUIRE_CYCLE_MODULE for any status == Evaluating (matching Node exactly) is safer than returning a possibly-TDZ namespace, at the cost of breaking the SCC-sibling case in test 2 — though Node also throws in that case, so test 2 is arguably asserting non-Node behavior anyway.
| if (cyclic->hasTLA()) | ||
| return throwVMTypeError(globalObject, scope, makeString("require() async module \""_s, keyString, "\" is unsupported. use \"await import()\" instead."_s)); |
There was a problem hiding this comment.
🟡 nit (pre-existing pattern): this throws a plain TypeError with no .code, whereas Node sets error.code = 'ERR_REQUIRE_ASYNC_MODULE' for this case — and ErrorCode.ts already defines ERR_REQUIRE_ASYNC_MODULE (currently unused). Since this line intentionally mirrors the existing fallback at ~line 819 (and ModuleLoader.cpp:684/739), consider switching all of them to Bun::throwError(globalObject, scope, Bun::ErrorCode::ERR_REQUIRE_ASYNC_MODULE, ...) in a follow-up rather than just this one.
Extended reasoning...
What the issue is
The new hasTLA() branch at ZigGlobalObject.cpp:773-774 uses throwVMTypeError(globalObject, scope, makeString(...)), which constructs a plain JSC TypeError with only a message and no .code property. Node.js, by contrast, throws an error with error.code === 'ERR_REQUIRE_ASYNC_MODULE' (and Error rather than TypeError) when require() targets a module that contains top-level await. Bun's own ErrorCode.ts:217 already defines ERR_REQUIRE_ASYNC_MODULE, but a grep confirms it is not referenced anywhere in C++ — the code is generated but never used.
Why this is a copy of a pre-existing gap, not a new one
The comment at lines 771-772 says "treat that the same as the EvaluatingAsync path below and surface the async-module error" — and the EvaluatingAsync path at ~line 819 uses the identical throwVMTypeError(...) pattern with the same message string. ModuleLoader.cpp:684 and ModuleLoader.cpp:739 do the same. So the author deliberately matched existing behavior for consistency; the Node-compat gap predates this PR. Notably, the same PR does use the structured form one branch up at line 763 (Bun::throwError(..., Bun::ErrorCode::ERR_REQUIRE_CYCLE_MODULE, ...)), so the mechanism is clearly available and known here.
How it manifests / step-by-step
Take an ESM cycle where a.mjs (which contains top-level await) imports b.mjs, and b.mjs does createRequire(import.meta.url)('./a.mjs') while a.mjs is on the InnerModuleEvaluation stack:
a.mjsentersInnerModuleEvaluation→status = Evaluating.ExecuteModulesets generatorState = Executingand runs the body up to the firstawait.- Before the SCC root pops, control reaches
b.mjs, which callsrequire('./a.mjs')→functionEsmLoadSync. - The registry entry exists,
isModuleEvaluatedis false,status == Evaluating,State != Init, andcyclic->hasTLA()is true → line 774 executes. - The thrown value is a
TypeErrorwhose only own property ismessage. User code that doescatch (e) { if (e.code === 'ERR_REQUIRE_ASYNC_MODULE') ... }(the documented Node way to detect this and fall back to dynamicimport()) seese.code === undefinedand takes the wrong branch.
Impact
Minor Node-compat divergence in a narrow edge case (TLA module at status == Evaluating being require()d in a cycle). The much more common path at line ~819 already had the same divergence on main, so no behavior is worse than before — the gap is just propagated to one more code path. Anything that pattern-matches on the error message string will still work.
Suggested fix
Replace the throwVMTypeError here (and at the three pre-existing sites: ZigGlobalObject.cpp:~819, ModuleLoader.cpp:684, ModuleLoader.cpp:739) with:
return Bun::throwError(globalObject, scope, Bun::ErrorCode::ERR_REQUIRE_ASYNC_MODULE,
makeString("require() async module \""_s, keyString, "\" is unsupported. use \"await import()\" instead."_s));Fixing only line 774 would make it inconsistent with the path it explicitly mirrors, so this is best done as a small follow-up across all four sites; it shouldn't block this PR.
| import { expect, test } from "bun:test"; | ||
| import { bunEnv, bunExe, tempDir } from "harness"; | ||
|
|
||
| test("require(esm) on a module mid-evaluation throws ERR_REQUIRE_CYCLE_MODULE", async () => { |
There was a problem hiding this comment.
🟡 nit: per test/CLAUDE.md ("Prefer concurrent tests over sequential tests… make them concurrent with test.concurrent or describe.concurrent"), these three tests each spawn an independent Bun subprocess in their own tempDir with no shared state, so they could be test.concurrent to run in parallel.
Extended reasoning...
What
test/CLAUDE.md:22 states:
Prefer concurrent tests over sequential tests: When multiple tests in the same file spawn processes or write files, make them concurrent with
test.concurrentordescribe.concurrentunless it's very difficult to make them concurrent.
This file defines three tests (lines 11, 45, 81), each of which:
- creates its own isolated
tempDir(...)with a unique prefix (require-esm-cycle-throw,require-esm-cycle-sibling,require-esm-cycle-message), - spawns an independent
Bun.spawn({ cmd: [bunExe(), ...] })subprocess pointed at that dir, - awaits stdout/stderr/exit and asserts on the result.
There is no shared state between them — no shared temp dir, no shared port/server, no module-level mutable variables — so they are trivially parallelizable.
Why it matters
Each test pays full bun process-startup cost. Running them sequentially serializes three process spawns; declaring them concurrent lets the runner overlap that I/O wait, which is exactly the case the CLAUDE.md guideline targets. Other files in test/js/bun/resolve/ (e.g. resolve-error.test.ts, bun-main-entry-point.test.ts) already use test.concurrent for the same shape of test.
That said, the immediate sibling files this PR cites in its test plan (require-esm-transitive-tla.test.ts, require-esm-microtask-order.test.ts, require-esm-gc-roots.test.ts) also use plain test(), so this PR is consistent with its closest neighbors. CLAUDE.md is the authoritative guideline though, so flagging it here.
Step-by-step
- Test 1 ("throws ERR_REQUIRE_CYCLE_MODULE") creates
tempDir("require-esm-cycle-throw", {...})and spawnsbun entry.mjsin it. - Test 2 ("SCC sibling… returns the populated namespace") creates a separate
tempDir("require-esm-cycle-sibling", {...})and spawnsbun entry.mjsin that dir. - Test 3 ("error names the requested module") creates a third
tempDir("require-esm-cycle-message", {...})and spawnsbun a.mjs. - None of the three reads or writes anything the others can observe; the
using dirdeclarations dispose each temp dir at end-of-scope independently. Therefore running them concurrently cannot change the outcome.
Fix
Either change each test(...) to test.concurrent(...), or wrap all three in a describe.concurrent("require(esm) cycles", () => { ... }) block. No assertion changes needed.
Severity
Pure style/convention — the tests are functionally correct as written. Filing as a nit.
| // Node's ERR_REQUIRE_CYCLE_MODULE instead. | ||
| JSValue state = record->internalField(JSC::AbstractModuleRecord::Field::State).get(); | ||
| if (state.isInt32() && state.asInt32() == static_cast<int32_t>(JSC::AbstractModuleRecord::State::Init)) | ||
| return Bun::throwError(globalObject, scope, Bun::ErrorCode::ERR_REQUIRE_CYCLE_MODULE, makeString("Cannot require() ES Module "_s, keyString, " in a cycle."_s)); |
There was a problem hiding this comment.
🟡 nit: Node's ERR_REQUIRE_CYCLE_MODULE message includes a (from <parent>) suffix naming the requesting module, which helps identify which side of the cycle issued the require(). Bun's message only includes the target module path. Plumbing the parent through $loadEsmIntoCjs → $esmLoadSync (CommonJS.ts has this.filename) would close the gap, but .code and the target path are already correct so this is non-blocking.
Extended reasoning...
What the divergence is
Node's ESM loader (lib/internal/modules/esm/loader.js) constructs this error as:
let message = `Cannot require() ES Module ${filename} in a cycle.`;
if (parentFilename) message += ` (from ${parentFilename})`;
throw new ERR_REQUIRE_CYCLE_MODULE(message);So Node's message names both the target module and the parent module that issued the require(). Bun's new message at ZigGlobalObject.cpp:764 is:
makeString("Cannot require() ES Module "_s, keyString, " in a cycle."_s)— only the target. The (from <parent>) suffix is missing.
Why it can't be added trivially
functionEsmLoadSync only receives one argument: callFrame->argument(0), the resolved key of the target module. The parent/requesting module's path isn't available at this layer. The call chain is overridableRequire (which has this.filename) → $requireESM(id) → $loadEsmIntoCjs(resolved) → $esmLoadSync(resolved) → functionEsmLoadSync. None of those intermediate calls forward the parent filename, so adding the suffix would require threading a second argument through $loadEsmIntoCjs and $esmLoadSync.
Step-by-step example
Using test 3's fixture: a.mjs imports b.mjs, and b.mjs does createRequire(import.meta.url)('./a.mjs').
a.mjsentersInnerModuleEvaluation→status = Evaluating, generatorState = Init(still in the requested-modules loop).b.mjsbody runs, callsrequire('./a.mjs')→overridableRequire(wherethis.filenameis/…/b.mjs).$requireESM→$loadEsmIntoCjs('/…/a.mjs')→$esmLoadSync('/…/a.mjs')— only the target path is passed.functionEsmLoadSyncfinds the registry entry fora.mjsatStatus::EvaluatingwithState == Initand throws with messageCannot require() ES Module /…/a.mjs in a cycle.- Node for the same scenario produces
Cannot require() ES Module /…/a.mjs in a cycle. (from /…/b.mjs).
The user sees which module can't be required, but not which module did the requiring — in a two-file cycle that's obvious, but in a larger graph (e.g. a → b → c → d → require(a)) the parent path is the more useful debugging hint.
Addressing the refutation
One verifier argued this isn't worth a comment because Bun routinely diverges from Node's exact error wording and .code is what matters for programmatic compat. That's correct on the merits — .code === 'ERR_REQUIRE_CYCLE_MODULE' is set, the error class is Error, and the target path is in the message (test 3 verifies all of this), so any code that branches on e.code or greps the message for the target filename works identically. This is purely a human-readable-message difference, and the fix requires non-trivial signature changes across three layers. Filing as a nit that can be deferred to a follow-up; it should not block this PR.
Suggested fix (follow-up)
Pass the parent filename as a second argument from overridableRequire/requireESMFromHijackedExtension (this.filename) through $requireESM → $loadEsmIntoCjs → $esmLoadSync, then in functionEsmLoadSync read callFrame->argument(1) and append (from <parent>) when it's a non-empty string.
| // $requireESM returns the namespace object directly (it throws on | ||
| // undefined). Use that — re-reading via $esmNamespaceForCjs would miss the | ||
| // Evaluating-but-body-ran case that $esmLoadSync handles. | ||
| if (namespace !== undefined) { |
There was a problem hiding this comment.
🟡 Nit: the new comment says "$requireESM ... throws on undefined", which is correct — requireESM() ends with if (exports === undefined) throw new TypeError(...) and the catch block here re-throws — so namespace can never be undefined when if (namespace !== undefined) is reached (same in requireESMFromHijackedExtension). The guard is now always-true dead code that contradicts the comment immediately above it; consider dropping the check (and the now-unreachable fallthrough) or removing the "(it throws on undefined)" parenthetical.
Extended reasoning...
What
The PR adds a comment at src/js/builtins/CommonJS.ts:115-117 (and the mirror in requireESMFromHijackedExtension at ~338-340):
$requireESMreturns the namespace object directly (it throws on undefined).
…and then immediately follows it with the pre-existing guard if (namespace !== undefined) { ... }. The comment and the guard contradict each other: if $requireESM throws on undefined, the guard is dead.
Why the guard is provably dead
requireESM() in the same file (~316-323) is:
var exports = $esmNamespaceForCjs(resolved);
if (exports === undefined) {
exports = $loadEsmIntoCjs(resolved);
}
if (exports === undefined) {
throw new TypeError(`require() failed to evaluate module ...`);
}
return exports;So $requireESM(id) either returns a non-undefined value or throws. Back in overridableRequire, the call is wrapped in a try/catch whose catch block does $requireMap.$delete(id); throw exception; — i.e. it re-throws. Therefore when control reaches line 118, namespace !== undefined is guaranteed and the if is always entered. The fallthrough to $evaluateCommonJSModule(mod, this) for the out === -1 branch is consequently unreachable (which is in fact the intended new behavior — the PR wants to return namespace in the SCC-sibling case rather than re-evaluate — but the dead guard obscures that intent).
Step-by-step
out === -1, so we enter the ESM branch and callnamespace = $requireESM(id).- Case A —
$requireESMreturns: by its definition the return value is non-undefined.namespace !== undefinedis true → enter the block, set__esModule, returnmod.exports = namespace[...]. - Case B —
$requireESMthrows (including the "failed to evaluate module"TypeErrorfor the would-be-undefined case): the catch block deletes from$requireMapand re-throws. Line 118 is never reached. - There is no Case C where line 118 is reached with
namespace === undefined.
The same reasoning applies verbatim to requireESMFromHijackedExtension, where the dead else-path is return; (leaving this.exports untouched), which is likewise unreachable.
Addressing the refutation
One verifier argued this is a harmless always-true defensive check that "future-proofs against changes to $requireESM" and isn't worth a comment. That's fair on the behavioral axis — there is zero functional impact, which is why this is filed as a nit, not a bug. But the issue isn't merely "redundant defensive code": the PR adds a comment that explicitly asserts the opposite of what the retained guard checks, on the line directly above it. A reader sees "it throws on undefined" and then if (namespace !== undefined) and has to go verify which one is wrong. That's a self-introduced inconsistency in this PR rather than a subjective style preference about pre-existing code, so it's worth a one-line nit.
Suggested fix
Either:
- Drop the
if (namespace !== undefined)wrapper in both functions and let the body run unconditionally (also deletes the unreachable fallthrough inoverridableRequire), or - Keep the defensive guard and remove "(it throws on undefined)" from the comment so the two no longer contradict.
Either is a couple-of-line change with no behavior difference.
What
$esmLoadSyncreturned the namespace object for a module atStatus::Evaluatingwhose body had not yet run (it was still in the requested-modules loop ofInnerModuleEvaluation).let/const/classexports were TDZ, so the caller hitCannot access 'x' before initializationfar from the actual cycle. Regression from the module-loader rewrite (#29393).Now, before calling
loadModuleSync, if the target record is alreadyEvaluating:State == Init→ExecuteModulehasn't been entered → throwERR_REQUIRE_CYCLE_MODULE(matches Node)CyclicModuleRecord::evaluate()'s status precondition)hasTLA()→ body suspended at firstawait→ surface the existing async-module errorCommonJS.tsnow uses$requireESM's return value directly instead of re-reading via$esmNamespaceForCjs, whose stricterstatus == Evaluatedcheck dropped the namespace in the SCC-sibling case.Why
Test plan
bun bd test test/js/bun/resolve/require-esm-cycle.test.ts— 3 new tests (cycle throws, SCC sibling still works, error names module)bun bd test test/js/bun/resolve/require-esm-{transitive-tla,microtask-order,gc-roots}.test.ts— existing require(esm) coverage still passesbun bd test test/js/bun/resolve/esModule.test.tsUSE_SYSTEM_BUN=1 bun test test/js/bun/resolve/require-esm-cycle.test.ts— tests 1 and 3 fail on 1.3.5Out of scope
When a CJS file is statically imported by ESM, its body runs during
makeModule(the fetch phase) before any record reachesEvaluating; arequire()back into the graph from there sees all records atStatus::New, double-evaluates, and trips afetchCompletedebug assertion. That reproduces onmainwithout this change and is tracked separately.