Skip to content

perf(JSC): O(E+N) GetModuleNamespace; iterative InnerModuleLoading#200

Merged
Jarred-Sumner merged 3 commits into
mainfrom
claude/module-loader-perf
Apr 26, 2026
Merged

perf(JSC): O(E+N) GetModuleNamespace; iterative InnerModuleLoading#200
Jarred-Sumner merged 3 commits into
mainfrom
claude/module-loader-perf

Conversation

@Jarred-Sumner

Copy link
Copy Markdown
Collaborator

perf(JSC): O(E+N) GetModuleNamespace; iterative InnerModuleLoading

  • AbstractModuleRecord::getModuleNamespace: replace per-name star-graph
    walk (O(names × edges)) with a single BFS that records each name's
    unique Local/Namespace binding. A name with exactly one binding is
    Resolved by spec regardless of resolveSet, so only names with >1
    binding or an Indirect entry fall back to resolveExportImpl.
  • AbstractModuleRecord::resolveExportImpl: read m_resolutionCache in
    Type::Query unconditionally. Only Resolved values are ever written
    (root or local), so a non-empty resolveSet can only turn paths to the
    cached binding into null, and merge(Resolved, null) = Resolved.
  • JSModuleLoader::innerModuleLoading: drain a worklist on the state
    instead of recursing through the FinishLoadingImportedModule ->
    ContinueModuleLoading re-entry. continueModuleLoading short-circuits
    already-visited modules without a function call.
  • ModuleGraphLoadingState: inline the trivial accessors; widen
    m_visitedSet to AbstractModuleRecord* so the per-edge contains check
    runs before dynamicDowncast.
  • JSModuleLoader::loadRequestedModules: memoize the GraphLoadingState
    capability per module (skip rejected so retry still re-walks).

bench/module-loader (500 modules, 30k star edges):
GetModuleNamespace + ResolveExport drop from ~48% of cycles to <1%.
Roughly 2x faster end-to-end vs previous.

bench/module-loader (500 modules, 30k star edges)

before after
getModuleNamespace/resolveExportImpl self-cycles ~48% <1%
innerModuleLoading self-cycles ~15% <1%
end-to-end (local release, no LTO) ~694ms ~320ms
bun-profile (prebuilt baseline) ~530ms
node 24 ~525ms
deno 2.7 ~157ms

Bun test/js/bun/resolve + test/js/node/module suites unchanged vs baseline.

- AbstractModuleRecord::getModuleNamespace: replace per-name star-graph
  walk (O(names × edges)) with a single BFS that records each name's
  unique Local/Namespace binding. A name with exactly one binding is
  Resolved by spec regardless of resolveSet, so only names with >1
  binding or an Indirect entry fall back to resolveExportImpl.
- AbstractModuleRecord::resolveExportImpl: read m_resolutionCache in
  Type::Query unconditionally. Only Resolved values are ever written
  (root or local), so a non-empty resolveSet can only turn paths to the
  cached binding into null, and merge(Resolved, null) = Resolved.
- JSModuleLoader::innerModuleLoading: drain a worklist on the state
  instead of recursing through the FinishLoadingImportedModule ->
  ContinueModuleLoading re-entry. continueModuleLoading short-circuits
  already-visited modules without a function call.
- ModuleGraphLoadingState: inline the trivial accessors; widen
  m_visitedSet to AbstractModuleRecord* so the per-edge contains check
  runs before dynamicDowncast.
- JSModuleLoader::loadRequestedModules: memoize the GraphLoadingState
  capability per module (skip rejected so retry still re-walks).

bench/module-loader (500 modules, 30k star edges):
  GetModuleNamespace + ResolveExport drop from ~48% of cycles to <1%.
  Roughly 2x faster end-to-end vs previous.
@coderabbitai

coderabbitai Bot commented Apr 26, 2026

Copy link
Copy Markdown

Walkthrough

The changes optimize ECMAScript module graph loading by refactoring from recursive call-based traversal to an iterative state-managed worklist approach, while consolidating module namespace export resolution into a single graph traversal with unified conflict detection.

Changes

Cohort / File(s) Summary
Export Resolution Optimization
Source/JavaScriptCore/runtime/AbstractModuleRecord.cpp
Refactors getModuleNamespace to perform star-export graph traversal once while inspecting exportEntries(), replacing the prior two-phase strategy of collecting all names then resolving each separately. Now records Resolution only for uniquely-bound names; conflicting, non-local, and indirect cases are deferred to slowPathNames for resolution via the existing path.
Module Loading State Infrastructure
Source/JavaScriptCore/runtime/ModuleGraphLoadingState.h, Source/JavaScriptCore/runtime/ModuleGraphLoadingState.cpp
Converts accessor/mutator methods (promise(), pendingModulesCount(), isLoading(), scriptFetcher(), and setters) from .cpp implementations to inline definitions in .h. Generalizes containsVisited() to accept const AbstractModuleRecord* with inline retyped keying. Introduces m_innerLoadQueue and m_drainingInnerLoad state with new methods for enqueue/take operations and drain-status queries.
Module Loading Refactoring
Source/JavaScriptCore/runtime/JSModuleLoader.cpp
Replaces recursive InnerModuleLoading re-entry with iterative worklist-based drain loop. innerModuleLoading now enqueues modules and uses drainingInnerLoad flag to ensure single outermost processing loop. continueModuleLoading adds fast-path completion for already-visited modules while avoiding full inner routine invocation during active drains.
🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Description check ❓ Inconclusive The description is comprehensive and detailed, explaining the changes with technical depth, benchmarks, and test results. However, it lacks the required Bugzilla bug link and formal structure specified in the WebKit template. Add the required Bugzilla bug link (https://bugs.webkit.org/show_bug.cgi?id=#####) and follow the formal template structure with 'Reviewed by' section.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main performance-focused changes: O(E+N) optimization to GetModuleNamespace and converting recursive InnerModuleLoading to iterative.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

JSTests/modules/self-star-link.js: an Indirect cycle that falls back to
star (JSC's IndirectFallback) makes the cached Resolved value
path-dependent — C resolved as a root caches 'D', but C reached from A
must yield 'E'. The original !foundStarLinks gate is correct here. The
getModuleNamespace BFS doesn't depend on this read, so no perf change.
@github-actions

github-actions Bot commented Apr 26, 2026

Copy link
Copy Markdown

Preview Builds

Commit Release Date
28f2acb7 autobuild-preview-pr-200-28f2acb7 2026-04-26 04:14:55 UTC
143fb225 autobuild-preview-pr-200-143fb225 2026-04-26 02:08:22 UTC

Jarred-Sumner added a commit to oven-sh/bun that referenced this pull request Apr 26, 2026
oven-sh/WebKit#200:
- O(E+N) GetModuleNamespace via single-BFS unique-binding fast path
- iterative InnerModuleLoading worklist
- inline ModuleGraphLoadingState accessors
- memoize LoadRequestedModules per module

bench/module-loader (500 modules / 30k star edges): ~2x faster end-to-end.
The per-module WriteBarrier<JSPromise> kept one extra Promise alive for
the lifetime of every module record (visible in fetch-leak's heap-stats
threshold). The memoization only helps the rare case of two concurrent
import()s of the same module before either fetches; the original
per-state walk handles that correctly, just with a redundant traversal.
Not worth a permanent +1 Promise per module.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@Source/JavaScriptCore/runtime/JSModuleLoader.cpp`:
- Around line 841-861: The current code enqueues an already-loaded child via
state->enqueueInnerLoad(loaded) and then continues to the next sibling, which
breaks the required depth-first descent; change the logic in the block that
checks module->loadedModules() (ModuleMapKey lookup) to process cached children
immediately using an explicit stack or frame instead of enqueueing them: when
you find iter->value.m_module (loaded) and state->containsVisited(loaded) is
false, push or process that loaded AbstractModuleRecord* (the same work that
enqueueInnerLoad would trigger) before advancing to the next request so
descendant HostLoadImportedModule calls (hostLoadImportedModule) and synchronous
failures occur before siblings; preserve the state->isLoading() checks and the
existing payload creation (ModuleLoaderPayload::create) and only call
hostLoadImportedModule for requests after the cached-child stack/frame has been
drained.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 599ee7ac-30ee-40fe-a40f-a3dae38a12dc

📥 Commits

Reviewing files that changed from the base of the PR and between 143fb22 and 28f2acb.

📒 Files selected for processing (2)
  • Source/JavaScriptCore/runtime/AbstractModuleRecord.cpp
  • Source/JavaScriptCore/runtime/JSModuleLoader.cpp

Comment on lines +841 to +861
if (!loadedModulesEmpty) {
if (auto iter = module->loadedModules().find(ModuleMapKey { request.m_specifier.impl(), request.type() }); iter != module->loadedModules().end()) {
// 2.d.ii.1. Perform InnerModuleLoading(state, record.[[Module]]).
AbstractModuleRecord* loaded = iter->value.m_module.get();
if (state->containsVisited(loaded))
state->setPendingModulesCount(state->pendingModulesCount() - 1);
else
state->enqueueInnerLoad(loaded);
if (!state->isLoading())
break;
continue;
}
}
// 2.d.iii. Else,
// 2.d.iii.1. Perform HostLoadImportedModule(module, request, state.[[HostDefined]], state).
// The payload only carries `state`; one instance is sufficient for every
// request originating from this drain. Async loads stash it in their
// ModuleLoadingContext, but they all want the same state back.
if (!payload)
payload = ModuleLoaderPayload::create(vm, state);
JSPromise* promise = hostLoadImportedModule(globalObject, cyclic, request, payload, state->scriptFetcher(), true);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Keep [[LoadedModules]] descent depth-first.

This path now enqueues an already-loaded child and keeps iterating later sibling requests. Step 2.d.ii.1 used to descend into that child immediately, so this reorders descendant HostLoadImportedModule calls and synchronous failures relative to later siblings. A graph like [A_loaded, B] can now start B before a synchronous failure in A_loaded's subtree is observed, which changes host side effects and potentially first-error precedence. Use an explicit stack/frame so cached children are processed before advancing to the next sibling.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Source/JavaScriptCore/runtime/JSModuleLoader.cpp` around lines 841 - 861, The
current code enqueues an already-loaded child via
state->enqueueInnerLoad(loaded) and then continues to the next sibling, which
breaks the required depth-first descent; change the logic in the block that
checks module->loadedModules() (ModuleMapKey lookup) to process cached children
immediately using an explicit stack or frame instead of enqueueing them: when
you find iter->value.m_module (loaded) and state->containsVisited(loaded) is
false, push or process that loaded AbstractModuleRecord* (the same work that
enqueueInnerLoad would trigger) before advancing to the next request so
descendant HostLoadImportedModule calls (hostLoadImportedModule) and synchronous
failures occur before siblings; preserve the state->isLoading() checks and the
existing payload creation (ModuleLoaderPayload::create) and only call
hostLoadImportedModule for requests after the cached-child stack/frame has been
drained.

@Jarred-Sumner Jarred-Sumner merged commit bdf6aab into main Apr 26, 2026
43 checks passed
Jarred-Sumner added a commit to oven-sh/bun that referenced this pull request Apr 26, 2026
Jarred-Sumner added a commit to oven-sh/bun that referenced this pull request Apr 26, 2026
Companion to oven-sh/WebKit#200.

Adds correctness coverage for the new JSC `GetModuleNamespace`
single-BFS fast path:
- deep `export * from` chain — every transitive binding is present
- two siblings with the same local — name is excluded (ambiguous), but a
local in the importing module shadows correctly
- indirect re-export through a star chain — still resolves via the slow
path

These pass on both the current and patched JSC (the optimization
preserves observable behavior).

Bench impact from the WebKit change (500 modules, 30k star edges, local
release without LTO):
- before: ~694ms
- after: ~320ms
- node 24: ~525ms, deno 2.7: ~157ms

Separately found while auditing: a fresh `import()` that reaches a
module whose previous load failed resolves on Bun (all versions back to
1.3.11) but rejects on Node/Deno — pre-existing, not touched by this PR.

---------

Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
xhjkl pushed a commit to xhjkl/bun that referenced this pull request May 14, 2026
…#29729)

Companion to oven-sh/WebKit#200.

Adds correctness coverage for the new JSC `GetModuleNamespace`
single-BFS fast path:
- deep `export * from` chain — every transitive binding is present
- two siblings with the same local — name is excluded (ambiguous), but a
local in the importing module shadows correctly
- indirect re-export through a star chain — still resolves via the slow
path

These pass on both the current and patched JSC (the optimization
preserves observable behavior).

Bench impact from the WebKit change (500 modules, 30k star edges, local
release without LTO):
- before: ~694ms
- after: ~320ms
- node 24: ~525ms, deno 2.7: ~157ms

Separately found while auditing: a fresh `import()` that reaches a
module whose previous load failed resolves on Bun (all versions back to
1.3.11) but rejects on Node/Deno — pre-existing, not touched by this PR.

---------

Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant