napi: don't mutate m_pendingNapiModules while range-for iterating it#29981
Conversation
process.dlopen iterates m_pendingNapiModules after the shared library's static constructors have run, calling each module's nm_register_func. If that callback itself calls napi_module_register() (or triggers a nested dlopen that does), the append can grow the backing WTF::Vector and leave the outer range-for's iterator pointing into freed memory. Any registrations made during init were also silently dropped by the trailing clear(). Move the pending vectors into locals with std::exchange before iterating, and drain the global vector in a loop so re-entrantly registered modules are executed instead of discarded. Same treatment for the cached-registration replay path.
|
Updated 4:35 AM PT - Apr 30th, 2026
❌ @robobun, your commit b1432a9 has 4 failures in
🧪 To try this PR locally: bunx bun-pr 29981That installs a local version of the PR into your bun-29981 --bun |
|
Warning Rate limit exceeded
To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (4)
Review rate limit: 0/5 reviews remaining, refill in 5 minutes and 12 seconds. Comment |
|
Found 2 issues this PR may fix:
🤖 Generated with Claude Code |
There was a problem hiding this comment.
The fix looks correct and the Windows CRLF issue is resolved — but since this changes iteration/drain semantics in the core process.dlopen NAPI module-loading path (re-entrant registrations now execute rather than being dropped, and aren't added to the DLHandleMap cache), it's worth a maintainer's eyes before merge.
Extended reasoning...
Overview
This PR fixes an iterator-invalidation bug in Process_functionDlopen (src/bun.js/bindings/BunProcess.cpp): the execute loop range-for iterated globalObject->m_pendingNapiModules while nm_register_func callbacks could call napi_module_register() and append to that same vector, reallocating its backing buffer and leaving a dangling iterator (heap-use-after-free under ASAN, garbage function pointer otherwise). The fix std::exchanges the pending vectors into locals before iterating, then drains in a loop so re-entrantly registered modules are executed rather than silently dropped by the trailing .clear(). Applied to both the fresh-registration and cached-replay paths. Ships with a new reentrant_register_addon native test that registers 2 modules via static constructor, the first of which registers 64 more from its init callback to force reallocation. Also bumps the beforeAll node-gyp build timeout to 120s.
Security risks
None identified. No new attack surface — this is a defensive correctness fix in an existing code path. Inputs are still native module init callbacks, which are already fully trusted (loading a .node is arbitrary native code execution by design).
Level of scrutiny
High. process.dlopen is the entry point for every native addon load in Bun, and this change alters both control flow and semantics:
- Re-entrant
napi_module_register()calls from inside annm_register_funcare now executed (previously they were appended then immediately.clear()ed). This is the correct behavior, but it's a deliberate semantic change beyond just fixing the UAF. - Re-entrant registrations are not saved to
DLHandleMap(only the initial static-constructor batch is). This is consistent — on cached replay the drain loop in the second path will pick them up again when the originalnm_register_funcre-registers them — but it's a design decision a NAPI-subsystem maintainer should confirm. - Re-entrant V8 (
node_module_register) registrations are still cleared without execution, matching prior behavior.
Other factors
- The
std::exchange+ drain pattern mirrors recently-merged #29949 for the same class of bug. - My earlier inline comment about Windows CRLF in the test assertion was addressed in b1432a9 (now splits on
/\r?\n/). - The bug-hunting system found no issues on this revision.
- Linked to crash reports #20454 and #19802.
- No CODEOWNERS entry covers this file.
The fix itself reads as correct and idiomatic; I'm deferring purely because the dlopen/NAPI-init path is critical enough that a human familiar with the DLHandleMap caching semantics should sign off.
…ven-sh#29981) ## What `process.dlopen` iterates `globalObject->m_pendingNapiModules` after the shared library's static constructors have run, invoking each module's `nm_register_func`. If that callback itself calls `napi_module_register()` — or triggers a nested `dlopen` whose static constructors do — the append can grow the backing `WTF::Vector` and leave the outer range-for's iterator pointing into freed storage. Re-entrantly registered modules were also silently dropped by the trailing `.clear()`. ## How `std::exchange` the pending NAPI/V8 vectors into locals before iterating, then drain `m_pendingNapiModules` in a loop so modules registered during an init callback are executed rather than discarded. Applied to both execute paths in `Process_functionDlopen` (fresh static-constructor registrations and the cached-registration replay). ## Test New `reentrant_register_addon` in `test/napi/napi-app`: two modules are registered from a static constructor; the first one's `nm_register_func` registers 64 more. Before this change the 64 re-entrant registrations are never executed (and the second static-constructor registration is reached via a stale iterator into the reallocated buffer); after, all 66 init callbacks run in order. Also gives the `beforeAll` in `napi.test.ts` a 120 s timeout — the `bunExe() install` it runs does a full node-gyp rebuild which can exceed the default 5 s hook timeout under a debug/ASAN binary, at which point the test runner SIGTERMs the install and the whole file aborts with `build failed, bailing out!`. --------- Co-authored-by: robobun <robobun@users.noreply.github.com>
What
process.dlopeniteratesglobalObject->m_pendingNapiModulesafter the shared library's static constructors have run, invoking each module'snm_register_func. If that callback itself callsnapi_module_register()— or triggers a nesteddlopenwhose static constructors do — the append can grow the backingWTF::Vectorand leave the outer range-for's iterator pointing into freed storage. Re-entrantly registered modules were also silently dropped by the trailing.clear().How
std::exchangethe pending NAPI/V8 vectors into locals before iterating, then drainm_pendingNapiModulesin a loop so modules registered during an init callback are executed rather than discarded. Applied to both execute paths inProcess_functionDlopen(fresh static-constructor registrations and the cached-registration replay).Test
New
reentrant_register_addonintest/napi/napi-app: two modules are registered from a static constructor; the first one'snm_register_funcregisters 64 more. Before this change the 64 re-entrant registrations are never executed (and the second static-constructor registration is reached via a stale iterator into the reallocated buffer); after, all 66 init callbacks run in order.Also gives the
beforeAllinnapi.test.tsa 120 s timeout — thebunExe() installit runs does a full node-gyp rebuild which can exceed the default 5 s hook timeout under a debug/ASAN binary, at which point the test runner SIGTERMs the install and the whole file aborts withbuild failed, bailing out!.