fix(embeddings): resolve onnxruntime-common under pnpm-strict / pnpm dlx (#307)#2139
Conversation
|
@magyargergo is attempting to deploy a commit to the NexusCore Team on Vercel. A member of the Team first needs to authorize it. |
✨ PR AutofixFound fixable formatting / unused-import issues across 16 changed lines. Comment |
CI Report✅ All checks passed Pipeline Status
Test Results
✅ All 10977 tests passed 16 test(s) skipped — expand for details
Code CoverageTests
📋 View full run · Generated by CI |
…dlx (abhigyanpatwari#307) `@huggingface/transformers` does a bare `import 'onnxruntime-common'` from its shipped `dist/transformers.node.mjs`, but never declares onnxruntime-common in its own `dependencies`. npm's flat node_modules (and pnpm with hoisting) place it on transformers' resolution path by accident; pnpm's isolated store only links a package's declared deps into its scope, so under pnpm-strict / `pnpm dlx` / `pnpx` the import dies with ERR_MODULE_NOT_FOUND before `analyze --embeddings` can run. Declaring onnxruntime-common in gitnexus' own deps (abhigyanpatwari#2074) does not fix this under pnpm: Node resolves the bare specifier from transformers' module scope, not ours, and overrides/resolutions can only re-version an existing edge, never add the missing one (verified against a real `hoist=false` install — the declaration only changes which version wins the hoist, never whether the import resolves). Fix: install a synchronous, in-thread ESM resolution hook (`module.registerHooks`) right before the lazy transformers import that redirects `onnxruntime-common` to the copy gitnexus depends on — but only when the default resolver fails. On npm / hoisted layouts the default resolver succeeds first and the hook never fires, so working setups are unchanged. The hook only intercepts the exact `onnxruntime-common` specifier on failure, so it can never mask an unrelated resolution error; onnxruntime-node's native binding still loads normally from transformers' own scope. `registerHooks` (sync, in-thread, single inline closure) is preferred over the older `module.register` (async, off-thread, now deprecated — DEP0205, removed in Node 26): the redirect is a one-line conditional that needs no worker thread, no separate hook module, and no `data` marshalling. It is available on Node >= 22.15; on older runtimes the helper is a graceful no-op (the gitnexus engines floor is >= 22.0.0, and the import still resolves on hoisted layouts there). Chosen over bundling transformers (the build is tsc-only, and transformers carries native onnxruntime-node + WASM onnxruntime-web assets that bundle poorly). Installation is idempotent, best-effort, and lazy — only on the local-embedding path, so it never affects analysis, the parse workers, or HTTP embedding mode. Validated end-to-end: the compiled resolver fixes a real pnpm `hoist=false` transformers install (ERR_MODULE_NOT_FOUND -> resolved). The separate `@ladybugdb/core` native-binary path under pure `pnpm dlx` is unchanged (abhigyanpatwari#1967 handles that gracefully). Refs abhigyanpatwari#307, abhigyanpatwari#2069 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
3365436 to
425809c
Compare
magyargergo
left a comment
There was a problem hiding this comment.
Tri-review of #2139 — onnxruntime-common pnpm-strict fix
3 methods: GitNexus swarm + Compound-Engineering personas + Codex. Engine breakdown: 6 Claude lanes + Codex (the one independent engine; live). Independence is asymmetric — Claude-lane agreement is "consistent across personas," not independent confirmation; the strong signal is Codex agreeing with a Claude lane.
The fix is sound — no P0/P1
The correctness and adversarial lanes empirically executed the redirect on Node 22.16 against the real installed packages and refuted the scariest hypotheses:
- No new dual-instance
Tensoridentity break — the redirect makes transformers use the same CJSonnxruntime-commonbuild thatonnxruntime-nodealready loads, and the feed path intoInferenceSession.runis duck-typed, notinstanceof-gated. (Codex raised this as a P2; two Claude lanes ran it and refuted it.) - The CJS-entry redirect still yields the named
Tensorexport via cjs-module-lexer; and (correctness lane) omittingformatis safe because onnxruntime-common ships a nesteddist/cjs/package.json{"type":"commonjs"}. - The helper is fully wrapped in
try/catch, so a failure degrades to a logged no-op rather than throwing into the embedder (that swallow path is itself untested — see Test gaps).
Inline findings (2)
See the two inline comments (:70, :79).
Body / lower-priority (not inlined)
- [P2] Test gap (test/CI + testing lanes): the resolver's outer
try/catchswallow path is untested. The reason given for dropping the test (vitest surfaces a throwing mock) applies to a throwing factory, not a throwing spy — verifiedloadResolver(vi.fn(() => { throw … }))+not.toThrow()passes. Worth adding. - [P3] Weak test assertion (
onnxruntime-common-resolver.test.ts):/^file:\/\/.*onnxruntime-common/matches loosely; anchor to/node_modules/onnxruntime-common/. - [P3]
__resetOnnxRuntimeCommonResolverForTestsis effectively a no-op aftervi.resetModules()+ a fresh import (maintainability + testing lanes) — consider removing the seam. - [P3] Comment overstates isolation: "never affects other tools' resolution" — the hook is process-global and fires (cheaply, inert) for every resolution in the long-lived MCP server (adversarial; Codex refuted any real overhead concern).
- [design] Node 22.0–22.14:
registerHooksneeds ≥22.15 while theenginesfloor is ≥22.0.0, so the fix is a silent no-op there (documented in the file;registerHooksis also@experimental). - Refuted/contested: the adversarial lane raised "
attempted=truebefore thetrypoisons the fallback on a transientrequire.resolvefailure"; Codex refuted it (a failedrequire.resolvehas no valid redirect target to retry anyway).
CI
Real test/build checks pending as of a880f31b; the Vercel fail is deploy-auth, not code. The PR head is a "Merge main" commit (branch updated with main) — verified the net PR diff is exactly the 4 source files (merge-base 7eaeb0a0; 425809cf..HEAD shows no change to those 4 files).
Automated multi-tool digest (GitNexus swarm + CE personas + Codex) — verify before acting.
| const require = createRequire(import.meta.url); | ||
| // Resolve from gitnexus' own scope — onnxruntime-common is a direct | ||
| // dependency, so this succeeds under npm and pnpm-strict alike. | ||
| const redirectUrl = pathToFileURL(require.resolve('onnxruntime-common')).href; |
There was a problem hiding this comment.
[P2 · code-read · Codex + adversarial converge] Redirect version can skew from the native binding under pnpm dlx, and the comment at :25 is inaccurate.
The comment claims the redirected copy is "version-aligned with the onnxruntime-node transformers loads." But gitnexus' npm-style overrides block (which pins onnxruntime-node) is honored only from the root manifest — so under pnpm dlx/pnpx, where gitnexus is a transitive dep, it does not apply. transformers then loads its own pinned onnxruntime-node@1.24.3 (→ wants onnxruntime-common@1.24.3) while require.resolve('onnxruntime-common') here supplies gitnexus' ^1.26.0.
Not a current bug — it works because the Tensor feed surface is stable/duck-typed across 1.24–1.26 (verified). But the comment's rationale is wrong, and the skew is latent on every onnxruntime-common bump.
- Confirmed defect: correct the
:25comment. - Recommended hardening: resolve the redirect target from
onnxruntime-node's own scope (the version the native binding actually expects), falling back to gitnexus' copy.
| // (npm / hoisted pnpm); only substitute ours when resolution fails. | ||
| try { | ||
| return nextResolve(specifier, context); | ||
| } catch { |
There was a problem hiding this comment.
[P3 · code-read · Codex + correctness + adversarial] catch {} is too broad.
This swallows every error from nextResolve for the onnxruntime-common specifier — not just ERR_MODULE_NOT_FOUND — and redirects. A genuinely present-but-broken install (e.g. ERR_PACKAGE_PATH_NOT_EXPORTED, a corrupt exports map) would be silently papered over with gitnexus' copy instead of surfaced.
Narrow it:
} catch (err) {
if (err?.code === 'ERR_MODULE_NOT_FOUND' || err?.code === 'ERR_PACKAGE_PATH_NOT_EXPORTED') {
return { url: redirectUrl, shortCircuit: true };
}
throw err;
}…abhigyanpatwari#307) Prefer the onnxruntime-common that onnxruntime-node (the native binding transformers actually loads) depends on, so the redirected copy is version- matched to that binding even under `pnpm dlx` — where gitnexus' npm-style `overrides` block does not apply, because it is honoured only from a root manifest and gitnexus is a transitive dependency there. The walk resolves transformers' main entry (not its `exports`-blocked package.json) -> onnxruntime-node -> its onnxruntime-common, and falls back to gitnexus' own direct dependency when the chain can't be walked. Also corrects the doc comment that claimed the gitnexus copy was already "version-aligned". Addresses a PR abhigyanpatwari#2139 tri-review finding (P2). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…sence errors (abhigyanpatwari#307) The resolve closure's `catch` swallowed every error from `nextResolve` and redirected, which would silently paper over a genuinely present-but-broken onnxruntime-common install. Only substitute gitnexus' copy when the specifier is actually absent (ERR_MODULE_NOT_FOUND, or ERR_PACKAGE_PATH_NOT_EXPORTED for an exports-broken copy); rethrow anything else. Adds a test that an unrelated error code rethrows. Addresses a PR abhigyanpatwari#2139 tri-review finding (P3). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…wallow path (abhigyanpatwari#307) The outer try/catch in ensureOnnxRuntimeCommonResolvable() was untested. A throwing registerHooks spy drives it; the call must not throw (initEmbedder does not guard the return, so a throw would break `analyze --embeddings`). The vitest quirk that surfaced an earlier attempt applies to throwing mock factories, not a throwing spy implementation, so this is testable cleanly. Addresses a PR abhigyanpatwari#2139 tri-review finding (P2). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…on (abhigyanpatwari#307) `/^file:\/\/.*onnxruntime-common/` matched the substring anywhere, so a lookalike path (e.g. `/x/onnxruntime-common-fake/`) would pass. Require an actual `/node_modules/onnxruntime-common/...js` segment so the assertion proves the redirect resolves to the real package, not just a string match. Addresses a PR abhigyanpatwari#2139 tri-review finding (P3). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ForTests seam (abhigyanpatwari#307) The test helper reloads the resolver via vi.resetModules() + a fresh import(), which already re-initialises the module-level one-shot `attempted` flag to false. The __reset export it then called was therefore a no-op. Remove the test-only export and its call; isolation now rests solely on vi.resetModules(). Addresses a PR abhigyanpatwari#2139 tri-review finding (P3). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…omment (abhigyanpatwari#307) The doc comment claimed the hook "never affects other tools' resolution". Once installed, `module.registerHooks` is process-global and its resolve closure runs for every subsequent resolution — it passes them all through untouched and only substitutes the exact `onnxruntime-common` specifier on genuine absence, at a cost of one string comparison. Also note `registerHooks` is @experimental and requires Node >= 22.15 (graceful no-op below that). Comment-only. Addresses a PR abhigyanpatwari#2139 tri-review finding (P3). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The bug (#307 / #2069)
pnpm dlx gitnexus@latest analyze --embeddings(andpnpm add -gstrict installs) crash with:@huggingface/transformers@4.2.0does a bareimport 'onnxruntime-common'from its shippedtransformers.node.mjs, but never declares onnxruntime-common in its owndependencies(it listsonnxruntime-node/onnxruntime-web/sharp). It's a phantom dependency: npm's flatnode_modules— and pnpm with hoisting — place it on transformers' resolution path by accident, so the import resolves. Under pnpm's isolated store a package only sees its declared deps, so the import dies.Why #2074 didn't fix it for pnpm
#2074 declared
onnxruntime-commonin gitnexus' own deps. Node resolves the bare specifier from transformers' module scope, not ours, so a sibling-scope declaration is invisible to it under pnpm-strict. Verified empirically against real pnpm installs (10.x and 11.x): under default hoisting the import already resolves (so most users are fine, and #2074 only changes which onnxruntime-common version wins the hoist); under a non-hoisting / strict layout it fails even with #2074.The fix
Install a synchronous, in-thread ESM resolution hook (
module.registerHooks) immediately before the lazyimport('@huggingface/transformers'). The hook:onnxruntime-commonwhen the default resolver fails — so on npm / hoisted layouts it never fires and behaviour is unchanged;onnxruntime-commonis a stable pure-JS package whoseTensorAPI is unchanged across 1.24–1.26, version-aligned with theonnxruntime-nodetransformers loads), always resolvable from gitnexus' own scope regardless of how the consumer's package manager linked things;onnxruntime-commonspecifier on failure, so it can't mask an unrelated resolution error.onnxruntime-node's native binding still loads normally from transformers' own scope.Installation is idempotent, best-effort, and lazy — only on the local-embedding code path, so it never affects analysis, the parse workers, or HTTP embedding mode. If anything goes wrong it's swallowed and embeddings proceed exactly as before.
Why
registerHooks(and notregister, or bundling)module.register()(async, off-thread) is deprecated —DEP0205, slated for removal in Node 26 — and would force a separate hook module + cross-threaddataplumbing.module.registerHooks()is the supported replacement: one inline synchronous closure, no separate file, no worker thread. It's available on Node ≥ 22.15; gitnexus' engine floor is>=22.0.0, so on older 22.x the helper is a graceful no-op (the import still resolves on hoisted layouts there). I kept the floor at>=22.0.0rather than bumping the whole package'senginesfor one best-effort fallback.tsc-only, and transformers carries a nativeonnxruntime-nodebinding + WASMonnxruntime-webassets that bundle poorly.Validation
dist/resolver fixes a genuine pnpmhoist=falsetransformers install —ERR_MODULE_NOT_FOUND→ resolved.tsc --noEmitclean). New unit test covers the resolve closure's redirect / passthrough / rethrow logic, one-shot installation, and graceful degradation whenregisterHooksis unavailable.Scope / what this does NOT cover
The separate
@ladybugdb/corenative-binary failure under purepnpm dlx/pnpx(itslbugjs.nodeis staged by a postinstall that ephemeral runners skip) is unchanged — owned upstream in@ladybugdb/core, and #1967 already makes it fail gracefully. This PR restores--embeddingsunder pnpm-strict / global installs.Suggested follow-ups (not in this PR)
huggingface/transformers.js#1087(declareonnxruntime-common).🤖 Generated with Claude Code