fix: actionable error + docs for pnpm dlx / pnpx native-load crash (#307)#1967
Conversation
`pnpm dlx gitnexus serve` (and `pnpx gitnexus`) crash with a raw `ERR_DLOPEN_FAILED` stack trace because @ladybugdb/core's native addon (lbugjs.node) is placed by a postinstall script, and dlx/pnpx run ephemerally without executing lifecycle scripts. The existing checkLbugNative() guard already catches the missing binary for serve/mcp/analyze, but its guidance only mentioned bun and --ignore-scripts. Extend the message to call out the common pnpm dlx / pnpx case and the fix (`pnpm add -g gitnexus && pnpm approve-builds -g`, or use npx/npm). Add a matching README troubleshooting section. This does not make `pnpm dlx` itself work — that requires a runtime fallback in @ladybugdb/core. It turns the crash into actionable guidance. Refs abhigyanpatwari#307 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
@tupe12334 is attempting to deploy a commit to the NexusCore Team on Vercel. A member of the Team first needs to authorize it. |
|
@tupe12334 I have a solution for this but I need somebody to test it #1945 |
|
btw, for pnpm --allow-build=@ladybugdb/core --allow-build=gitnexus --allow-build=tree-sitter dlx gitnexus@latest analyze |
Incorporates collaborator feedback (magyargergo): pnpm's security model allows `dlx` to run build scripts when you pass `--allow-build` for each native dep. Add this as the first/preferred pnpm-dlx path in the error message, README troubleshooting section, and test assertion. Drop the now-incorrect claim that `pnpm dlx` "cannot be made to work directly". Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
magyargergo
left a comment
There was a problem hiding this comment.
PR tri-review (#1967, framed vs #1945)
Methods: GitNexus risk-architect + test-ci-verifier, Compound-Engineering correctness + adversarial (4 Claude persona lanes), Codex (codex-rescue, live). Engines: 4× Claude + 1× Codex.
Verdict: production-ready with minor follow-ups — the change correctly turns ERR_DLOPEN_FAILED into actionable stderr for the #307 repro path; fix the P1 global-install line before merge on pnpm 11 hosts.
Branch hygiene: clean feature/fix PR — 2 commits, 3 files, no unrelated churn.
Merge state: checks failing or incomplete — only Vercel (org auth gate) visible via gh pr checks; no Actions rollup in this environment. Branch is behind main. Local vitest run test/unit/lbug-native-check.test.ts: 3/3 pass.
What is solid [code-read]
createLbugLazyAction(gitnexus/src/cli/lazy-action.ts:32-36) callscheckLbugNative()beforeserve/mcp/analyze— rawdlopenshould not reach users on the missing-binary path.--allow-buildflags are beforedlx(native-check.ts:49-50) — matches PR #1945’sresolve-analyze-cmd.cjscomment (issue-1939 branch, lines 12-15).- Allow-build package triple matches PR #1945
PNPM_ALLOW_BUILD_BASE(@ladybugdb/core,gitnexus,tree-sitter).
Cross-PR context (#1945 “in the sense of”)
PR #1945 steers hooks/runner toward pnpm --allow-build=… dlx (not bare dlx) to dodge npm 11 npx (#1939). #1967 is the safety net when users still run pnpm dlx gitnexus@latest serve (#307) and hit missing lbugjs.node. The two PRs are complementary, not contradictory — but README will need a merge-time dedupe with #1945’s rewritten npm-11 section, and #1945’s skill-file steering commit (3f2434f9 on issue-1939) still emits bare pnpm dlx gitnexus@latest analyze without flags (predictable first-failure loop until stderr from this PR).
Headline findings (inline comments on verified lines)
| Sev | Theme | Blocks merge? |
|---|---|---|
| P1 | pnpm approve-builds -g invalid on pnpm 11+ |
yes (broken repair path on target audience) |
| P2 | Unqualified npx escape vs npm 11 arborist crash |
no |
| P3 | Load-failure branch still bun-only; examples use analyze not serve |
no |
P1 — global install workaround
- Risk: Users on pnpm 11 copy
pnpm add -g gitnexus && pnpm approve-builds -gand hit an unsupported flag. - Evidence:
native-check.ts:52,README.md:317-318; pnpm approve-builds docs: “--global, -g— Removed in v11.0.0”. - Recommended fix:
pnpm add -g --allow-build=@ladybugdb/core --allow-build=gitnexus --allow-build=tree-sitter gitnexus(flags before subcommand, same triple as #1945).
P2 — npx/npm fallback
- Risk: npm 11
npxcrashes in arborist before gitnexus runs (#1939) — worse than this message. - Evidence:
native-check.ts:53,README.md:313-314; README L274+ already documentsnode.target is null. - Recommended fix: Qualify (“npm < 11 or after
npm i -g gitnexus”) or prefer global npm / flagged pnpm dlx.
P3 — nits
- Load-failure branch (
native-check.ts:79-81) unchanged — partial pnpm installs that leave a badlbugjs.nodeget no pnpm/--allow-buildhints. - Examples (
native-check.ts:50,README.md:310-311) sayanalyze; issue repro isserve— addserve/mcpor a<command>placeholder. - Tests (
lbug-native-check.test.ts:27-28) pin substrings only — consider asserting--allow-buildindex <dlxindex.
Refuted / lower priority (body only)
- Refuted: This PR cannot make bare
pnpm dlxwork without--allow-build— second commit documents the correct flagged command (pnpm 10.2+). - Refuted as blocking: Contradicts #1945 runner — #1967 does not touch hooks/resolver.
- Advisory: Windows users copying POSIX
\continuation from stderr (native-check.ts:49); pnpm < 10.2 lacks--allow-build( #1945 gates; this message does not).
Automated multi-tool digest — verify before acting. Coordinator tags: inline findings are [code-read] unless noted.
Replace removed pnpm approve-builds -g with add -g --allow-build flags, qualify npm 11 npx caveats, use serve in examples, extend load-failure hints, and assert --allow-build precedes dlx in tests. Co-authored-by: Cursor <cursoragent@cursor.com>
CI Report✅ All checks passed Pipeline Status
Test Results
✅ All 10761 tests passed 10 test(s) skipped — expand for details
Code CoverageTests
📋 View full run · Generated by CI |
…dlx (abhigyanpatwari#307) `@huggingface/transformers` does a bare `import 'onnxruntime-common'` from its shipped `dist/transformers.node.mjs`, but never declares onnxruntime-common in its own `dependencies`. npm's flat node_modules (and pnpm with hoisting) place it on transformers' resolution path by accident; pnpm's isolated store only links a package's declared deps into its scope, so under pnpm-strict / `pnpm dlx` / `pnpx` the import dies with ERR_MODULE_NOT_FOUND before `analyze --embeddings` can run. Declaring onnxruntime-common in gitnexus' own deps (abhigyanpatwari#2074) does not fix this under pnpm: Node resolves the bare specifier from transformers' module scope, not ours, and overrides/resolutions can only re-version an existing edge, never add the missing one (verified against a real `hoist=false` install — the declaration only changes which version wins the hoist, never whether the import resolves). Fix: install a synchronous, in-thread ESM resolution hook (`module.registerHooks`) right before the lazy transformers import that redirects `onnxruntime-common` to the copy gitnexus depends on — but only when the default resolver fails. On npm / hoisted layouts the default resolver succeeds first and the hook never fires, so working setups are unchanged. The hook only intercepts the exact `onnxruntime-common` specifier on failure, so it can never mask an unrelated resolution error; onnxruntime-node's native binding still loads normally from transformers' own scope. `registerHooks` (sync, in-thread, single inline closure) is preferred over the older `module.register` (async, off-thread, now deprecated — DEP0205, removed in Node 26): the redirect is a one-line conditional that needs no worker thread, no separate hook module, and no `data` marshalling. It is available on Node >= 22.15; on older runtimes the helper is a graceful no-op (the gitnexus engines floor is >= 22.0.0, and the import still resolves on hoisted layouts there). Chosen over bundling transformers (the build is tsc-only, and transformers carries native onnxruntime-node + WASM onnxruntime-web assets that bundle poorly). Installation is idempotent, best-effort, and lazy — only on the local-embedding path, so it never affects analysis, the parse workers, or HTTP embedding mode. Validated end-to-end: the compiled resolver fixes a real pnpm `hoist=false` transformers install (ERR_MODULE_NOT_FOUND -> resolved). The separate `@ladybugdb/core` native-binary path under pure `pnpm dlx` is unchanged (abhigyanpatwari#1967 handles that gracefully). Refs abhigyanpatwari#307, abhigyanpatwari#2069 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…dlx (#307) (#2139) * fix(embeddings): resolve onnxruntime-common under pnpm-strict / pnpm dlx (#307) `@huggingface/transformers` does a bare `import 'onnxruntime-common'` from its shipped `dist/transformers.node.mjs`, but never declares onnxruntime-common in its own `dependencies`. npm's flat node_modules (and pnpm with hoisting) place it on transformers' resolution path by accident; pnpm's isolated store only links a package's declared deps into its scope, so under pnpm-strict / `pnpm dlx` / `pnpx` the import dies with ERR_MODULE_NOT_FOUND before `analyze --embeddings` can run. Declaring onnxruntime-common in gitnexus' own deps (#2074) does not fix this under pnpm: Node resolves the bare specifier from transformers' module scope, not ours, and overrides/resolutions can only re-version an existing edge, never add the missing one (verified against a real `hoist=false` install — the declaration only changes which version wins the hoist, never whether the import resolves). Fix: install a synchronous, in-thread ESM resolution hook (`module.registerHooks`) right before the lazy transformers import that redirects `onnxruntime-common` to the copy gitnexus depends on — but only when the default resolver fails. On npm / hoisted layouts the default resolver succeeds first and the hook never fires, so working setups are unchanged. The hook only intercepts the exact `onnxruntime-common` specifier on failure, so it can never mask an unrelated resolution error; onnxruntime-node's native binding still loads normally from transformers' own scope. `registerHooks` (sync, in-thread, single inline closure) is preferred over the older `module.register` (async, off-thread, now deprecated — DEP0205, removed in Node 26): the redirect is a one-line conditional that needs no worker thread, no separate hook module, and no `data` marshalling. It is available on Node >= 22.15; on older runtimes the helper is a graceful no-op (the gitnexus engines floor is >= 22.0.0, and the import still resolves on hoisted layouts there). Chosen over bundling transformers (the build is tsc-only, and transformers carries native onnxruntime-node + WASM onnxruntime-web assets that bundle poorly). Installation is idempotent, best-effort, and lazy — only on the local-embedding path, so it never affects analysis, the parse workers, or HTTP embedding mode. Validated end-to-end: the compiled resolver fixes a real pnpm `hoist=false` transformers install (ERR_MODULE_NOT_FOUND -> resolved). The separate `@ladybugdb/core` native-binary path under pure `pnpm dlx` is unchanged (#1967 handles that gracefully). Refs #307, #2069 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(embeddings): version-match the onnxruntime-common redirect target (#307) Prefer the onnxruntime-common that onnxruntime-node (the native binding transformers actually loads) depends on, so the redirected copy is version- matched to that binding even under `pnpm dlx` — where gitnexus' npm-style `overrides` block does not apply, because it is honoured only from a root manifest and gitnexus is a transitive dependency there. The walk resolves transformers' main entry (not its `exports`-blocked package.json) -> onnxruntime-node -> its onnxruntime-common, and falls back to gitnexus' own direct dependency when the chain can't be walked. Also corrects the doc comment that claimed the gitnexus copy was already "version-aligned". Addresses a PR #2139 tri-review finding (P2). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(embeddings): narrow the onnxruntime-common resolve fallback to absence errors (#307) The resolve closure's `catch` swallowed every error from `nextResolve` and redirected, which would silently paper over a genuinely present-but-broken onnxruntime-common install. Only substitute gitnexus' copy when the specifier is actually absent (ERR_MODULE_NOT_FOUND, or ERR_PACKAGE_PATH_NOT_EXPORTED for an exports-broken copy); rethrow anything else. Adds a test that an unrelated error code rethrows. Addresses a PR #2139 tri-review finding (P3). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(embeddings): cover the onnxruntime-common resolver best-effort swallow path (#307) The outer try/catch in ensureOnnxRuntimeCommonResolvable() was untested. A throwing registerHooks spy drives it; the call must not throw (initEmbedder does not guard the return, so a throw would break `analyze --embeddings`). The vitest quirk that surfaced an earlier attempt applies to throwing mock factories, not a throwing spy implementation, so this is testable cleanly. Addresses a PR #2139 tri-review finding (P2). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(embeddings): tighten the onnxruntime-common redirect-URL assertion (#307) `/^file:\/\/.*onnxruntime-common/` matched the substring anywhere, so a lookalike path (e.g. `/x/onnxruntime-common-fake/`) would pass. Require an actual `/node_modules/onnxruntime-common/...js` segment so the assertion proves the redirect resolves to the real package, not just a string match. Addresses a PR #2139 tri-review finding (P3). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * refactor(embeddings): drop the no-op __resetOnnxRuntimeCommonResolverForTests seam (#307) The test helper reloads the resolver via vi.resetModules() + a fresh import(), which already re-initialises the module-level one-shot `attempted` flag to false. The __reset export it then called was therefore a no-op. Remove the test-only export and its call; isolation now rests solely on vi.resetModules(). Addresses a PR #2139 tri-review finding (P3). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * docs(embeddings): correct the onnxruntime-common resolver isolation comment (#307) The doc comment claimed the hook "never affects other tools' resolution". Once installed, `module.registerHooks` is process-global and its resolve closure runs for every subsequent resolution — it passes them all through untouched and only substitutes the exact `onnxruntime-common` specifier on genuine absence, at a cost of one string comparison. Also note `registerHooks` is @experimental and requires Node >= 22.15 (graceful no-op below that). Comment-only. Addresses a PR #2139 tri-review finding (P3). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The bug
Running
pnpm dlx gitnexus@latest serve(orpnpx gitnexus) crashes immediately with a raw native stack trace:Reproduced on Node v25.9.0, macOS arm64, gitnexus 1.6.5,
@ladybugdb/core0.16.1/0.17.0.Root cause
gitnexusdepends on@ladybugdb/core, whose native addon (lbugjs.node) is copied into place by a postinstall script (install.js) out of the per-platform sub-package (@ladybugdb/core-darwin-arm64, etc.).pnpm dlx/pnpx(and any install with--ignore-scripts) run ephemerally and do not execute lifecycle scripts, so the binary is never placed and the runtimedlopenfails. The prebuilt binary genuinely exists inside thecore-<platform>sub-package — it just never gets linked into place.Repro
What this PR changes
serve/mcp/analyzealready route throughcheckLbugNative()(viacreateLbugLazyAction), which detects the missing/unloadable binary and prints guidance instead of lettingdlopenthrow. However, that guidance only mentionedbunand--ignore-scripts— not the most common real-world trigger,pnpm dlx/pnpx.src/core/lbug/native-check.ts: extend the "binary missing" guidance to call out thepnpm dlx/pnpxcase and the fix:pnpm add -g gitnexus && pnpm approve-builds -gnpx/npm, which execute install scripts.README.md: add anERR_DLOPEN_FAILED/lbugjs.nodetroubleshooting section coveringpnpm dlx/pnpxandpnpm approve-builds.test/unit/lbug-native-check.test.ts: assert the newpnpm approve-buildsguidance is present.Verified end-to-end: with the binary removed,
tsx src/cli/index.ts servenow prints the actionable message and exits non-zero (code 1) instead of dumpingERR_DLOPEN_FAILED.What this PR does NOT do
It does not make
pnpm dlx gitnexuswork directly —dlxruns ephemerally and never executes the dependency's build script, so there is nothing gitnexus can do at that layer. The deeper fix (a runtime fallback that loads the binary directly from the@ladybugdb/core-<platform>sub-package even when postinstall was skipped) belongs in@ladybugdb/coreand is being filed separately there. This PR turns the crash into clear, actionable guidance.Refs #307 (doesn't work with pnpx). Related: #1939 (
npx gitnexuscrash on npm 11.x).🤖 Generated with Claude Code