fix(embeddings): guard local ONNX runtime on macOS Intel before transformers.js import#1987
Conversation
…formers.js import macOS Intel (darwin/x64) crashed on `gitnexus analyze --embeddings` with a raw `Cannot find module .../bin/napi-v6/darwin/x64/onnxruntime_binding.node`: both embedders imported @huggingface/transformers at module scope, which loads onnxruntime-node and resolves the (unshipped) native binding before any backend could be selected. ONNX_WEB_BACKEND=wasm could not help (#1516). - Add a native-free runtime-support guard (getLocalEmbeddingRuntimeBlocker) that returns a clear, actionable message on darwin/x64 and null elsewhere. - Convert both the core and MCP embedders to type-only transformers imports plus a guarded lazy `await import()`; throw the blocker in initEmbedder before any transformers.js / onnxruntime-node resolution. HTTP mode is unaffected. - Surface the blocker cleanly in the analyze CLI instead of the misleading "installation may be corrupt" module-not-found hint. - Add unit tests: guard DI, lazy-import timing, core+MCP darwin/x64 rejection, and HTTP mode not blocked. Refs #1515, #1516 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
`gitnexus doctor` now reports whether the local embedding runtime can load on the current platform. macOS Intel (darwin/x64) users see up front that local embeddings are unavailable — plus the recommended alternatives — instead of only discovering it when `analyze --embeddings` fails (#1515). The Embeddings section gains a "Support" line; on a blocked platform the full guidance (reused from getLocalEmbeddingRuntimeBlocker, single source of truth) is written to stderr. doctor stays import-safe — it never loads transformers.js or onnxruntime-node, so it runs cleanly on macOS Intel. Refs #1515 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
magyargergo
left a comment
There was a problem hiding this comment.
Tri-review — PR #1987 (macOS Intel embedding runtime guard)
Verdict: no blocking issues — the fix is correct and merge-safe. Codex (an independent engine) and all six Claude lanes converged on this. That's the strong form of the signal: the one independent engine agrees, not only correlated personas.
Methods (3). GitNexus swarm (risk-architect, test-ci-verifier) + Compound-Engineering personas (correctness, adversarial, maintainability, testing) — all Claude (6 lanes, correlated) — plus Codex (1 lane, independent engine, ran live). Read it accordingly: cross-engine = Codex and a Claude lane; Claude-only = "consistent across personas," not independent confirmation.
What the review validated (credit)
- Lazy import is complete. A repo-wide sweep found no remaining module-scope import of
@huggingface/transformers/onnxruntime-nodeon any runtime path (analyze→pipeline→embedder, MCP search, doctor); the only two references areimport type(erased at compile time). (Codex + adversarial — cross-engine) - The guard precedes every native touch. It throws before the singleton state, before
isCudaAvailable()/hasOrtCudaProvider()'srequire.resolve(which is path-only — nodlopen), and before the lazyawait import(). On darwin/x64 nothing reaches the native load. (Codex + correctness + adversarial + risk) - No supported-platform regression. Singleton/
initPromiseerror-reset, device fallback, andORT_LOG_LEVELordering are unchanged; HTTP mode (embedText/embedBatch/embedQuery) bypasses the guard. (risk + correctness + Codex) - Tests don't false-pass. The lazy-import spy has a positive control;
vi.resetModules()+mockClear()prevent mock leakage; platform stubs restore infinally; a re-added top-level import would fail the timing tests. (Codex + testing + adversarial)
Inline findings — both are test-coverage gaps on otherwise-correct code (not bugs)
- [P2]
analyze.ts:1220— the new error branch has no dedicated test. - [P2]
embedding-runtime-support.test.ts— MCPembedQuerydarwin/x64 path untested.
Lower priority (P3)
- Message wasm-gap (
runtime-support.ts:55): the blocker rebutsONNX_WEB_BACKEND=wasm, but a user who setGITNEXUS_EMBEDDING_DEVICE=wasm(a valid device perconfig.ts) isn't directly addressed — consider naming that knob too. (adversarial) - Tautological test (
embedding-runtime-support.test.ts:88): the "defaults platform/arch" assertion isnull === nullon the CI host and doesn't exercise theprocess.platformfallback — stub darwin/x64 to make it falsifiable. (testing) - Error-branch ordering (
analyze.ts): the blocker check sits afterisHfDownloadFailure; harmless (no substring overlap) but the explicit platform message could logically precede the network heuristics. (Codex) doctor.tsredundancy:localEmbeddingDoctorStatusre-resolves platform/arch the guard already resolved — cosmetic. (maintainability)
Refuted / out of scope
- Refuted: concurrency half-set singleton (guard throws before any state mutation); substring false-positive misrouting (lead string is unique — no overlap with HF/WAL patterns);
require.resolvetriggering a native load (path-only, and unreachable on darwin/x64). - Pre-existing, not this PR: MCP semantic search degrades silently to BM25 on darwin/x64 when an index already has embeddings (the
catch{}predates this PR — worth a follow-up to surface the blocker once); the darwin/x64 block is a deliberate single-pair allowlist (won't catch a hypothetical future binding-drop on another platform); theanalyze.tserror-dispatch chain keeps growing (extract aclassifyAnalysisErrorlater).
CI: MERGEABLE; required checks pending (branch-protection BLOCKED at review time). Coverage: full diff read (8 files, +420/-8) — no partial-coverage caveat.
Automated multi-tool digest (GitNexus swarm + Compound-Engineering personas + Codex). Verify before acting.
CI Report✅ All checks passed Pipeline Status
Test Results
✅ All 10969 tests passed 13 test(s) skipped — expand for details
Code CoverageTests
📋 View full run · Generated by CI |
…olish Resolves the maintainer tri-review feedback on PR #1987: - Add the analyze error-branch test (new analyze-local-embedding-error.test.ts): a darwin/x64 blocker routes to the clean local-embedding-unsupported message (exit 1), not the module-not-found "installation may be corrupt" branch, and wins over isHfDownloadFailure even when both match (guards the reorder below). - Cover the MCP embedQuery darwin/x64 paths — HTTP bypass via httpEmbedQuery without importing transformers, and local-mode rejection before the import. - Make the "defaults platform/arch" guard test falsifiable by stubbing the platform, instead of asserting null === null on the CI host. - analyze.ts: evaluate the blocker-message branch before the network-heuristic isHfDownloadFailure branch so the explicit platform message takes priority. - runtime-support.ts: the blocker message now also notes GITNEXUS_EMBEDDING_DEVICE =wasm/cpu cannot help, not only ONNX_WEB_BACKEND=wasm. - doctor.ts: resolve platform/arch once instead of re-resolving after the guard. Refs #1515, #1516 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Problem
gitnexus analyze --embeddingscrashes on macOS Intel (darwin/x64) with:Both embedders imported
@huggingface/transformersat module scope, which loadsonnxruntime-node, which immediately resolves the platform-native.nodebinding.onnxruntime-node@1.24.xships nodarwin/x64prebuild, so the process dies at import time — before any device/backend selection. That's whyONNX_WEB_BACKEND=wasm/--embedding-device wasmcan't help (#1516), and why users previously saw the misleading "installation may be corrupt, reinstall" hint from the CLI'sCannot find modulebranch.Closes #1515. Refs #1516.
Fix
Make GitNexus fail gracefully with a clear, GitNexus-authored message before importing transformers.js / onnxruntime-node on unsupported local runtimes.
core/embeddings/runtime-support.ts—getLocalEmbeddingRuntimeBlocker({platform?, arch?}): string | null. Native-/transformers-free so it can be consulted before the import that would crash. Returns a clear blocker ondarwin/x64,nullelsewhere. The message names macOS Intel, the missingdarwin/x64ONNX native binding, explicitly statesONNX_WEB_BACKEND=wasmdoes not help, and lists alternatives (run without--embeddings,GITNEXUS_EMBEDDING_URLHTTP provider, Linux/Docker, Apple Silicon, future build). PlusisLocalEmbeddingRuntimeBlockerMessage()for clean CLI surfacing.core/embeddings/embedder.ts+mcp/core/embedder.ts— top-levelimport {pipeline, env, …}→import type {…}(erased at compile time); runtime values now via a guarded lazyawait import('@huggingface/transformers')inside the init promise. The guard is thrown ininitEmbedderright after the existing HTTP-mode check, before any transformers.js /onnxruntime-noderesolution (including therequire.resolveinhasOrtCudaProvider, which is path-only and safe).cli/analyze.ts— present the blocker viacliError(clean, actionable, no stack trace) instead of the misleading module-not-found "reinstall" hint. NewRecoveryHinttaglocal-embedding-unsupported.cli/doctor.ts— the Embeddings section now shows aSupport:line. On a blocked platform it prints the full guidance (reused from the guard) to stderr, so macOS Intel users see the limitation proactively viagitnexus doctorinstead of only whenanalyze --embeddingsfails.doctorstays import-safe (never loads transformers.js/onnxruntime).Scope note
Both the analyze/CLI embedder (
core/embeddings/embedder.ts) and the MCP search embedder (mcp/core/embedder.ts) had the identical top-level import; both are fixed (same npm package). Leaving the MCP one would still crashgitnexus mcpsearch on macOS Intel.What this does / doesn't do
GITNEXUS_EMBEDDING_URL) remains fully usable on macOS Intel.darwin/x64binding; users are routed to working alternatives.Dependency pinning
Deliberately avoided. The fix is lazy import + explicit guard.
onnxruntime-node@^1.24.0and the transformers→ORT override are unchanged; pinning an older ORT would risk the override strategy and CUDA-on-Linux behavior. Nopackage.json/ lockfile changes.Validation
npx tsc --noEmit— clean.test/unit/embedding-runtime-support.test.ts— 13/13: guard DI (darwin/x64 blocks; arm64/linux/win32 → null; message content), lazy-import timing (spy proves guard/core/MCP modules don't load transformers, with a positive control), core+MCPinitEmbedderdarwin/x64 rejection before the lazy import (no rawCannot find module), and HTTP-mode-not-blocked (no network).doctorsupport-status helper unit-tested (darwin/x64 blocked; arm64/linux/win32 + HTTP supported); verified end-to-end by running the realdoctorcommand on linux and a simulated darwin/x64.npm run test:unit— 7180 pass / 2 skipped; 2 failures are the pre-existinganalyze-heap-respawnfull-suite-load flake (9/9 in isolation, untouched code).npm run test:cross-platform— 1245 pass / 0 failed.CHANGELOG.mduntouched.🤖 Generated with Claude Code