fix: probe CUDA availability in MCP embedder before requesting it by davidbeesley · Pull Request #534 · abhigyanpatwari/GitNexus

davidbeesley · 2026-03-27T02:44:05Z

Summary

Add CUDA availability probe to the MCP embedder so semantic search works on systems without CUDA.

Motivation / context

The MCP embedder blindly attempted CUDA initialization on Linux. When CUDA libraries were absent, the failed attempt poisoned ONNX Runtime's native state, causing the CPU fallback to fail with the same CUDA error. This silently disabled all semantic vector search — the catch in semanticSearch() swallowed the error and returned [], so queries fell back to BM25-only with no visible indication.

The core embedder (src/core/embeddings/embedder.ts) already had isCudaAvailable() / hasOrtCudaProvider() guards that probe for CUDA libraries before requesting them. The MCP embedder was missing these guards.

Areas touched

gitnexus/ (CLI / core / MCP server)
gitnexus-web/ (Vite / React UI)
.github/ (workflows, actions)
eval/ or other tooling
Docs / agent config only (AGENTS.md, CLAUDE.md, .cursor/, llms.txt, etc.)

Scope & constraints

In scope

Port hasOrtCudaProvider() and isCudaAvailable() from the core embedder to the MCP embedder
Skip CUDA attempt entirely when system lacks required libraries
Affects both MCP server and CLI query command (both use the same MCP embedder)

Explicitly out of scope / not done here

Refactoring to share the CUDA probe between core and MCP embedders

Implementation notes

hasOrtCudaProvider() resolves onnxruntime-node from transformers.js's module scope and checks for libonnxruntime_providers_cuda.so
isCudaAvailable() additionally probes ldconfig -p and CUDA_PATH/LD_LIBRARY_PATH for libcublasLt.so.12
Device list construction now skips CUDA when probe returns false, going straight to CPU

Testing & verification

cd gitnexus && npx tsc --noEmit
cd gitnexus && npm test — 128 files, 4457 tests passed
Manual: npx gitnexus query "retry with exponential backoff" -r GitNexus now prints "Embedding model loaded (cpu)" and returns semantic results (previously returned BM25-only)

Risk & rollout

No behavioral change on systems with CUDA — the probe returns true and CUDA is attempted as before
On systems without CUDA, semantic search now works instead of silently failing
No migration or config changes needed

Checklist

PR body meets repo minimum length
If AGENTS.md / overlays changed: N/A
No secrets, tokens, or machine-specific paths committed

vercel · 2026-03-27T02:44:08Z

@davidbeesley is attempting to deploy a commit to the NexusCore Team on Vercel.

A member of the Team first needs to authorize it.

magyargergo · 2026-03-27T05:53:36Z

@claude could this be supported on Windows and MacOS as well?

github-actions · 2026-03-27T05:53:46Z

Claude finished @magyargergo's task in 56s —— View job

Cross-platform support analysis

Short answer: Windows is already handled; macOS works correctly (CPU-only, by design).

Here's the breakdown:

Windows ✅ Already supported

The MCP embedder already has explicit Windows handling at line 93–94:

const isWindows = process.platform === 'win32';
const gpuDevice = isWindows ? 'dml' : (isCudaAvailable() ? 'cuda' : 'cpu');

On Windows, DirectML (dml) is used for GPU acceleration (via DirectX 12) — isCudaAvailable() is never called. If DML fails, it falls back to CPU. This mirrors exactly how the core embedder (src/core/embeddings/embedder.ts) handles Windows.

macOS ✅ Already works correctly (CPU-only)

The probe functions are Linux-specific by design:

hasOrtCudaProvider() checks for libonnxruntime_providers_cuda.so — a Linux shared object. On macOS this path doesn't exist, so it returns false.
isCudaAvailable() calls ldconfig -p (a Linux tool that doesn't exist on macOS), which throws and is caught. It then checks LD_LIBRARY_PATH (also unused on macOS where DYLD_LIBRARY_PATH is the equivalent).

The result: on macOS, isCudaAvailable() safely returns false → device falls back to CPU. This is correct — onnxruntime-node's CUDA provider only ships binaries for Linux/x64.

What's missing: macOS GPU acceleration (Metal/MPS)

Neither this PR nor the existing core embedder attempts Metal Performance Shaders on macOS. The onnxruntime-node package doesn't currently expose an MPS backend in a way that's safely probed the same way CUDA is. This is a pre-existing limitation, and explicitly out of scope for this PR (which is specifically fixing the CUDA-poisoning-ORT-state bug on Linux).

Summary:

Platform	GPU Path	Probe Needed?	Status
Linux	CUDA	Yes — ORT state can be poisoned	✅ Fixed by this PR
Windows	DirectML	Not critical (DML is stable)	✅ Already handled
macOS	None (CPU only)	N/A	✅ Works correctly

No changes are needed in this PR for Windows or macOS support. A future enhancement could add a safe hasDmlProvider() probe for Windows (to mirror the CUDA pattern), and MPS support on macOS would require upstream onnxruntime-node changes.

magyargergo · 2026-03-27T05:58:18Z

@claude check you previous comment and I meant the ORT because all the checks are performed on linux kernel.

github-actions · 2026-03-27T05:58:28Z

Claude finished @magyargergo's task in 1m 4s —— View job

Corrected cross-platform ORT analysis

You're right to call that out. Here's the more precise picture:

`hasOrtCudaProvider()` is hardcoded to Linux

The key line (MCP embedder L29, core embedder L45):

return existsSync(join(ortPath, 'bin', 'napi-v6', 'linux', arch, 'libonnxruntime_providers_cuda.so'));

The linux segment is hardcoded. On Windows or macOS this path never exists, so hasOrtCudaProvider() always returns false — which makes isCudaAvailable() return false immediately as well. Neither probe does any real work on non-Linux.

What actually happens per platform

Platform	CUDA probe called?	DML probe?	Net behavior
Linux	Yes — `ldconfig -p` + `LD_LIBRARY_PATH` scan	N/A	✅ Fixed by this PR
Windows	No — `isWindows` check skips it entirely	❌ None	DML is attempted blindly; falls back to CPU on failure
macOS	`hasOrtCudaProvider()` returns `false` immediately (hardcoded `linux` path)	N/A	Falls straight to CPU — correct, but via a Linux-only guard

The real gap: Windows has no `hasDmlProvider()` probe

On Windows, dml is attempted unconditionally. If ORT's DML initialization fails, the code catches the error and retries with CPU (L95–127 in the MCP embedder). The question is whether a failed DML attempt poisons ORT's native state the same way a failed CUDA attempt does on Linux.

The original CUDA bug was specifically that onnxruntime-node makes an uncatchable native crash when CUDA libs are missing (per the core embedder comments, L144). DML may fail more gracefully (a catchable JS exception) since DirectX 12 is a standard Windows component — but this hasn't been explicitly verified.

Summary

My previous comment was correct in outcome (Windows and macOS produce working embeddings) but imprecise in reasoning. The probe functions are Linux-only by construction. The Windows DML path relies on ORT's own error handling rather than a pre-flight check. If Windows DML failures turn out to exhibit the same native state-poisoning as Linux CUDA failures, a hasDmlProvider() guard would be the right follow-up — but that's out of scope for this PR, which is narrowly fixing the Linux CUDA bug.

github-actions · 2026-03-27T06:13:06Z

CI Report

✅ All checks passed

Pipeline Status

Stage	Status	Details
✅ Typecheck	`success`	tsc --noEmit
✅ Tests	`success`	unit tests, 3 platforms
✅ E2E	`success`	gitnexus-web changes only

Test Results

Tests	Passed	Failed	Skipped	Duration
4669	4668	0	1	184s

✅ All 4668 tests passed

1 test(s) skipped — expand for details

buildTypeEnv > known limitations (documented skip tests) > Ruby block parameter: users.each { |user| } — closure param inference, different feature

Code Coverage

Tests

Metric	Coverage	Covered	Base	Delta	Status
Statements	71.99%	12770/17738	72.24%	📉 -0.2	🔴 ██████████████░░░░░░
Branches	61.58%	8472/13756	62.16%	📉 -0.6	🔴 ████████████░░░░░░░░
Functions	76.09%	1114/1464	75.86%	📈 +0.2	🟢 ███████████████░░░░░
Lines	74.44%	11439/15365	74.72%	📉 -0.3	🔴 ██████████████░░░░░░

_{📋 View full run · Generated by CI}

magyargergo · 2026-03-28T14:43:26Z

⚠️ Upcoming Prettier formatting — rebase instructions

PR #563 adds Prettier as the code formatter for the repo. When it merges, the bulk format commit will touch ~350 files (style-only: whitespace, quotes, trailing commas). Your branch will likely conflict.

After #563 merges, rebase your branch:

git fetch origin
git checkout <your-branch>
git rebase origin/main

# Conflicts will be formatting-only — accept your version:
git checkout --theirs .
git add .
git rebase --continue

# Then re-format your branch to match the new style:
npx prettier --write .
git add -A
git commit -m "style: apply prettier formatting"
git push --force-with-lease

New setup step: Run npm install at the repo root (not just in gitnexus/) to get prettier + activate the pre-commit hook. The hook auto-formats staged files on every commit going forward.

magyargergo · 2026-05-11T07:04:40Z

Please submit a new PR if this is still relevant

davidbeesley added 2 commits March 26, 2026 20:35

fix: probe CUDA availability in MCP embedder before requesting it

f515dcf

fix: remove noisy comment

82f16be

magyargergo closed this May 11, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: probe CUDA availability in MCP embedder before requesting it#534

fix: probe CUDA availability in MCP embedder before requesting it#534
davidbeesley wants to merge 2 commits into
abhigyanpatwari:mainfrom
davidbeesley:fix/mcp-cuda-probe

davidbeesley commented Mar 27, 2026

Uh oh!

vercel Bot commented Mar 27, 2026

Uh oh!

magyargergo commented Mar 27, 2026

Uh oh!

github-actions Bot commented Mar 27, 2026 •

edited

Loading

Uh oh!

magyargergo commented Mar 27, 2026

Uh oh!

github-actions Bot commented Mar 27, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Mar 27, 2026

Uh oh!

magyargergo commented Mar 28, 2026

Uh oh!

magyargergo commented May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

davidbeesley commented Mar 27, 2026

Summary

Motivation / context

Areas touched

Scope & constraints

Implementation notes

Testing & verification

Risk & rollout

Checklist

Uh oh!

vercel Bot commented Mar 27, 2026

Uh oh!

magyargergo commented Mar 27, 2026

Uh oh!

github-actions Bot commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Cross-platform support analysis

Windows ✅ Already supported

macOS ✅ Already works correctly (CPU-only)

What's missing: macOS GPU acceleration (Metal/MPS)

Uh oh!

magyargergo commented Mar 27, 2026

Uh oh!

github-actions Bot commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Corrected cross-platform ORT analysis

hasOrtCudaProvider() is hardcoded to Linux

What actually happens per platform

The real gap: Windows has no hasDmlProvider() probe

Summary

Uh oh!

github-actions Bot commented Mar 27, 2026

CI Report

Pipeline Status

Test Results

Code Coverage

Tests

Uh oh!

magyargergo commented Mar 28, 2026

⚠️ Upcoming Prettier formatting — rebase instructions

Uh oh!

magyargergo commented May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions Bot commented Mar 27, 2026 •

edited

Loading

github-actions Bot commented Mar 27, 2026 •

edited

Loading

`hasOrtCudaProvider()` is hardcoded to Linux

The real gap: Windows has no `hasDmlProvider()` probe