fix(local-backend): #1178 skip vector index query on unsupported pla…#1181
Conversation
|
@Morieity is attempting to deploy a commit to the NexusCore Team on Vercel. A member of the Team first needs to authorize it. |
There was a problem hiding this comment.
Pull request overview
This PR ensures the local MCP backend’s semantic search respects the existing platform capability guard for LadybugDB VECTOR, avoiding the vector index query on unsupported platforms (notably Windows) and relying on the existing exact-scan fallback.
Changes:
- Gate
CALL QUERY_VECTOR_INDEX(...)behindisVectorExtensionSupportedByPlatform()inLocalBackend.semanticSearch(). - Add a regression unit test asserting the vector index query is skipped when VECTOR is unsupported.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
gitnexus/src/mcp/local/local-backend.ts |
Adds a platform capability guard to skip the VECTOR index query path and fall back to exact-scan behavior on unsupported platforms. |
gitnexus/test/unit/calltool-dispatch.test.ts |
Introduces a unit test and mocks platform capabilities to validate the guard behavior (needs a stronger assertion for the fallback path). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| expect(queries.some((cypher: string) => cypher.includes('MATCH (e:CodeEmbedding)'))).toBe( | ||
| true, | ||
| ); |
There was a problem hiding this comment.
This test’s assertion that the exact-scan fallback ran is too weak: queries.some(cypher.includes('MATCH (e:CodeEmbedding)')) will already be true because the initial table existence check uses the same substring (MATCH (e:CodeEmbedding) RETURN COUNT(*)...). As written, the test could pass even if the exact-scan query never executes. Make the assertion target the exact-scan query more specifically (e.g., check for RETURN e.nodeId AS nodeId / e.embedding AS embedding), or assert call ordering/count (COUNT(*) query first, then the full embedding scan query).
| expect(queries.some((cypher: string) => cypher.includes('MATCH (e:CodeEmbedding)'))).toBe( | |
| true, | |
| ); | |
| expect( | |
| queries.some( | |
| (cypher: string) => | |
| cypher.includes('RETURN e.nodeId AS nodeId') && | |
| cypher.includes('e.embedding AS embedding'), | |
| ), | |
| ).toBe(true); |
| try { | ||
| bestChunks = await collectBestChunks(limit, async (fetchLimit) => { | ||
| const vectorQuery = ` | ||
| if (isVectorExtensionSupportedByPlatform()) { |
There was a problem hiding this comment.
We could output a message in verbose mode when the vector extension doesn't supported.
There was a problem hiding this comment.
You're absolutely right! I hadn't fully considered the observability aspect. I've implemented the necessary changes—could you please take another look?
CI Report✅ All checks passed Pipeline Status
Test Results
✅ All 7667 tests passed 1 test(s) skipped — expand for details
Code CoverageTests
📋 View full run · Generated by CI |
|
Claude finished @magyargergo's task in 2m 37s —— View job PR #1181 Review — fix(local-backend): skip vector index query on unsupported platformsTasks
Review bar for this PR
Understanding of the changeIssue #1178 identified that The fix wraps the entire Out of scope: fixing the underlying LadybugDB native crash, changing extension install policy, or unifying pool-adapter and lbug-adapter connection models. Findings[HIGH] CI TypeCheck failure — hard DoD block
[HIGH] Weak exact-scan fallback assertion — Copilot flag unresolved
[LOW] No observability on skipped VECTOR path
Definition of Done check
Validation assessmentTests added/updated: One new test Validation claimed by PR:
Validation verified from CI:
Missing / not verifiable:
Does missing validation block production readiness? Yes. The CI typecheck failure is a hard DoD block. The weak assertion is a material test correctness gap. Final verdictnot production-ready The code change itself is correct and minimal — the guard is placed in the right location, imports from the canonical capability module, and does not alter extension lifecycle or connection models. However, CI TypeCheck is failing, which is a hard DoD block regardless of test results. Additionally, the new regression test's fallback assertion is too weak to catch a broken exact-scan path (it is satisfied by the preliminary count query alone), which is the same issue Copilot raised and which was not addressed before this review. Both must be fixed: resolve the TypeScript error CI is catching, and tighten the assertion to check for |
|
@Morieity can you please look into these findings? 🙏 |
|
@magyargergo Regarding the observability of the fallback strategy, I’ve implemented changes based on Claude’s review and the project’s logging policy, and I’ve updated the unit tests for my revised assertion strategy. This is my first time contributing to such a large-scale project, and I realize I missed some details. Thank you for your friendly guidance! |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Rare diagnostic: surface why we fell back to the exact scan path so | ||
| // operators can see at a glance that the VECTOR extension is missing on | ||
| // this runtime (e.g. Windows builds without the optional native | ||
| // dependency). Kept always-on to match the `logQueryTiming` convention | ||
| // in this file — the message fires at most once per semantic search. | ||
| console.error( | ||
| 'GitNexus [query:vector]: VECTOR index unavailable for this runtime; using exact scan fallback', | ||
| ); |
There was a problem hiding this comment.
The unsupported-platform branch logs a console.error on every semantic search. On platforms where VECTOR is intentionally disabled (e.g. win32), this is expected to happen on every query call and can flood stderr / telemetry and make MCP stderr noisy. Consider making this a one-shot warning (e.g. cache a boolean on LocalBackend, similar to warnedSiblingDrift), or gate it behind an opt-in env flag while still skipping the vector fast path.
There was a problem hiding this comment.
Since the code is placed within an else block, the logic is only triggered when an error occurs (assuming a platform-controlled fallback). Therefore, I don't believe it will result in excessive logging. If the fallback is triggered manually, I think it’s necessary to provide a log entry indicating what actually happened.
Even if we used a single boolean to toggle the display, it would still produce output in the terminal. In my view, there isn't much of a practical difference between the two approaches.
|
Claude finished @magyargergo's task in 3m 30s —— View job PR #1181 Review — fix(local-backend): skip vector index query on unsupported platformsTasks
Review bar for this PR
Understanding of the changeIssue #1178: How this differs from the core path: Behavior change on unsupported platforms: Supported platforms: No change. Same guard wraps the same try/catch block; the Out of scope: Fixing the underlying LadybugDB native crash, changing extension install policy, or unifying pool-adapter and lbug-adapter connection models. Findings[MEDIUM] Always-on
|
| DoD item | Status |
|---|---|
CALL QUERY_VECTOR_INDEX(...) not issued on unsupported platforms |
satisfied — guard is correctly placed and wraps the entire fast path |
Unsupported platforms proceed to real exact-scan CodeEmbedding query |
satisfied — code path verified: bestChunks stays empty → count check → full scan query executes |
| Supported platform vector-index behavior unchanged | satisfied — beforeEach resets mock to true; existing happy-path test still exercises the vector path |
Guard uses canonical isVectorExtensionSupportedByPlatform() from capabilities.ts, no new policy |
satisfied — imported alongside the already-used getExactScanLimit from the same module |
No new loadVectorExtension() or extension lifecycle change |
satisfied — local backend does not call extension load |
| Diagnostics safe for MCP output (stderr, not stdout) | satisfied — console.error goes to stderr per established convention (lines 173–179) |
| Logging proportionate per DoD §2.8 | partially satisfied — diagnostic fires once per call (not per process), which is noisy on Windows; one-shot pattern is available in the same class |
| Tests prove vector-query skipped AND exact-scan query ran | satisfied — assertion now checks RETURN e.nodeId AS nodeId AND e.embedding AS embedding, which is distinct from the count query |
tsc --noEmit and npm test green in CI |
satisfied — latest CI run shows ✅ Typecheck and ✅ Tests (7667 passed, 0 failed) |
Validation assessment
Tests added/updated: One new test — 'skips vector index query when VECTOR is unsupported by the platform' in calltool-dispatch.test.ts.
Mock wiring: isVectorExtensionSupportedByPlatform is mocked via vi.hoisted and the capabilities.js module mock, which correctly intercepts the import in local-backend.ts. beforeEach resets to true, preventing cross-test contamination. console.error is spied on and restored in finally. All clean.
Assertion quality: The old weak assertion (cypher.includes('MATCH (e:CodeEmbedding)')) has been replaced with a compound check for RETURN e.nodeId AS nodeId AND e.embedding AS embedding. The count query (RETURN COUNT(*) AS cnt LIMIT 1) does not contain either of those projection fields. The full scan query contains both. This is a meaningful regression guard. The Copilot concern is resolved.
Validation claimed by PR:
platform-capabilities.test.ts: 2 passed (local)calltool-dispatch.test.ts: 67 passed (local)tsc --noEmit: passed (local)
Validation verified from CI:
- ✅ Typecheck (
tsc --noEmit) — passes (the earlier TypeCheck failure from the previous review iteration is resolved) - ✅ Unit tests, 3 platforms — 7667 passed, 0 failed
- ✅ E2E — passes
Missing / not-verifiable:
- No test explicitly asserts that a supported-platform call does issue
QUERY_VECTOR_INDEX. The existing happy-path'dispatches query tool'test covers this implicitly through thebeforeEachmock reset. Acceptable for a small guard change. - No Windows smoke test with real LadybugDB VECTOR (unit mock only). This is acceptable — the native crash cannot be reproduced in CI without Windows + real native binary, and the mock correctly validates the guard logic.
- The always-on diagnostic has test coverage (
consoleErrorassertion), but the test only calls once so it does not expose the repetition behavior.
Does missing validation block production readiness? No. CI is green. The critical test correctness gap from the prior review (weak fallback assertion) is resolved. The remaining concern (logging proportionality) is medium severity.
Final verdict
production-ready with minor follow-ups
The core change is correct and minimal: the guard is in the right place, imports from the canonical capability module, does not alter extension lifecycle, and the previously-weak test assertion has been tightened to check for the actual scan projection (RETURN e.nodeId AS nodeId + e.embedding AS embedding). CI is fully green. The only material outstanding concern is that console.error fires on every semantic search on Windows rather than once per instance — the warnedSiblingDrift pattern exists in the same class and is the straightforward fix. This does not block merge given the logQueryTiming precedent for per-query stderr output, but the author's stated rationale ("triggered only when an error occurs") is factually incorrect and should be corrected before this becomes a documentation debt. The one-shot deduplication is recommended as a follow-up.
|
@Morieity can you please address this console.log reporting issue in this PR? 🙏 |
|
Yes,i can |
…unsupported platforms Co-authored-by: Copilot <copilot@github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@magyargergo I have been solve this console.log problem |
|
Note: this pr doesn't solve to root cause of the issue, as this was pointed out by one of our community members, we need to upgrade the @ladybug/core version to 0.16.0. We will need to do some preliminary works in order to migrate to the new version. |
Summary
This PR makes the local MCP semantic search path respect the existing VECTOR platform capability guard.
GitNexus already treats LadybugDB VECTOR as unsupported on
win32viaisVectorExtensionSupportedByPlatform(). The core embedding pipeline uses that guard before attempting to load or use VECTOR. However,src/mcp/local/local-backend.tsstill had one directCALL QUERY_VECTOR_INDEX(...)path that did not check the same capability first.This PR fixes that remaining inconsistency by skipping the VECTOR index fast path on unsupported platforms and letting the existing exact-scan fallback run.
Background
On Windows,
@ladybugdb/corecan terminate the process with a native access violation when executing:Observed exit code:
This is a native crash, so JavaScript
try/catchcannot reliably recover from it once the unsafe native path is entered.The current main branch already has the primary guard:
src/core/platform/capabilities.tsisVectorExtensionSupportedByPlatform('win32') === falsesrc/core/lbug/lbug-adapter.tsloadVectorExtension()checks platform support before calling into LadybugDB VECTOR loadingThe local MCP backend should follow the same capability policy.
Problem
LocalBackend.semanticSearch()insrc/mcp/local/local-backend.tsdirectly attempted:without checking whether VECTOR is supported on the current platform.
On Windows this usually falls back after query failure, because the pool path does not load the VECTOR extension by default. However, it is still inconsistent with the rest of the codebase:
core embedding semantic search checks VECTOR availability first
local MCP semantic search attempted the vector index query directly
unsupported platforms should go straight to exact-scan fallback
Change
This PR updates
LocalBackend.semanticSearch()to reuse:Behavior after this change:
Supported platforms:
unchanged
still try
CALL QUERY_VECTOR_INDEX(...)fall back to exact scan if the vector query fails or returns no chunks
Unsupported platforms:
skip
CALL QUERY_VECTOR_INDEX(...)go directly to the existing exact-scan fallback
This does not disable semantic search. It only skips the LadybugDB VECTOR index fast path on platforms where VECTOR is already marked unsupported.
Why this approach
This follows the existing project architecture instead of introducing a new policy.
I intentionally did not call
loadVectorExtension()fromlocal-backend.ts, because local backend uses the repo-scopedpool-adapterconnection model, whileloadVectorExtension()belongs to the singletonlbug-adapterpath. Mixing those connection models would make the change larger and riskier.I also did not add a new environment variable or change extension install policy. This PR only makes the local backend respect the existing platform capability helper.
Tests
Added a regression test in
test/unit/calltool-dispatch.test.tsverifying that when VECTOR is unsupported by the platform:LocalBackenddoes not send aQUERY_VECTOR_INDEXquerythe code proceeds to the exact-scan
CodeEmbeddingquery pathValidation run locally:
Results:
Scope
This PR does not attempt to fix the underlying LadybugDB native crash itself. That should be addressed upstream in
@ladybugdb/core.This PR only ensures that GitNexus' local MCP semantic search path does not bypass the existing platform capability guard.