feat: add AWS Neptune as alternative graph database backend#360
feat: add AWS Neptune as alternative graph database backend#360naicud wants to merge 16 commits into
Conversation
|
@naicud is attempting to deploy a commit to the NexusCore Team on Vercel. A member of the Team first needs to authorize it. |
CI Report✅ All checks passed Pipeline Status
Test Results
✅ All 2789 tests passed 20 test(s) skipped — expand for detailsIntegration:
Unit:
Code CoverageCombined (Unit + Integration)
Coverage breakdown by test suiteUnit Tests
Integration Tests
📋 View full run · Generated by CI |
|
@claude ultra-think while reviewing this PR as a senior backend engineer and see if we are backward compatible wth these changes architectural fit. Also we prefer simplicity over overengineering! |
Technical ReviewCRITICAL1. Three
const [processRows, moduleRows, directModuleRows] = await Promise.all([
executeQuery(repo.id, `...`).catch(() => []), // line 1446
executeQuery(repo.id, `...`).catch(() => []), // line 1453
d1Ids ? executeQuery(repo.id, `...`).catch(() => []) : ... // line 1460
]);These hit LadybugDB even for Neptune-backed repos. Since LadybugDB is never initialized for Neptune repos, the 2. Cypher injection in WHERE toLower(n.name) CONTAINS toLower('${query.replace(/'/g, "\'")}')String interpolation with wrong escaping ( WHERE toLower(n.name) CONTAINS toLower($query)3.
await sendCypher(client, `CREATE INDEX ON :\`${lbl}\`(id)`);Neptune does not support HIGH4. Full-scan Neptune BM25 fallback in both WHERE toLower(n.name) CONTAINS toLower($q) OR toLower(n.content) CONTAINS toLower($q)This applies 5.
WHERE n._gen <> '${generation}' OR NOT exists(n._gen)
6. Unrelated change — this removes Swift language parsing support and should be in a separate PR. MEDIUM7. The function exists in 8. Stale
db: (entry as any).db,
9. No Neptune embedding cache invalidation after re-indexing
LOW10. Cypher Console doesn't indicate which backend is active — users won't know whether to write KuzuDB or Neptune syntax. 11. 12. Neptune adapter 13. Edge batch size of 25 is very small — 5000+ relationships = 200+ HTTP round-trips. Neptune can handle 100-250 per batch for simple MERGE patterns. |
|
@naicud Check the above technical issues. Technical issues can be solved but here is some deeper issue to give u context of what could go wrong. While implementing the mcp tools to query Kuzu, I had to build sort of abstractions that shield the LLM from dialect entirely, or at least guide it, because kuzuDB isn't as popular as Neo4j and LLMs kept defaulting to Neo4j like syntax ( probably due to their training data being richer for neo ). Now the difficult part was finding the sweet spot which prevented over-prompting to eat away LLM flexibility and prevent the wrong query generation or atleast induce the error message with guide to let the llm know what went wrong and how it can fix it in the next tool call. So you would need to do some hit and trial test to see if the queries are executing without issues and feels like same level of quality. cc: @magyargergo |
|
This is an interesting area. I think we need to reevaluate this, since we migrated to ladybug and they might have improved on this. |
- NeptuneAdapter for read queries (openCypher via @aws-sdk/client-neptunedata) - Neptune ingestion pipeline (batch UNWIND, 500-node batches) - Neptune vector search support - CLI flags: --db neptune, --neptune-endpoint, --neptune-region - Server API: /api/db/test, PATCH /api/repo/db - Settings panel: Database Backend configuration section - Cypher console component for direct query execution - IAM+SigV4 auth via AWS SDK credential chain Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three executeQuery() calls in the impact enrichment Promise.all block bypassed Neptune dispatch, causing empty results for affected processes and modules on Neptune-backed repos. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… query Neptune search endpoint used string interpolation with wrong escaping (backslash-quote instead of Cypher double-quote), creating a Cypher injection vulnerability. Now uses executeParameterized with $q param. LIMIT remains inline as Neptune openCypher does not support parameterized LIMIT values (must be integer literal). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Neptune does not support CREATE INDEX via openCypher (Neo4j DDL syntax). The try/catch silently swallowed the error, creating the false impression that indexes were being created. Neptune manages indexes automatically via its DFE query engine. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Full-scan toLower(n.content) CONTAINS over all source code nodes exceeds Neptune's 120s query timeout on non-trivial codebases. Neptune search now only matches on n.name. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
NOT exists(n.property) behavior varies by Neptune engine version. Replaced with n._gen IS NULL which is standard openCypher. Also parameterized the generation value instead of string interpolation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The embedding pipeline writes directly to LadybugDB and doesn't support Neptune yet. Instead of silently generating embeddings that never get stored, explicitly skip with an informative message. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
RegistryEntry already has db?: DbConfig in its interface. The as any cast was hiding type errors. Aligned RepoHandle.db type to use DbConfig. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The Neptune embedding cache was never cleared when repos were re-indexed, causing stale semantic search results if the MCP server was running during re-indexing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add a small badge to the Cypher Console header displaying "LadybugDB" or "Neptune" to help users write appropriate syntax for the active backend. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
NeptuneAdapter.config was private but neptune-vector-search needed it for cache keys, using ugly (adapter as unknown as ...) casts. Added a public id getter returning endpoint:port. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
25 edges per batch caused 200+ HTTP round-trips for typical codebases. Neptune handles 100 edges per MERGE batch comfortably. Adaptive sizing will still halve automatically on timeout. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
28ff150 to
f2176b0
Compare
|
@abhigyanpatwari I think that some fix was ported in the PR now |
|
@claude ultra-think while evaluating the architectural fit of this PR. we want to support multiple datasoruces but it has to be done right otherwise the architeture will drift. |
|
Claude finished @magyargergo's task in 2m 26s —— View job Architectural Review: AWS Neptune Backend
|
| Gap | Location | Impact |
|---|---|---|
loadEmbeddingsToNeptune defined but never called |
analyze.ts — no call site |
Semantic search always empty for Neptune repos |
| Embedding cache invalidation not called after re-index | neptune-vector-search.ts:221 — export unused |
Stale vectors on re-index |
tree-sitter-swift removed |
package.json |
Unrelated, should be separate PR |
/api/graph LadybugDB-only |
api.ts:197 |
Graph viz broken for Neptune repos |
Recommended Path Forward
The fix that would prevent all future drift is small:
- Create
LbugAdapterclass implementingIDbQueryAdapter— wrap the existing lbug module functions. - Add
getAdapter(repo): IDbQueryAdaptertoLocalBackend— resolves once, removes allif (neptune)branches. - Delete
getDbConfigFromEntryfromapi.ts— use thegetDbConfigalready inrepo-manager.ts. - Route
/api/querythroughLocalBackendrather than constructing adapters directly — this also fixes the per-request construction issue. - Wire
loadEmbeddingsToNeptuneintoanalyze.tswhen--embeddings+--db neptuneare set.
Steps 1–4 are a modest refactor that would collapse ~150 lines of duplicated dispatch down to a single resolution site. The PR is mergeable without them, but without them the next backend will double the drift.
· Branch
|
I'm Checking guys ;) |
# Conflicts: # gitnexus/src/mcp/local/local-backend.ts
Enables polymorphic dispatch for LadybugDB alongside NeptuneAdapter. Uses dependency injection to avoid circular imports between layers. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- getAdapter(repoId) factory returns IDbQueryAdapter (polymorphic) - LbugQueryAdapter for LadybugDB, NeptuneAdapter for Neptune - Eliminates module-level neptuneAdapters Map - runQuery/runParameterized delegate through adapter - isNeptune() helper replaces scattered neptuneAdapters.has() checks Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Delete duplicate getDbConfigFromEntry (canonical getDbConfig in repo-manager) - Delete buildGraph function (moved to LocalBackend.getGraphForApi) - /api/query routes through backend.executeCypher() - /api/search routes through new backend.searchForApi() - /api/graph routes through new backend.getGraphForApi() - Returns 501 for graph viz on Neptune repos (proper error) - Eliminates per-request NeptuneAdapter construction - Remove double closeLbug() in shutdown handler - Update Neptune injection safety test to scan new code location Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
CI Report✅ All checks passed Pipeline
Tests
✅ All 3627 tests passed across 1048 files 20 test(s) skipped
Coverage
📋 Full run · Coverage from Ubuntu · Generated by CI |
There was a problem hiding this comment.
Pull request overview
Adds AWS Neptune as an optional graph database backend alongside the existing LadybugDB (local) path, wiring it through CLI/server/MCP/web settings so repos can be configured to query Neptune via openCypher.
Changes:
- Introduces a DB-adapter abstraction (
IDbQueryAdapter) with implementations for LadybugDB and Neptune, plus Neptune ingestion and app-side vector search caching. - Extends CLI + server API to select/test/update DB backend config per repo, and updates MCP local backend query dispatch accordingly.
- Adds web UI settings for Neptune configuration plus a Cypher console, and adds unit tests around the Neptune components.
Reviewed changes
Copilot reviewed 23 out of 24 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| gitnexus/src/core/db/interfaces.ts | Defines DbConfig and IDbQueryAdapter contract. |
| gitnexus/src/core/db/lbug-query-adapter.ts | Adapter wrapper for LadybugDB query functions. |
| gitnexus/src/core/db/neptune/neptune-adapter.ts | Neptune openCypher adapter using @aws-sdk/client-neptunedata. |
| gitnexus/src/core/db/neptune/neptune-ingest.ts | Neptune ingestion (UNWIND+MERGE batches) + stats/embedding loaders. |
| gitnexus/src/core/db/neptune/neptune-vector-search.ts | App-side vector search cache + cosine similarity top‑K. |
| gitnexus/src/mcp/local/local-backend.ts | Adapter-based query dispatch; Neptune fallbacks and API helpers. |
| gitnexus/src/server/api.ts | Adds /api/db/test + /api/repo/db; routes graph/query/search through LocalBackend. |
| gitnexus/src/storage/repo-manager.ts | Persists per-repo db config in registry and adds updateRepoDb. |
| gitnexus/src/cli/index.ts | Adds --db and Neptune CLI flags. |
| gitnexus/src/cli/analyze.ts | Adds Neptune ingestion path and stores db config in registry. |
| gitnexus/test/unit/neptune-adapter.test.ts | Unit tests for NeptuneAdapter behavior (mocked AWS SDK). |
| gitnexus/test/unit/neptune-ingest.test.ts | Unit tests for Neptune ingestion + stats (mocked AWS SDK). |
| gitnexus/test/unit/neptune-vector-search.test.ts | Tests intended to cover vector search logic. |
| gitnexus/test/unit/neptune-impact-dispatch.test.ts | Guards dispatch usage in impact enrichment block. |
| gitnexus/test/unit/neptune-content-scan.test.ts | Regression test to prevent Neptune n.content scans. |
| gitnexus/test/unit/neptune-api-injection.test.ts | Regression test to prevent string interpolation in Neptune search query params. |
| gitnexus/test/unit/lbug-query-adapter.test.ts | Unit tests for the LadybugDB query adapter wrapper. |
| gitnexus-web/src/services/backend.ts | Adds API calls for DB config update and Neptune connection testing. |
| gitnexus-web/src/core/llm/types.ts | Persists DB backend settings in web settings model. |
| gitnexus-web/src/components/SettingsPanel.tsx | Adds Neptune backend UI controls + test connection + Cypher console entrypoint. |
| gitnexus-web/src/components/CypherConsole.tsx | New UI modal to run openCypher queries. |
| docs/neptune-setup.md | Setup guide and operational notes for using Neptune. |
| gitnexus/package.json | Adds @aws-sdk/client-neptunedata dependency. |
| gitnexus/package-lock.json | Locks AWS SDK dependency graph. |
Files not reviewed (1)
- gitnexus/package-lock.json: Language not supported
Comments suppressed due to low confidence (2)
gitnexus/src/server/api.ts:117
- The /api/graph handler returns 500 on repo resolution errors (e.g. no repos indexed, repo not found, multiple repos) even though statusFromError() exists to map these to 404/400. Consider using res.status(statusFromError(err)) here for consistent client behavior.
// Get full graph
app.get('/api/graph', async (req, res) => {
try {
const graph = await backend.getGraphForApi(requestedRepo(req));
if (!graph) {
res.status(501).json({ error: 'Graph visualization is not supported for Neptune-backed repos. Use Cypher queries instead.' });
return;
}
res.json(graph);
} catch (err: any) {
res.status(500).json({ error: err.message || 'Failed to build graph' });
}
gitnexus/src/server/api.ts:132
- The /api/query handler similarly hard-codes a 500 response for repo resolution errors. For consistency with other endpoints, use statusFromError(err) so "not found" / "multiple repositories" produce 404/400 rather than 500.
// Execute Cypher query
app.post('/api/query', async (req, res) => {
try {
const cypher = req.body.cypher as string;
if (!cypher) {
res.status(400).json({ error: 'Missing "cypher" in request body' });
return;
}
const result = await backend.executeCypher(requestedRepo(req), cypher);
res.json({ result });
} catch (err: any) {
res.status(500).json({ error: err.message || 'Query failed' });
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| "prepack": "npm run build && chmod +x dist/cli/index.js" | ||
| }, | ||
| "dependencies": { | ||
| "@aws-sdk/client-neptunedata": "^3.0.0", |
There was a problem hiding this comment.
The newly added @aws-sdk/client-neptunedata dependency currently resolves (per package-lock) to a version that declares engines.node >=20, but this package.json declares engines.node >=18. This will likely break installs/runtime on Node 18. Consider pinning a neptunedata client version that supports Node 18, or bumping the project's Node engine requirement to >=20 to match the SDK.
| "@aws-sdk/client-neptunedata": "^3.0.0", | |
| "@aws-sdk/client-neptunedata": "3.624.0", |
| const { neptuneEndpoint, neptuneRegion, neptunePort } = req.body; | ||
| if (!neptuneEndpoint || !neptuneRegion) { | ||
| res.status(400).json({ ok: false, error: 'neptuneEndpoint and neptuneRegion are required' }); | ||
| return; | ||
| } | ||
| const lbugPath = path.join(entry.storagePath, 'lbug'); | ||
| const parsedLimit = Number(req.body.limit ?? 10); | ||
| const limit = Number.isFinite(parsedLimit) | ||
| ? Math.max(1, Math.min(100, Math.trunc(parsedLimit))) | ||
| : 10; | ||
| const config: NeptuneDbConfig = { | ||
| type: 'neptune', | ||
| endpoint: neptuneEndpoint, | ||
| region: neptuneRegion, | ||
| port: neptunePort ?? 8182, | ||
| }; | ||
| const adapter = new NeptuneAdapter(config); | ||
| const t0 = Date.now(); | ||
| try { | ||
| await adapter.executeQuery('MATCH (n) RETURN count(n) AS cnt LIMIT 1'); | ||
| res.json({ ok: true, latencyMs: Date.now() - t0 }); | ||
| } catch (err: any) { | ||
| res.json({ ok: false, error: err.message || 'Connection failed' }); | ||
| } finally { | ||
| adapter.close(); | ||
| } |
There was a problem hiding this comment.
In /api/db/test, neptunePort is taken directly from req.body and assigned to NeptuneDbConfig.port without parsing/validation, so a string (e.g. "8182") or invalid value could slip through. Also, adapter.close() is async but is not awaited in the finally block, which can leave cleanup unhandled. Parse/validate the port (coerce to a number with bounds/default) and await adapter.close() in finally.
| const results = await backend.searchForApi(requestedRepo(req), query, limit); | ||
| res.json({ results }); | ||
| } catch (err: any) { | ||
| res.status(500).json({ error: err.message || 'Search failed' }); |
There was a problem hiding this comment.
The /api/search handler hard-codes a 500 response for repo resolution errors; this makes "repo not found" look like a server failure. Use statusFromError(err) (already defined) to return 404/400 where appropriate.
| res.status(500).json({ error: err.message || 'Search failed' }); | |
| const status = statusFromError(err); | |
| res.status(status).json({ error: err.message || 'Search failed' }); |
| @@ -261,13 +333,17 @@ export const analyzeCommand = async ( | |||
| } | |||
There was a problem hiding this comment.
Phase 3.5 restores cached embeddings by writing CodeEmbedding nodes via executeWithReusedStatement, but this block runs even when --db neptune is selected (when LadybugDB may not be initialized/available). This can cause Neptune indexing to fail or silently do the wrong thing. Guard this block behind !isNeptune, or (if Neptune should support embeddings) write the cached embeddings to Neptune via the Neptune embedding loader instead.
| if (options?.embeddings) { | ||
| if (stats.nodes > EMBEDDING_NODE_LIMIT) { | ||
| if (isNeptune) { | ||
| embeddingSkipReason = 'skipped (Neptune embedding support not yet available — use LadybugDB for embeddings)'; | ||
| } else if (stats.nodes > EMBEDDING_NODE_LIMIT) { |
There was a problem hiding this comment.
The Neptune path forces embeddings to be skipped even when --embeddings is set, but the PR description advertises Neptune vector/semantic search support and the runtime Neptune semanticSearch path expects embeddings on nodes. Also, loadEmbeddingsToNeptune is defined but never called anywhere. If Neptune semantic search is intended to work, the CLI should generate embeddings and persist them to Neptune; otherwise, the PR description/docs should be updated to reflect that semantic search is not supported yet.
| // 5. Clean up stale nodes from previous runs. | ||
| // Any node without the current generation marker is orphaned. | ||
| // Batched to avoid timeouts on large graphs. Non-fatal if it fails. | ||
| onProgress?.('Cleaning up stale nodes...'); | ||
| const CLEANUP_BATCH = 10_000; | ||
| let cleanupTotal = 0; | ||
| try { | ||
| for (;;) { | ||
| const res = await client.send(new ExecuteOpenCypherQueryCommand({ | ||
| openCypherQuery: `MATCH (n) WHERE n._gen <> $gen OR n._gen IS NULL WITH n LIMIT ${CLEANUP_BATCH} DETACH DELETE n RETURN count(*) AS deleted`, | ||
| parameters: JSON.stringify({ gen: generation }), | ||
| })); | ||
| const rows = (res.results as Record<string, unknown>[]) ?? []; | ||
| const deleted = Number(rows[0]?.['deleted'] ?? 0); | ||
| if (deleted === 0) break; |
There was a problem hiding this comment.
The stale-node cleanup query deletes all nodes in the Neptune database that don't match the current generation marker. If a Neptune cluster is ever shared across multiple repos (or used for anything else), this will delete unrelated data. Consider scoping all GitNexus nodes by a per-repo marker (e.g. repoId property, separate label namespace, or separate database/graph per repo) and include that scope in both upsert and cleanup queries.
| for (const embRow of rawResults) { | ||
| const nodeId = embRow.nodeId; | ||
| const distance = embRow.distance; | ||
| const labelEndIdx = nodeId.indexOf(':'); | ||
| const label = labelEndIdx > 0 ? nodeId.substring(0, labelEndIdx) : 'Unknown'; | ||
| if (!VALID_NODE_LABELS.has(label)) continue; | ||
|
|
||
| try { | ||
| const nodeQuery = label === 'File' | ||
| ? `MATCH (n:File {id: $nodeId}) RETURN n.name AS name, n.filePath AS filePath` | ||
| : `MATCH (n:\`${label}\` {id: $nodeId}) RETURN n.name AS name, n.filePath AS filePath, n.startLine AS startLine, n.endLine AS endLine`; | ||
| const nodeRows = await this.runParameterized(repo.id, nodeQuery, { nodeId }); | ||
| if (nodeRows.length > 0) { | ||
| const nodeRow = nodeRows[0]; | ||
| results.push({ | ||
| nodeId, | ||
| name: nodeRow.name ?? nodeRow[0] ?? '', | ||
| type: label, | ||
| filePath: nodeRow.filePath ?? nodeRow[1] ?? '', | ||
| distance, | ||
| startLine: label !== 'File' ? (nodeRow.startLine ?? nodeRow[2]) : undefined, | ||
| endLine: label !== 'File' ? (nodeRow.endLine ?? nodeRow[3]) : undefined, | ||
| }); | ||
| } | ||
| } catch {} | ||
| } |
There was a problem hiding this comment.
Neptune semanticSearch currently does an N+1 query pattern: after getting top-K embedding hits, it runs an additional Cypher query per result to fetch node metadata. Over Neptune HTTP this can add significant latency and load. Consider fetching metadata for all returned nodeIds in a single parameterized query (e.g. UNWIND $ids AS id MATCH (n {id:id}) RETURN ...), or extending the embedding fetch/cache to include the needed node fields.
| for (const embRow of rawResults) { | |
| const nodeId = embRow.nodeId; | |
| const distance = embRow.distance; | |
| const labelEndIdx = nodeId.indexOf(':'); | |
| const label = labelEndIdx > 0 ? nodeId.substring(0, labelEndIdx) : 'Unknown'; | |
| if (!VALID_NODE_LABELS.has(label)) continue; | |
| try { | |
| const nodeQuery = label === 'File' | |
| ? `MATCH (n:File {id: $nodeId}) RETURN n.name AS name, n.filePath AS filePath` | |
| : `MATCH (n:\`${label}\` {id: $nodeId}) RETURN n.name AS name, n.filePath AS filePath, n.startLine AS startLine, n.endLine AS endLine`; | |
| const nodeRows = await this.runParameterized(repo.id, nodeQuery, { nodeId }); | |
| if (nodeRows.length > 0) { | |
| const nodeRow = nodeRows[0]; | |
| results.push({ | |
| nodeId, | |
| name: nodeRow.name ?? nodeRow[0] ?? '', | |
| type: label, | |
| filePath: nodeRow.filePath ?? nodeRow[1] ?? '', | |
| distance, | |
| startLine: label !== 'File' ? (nodeRow.startLine ?? nodeRow[2]) : undefined, | |
| endLine: label !== 'File' ? (nodeRow.endLine ?? nodeRow[3]) : undefined, | |
| }); | |
| } | |
| } catch {} | |
| } | |
| const nodeIds: string[] = []; | |
| const distanceById = new Map<string, number>(); | |
| const labelById = new Map<string, string>(); | |
| for (const embRow of rawResults) { | |
| const nodeId = embRow.nodeId as string; | |
| const distance = embRow.distance as number; | |
| const labelEndIdx = nodeId.indexOf(':'); | |
| const label = labelEndIdx > 0 ? nodeId.substring(0, labelEndIdx) : 'Unknown'; | |
| if (!VALID_NODE_LABELS.has(label)) continue; | |
| nodeIds.push(nodeId); | |
| distanceById.set(nodeId, distance); | |
| labelById.set(nodeId, label); | |
| } | |
| if (nodeIds.length === 0) { | |
| return results; | |
| } | |
| try { | |
| const nodeQuery = ` | |
| UNWIND $ids AS nodeId | |
| MATCH (n {id: nodeId}) | |
| RETURN nodeId AS nodeId, | |
| labels(n)[0] AS dbLabel, | |
| n.name AS name, | |
| n.filePath AS filePath, | |
| n.startLine AS startLine, | |
| n.endLine AS endLine | |
| `; | |
| const nodeRows = await this.runParameterized(repo.id, nodeQuery, { ids: nodeIds }); | |
| const rowById = new Map<string, any>(); | |
| for (const row of nodeRows) { | |
| if (row && row.nodeId) { | |
| rowById.set(row.nodeId as string, row); | |
| } | |
| } | |
| for (const nodeId of nodeIds) { | |
| const nodeRow = rowById.get(nodeId); | |
| if (!nodeRow) continue; | |
| const label = labelById.get(nodeId) ?? (nodeRow.dbLabel as string | undefined) ?? 'Unknown'; | |
| const distance = distanceById.get(nodeId) ?? 0; | |
| results.push({ | |
| nodeId, | |
| name: nodeRow.name ?? nodeRow[0] ?? '', | |
| type: label, | |
| filePath: nodeRow.filePath ?? nodeRow[1] ?? '', | |
| distance, | |
| startLine: label !== 'File' ? (nodeRow.startLine ?? nodeRow[2]) : undefined, | |
| endLine: label !== 'File' ? (nodeRow.endLine ?? nodeRow[3]) : undefined, | |
| }); | |
| } | |
| } catch {} |
| fetchOpenRouterModels, | ||
| } from '../core/llm/settings-service'; | ||
| import type { LLMSettings, LLMProvider } from '../core/llm/types'; | ||
| import { testDbConnection, updateRepoDbConfig, getBackendUrl } from '../services/backend'; |
There was a problem hiding this comment.
getBackendUrl is imported here but not used anywhere in this component. If the web build uses noUnusedLocals/noUnusedParameters (or ESLint), this will fail CI or add noise. Remove the unused import or use it.
| import { testDbConnection, updateRepoDbConfig, getBackendUrl } from '../services/backend'; | |
| import { testDbConnection, updateRepoDbConfig } from '../services/backend'; |
| {dbType === 'neptune' ? 'Neptune' : 'LadybugDB'} | ||
| </span> | ||
| </div> | ||
| <button onClick={onClose} className="text-white/40 hover:text-white transition-colors">✕</button> |
There was a problem hiding this comment.
The CypherConsole close button is rendered as a bare "✕" without an accessible name. Add an aria-label (e.g. "Close") and/or visually hidden text so screen readers can announce its purpose.
| <button onClick={onClose} className="text-white/40 hover:text-white transition-colors">✕</button> | |
| <button | |
| onClick={onClose} | |
| className="text-white/40 hover:text-white transition-colors" | |
| aria-label="Close Cypher Console" | |
| > | |
| ✕ | |
| </button> |
| /** | ||
| * Tests for the cosine similarity and top-K selection logic. | ||
| * We test the exported function by mocking the NeptuneAdapter. | ||
| */ | ||
|
|
||
| // Helper: cosine similarity between two vectors (for verification) | ||
| function cosineSim(a: number[], b: number[]): number { | ||
| let dot = 0, normA = 0, normB = 0; | ||
| for (let i = 0; i < a.length; i++) { | ||
| dot += a[i] * b[i]; | ||
| normA += a[i] * a[i]; | ||
| normB += b[i] * b[i]; | ||
| } | ||
| const denom = Math.sqrt(normA) * Math.sqrt(normB); | ||
| return denom === 0 ? 0 : dot / denom; | ||
| } |
There was a problem hiding this comment.
This test file only verifies a locally defined cosineSim helper and array sorting/filtering, but it does not exercise the actual exported Neptune vector search implementation (neptuneSemanticSearch) as the header comment claims. Consider importing and testing the real module (mocking NeptuneAdapter.executeQuery) so changes to production logic are covered.
|
Can you please ensure we are improving the coverage? We went down because there are new uncovered functionalities. |
|
@claude review copilot reviews and verify their validity against the code base. |
|
Claude finished @magyargergo's task in 2m 33s —— View job Copilot Review Validity Assessment
Verdict by Comment✅ VALID —
|
| Priority | Issue | Location |
|---|---|---|
| 🔴 High | loadEmbeddingsToNeptune never called — Neptune semantic search broken |
analyze.ts:346 |
| 🔴 High | N+1 query per embedding result — will be slow/timeout | local-backend.ts:769 |
| 🟡 Med | Warning regex never matches Neptune format — silent failure reporting | analyze.ts:463 |
| 🟡 Med | Port from req.body not coerced to number | api.ts:164 |
| 🟡 Med | Hard-coded 500 in search/query/graph catch blocks | api.ts:131,148,116 |
| 🟢 Low | adapter.close() not awaited |
api.ts:174 |
| 🟢 Low | Unused getBackendUrl import |
SettingsPanel.tsx:10 |
| 🟢 Low | Missing aria-label on close button |
CypherConsole.tsx:50 |
| 🟢 Low | Test doesn't cover real module | neptune-vector-search.test.ts |
| · Branch |
|
@naicud please review these findings |
|
|
Please submit a new PR if this is still relevant |

Summary
Adds AWS Neptune as an alternative graph database backend alongside the default KuzuDB (LadybugDB). This enables cloud-native graph storage for teams that need shared, scalable infrastructure.
Key features
@aws-sdk/client-neptunedata--db neptune --neptune-endpoint <host> --neptune-region <region>flags ongitnexus analyze/api/db/testendpoint for connection testing,PATCH /api/repo/dbfor config updatesCypherConsole.tsxcomponent for direct query executionArchitecture
KuzuDB code is completely untouched — Neptune is additive. The
local-backend.tsuses arunQuery/runParameterizedabstraction that dispatches to either KuzuDB or Neptune based on repo config.Changes (19 files)
interfaces.ts,neptune-adapter.ts,neptune-ingest.ts,neptune-vector-search.tsanalyze.ts,index.tsapi.tslocal-backend.tsrepo-manager.tsSettingsPanel.tsx,backend.ts,CypherConsole.tsxTest plan
cd gitnexus && npm run build && cd ../gitnexus-web && npm run buildgitnexus analyze --db neptune --neptune-endpoint <host>indexes to Neptune🤖 Generated with Claude Code