fix(web): align agent system prompt with registered tools by magyargergo · Pull Request #1984 · abhigyanpatwari/GitNexus

magyargergo · 2026-06-03T04:44:37Z

Summary

Addresses all blocking and high-priority findings from the PR #14 tri-review for the current gitnexus-web agent:

Rewrites BASE_SYSTEM_PROMPT with the iterative investigation loop while using exact registered tool names (search, cypher, grep, read, explore, overview, impact)
Restores explicit citation rules ([[path:START-END]], [[Type:Name]]) matching the UI parser
Documents typed node labels + CodeRelation {type: '...'} schema (no CodeNode / INHERITS drift)
Clarifies that graph highlighting is citation-driven — there is no highlight_in_graph tool in this codebase
Restores BE DIRECT, MERMAID RULES, and ERROR RECOVERY sections removed in PR prompt changes #14
Exports GRAPH_RAG_TOOL_NAMES and adds unit tests to prevent prompt ↔ tool registry regressions

Test plan

cd gitnexus-web && npm test -- test/unit/agent-prompt.test.ts
Manual: run agent in web UI and confirm discovery tools (search, cypher, read) dispatch successfully

Made with Cursor

Rewrites BASE_SYSTEM_PROMPT to fix tool-name mismatches, citation format, and schema guidance from PR #14 tri-review, and adds unit tests that guard prompt ↔ tool registry parity. Co-authored-by: Cursor <cursoragent@cursor.com>

vercel · 2026-06-03T04:44:43Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
gitnexus	Ready	Preview, Comment	Jun 3, 2026 6:25am

magyargergo

🔭 Tri-review — `fix(web): align agent system prompt with registered tools`

Methods & engine breakdown. Reviewed three ways: the GitNexus reviewer swarm (risk, test/CI) + Compound-Engineering personas (correctness, adversarial, maintainability, testing) — six Claude lanes — plus Codex, the one genuinely independent engine. Codex was sandbox-limited this run: it couldn't use its local shell or the GitNexus index, but recovered file access via its GitHub tools and read the full diff, the tool definitions, and the gitnexus-shared schema constants. Its visible analysis corroborated two points (the validation-section fold-in is benign; the schema constants it located back the schema-match check), but it did not return a retrievable final findings list. So treat this as a Claude-consensus review with partial Codex corroboration — not three independent confirmations (the six Claude lanes share priors, so their agreement is "consistent across personas," not independent).

This holds up well. I re-read the code to confirm the central claim end-to-end: the prompt's seven tool names, the graph-schema node labels & relation types (all members of NODE_TABLES/REL_TYPES in gitnexus-shared), the cypher {{QUERY_VECTOR}} + query routing, and the [[path:START-END]] / [[Type:Name]] citation format all match the real createGraphRAGTools registration (tools.ts:1497) and the UI grounding parser. The "There is NO highlight_in_graph tool" line is accurate. All five new test cases (8 expect() calls) genuinely pass. This is a clean fix for a prompt that had drifted from the actual toolset.

Headline (inline) — P2, non-blocking

The new test guards prompt ↔ GRAPH_RAG_TOOL_NAMES but not GRAPH_RAG_TOOL_NAMES ↔ the actual registered tools. GRAPH_RAG_TOOL_NAMES is a third hand-maintained copy; the test never imports createGraphRAGTools, so a future tool rename/add/remove can keep all five test cases green while the prompt mislabels a real tool — the LLM then emits a tool-call name LangChain can't route, and that tool silently fails at runtime. One assertion closes the loop (see inline comment). Flagged by five of the six lanes (risk, test-CI, adversarial, maintainability, testing; correctness noted it as a testing gap); trigger confirmed by code-read. Non-blocking — the three lists are in sync today.

Minor (optional)

Test-only const on the public barrel — GRAPH_RAG_TOOL_NAMES is consumed only by the test, which imports it directly from ./tools; the index.ts:24 re-export adds public surface for no runtime consumer. (maintainability)
Const comment understates the coupling — tools.ts:19 says "keep in sync with BASE_SYSTEM_PROMPT"; the real invariant is registration ↔ const ↔ prompt. (maintainability)
A few brittle assertions — FORBIDDEN_TOOL_NAMES only blocks back-ticked names (a bare-prose legacy mention would evade it); the highlight_in_graph regex leans on single-line token co-occurrence; the citation regex matches the example not the instruction; the new [[Type:Name]] symbol-ref form is untested. All P3 hardening, not defects. (adversarial, testing)

Refuted (validation is a feature)

Dropping the "MANDATORY: VALIDATION" heading is not a regression — it was folded into CORE PROTOCOL step 6, the "Cite or retract" rule, and a new ERROR RECOVERY section. (risk, correctness, adversarial; Codex concurred)
Tool-name substring-collision (search vs hybrid_search), barrel-export name collision, and "citation placeholders are unparseable" were each probed and refuted — the concrete prompt examples are parser-valid.

Pre-existing (not introduced here)

The UI's NODE_REF_REGEX allowlist (grounding-patterns.ts:8) omits Community/Process, yet the prompt lists them as node labels, so a [[Process:…]] citation would be silently dropped. Pre-existing — the prompt's examples only use Function/Class; worth a separate ticket, not this PR.

CI & merge

Branch hygiene: merge-from-main commit present but harmless and merge-safe (Merge branch 'main' brought the branch up to date; the web change is one focused commit).
Merge state: checks pending (BLOCKED on required checks; no conflicts). The web gates are green — typecheck-web, lint, format, Build & Push gitnexus-web, e2e / e2e (chromium), e2e / Check web module changes, CodeQL. The one web-relevant check still pending is tests / ubuntu / coverage (it runs the new agent-prompt.test.ts); the rest of the pending set (Build & Push gitnexus, scope-parity, tree-sitter ABI windows, windows platform-sensitive) is ingestion/CLI and unrelated to this web-only change.

Final verdict — production-ready with minor follow-ups

No correctness defects: the prompt-vs-tools alignment, graph schema, and citation format were verified end-to-end and the new tests pass. The single P2 (a const↔registration parity gap in the test) is a non-blocking hardening that a one-line assertion resolves; the other items are P3 polish or pre-existing. Before merge, just let the remaining tests / ubuntu / coverage check (which runs the new test) go green.

Automated multi-tool digest (GitNexus swarm + Compound-Engineering + Codex). Verify findings before acting. No blocking issues; the one inline item is a non-blocking test-coverage enhancement.

github-actions · 2026-06-03T05:20:58Z

CI Report

✅ All checks passed

Pipeline Status

Stage	Status	Details
✅ Typecheck	`success`	tsc --noEmit
✅ Tests	`success`	unit tests, 3 platforms
✅ E2E	`success`	gitnexus-web changes only

Test Results

Tests	Passed	Failed	Skipped	Duration
10960	10947	0	13	662s

✅ All 10947 tests passed

13 test(s) skipped — expand for details

COBOL pipeline benchmark > scales with file count
C# pipeline benchmark > scales with file count — namespaces spread across the solution
C# pipeline benchmark > scales with file count — all types in one (global) namespace bucket
C# pipeline benchmark > scales with file count — all types in one (named) namespace bucket
Go pipeline benchmark > scales with file count (workers enabled)
Go pipeline benchmark — worker pool (issue Worker idle timeout kills long Go scope extraction and surfaces as Napi::Error during analyze #1848) > does not quarantine the large generated Go file on sub-batch idle timeout
Go structural interface detection benchmark > scales linearly with interface × struct count
Go structural interface detection split-phase benchmark > separates index-build and detection time
PHP pipeline benchmark > scales with file count (workers enabled)
Ruby pipeline benchmark > scales with file count (workers enabled)
Rust pipeline benchmark > scales with file count (workers enabled)
run.cjs direct-exec entrypoint (fix(cli): steer docs, skills, and hooks through a CLI-neutral project-local runner (#1939) #1945) > resolves a .cmd shim via the Windows shell branch, passing args and exit code
buildTypeEnv > known limitations (documented skip tests) > Ruby block parameter: users.each { |user| } — closure param inference, different feature

Code Coverage

Tests

Metric	Coverage	Covered	Base	Delta	Status
Statements	80.3%	38245/47625	79.84%	📈 +0.5	🟢 ████████████████░░░░
Branches	68.85%	24321/35320	68.5%	📈 +0.3	🟢 █████████████░░░░░░░
Functions	85.45%	3978/4655	84.94%	📈 +0.5	🟢 █████████████████░░░
Lines	83.91%	34403/40998	83.36%	📈 +0.5	🟢 ████████████████░░░░

_{📋 View full run · Generated by CI}

U1: assert GRAPH_RAG_TOOL_NAMES equals the names createGraphRAGTools actually registers (via a no-op stub backend), closing the const<->registration drift gap the prompt-parity test previously missed. U2: make the forbidden-name guard word-boundary (catches bare-prose mentions, not just backticked); make the highlight_in_graph guarantee registry-level (reword-proof) plus a presence check; add a parser-recognized [[Type:Name]] symbol-citation assertion. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

U3: GRAPH_RAG_TOOL_NAMES has no runtime consumer -- the parity test imports it directly from ./tools -- so remove it from the public index.ts barrel re-export. Update the constant's doc comment to name the registration<->const<->prompt coupling now enforced by agent-prompt.test.ts. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Source the symbol-citation assertion from the UI parser's own NODE_REF_REGEX instead of a hardcoded 4-label subset, so the test tracks the parser's allowlist rather than forking it. Also drop a redundant array spread and an unnecessary readonly-tuple cast surfaced by the simplify pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Code review noted the registry-absence + bare-presence pair would pass if a future prompt edit affirmatively instructed calling highlight_in_graph (string present, still not registered). Add an assertion that the prompt never says use/call/invoke highlight_in_graph -- restoring the protective intent of the replaced negation check without its brittleness. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…system-prompt Fold the prompt<->tools parity test hardening into PR #1984: const<->registration parity gate, word-boundary forbidden-name guard, NODE_REF_REGEX-derived symbol-ref assertion, and a guard against affirmative highlight_in_graph call instructions. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

vercel Bot deployed to Preview June 3, 2026 04:45 View deployment

Merge branch 'main' into fix/web-agent-system-prompt

b1bab8c

vercel Bot deployed to Preview June 3, 2026 04:53 View deployment

magyargergo commented Jun 3, 2026

View reviewed changes

Comment thread gitnexus-web/test/unit/agent-prompt.test.ts Outdated

magyargergo and others added 5 commits June 3, 2026 05:45

vercel Bot deployed to Preview June 3, 2026 06:08 View deployment

Merge branch 'main' into fix/web-agent-system-prompt

06aaf39

vercel Bot deployed to Preview June 3, 2026 06:25 View deployment

magyargergo merged commit 78ad6bc into main Jun 3, 2026
31 checks passed

magyargergo deleted the fix/web-agent-system-prompt branch June 3, 2026 07:00

magyargergo mentioned this pull request Jun 3, 2026

fix(ingestion): fully-qualified nested-type identity for C++/Ruby — structure (#1978) + resolution (#1982) #1981

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(web): align agent system prompt with registered tools#1984

fix(web): align agent system prompt with registered tools#1984
magyargergo merged 8 commits into
mainfrom
fix/web-agent-system-prompt

magyargergo commented Jun 3, 2026

Uh oh!

vercel Bot commented Jun 3, 2026 •

edited

Loading

Uh oh!

magyargergo left a comment

Uh oh!

Uh oh!

github-actions Bot commented Jun 3, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

magyargergo commented Jun 3, 2026

Summary

Test plan

Uh oh!

vercel Bot commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

magyargergo left a comment

Choose a reason for hiding this comment

🔭 Tri-review — fix(web): align agent system prompt with registered tools

Headline (inline) — P2, non-blocking

Minor (optional)

Refuted (validation is a feature)

Pre-existing (not introduced here)

CI & merge

Final verdict — production-ready with minor follow-ups

Uh oh!

Uh oh!

github-actions Bot commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CI Report

Pipeline Status

Test Results

Code Coverage

Tests

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel Bot commented Jun 3, 2026 •

edited

Loading

🔭 Tri-review — `fix(web): align agent system prompt with registered tools`

github-actions Bot commented Jun 3, 2026 •

edited

Loading