feat(mcp): rank context/impact disambiguation candidates and expose kind/file_path hints#888
Conversation
…ind/file_path hints
The `context` MCP tool already returned `{ status: 'ambiguous', candidates }`
when a name hit multiple symbols, but the candidates were returned in
arbitrary DB order and the only hint it accepted was file_path. The
`impact` tool was worse: when its name resolver found multiple viable
matches it silently picked the first one from a priority UNION, with no
signal back to the caller that a different symbol might have been
intended.
Both failure modes were flagged in issue abhigyanpatwari#470 and reconfirmed in the
comments by a second user who described impact as returning "incorrect
parsing results and meaningless tool calls" in the multi-match case.
Changes:
* Add `resolveSymbolCandidates(repo, query, hints)` private helper on
LocalBackend. Single place that:
- Short-circuits on direct uid (zero-ambiguity)
- Runs the same name-or-qualified-id match as before, with LIMIT 20
(was 10) so the ranker has headroom instead of arbitrary truncation
- Preserves the abhigyanpatwari#480 Class/Constructor preference -- when the only
ambiguity is a Class and its own Constructor, the Class wins
silently
- Scores each candidate (pure TS, no extra DB round-trip): base 0.50,
+0.40 for file_path match, +0.20 for kind match, plus a small
kind-priority tiebreaker (Class > Interface > Function > Method >
Constructor) when no explicit kind hint is given
- Sorts desc by score with stable tiebreakers (shorter filePath,
then lex uid)
- Promotes to a single confident resolve when the top score is
>= 0.95 AND beats the runner-up by >= 0.10 -- lets a strong hint
cut through without forcing the caller through a disambiguation
round-trip
* Rewire `context()` to use the shared helper. Response shape is a
strict superset of today's: candidates gain a `score` field, the
existing `{ uid, name, kind, filePath, line }` keys are preserved so
every downstream consumer (rename, eval-server formatter, etc.) keeps
working. New `kind` input hint accepted.
* Rewire `impact()` to use the shared helper. Now emits the same
`{ status: 'ambiguous', candidates, impactedCount: 0, risk: 'UNKNOWN' }`
shape instead of silent first-pick. New inputs accepted:
`target_uid`, `file_path`, `kind`.
* Update tool schemas in mcp/tools.ts to advertise the new inputs and
describe ranked disambiguation.
Backward compatibility:
The abhigyanpatwari#480 Class/Constructor collapse is preserved and covered by the
existing java-class-impact integration test (still green). The
ambiguous response shape is a strict superset -- `eval-formatters`
unit test that parses the old shape is unchanged and still passes.
`impact` going from silent-first-pick to structured ambiguous is a
semantic improvement that is the entire point of the issue; callers
relying on silent first-pick now get an actionable response.
Scope declined for v1:
module/community hint -- the issue lists it as one of several hints,
but kind + file_path cover the vast majority of disambiguation needs
in practice, and a community-label filter requires an extra graph
query per candidate. Natural v2 follow-up.
Tests: calltool-dispatch.test.ts gains 5 new cases covering file_path
boost, kind hint boost, impact ambiguous shape, impact target_uid
short-circuit, and score field presence on the existing ambiguous
test. Plus the extended assertions on the existing
`context tool returns disambiguation for multiple matches`.
Verification:
npx vitest run test/unit/calltool-dispatch.test.ts -> 64 pass
npx vitest run test/integration/java-class-impact.test.ts -> pass
npm run test:unit -> 3642 pass
(4 pre-existing env failures unchanged: skip-git-cli needs built
dist/, git-utils tmpdir on Windows worktree -- same on main)
npx tsc --noEmit -> clean
Closes abhigyanpatwari#470
|
@azizur100389 is attempting to deploy a commit to the NexusCore Team on Vercel. A member of the Team first needs to authorize it. |
|
@claude Act as a senior reviewer for GitNexus. Your job is to determine whether this PR is production-ready for this repo, not to give a generic code review. You are reviewing a PR in the GitNexus monorepo:
Your task has 2 phases, in this exact order: PHASE 1 — DEFINE THE BAR PHASE 2 — REVIEW THE PR AGAINST THAT BAR CONTEXT TO LOAD FIRST
Additional context:
PRIMARY OBJECTIVE Do not optimize for completeness at the expense of signal. REVIEW RULES
For each finding, assign one severity:
REPO-SPECIFIC REVIEW CHECKLIST 1. Correctness & functional completenessCheck:
2. Code clarity & clean codeCheck:
3. Test coverage & change safetyEvaluate against TESTING.md:
4. PerformanceInspect for:
5. Operational riskCheck:
6. MaintainabilityCheck:
NON-NEGOTIABLES TO VERIFY
OUTPUT FORMAT Production-ready bar for this PRGive a concise repo-specific definition in 5–10 bullets. FindingsFor each finding, use this structure exactly: [BLOCKING|NON-BLOCKING|NIT] Short title
Only include findings that are real and supported.
What I could not fully verifyList any behavior you could not prove from the diff, tests, or available context. Final verdictChoose exactly one:
If verdict is not ready, list the blocking items in priority order. IMPORTANT REVIEW BEHAVIOR
|
|
Claude finished @magyargergo's task in 5m 43s —— View job Production-ready bar for this PR
Findings1. Correctness & functional completenessNo blocking issues. Behavior matches claims:
One subtle behavioral note (see §3 tests for the testing angle): when 2. Code clarity & clean codeNo issues found in this area based on the diff and reviewed context.
NIT — redundant guard at
3. Test coverage & change safetyNON-BLOCKING — Test mechanism diverges from production code path for
|
CI Report✅ All checks passed Pipeline Status
Test Results
✅ All 6341 tests passed 97 test(s) skipped — expand for details
Code CoverageTests
📋 View full run · Generated by CI |
…s review findings CI on PR abhigyanpatwari#888 caught 13 integration-test failures I did not cover locally: my resolver refactor collected candidates via `labels(n)[0] AS type`, but LadybugDB returns an empty string for that projection on certain node types (most importantly Class). With an empty `type`, impact's downstream `_runImpactBFS` no longer recognised `symType === 'Class' | 'Interface'` and stopped seeding Constructor + File nodes into the frontier, so the "impact(upstream) surfaces the file importer" assertion broke across 11 language fixtures plus 2 OVERRIDES filter tests. The original impact resolver worked around this by running a prioritised UNION across Class/Interface/Function/Method/Constructor and picking the first hit. My refactor dropped that. Fix: keep the simple candidate MATCH but enrich types afterward via a single scoped UNION query, so every candidate carries an accurate label for both scoring and downstream BFS seeding. The UID direct-lookup path is patched the same way. Also addresses the findings from the senior reviewer on PR abhigyanpatwari#888: * MIGRATION.md: document the `impact` behavioural change (silent first- pick → structured `{ status: 'ambiguous', candidates }`) so downstream callers know to branch on `result.status` before reading byDepth/ summary. `context` is unchanged shape-wise (strict superset). * New test: `context tool promotes top candidate via scoring when multiple rows survive DB pre-filter`. The review flagged that the existing file_path test works only because the mock ignores WHERE parameters -- the scored-promotion path (top ≥ 0.95 AND gap > 0.09) wasn't directly exercised. The new test uses two candidates both in App.tsx-containing paths plus a kind hint so promotion is decided by scoring, not DB pre-filtering. Also tightened the comment on the earlier file_path test to describe the mock vs production divergence honestly. * NIT: added a paragraph explaining why `scored.length >= 2` is kept as a defensive guard even though the `normalized.length === 1` early return already covers the single-candidate path. * Integration: two tests in `local-backend-calltool.test.ts` targeted `'authenticate'`, which now correctly resolves as ambiguous (two Method nodes: AuthService.authenticate and BaseService.authenticate). Updated both to pass `file_path: 'src/auth.ts'` so they exercise the new disambiguation API and still assert the METHOD_OVERRIDES filtering they were originally about. Edge case fix in the promotion gap check: IEEE754 makes 0.50 + 0.40 + 0.20 - 0.90 = 0.09999999999999998 instead of exactly 0.10, which would otherwise break the "winner clearly dominates" intent for legitimate 1.00 vs 0.90 cases. Changed `>= 0.10` to `> 0.09`; same user-facing intent, no floating-point sensitivity. Verification (all from gitnexus/): npx vitest run test/integration/class-impact-all-languages.test.ts -> 52 pass (was 11 FAIL on CI before this fix) npx vitest run test/integration/local-backend-calltool.test.ts -> 18 pass (was 2 FAIL on CI before this fix) npx vitest run test/integration/java-class-impact.test.ts -> 10 pass (regression guard for abhigyanpatwari#480 preserved) npx vitest run test/unit/calltool-dispatch.test.ts -> 65 pass (1 new test + 4 from original abhigyanpatwari#470 PR) npm run test:unit -> 3626 pass, 4 pre-existing env failures unchanged npx tsc --noEmit -> clean
|
Thanks for the review, and thanks for catching the CI failure — that one was on me. Addressed everything in Root cause of the 13 CI failuresThe refactor collected candidates via FixKept the simple candidate MATCH but enrich types afterwards via a single scoped UNION query, so every candidate carries an accurate label for both the kind-priority scoring bonus and downstream BFS seeding. One extra DB round-trip only when needed — no change when the labels come back populated. UID direct-lookup path patched the same way. await this.enrichCandidateLabels(repo, normalized);
// ...
private async enrichCandidateLabels(repo, candidates) {
const ids = candidates.filter(c => c.type === '' && c.id).map(c => c.id);
if (ids.length === 0) return;
// UNION across Class/Interface/Function/Method/Constructor, patch in-place
}Also updated the 2 Review findings1. NON-BLOCKING — MIGRATION.md entry missing → addressedNew entry at the top of MIGRATION.md describing the 2. NON-BLOCKING — test mechanism diverges from production for
|
| Test | Before (CI) | After (local) |
|---|---|---|
test/integration/class-impact-all-languages.test.ts |
11 × FAIL | 52 pass |
test/integration/local-backend-calltool.test.ts |
2 × FAIL | 18 pass |
test/integration/java-class-impact.test.ts (#480 regression guard) |
— | 10 pass |
test/unit/calltool-dispatch.test.ts |
— | 65 pass (1 new + 4 original #470) |
npm run test:unit |
— | 3626 pass, 4 pre-existing env failures unchanged |
npx tsc --noEmit |
— | clean |
| Pre-commit hook (lint-staged + prettier + tsc) | — | clean |
|
Quick ping — CI's been green on all 3 platforms since
Happy to re-request |
|
@claude Act as a senior reviewer for GitNexus. Your job is to determine whether this PR is production-ready for this repo, not to give a generic code review. You are reviewing a PR in the GitNexus monorepo:
Your task has 2 phases, in this exact order: PHASE 1 — DEFINE THE BAR PHASE 2 — REVIEW THE PR AGAINST THAT BAR CONTEXT TO LOAD FIRST
Additional context:
PRIMARY OBJECTIVE Do not optimize for completeness at the expense of signal. REVIEW RULES
For each finding, assign one severity:
REPO-SPECIFIC REVIEW CHECKLIST 1. Correctness & functional completenessCheck:
2. Code clarity & clean codeCheck:
3. Test coverage & change safetyEvaluate against TESTING.md:
4. PerformanceInspect for:
5. Operational riskCheck:
6. MaintainabilityCheck:
NON-NEGOTIABLES TO VERIFY
OUTPUT FORMAT Production-ready bar for this PRGive a concise repo-specific definition in 5–10 bullets. FindingsFor each finding, use this structure exactly: [BLOCKING|NON-BLOCKING|NIT] Short title
Only include findings that are real and supported.
What I could not fully verifyList any behavior you could not prove from the diff, tests, or available context. Final verdictChoose exactly one:
If verdict is not ready, list the blocking items in priority order. IMPORTANT REVIEW BEHAVIOR
|
|
Claude finished @magyargergo's task in 3m 48s —— View job Re-review of PR #888 — Final verdict
Production-ready bar for this PR
Findings1. Correctness & functional completeness
Verified in commit
2. Code clarity & clean code
One subtle observation (not a finding): when 3. Test coverage & change safety
Previous review's NON-BLOCKING test-mechanism finding fully addressed:
4. Performance
5. Operational risk
Previous review's NON-BLOCKING fully addressed: MIGRATION.md entry is complete, accurate, and correctly scoped — includes the full ambiguous response shape, migration guidance ( Other checks:
6. Maintainability
Non-negotiables
What I could not fully verify
Final verdict✅ Ready to merge — no blocking issues. All findings from the first review are addressed in
The root cause of the 13 CI failures (LadybugDB returning empty |
|
Thank you for your contribution! |
…ind/file_path hints (abhigyanpatwari#888) * feat(mcp): rank context/impact disambiguation candidates and expose kind/file_path hints The `context` MCP tool already returned `{ status: 'ambiguous', candidates }` when a name hit multiple symbols, but the candidates were returned in arbitrary DB order and the only hint it accepted was file_path. The `impact` tool was worse: when its name resolver found multiple viable matches it silently picked the first one from a priority UNION, with no signal back to the caller that a different symbol might have been intended. Both failure modes were flagged in issue abhigyanpatwari#470 and reconfirmed in the comments by a second user who described impact as returning "incorrect parsing results and meaningless tool calls" in the multi-match case. Changes: * Add `resolveSymbolCandidates(repo, query, hints)` private helper on LocalBackend. Single place that: - Short-circuits on direct uid (zero-ambiguity) - Runs the same name-or-qualified-id match as before, with LIMIT 20 (was 10) so the ranker has headroom instead of arbitrary truncation - Preserves the abhigyanpatwari#480 Class/Constructor preference -- when the only ambiguity is a Class and its own Constructor, the Class wins silently - Scores each candidate (pure TS, no extra DB round-trip): base 0.50, +0.40 for file_path match, +0.20 for kind match, plus a small kind-priority tiebreaker (Class > Interface > Function > Method > Constructor) when no explicit kind hint is given - Sorts desc by score with stable tiebreakers (shorter filePath, then lex uid) - Promotes to a single confident resolve when the top score is >= 0.95 AND beats the runner-up by >= 0.10 -- lets a strong hint cut through without forcing the caller through a disambiguation round-trip * Rewire `context()` to use the shared helper. Response shape is a strict superset of today's: candidates gain a `score` field, the existing `{ uid, name, kind, filePath, line }` keys are preserved so every downstream consumer (rename, eval-server formatter, etc.) keeps working. New `kind` input hint accepted. * Rewire `impact()` to use the shared helper. Now emits the same `{ status: 'ambiguous', candidates, impactedCount: 0, risk: 'UNKNOWN' }` shape instead of silent first-pick. New inputs accepted: `target_uid`, `file_path`, `kind`. * Update tool schemas in mcp/tools.ts to advertise the new inputs and describe ranked disambiguation. Backward compatibility: The abhigyanpatwari#480 Class/Constructor collapse is preserved and covered by the existing java-class-impact integration test (still green). The ambiguous response shape is a strict superset -- `eval-formatters` unit test that parses the old shape is unchanged and still passes. `impact` going from silent-first-pick to structured ambiguous is a semantic improvement that is the entire point of the issue; callers relying on silent first-pick now get an actionable response. Scope declined for v1: module/community hint -- the issue lists it as one of several hints, but kind + file_path cover the vast majority of disambiguation needs in practice, and a community-label filter requires an extra graph query per candidate. Natural v2 follow-up. Tests: calltool-dispatch.test.ts gains 5 new cases covering file_path boost, kind hint boost, impact ambiguous shape, impact target_uid short-circuit, and score field presence on the existing ambiguous test. Plus the extended assertions on the existing `context tool returns disambiguation for multiple matches`. Verification: npx vitest run test/unit/calltool-dispatch.test.ts -> 64 pass npx vitest run test/integration/java-class-impact.test.ts -> pass npm run test:unit -> 3642 pass (4 pre-existing env failures unchanged: skip-git-cli needs built dist/, git-utils tmpdir on Windows worktree -- same on main) npx tsc --noEmit -> clean Closes abhigyanpatwari#470 * fix(mcp): enrich labels from UNION when labels(n)[0] is empty; address review findings CI on PR abhigyanpatwari#888 caught 13 integration-test failures I did not cover locally: my resolver refactor collected candidates via `labels(n)[0] AS type`, but LadybugDB returns an empty string for that projection on certain node types (most importantly Class). With an empty `type`, impact's downstream `_runImpactBFS` no longer recognised `symType === 'Class' | 'Interface'` and stopped seeding Constructor + File nodes into the frontier, so the "impact(upstream) surfaces the file importer" assertion broke across 11 language fixtures plus 2 OVERRIDES filter tests. The original impact resolver worked around this by running a prioritised UNION across Class/Interface/Function/Method/Constructor and picking the first hit. My refactor dropped that. Fix: keep the simple candidate MATCH but enrich types afterward via a single scoped UNION query, so every candidate carries an accurate label for both scoring and downstream BFS seeding. The UID direct-lookup path is patched the same way. Also addresses the findings from the senior reviewer on PR abhigyanpatwari#888: * MIGRATION.md: document the `impact` behavioural change (silent first- pick → structured `{ status: 'ambiguous', candidates }`) so downstream callers know to branch on `result.status` before reading byDepth/ summary. `context` is unchanged shape-wise (strict superset). * New test: `context tool promotes top candidate via scoring when multiple rows survive DB pre-filter`. The review flagged that the existing file_path test works only because the mock ignores WHERE parameters -- the scored-promotion path (top ≥ 0.95 AND gap > 0.09) wasn't directly exercised. The new test uses two candidates both in App.tsx-containing paths plus a kind hint so promotion is decided by scoring, not DB pre-filtering. Also tightened the comment on the earlier file_path test to describe the mock vs production divergence honestly. * NIT: added a paragraph explaining why `scored.length >= 2` is kept as a defensive guard even though the `normalized.length === 1` early return already covers the single-candidate path. * Integration: two tests in `local-backend-calltool.test.ts` targeted `'authenticate'`, which now correctly resolves as ambiguous (two Method nodes: AuthService.authenticate and BaseService.authenticate). Updated both to pass `file_path: 'src/auth.ts'` so they exercise the new disambiguation API and still assert the METHOD_OVERRIDES filtering they were originally about. Edge case fix in the promotion gap check: IEEE754 makes 0.50 + 0.40 + 0.20 - 0.90 = 0.09999999999999998 instead of exactly 0.10, which would otherwise break the "winner clearly dominates" intent for legitimate 1.00 vs 0.90 cases. Changed `>= 0.10` to `> 0.09`; same user-facing intent, no floating-point sensitivity. Verification (all from gitnexus/): npx vitest run test/integration/class-impact-all-languages.test.ts -> 52 pass (was 11 FAIL on CI before this fix) npx vitest run test/integration/local-backend-calltool.test.ts -> 18 pass (was 2 FAIL on CI before this fix) npx vitest run test/integration/java-class-impact.test.ts -> 10 pass (regression guard for abhigyanpatwari#480 preserved) npx vitest run test/unit/calltool-dispatch.test.ts -> 65 pass (1 new test + 4 from original abhigyanpatwari#470 PR) npm run test:unit -> 3626 pass, 4 pre-existing env failures unchanged npx tsc --noEmit -> clean
Closes #470.
Problem
contextalready returned{ status: 'ambiguous', candidates }when a name hit multiple symbols, but the candidates were in arbitrary DB order and the only hint accepted wasfile_path.impactwas worse — when its name resolver found multiple viable matches it silently picked the first row from a priority UNION, giving no signal back that a different symbol might have been intended. A second user confirmed the impact failure mode in the issue comments: "the returned match is uncertain, which often leads to incorrect parsing results and meaningless tool calls."What changed
New shared resolver
resolveSymbolCandidates(repo, query, hints)onLocalBackend. Bothcontextandimpactnow go through it, so resolution behaviour can't drift between the two tools.Behaviour, in order:
WHERE n.filePath CONTAINS $filePathfilter.LIMIT 20(was 10) so the ranker has headroom.java-class-impact.test.ts(still green).0.50+0.40whenfile_pathhint matches (substring, case-insensitive)+0.20whenkindhint exactly matches the candidate's kindkindhint: small priority bonusClass > Interface > Function > Method > Constructor(+0.10, +0.08, +0.06, +0.04, +0.02)1.0filePathfirst, then lexuid≥ 0.95and beats runner-up by≥ 0.10, resolve as{ kind: 'ok' }instead of emitting ambiguous. A strongfile_path+kindhint can cut through without a disambiguation round-trip.{ kind: 'ambiguous', candidates: [{..., score}] }.API surface
contextinput schema gainskind. Response shape is a strict superset of today's — every candidate now carries ascorefield; all existing keys (uid, name, kind, filePath, line) preserved.impactinput schema gainstarget_uid,file_path,kind. New response variant when the target is ambiguous:{ \"status\": \"ambiguous\", \"message\": \"Found N symbols matching 'X'. Use target_uid, file_path, or kind to disambiguate.\", \"target\": { \"name\": \"X\" }, \"direction\": \"upstream\", \"impactedCount\": 0, \"risk\": \"UNKNOWN\", \"candidates\": [ { \"uid\": \"...\", \"name\": \"X\", \"kind\": \"Function\", \"filePath\": \"src/...\", \"line\": 42, \"score\": 0.76 } ] }Backward compatibility
contextambiguous shape — strict superset.eval-formatters.test.ts(which parses the old shape) is unchanged and still passes.renametool delegates to context and forwardsstatus === 'ambiguous'unchanged.impactgoing from silent-first-pick to structured ambiguous is the point of the issue. Callers that were relying on silent-first-pick now get an actionable response and can retry withtarget_uid/file_path/kind.#480Class/Constructor preference preserved inside the shared resolver; covered byjava-class-impact.test.ts(green).Scope declined for v1
module/communityhint. The issue lists it as one of several hints, butkind+file_pathcover the vast majority of disambiguation needs in practice, and a community-label filter requires an extra graph query per candidate. Natural v2 follow-up — the ranker and the tool schemas are designed so that bolting on community signal later only touchesscoreCandidate+ one schema field.Tests
Added / extended in
gitnexus/test/unit/calltool-dispatch.test.ts(6 changes, all use the existingexecuteParameterizedmock pattern):context tool returns disambiguation for multiple matches— asserts every candidate now carries a numericscore ∈ [0,1]and the list is sorted descending.context tool ranks file_path match higher than non-match— confident single-result promotion kicks in whenfile_path: 'App.tsx'narrows two Functions to one.context tool returns ranked candidates when file_path only partially narrows— asserts the exact computed score (0.56) for tied base-case candidates.context tool boosts the candidate whose kind matches the hint— Function + Method sharing a name;kind: 'Function'bubbles the Function to top.impact tool returns ambiguous shape with ranked candidates when target has multiple matches— the core new behaviour.impact tool resolves via target_uid without running the name-based resolver— inspectsexecuteParameterized.mock.callsto assert noWHERE n.name = \$symNamequery fires.Test plan
npx vitest run test/unit/calltool-dispatch.test.ts— 64 passednpx vitest run test/integration/java-class-impact.test.ts— pass (impact/context return empty results for Java classes — edges target Constructor/File nodes, not Class nodes #480 regression guard green)npx vitest run test/unit/eval-formatters.test.ts— pass (backward-compat shape)npm run test:unit— 3642 passed, 4 pre-existing env failures unchanged (skip-git-cli.test.tsneeds builtdist/;git-utils.test.tsWindows-tmpdir issue — both fail identically onmain)npx tsc --noEmit— clean