Skip to content

feat(mcp): rank context/impact disambiguation candidates and expose kind/file_path hints#888

Merged
magyargergo merged 2 commits into
abhigyanpatwari:mainfrom
azizur100389:feat/rank-context-impact-disambiguation
Apr 18, 2026
Merged

feat(mcp): rank context/impact disambiguation candidates and expose kind/file_path hints#888
magyargergo merged 2 commits into
abhigyanpatwari:mainfrom
azizur100389:feat/rank-context-impact-disambiguation

Conversation

@azizur100389

Copy link
Copy Markdown
Contributor

Closes #470.

Problem

context already returned { status: 'ambiguous', candidates } when a name hit multiple symbols, but the candidates were in arbitrary DB order and the only hint accepted was file_path. impact was worse — when its name resolver found multiple viable matches it silently picked the first row from a priority UNION, giving no signal back that a different symbol might have been intended. A second user confirmed the impact failure mode in the issue comments: "the returned match is uncertain, which often leads to incorrect parsing results and meaningless tool calls."

What changed

New shared resolver resolveSymbolCandidates(repo, query, hints) on LocalBackend. Both context and impact now go through it, so resolution behaviour can't drift between the two tools.

Behaviour, in order:

  1. Direct uid → zero-ambiguity short-circuit (both tools).
  2. Name / qualified-id query with optional WHERE n.filePath CONTAINS $filePath filter. LIMIT 20 (was 10) so the ranker has headroom.
  3. impact/context return empty results for Java classes — edges target Constructor/File nodes, not Class nodes #480 Class/Constructor collapse preserved — when the only ambiguity is a Class and its own Constructor sharing name+filePath, the Class wins silently. Covered by existing java-class-impact.test.ts (still green).
  4. Ranking (pure TS, no extra DB round-trip):
    • base 0.50
    • +0.40 when file_path hint matches (substring, case-insensitive)
    • +0.20 when kind hint exactly matches the candidate's kind
    • When no kind hint: small priority bonus Class > Interface > Function > Method > Constructor (+0.10, +0.08, +0.06, +0.04, +0.02)
    • Capped at 1.0
    • Stable tiebreak: shorter filePath first, then lex uid
  5. Confident single-result promotion: if top score ≥ 0.95 and beats runner-up by ≥ 0.10, resolve as { kind: 'ok' } instead of emitting ambiguous. A strong file_path + kind hint can cut through without a disambiguation round-trip.
  6. Otherwise { kind: 'ambiguous', candidates: [{..., score}] }.

API surface

context input schema gains kind. Response shape is a strict superset of today's — every candidate now carries a score field; all existing keys (uid, name, kind, filePath, line) preserved.

impact input schema gains target_uid, file_path, kind. New response variant when the target is ambiguous:

{
  \"status\": \"ambiguous\",
  \"message\": \"Found N symbols matching 'X'. Use target_uid, file_path, or kind to disambiguate.\",
  \"target\": { \"name\": \"X\" },
  \"direction\": \"upstream\",
  \"impactedCount\": 0,
  \"risk\": \"UNKNOWN\",
  \"candidates\": [
    { \"uid\": \"...\", \"name\": \"X\", \"kind\": \"Function\", \"filePath\": \"src/...\", \"line\": 42, \"score\": 0.76 }
  ]
}

Backward compatibility

  • context ambiguous shape — strict superset. eval-formatters.test.ts (which parses the old shape) is unchanged and still passes. rename tool delegates to context and forwards status === 'ambiguous' unchanged.
  • impact going from silent-first-pick to structured ambiguous is the point of the issue. Callers that were relying on silent-first-pick now get an actionable response and can retry with target_uid / file_path / kind.
  • #480 Class/Constructor preference preserved inside the shared resolver; covered by java-class-impact.test.ts (green).

Scope declined for v1

module / community hint. The issue lists it as one of several hints, but kind + file_path cover the vast majority of disambiguation needs in practice, and a community-label filter requires an extra graph query per candidate. Natural v2 follow-up — the ranker and the tool schemas are designed so that bolting on community signal later only touches scoreCandidate + one schema field.

Tests

Added / extended in gitnexus/test/unit/calltool-dispatch.test.ts (6 changes, all use the existing executeParameterized mock pattern):

  1. extend context tool returns disambiguation for multiple matches — asserts every candidate now carries a numeric score ∈ [0,1] and the list is sorted descending.
  2. context tool ranks file_path match higher than non-match — confident single-result promotion kicks in when file_path: 'App.tsx' narrows two Functions to one.
  3. context tool returns ranked candidates when file_path only partially narrows — asserts the exact computed score (0.56) for tied base-case candidates.
  4. context tool boosts the candidate whose kind matches the hint — Function + Method sharing a name; kind: 'Function' bubbles the Function to top.
  5. impact tool returns ambiguous shape with ranked candidates when target has multiple matches — the core new behaviour.
  6. impact tool resolves via target_uid without running the name-based resolver — inspects executeParameterized.mock.calls to assert no WHERE n.name = \$symName query fires.

Test plan

  • npx vitest run test/unit/calltool-dispatch.test.ts64 passed
  • npx vitest run test/integration/java-class-impact.test.tspass (impact/context return empty results for Java classes — edges target Constructor/File nodes, not Class nodes #480 regression guard green)
  • npx vitest run test/unit/eval-formatters.test.tspass (backward-compat shape)
  • npm run test:unit3642 passed, 4 pre-existing env failures unchanged (skip-git-cli.test.ts needs built dist/; git-utils.test.ts Windows-tmpdir issue — both fail identically on main)
  • npx tsc --noEmit — clean
  • Pre-commit hook (lint-staged + eslint + prettier + tsc) — passed

…ind/file_path hints

The `context` MCP tool already returned `{ status: 'ambiguous', candidates }`
when a name hit multiple symbols, but the candidates were returned in
arbitrary DB order and the only hint it accepted was file_path. The
`impact` tool was worse: when its name resolver found multiple viable
matches it silently picked the first one from a priority UNION, with no
signal back to the caller that a different symbol might have been
intended.

Both failure modes were flagged in issue abhigyanpatwari#470 and reconfirmed in the
comments by a second user who described impact as returning "incorrect
parsing results and meaningless tool calls" in the multi-match case.

Changes:

* Add `resolveSymbolCandidates(repo, query, hints)` private helper on
  LocalBackend. Single place that:
   - Short-circuits on direct uid (zero-ambiguity)
   - Runs the same name-or-qualified-id match as before, with LIMIT 20
     (was 10) so the ranker has headroom instead of arbitrary truncation
   - Preserves the abhigyanpatwari#480 Class/Constructor preference -- when the only
     ambiguity is a Class and its own Constructor, the Class wins
     silently
   - Scores each candidate (pure TS, no extra DB round-trip): base 0.50,
     +0.40 for file_path match, +0.20 for kind match, plus a small
     kind-priority tiebreaker (Class > Interface > Function > Method >
     Constructor) when no explicit kind hint is given
   - Sorts desc by score with stable tiebreakers (shorter filePath,
     then lex uid)
   - Promotes to a single confident resolve when the top score is
     >= 0.95 AND beats the runner-up by >= 0.10 -- lets a strong hint
     cut through without forcing the caller through a disambiguation
     round-trip

* Rewire `context()` to use the shared helper. Response shape is a
  strict superset of today's: candidates gain a `score` field, the
  existing `{ uid, name, kind, filePath, line }` keys are preserved so
  every downstream consumer (rename, eval-server formatter, etc.) keeps
  working. New `kind` input hint accepted.

* Rewire `impact()` to use the shared helper. Now emits the same
  `{ status: 'ambiguous', candidates, impactedCount: 0, risk: 'UNKNOWN' }`
  shape instead of silent first-pick. New inputs accepted:
  `target_uid`, `file_path`, `kind`.

* Update tool schemas in mcp/tools.ts to advertise the new inputs and
  describe ranked disambiguation.

Backward compatibility:

The abhigyanpatwari#480 Class/Constructor collapse is preserved and covered by the
existing java-class-impact integration test (still green). The
ambiguous response shape is a strict superset -- `eval-formatters`
unit test that parses the old shape is unchanged and still passes.
`impact` going from silent-first-pick to structured ambiguous is a
semantic improvement that is the entire point of the issue; callers
relying on silent first-pick now get an actionable response.

Scope declined for v1:

module/community hint -- the issue lists it as one of several hints,
but kind + file_path cover the vast majority of disambiguation needs
in practice, and a community-label filter requires an extra graph
query per candidate. Natural v2 follow-up.

Tests: calltool-dispatch.test.ts gains 5 new cases covering file_path
boost, kind hint boost, impact ambiguous shape, impact target_uid
short-circuit, and score field presence on the existing ambiguous
test. Plus the extended assertions on the existing
`context tool returns disambiguation for multiple matches`.

Verification:
  npx vitest run test/unit/calltool-dispatch.test.ts       -> 64 pass
  npx vitest run test/integration/java-class-impact.test.ts -> pass
  npm run test:unit                                         -> 3642 pass
    (4 pre-existing env failures unchanged: skip-git-cli needs built
    dist/, git-utils tmpdir on Windows worktree -- same on main)
  npx tsc --noEmit                                          -> clean

Closes abhigyanpatwari#470
@vercel

vercel Bot commented Apr 16, 2026

Copy link
Copy Markdown

@azizur100389 is attempting to deploy a commit to the NexusCore Team on Vercel.

A member of the Team first needs to authorize it.

@magyargergo

Copy link
Copy Markdown
Collaborator

@claude Act as a senior reviewer for GitNexus. Your job is to determine whether this PR is production-ready for this repo, not to give a generic code review.

You are reviewing a PR in the GitNexus monorepo:

  • gitnexus/ → CLI + MCP
  • gitnexus-web/
  • gitnexus-shared/

Your task has 2 phases, in this exact order:

PHASE 1 — DEFINE THE BAR
Before reviewing the diff, establish a concise repo-specific definition of “production-ready” for GitNexus, based only on the repo docs and the affected area.
Keep this definition practical and reviewable. Do not invent standards that are not grounded in the repo.

PHASE 2 — REVIEW THE PR AGAINST THAT BAR
Review the actual diff only after defining the bar.
Stay tightly scoped to the changed code and its direct consequences.


CONTEXT TO LOAD FIRST
Read these before reviewing:

  • AGENTS.md
  • GUARDRAILS.md
  • CONTRIBUTING.md
  • TESTING.md
  • ARCHITECTURE.md

Additional context:


PRIMARY OBJECTIVE
Decide whether this PR is safe, correct, maintainable, and operationally acceptable to merge into production for GitNexus.

Do not optimize for completeness at the expense of signal.
Do not pad the review.
Do not propose unrelated refactors.
Do not restate the PR description unless needed for verification.


REVIEW RULES

  • Every finding must be grounded in specific evidence from the diff or directly relevant surrounding code.
  • Every finding must include path:line.
  • If you make a behavioral claim, cite the code that proves it.
  • If you make a performance claim, explain the mechanism.
  • If something cannot be verified from the diff alone, explicitly say so.
  • Distinguish clearly between:
    • verified issue
    • plausible risk
    • unverified concern
  • Avoid vague wording like “might be better” or “could be improved” unless you explain exactly why.
  • Keep the review focused on this PR’s scope only.

For each finding, assign one severity:

  • BLOCKING → must be fixed before merge
  • NON-BLOCKING → valid issue, but merge may still be acceptable
  • NIT → stylistic/minor, not merge-relevant

REPO-SPECIFIC REVIEW CHECKLIST
Use these exact headings.

1. Correctness & functional completeness

Check:

  • Does the implementation actually satisfy the PR claim?
    • ManifestExtractor is truly invoked
    • config.links produces non-zero cross-links where expected
  • Resolver contracts are preserved:
    • resolveSymbol remains exact-match
    • label-scoped Cypher remains correct per contract type
    • flag any regression toward fuzzy or unscoped matching
  • Graph schema integrity is preserved:
    • no silent changes to node labels
    • no silent changes to edge types
    • no silent changes to ID generation (generateId)
  • Call out any missing wiring, partial integration, dead branch, or mismatch between tests and runtime behavior

2. Code clarity & clean code

Check:

  • naming quality
  • local cohesion
  • dead code
  • unnecessary abstraction
  • hidden control flow
  • confusing indirection
  • adherence to repo conventions:
    • direct imports from gitnexus-shared
    • no barrel re-export regressions
    • no // removed comments
    • no unused re-exports
  • no drive-by refactors outside stated scope per CONTRIBUTING.md and GUARDRAILS.md § Scope

3. Test coverage & change safety

Evaluate against TESTING.md:

  • Are there unit tests under gitnexus/test/unit/ covering the newly wired path?
  • Is there a regression guard for 0-link → N-link behavior?
  • Are assertions meaningful rather than tautological?
  • Are fixtures realistic for manifest inputs?
  • If memoization/cache was introduced, is there a test proving hit/miss behavior and correctness?
  • Is there evidence the expected validation path would pass for staged gitnexus/ files?
    • tsc --noEmit
    • vitest run --project default
      If not verifiable, say exactly what is missing.

4. Performance

Inspect for:

  • hot-path overhead in ingestion/group sync
  • excess allocations per manifest entry
  • redundant Cypher round-trips
  • missed batching or missed parallelism (Promise.all) where it materially matters
  • O(n²) or repeated lookup patterns on large repos
  • memoization tradeoffs:
    • correctness
    • invalidation
    • bounded vs unbounded memory growth
      Do not speculate casually; explain the mechanism and likely impact.

5. Operational risk

Check:

  • Windows/cross-platform safety:
    • stream lifecycle
    • FD/file handle lifecycle
    • path separator assumptions
    • anything resembling prior ENOTEMPTY-style lifecycle regressions
  • LadybugDB single-writer invariant is preserved
  • Embeddings preservation:
    • no silent breakage of --embeddings
    • .gitnexus/meta.json.stats.embeddings not silently zeroed by changed paths
  • MCP contracts remain compatible:
    • group_*
    • query
    • context
    • impact
    • detect_changes
    • rename
    • cypher
      Flag any schema or contract break without migration note
  • staleness behavior still triggers correctly (gitnexus/src/mcp/staleness.ts)
  • rollback safety:
    • can this PR be reverted safely without re-analyze?
    • if not, explain why

6. Maintainability

Check:

  • Does the change respect Pipeline Phase DAG boundaries?
  • Is cross-phase coupling being introduced?
  • Are manifest-extractor entry points discoverable from group/sync.ts?
  • Would the next contributor know where to extend this?
  • Is shared logic placed appropriately:
    • in gitnexus-shared/
    • or gitnexus/src/core/**/utils.ts
    • and not duplicated between CLI and web bridge (server/api.ts)?

NON-NEGOTIABLES TO VERIFY
From GUARDRAILS.md / CONTRIBUTING.md:

  • no secrets, tokens, or machine-specific paths committed
  • no blind find-and-replace renames
  • if a symbol rename occurred, rename MCP tool or equivalent careful review was used
  • upstream impact was considered for modified shared symbols
  • no HIGH/CRITICAL impact left unaddressed
  • PR title follows conventional commit style + [area] Short description
  • PolyForm Noncommercial license headers/metadata untouched

OUTPUT FORMAT

Production-ready bar for this PR

Give a concise repo-specific definition in 5–10 bullets.

Findings

For each finding, use this structure exactly:

[BLOCKING|NON-BLOCKING|NIT] Short title

  • Evidence: path:line, path:line
  • Why it matters: ...
  • Recommendation: ...
  • Confidence: High / Medium / Low

Only include findings that are real and supported.
If there are no findings in a section, say:

  • No issues found in this area based on the diff and reviewed context.

What I could not fully verify

List any behavior you could not prove from the diff, tests, or available context.

Final verdict

Choose exactly one:

  • ✅ Ready to merge — no blocking issues.
  • ⚠️ Merge after addressing blocking items.
  • ❌ Not ready — fundamental concerns.

If verdict is not ready, list the blocking items in priority order.


IMPORTANT REVIEW BEHAVIOR

  • Be strict on correctness and contracts.
  • Be conservative about operational risk.
  • Be explicit about uncertainty.
  • Do not suggest unrelated refactors.
  • Do not turn nits into blockers.
  • Do not reward intent; review actual implementation.

@github-actions

github-actions Bot commented Apr 16, 2026

Copy link
Copy Markdown
Contributor

Claude finished @magyargergo's task in 5m 43s —— View job


Production-ready bar for this PR

  1. MCP contract backward-compatibility: context response is a strict superset (new score field only); impact behavioral change (silent-pick → structured ambiguous) is intentional and documented.
  2. Resolver contracts preserved: no fuzzy fallback introduced; resolveSymbol remains exact-match via WHERE n.name = $symName; #480 Class/Constructor collapse preserved.
  3. Graph schema integrity: no node label, edge type, or generateId changes.
  4. Unit test coverage: new behaviors in resolveSymbolCandidates covered by tests under gitnexus/test/unit/; java-class-impact integration regression guard remains green.
  5. TypeScript clean: tsc --noEmit passes (confirmed in PR description; pre-commit hook verified by author).
  6. No secrets or machine-specific paths committed.
  7. PR title follows conventional-commit format (feat(mcp): ...).
  8. MIGRATION.md entry required for any behavioral contract break on MCP tools (per repo pattern in MIGRATION.md).
  9. Drive-by refactors excluded: change scoped only to resolver, context, impact, schema, and associated tests.

Findings

1. Correctness & functional completeness

No blocking issues. Behavior matches claims:

  • Direct UID short-circuit is correctly wired at local-backend.ts:1166–1187; the test at line 383 proves no WHERE n.name = $symName fires when target_uid is set.
  • #480 Class/Constructor collapse preserved at local-backend.ts:1230–1253; only triggers when !hints.kind, so an explicit kind: 'Constructor' still surfaces the Constructor via the scoring path.
  • rename at line 1845 forwards status === 'ambiguous' unchanged — contract preserved.
  • Score cap and stable tiebreak (shorter filePath → lex uid) verified at local-backend.ts:1268–1274.

One subtle behavioral note (see §3 tests for the testing angle): when file_path hint is given, the Cypher WHERE clause pre-filters candidates (AND n.filePath CONTAINS $filePath) before scoreCandidate runs. In production, all candidates returned from the DB already match the file_path hint and therefore all receive the +0.40 boost uniformly — the file_path component of scoring provides no relative differentiation in that path. Ranking among pre-filtered candidates is then driven entirely by kind hint (±0.20) and the no-kind priority bonus. This is not incorrect, but it is a subtlety the PR description's score breakdown doesn't call out explicitly.


2. Code clarity & clean code

No issues found in this area based on the diff and reviewed context.

scoreCandidate is deliberately minimal and documented as intentionally simple (local-backend.ts:1084–1087). No dead code, no barrel re-exports, no // removed comments, no drive-by refactors. Naming (resolveSymbolCandidates, scoreCandidate, hints) is clean.

NIT — redundant guard at local-backend.ts:1279

  • Evidence: local-backend.ts:1255 returns early when normalized.length === 1; therefore at line 1279 scored.length is always ≥ 2.
  • Why it matters: the scored.length >= 2 check is dead — the condition can never be false by the time execution reaches it.
  • Recommendation: Remove scored.length >= 2 && or add a comment explaining why it's a defensive guard.
  • Confidence: High

3. Test coverage & change safety

NON-BLOCKING — Test mechanism diverges from production code path for file_path hint

  • Evidence: calltool-dispatch.test.ts:262–290; local-backend.ts:1194–1201, 1255, 1279
  • Why it matters: The test "context tool ranks file_path match higher than non-match" works because executeParameterized mock returns both candidates regardless of the WHERE clause parameters. In production, the query WHERE n.name = $symName AND n.filePath CONTAINS $filePath would return only one row (src/App.tsx), which resolves via normalized.length === 1 at line 1255 — not via the confident-promotion scoring path (lines 1279–1281). The test comment at line 285–288 describes the scoring path (0.96 ≥ 0.95 threshold) as the mechanism, but in production this scenario never reaches the scorer. This means the confident-promotion path for file_path is exercised in tests only (via mock behavior), not in real queries. If a LadybugDB CONTAINS edge-case returns non-matching rows (e.g., case-mismatch on a case-sensitive engine), the scoring would be the safety net — and that path is untested end-to-end.
  • Recommendation: Add a separate unit test explicitly for the confident-promotion path where two candidates both satisfy the file_path hint (e.g., src/App.tsx and src/OtherApp.tsx for hint App.tsx) and one also matches kind — so promotion is determined by scoring, not DB pre-filtering. Also fix the comment at line 285–288.
  • Confidence: High

The rest of the test changes are well-structured:

  • Score arithmetic for tied candidates (0.56) is verified at line 316 — correct: 0.50 + 3×0.02 = 0.56.
  • Kind-match boost test (lines 320–345) correctly shows promotion threshold not reached (0.70 < 0.95) → ambiguous with correct ordering.
  • target_uid bypass is verified by inspecting Cypher call signatures (line 411–416) — that's a high-quality assertion.
  • eval-formatters.test.ts and java-class-impact.test.ts cited as passing guards full backward-compat and impact/context return empty results for Java classes — edges target Constructor/File nodes, not Class nodes #480 regression.

4. Performance

No issues found in this area based on the diff and reviewed context.

LIMIT raised from 10 to 20 (local-backend.ts:1207) on a MATCH (n) WHERE n.name = $symName scan — this is the same unscoped scan as before, just with more headroom for the ranker. The scoreCandidate function is O(1) pure TypeScript with no DB round-trip, so 20 candidates remain well within acceptable cost. The #480 collapse path adds at most 2 serial Cypher queries (Class → Interface) only when needed; no regression vs. prior behavior.


5. Operational risk

NON-BLOCKING — impact behavioral contract change not documented in MIGRATION.md

  • Evidence: local-backend.ts:2131–2148; MIGRATION.md (has entries for other contract breaks, e.g., OVERRIDES → METHOD_OVERRIDES)
  • Why it matters: impact going from "silently picks first match" to "returns { status: 'ambiguous', candidates, impactedCount: 0, risk: 'UNKNOWN' } with no byDepth / summary / affected_processes" is a behavioral contract break. Existing agents or callers that access result.byDepth or result.summary without first checking result.status will get undefined. CONTRIBUTING.md requires: "Documentation updated if behavior or public CLI/MCP contract changes." MIGRATION.md is the established pattern for this.
  • Recommendation: Add a short entry to MIGRATION.md (or a dedicated CHANGELOG note) describing the change: that impact may now return { status: 'ambiguous' } and callers should check status before accessing byDepth/summary. The tools.ts description is already updated (line 273) but that's only the MCP schema — tool callers via CLI or HTTP bridge also need a migration pointer.
  • Confidence: High

Other operational risk areas:

  • LadybugDB single-writer invariant: unaffected (read-only queries only).
  • Embeddings preservation: resolveSymbolCandidates is query-only; no path touches meta.json or the embeddings table.
  • Rollback safety: revert is safe — the resolver is self-contained in LocalBackend. No schema migration, no stored data changes.
  • Windows/cross-platform: no new stream, FD, or path-separator logic introduced.
  • Staleness: staleness.ts untouched; no impact.

6. Maintainability

No issues found in this area based on the diff and reviewed context.

resolveSymbolCandidates is cleanly placed on LocalBackend (the correct boundary — MCP query layer, not ingestion). The ranker is deliberately extensible: the PR comment at line 1084–1087 explicitly names BM25/embedding as v2 additions to scoreCandidate. Pipeline Phase DAG boundaries are unaffected. No shared-symbol duplication between CLI and web bridge.


Non-negotiables

  • ✅ No secrets or machine-specific paths.
  • ✅ No blind find-and-replace rename.
  • ✅ PR title follows conventional-commit format.
  • ✅ No HIGH/CRITICAL impact unaddressed (this PR is itself the fix for an impact-analysis correctness issue).
  • ✅ PolyForm Noncommercial license headers untouched.
  • ⚠️ MIGRATION.md entry missing for impact behavioral change (NON-BLOCKING per above).

What I could not fully verify

  1. LadybugDB CONTAINS case-sensitivity in production: The Cypher pre-filter uses n.filePath CONTAINS $filePath (unknown case sensitivity for this DB); scoreCandidate uses .toLowerCase().includes() (case-insensitive). If the DB's CONTAINS is case-sensitive, a hint like app.tsx would miss App.tsx at the DB level even though the scorer would have matched it. Cannot verify from diff alone.
  2. eval-formatters.test.ts and java-class-impact.test.ts test results: PR author reports all green; cannot run CI locally in this context.
  3. npm run test:unit — 3642 passed cited by author with 4 pre-existing env failures. Cannot independently verify the count is stable after this change.

Final verdict

⚠️ Merge after addressing blocking items.

There are no BLOCKING correctness issues. The two NON-BLOCKING items are worth addressing before merge:

  1. (Higher priority) Add a MIGRATION.md entry documenting the impact behavioral contract change (ambiguous response shape, missing byDepth). This is a low-effort fix with high documentation value for downstream callers.

  2. (Lower priority) Add a test that exercises the confident-promotion path when two candidates both match the file_path hint (so resolution is determined by scoring, not DB pre-filtering). The current test for this scenario works via mock divergence from production behavior.

The NIT (redundant scored.length >= 2 guard) can be addressed at the author's discretion.

@github-actions

github-actions Bot commented Apr 16, 2026

Copy link
Copy Markdown
Contributor

CI Report

All checks passed

Pipeline Status

Stage Status Details
✅ Typecheck success tsc --noEmit
✅ Tests success unit tests, 3 platforms
✅ E2E success gitnexus-web changes only

Test Results

Tests Passed Failed Skipped Duration
6438 6341 0 97 249s

✅ All 6341 tests passed

97 test(s) skipped — expand for details
  • Swift MethodExtractor > isTypeDeclaration > recognizes class_declaration
  • Swift MethodExtractor > isTypeDeclaration > recognizes protocol_declaration
  • Swift MethodExtractor > isTypeDeclaration > rejects import_declaration
  • Swift MethodExtractor > visibility > extracts public method
  • Swift MethodExtractor > visibility > extracts private method
  • Swift MethodExtractor > visibility > defaults to internal when no modifier
  • Swift MethodExtractor > protocol methods > marks protocol method as abstract
  • Swift MethodExtractor > static and class methods > detects static func as isStatic
  • Swift MethodExtractor > static and class methods > detects class func as isStatic
  • Swift MethodExtractor > parameters > extracts parameters with types and default values
  • Swift MethodExtractor > return type > extracts return type from -> annotation
  • Swift MethodExtractor > annotations > extracts @objc attribute
  • Swift MethodExtractor > isFinal > detects final func
  • Swift MethodExtractor > isFinal > is false when not final
  • Swift MethodExtractor > isAsync > detects async func
  • Swift MethodExtractor > isOverride > detects override method
  • buildTypeEnv > constructor inference (Tier 1 fallback) > lookupClassByName regression coverage > Swift lookupClassByName regression coverage > Swift cross-file constructor inference uses lookupClassByName
  • buildTypeEnv > constructor inference (Tier 1 fallback) > lookupClassByName regression coverage > Swift lookupClassByName regression coverage > Swift explicit init inference uses lookupClassByName
  • buildTypeEnv > constructor inference (Tier 1 fallback) > lookupClassByName regression coverage > Swift lookupClassByName regression coverage > Swift cross-file constructor inference does not bind plain functions
  • buildTypeEnv > known limitations (documented skip tests) > Ruby block parameter: users.each { |user| } — closure param inference, different feature
  • Swift constructor-inferred type resolution > detects User and Repo classes, both with save methods
  • Swift constructor-inferred type resolution > resolves user.save() to Models/User.swift via constructor-inferred type
  • Swift constructor-inferred type resolution > resolves repo.save() to Models/Repo.swift via constructor-inferred type
  • Swift constructor-inferred type resolution > emits exactly 2 save() CALLS edges (one per receiver type)
  • Swift self resolution > detects User and Repo classes, each with a save function
  • Swift self resolution > resolves self.save() inside User.process to User.save, not Repo.save
  • Swift parent resolution > detects BaseModel and User classes plus Serializable protocol
  • Swift parent resolution > emits EXTENDS edge: User → BaseModel
  • Swift parent resolution > emits IMPLEMENTS edge: User → Serializable (protocol conformance)
  • Swift cross-file User.init() inference > resolves user.save() via User.init(name:) inference
  • Swift cross-file User.init() inference > resolves user.greet() via User.init(name:) inference
  • Swift return type inference > detects User class and getUser function
  • Swift return type inference > detects save function on User (Swift class methods are Function nodes)
  • Swift return type inference > resolves user.save() to User#save via return type of getUser() -> User
  • Swift return-type inference via function return type > resolves user.save() to User#save via return type of getUser()
  • Swift return-type inference via function return type > user.save() does NOT resolve to Repo#save
  • Swift return-type inference via function return type > resolves repo.save() to Repo#save via return type of getRepo()
  • Swift implicit imports (cross-file visibility) > detects UserService class in Models.swift
  • Swift implicit imports (cross-file visibility) > resolves UserService() constructor call across files (no explicit import)
  • Swift implicit imports (cross-file visibility) > resolves service.fetchUser() member call across files
  • Swift implicit imports (cross-file visibility) > creates IMPORTS edges between files in the same module
  • Swift extension deduplication > detects Product class
  • Swift extension deduplication > resolves Product() constructor despite extension creating duplicate class node
  • Swift extension deduplication > resolves product.save() to Product.swift (primary definition)
  • Swift constructor call fallback (no new keyword) > resolves OCRService() as constructor call across files
  • Swift constructor call fallback (no new keyword) > resolves ocr.recognize() member call via constructor-inferred type
  • Swift export visibility (internal vs private) > resolves PublicService() constructor across files
  • Swift export visibility (internal vs private) > resolves internalHelper() across files (internal = module-scoped)
  • Swift if let / guard let binding resolution > detects User and Repo classes
  • Swift if let / guard let binding resolution > resolves user.save() inside if-let to User#save
  • Swift if let / guard let binding resolution > resolves repo.save() inside guard-let to Repo#save
  • Swift if let / guard let binding resolution > user.save() in if-let does NOT resolve to Repo#save
  • Swift await / try expression unwrapping > resolves user.save() via await fetchUser() return type
  • Swift await / try expression unwrapping > resolves repo.save() via try parseRepo() return type
  • Swift await / try expression unwrapping > detects fetchUser and parseRepo as functions
  • Swift for-in loop element type inference > detects User and Repo classes
  • Swift for-in loop element type inference > creates implicit import edges between files
  • Swift field-type resolution > detects classes and their properties
  • Swift field-type resolution > emits HAS_PROPERTY edges from class to field
  • Swift field-type resolution > resolves field-chain call user.address.save() → Address#save
  • Swift field-type resolution > emits ACCESSES edges for field reads in chains
  • Swift field-type resolution > populates field metadata (visibility, declaredType) on Property nodes
  • Swift call-result binding > resolves call-result-bound method call user.save() → User#save
  • Swift call-result binding > getUser() is present as a defined function
  • Swift call-result binding > emits processUser -> getUser CALLS edge for let-assigned free function call
  • Swift method enrichment > detects Animal protocol and Dog class
  • Swift method enrichment > emits IMPLEMENTS edge Dog -> Animal
  • Swift method enrichment > emits HAS_METHOD edges for Dog methods
  • Swift method enrichment > marks protocol Animal.speak as isAbstract
  • Swift method enrichment > marks Dog.speak as NOT isAbstract
  • Swift method enrichment > marks breathe as isFinal
  • Swift method enrichment > marks classify as isStatic
  • Swift method enrichment > captures @objc annotation on breathe
  • Swift method enrichment > populates parameterTypes for classify(_ name: String)
  • Swift method enrichment > records parameterCount for classify
  • Swift method enrichment > records returnType for speak
  • Swift method enrichment > resolves dog.speak() CALLS edge
  • Swift method enrichment > resolves Dog.classify("dog") CALLS edge
  • Swift abstract dispatch > detects Repository protocol and SqlRepository class
  • Swift abstract dispatch > emits IMPLEMENTS edge SqlRepository -> Repository
  • Swift abstract dispatch > emits HAS_METHOD edges for Repository.find and Repository.save
  • Swift abstract dispatch > emits HAS_METHOD edges for SqlRepository.find and SqlRepository.save
  • Swift abstract dispatch > marks base Repository.find as isAbstract
  • Swift abstract dispatch > marks base Repository.save as isAbstract
  • Swift abstract dispatch > marks concrete SqlRepository.find as NOT isAbstract
  • Swift abstract dispatch > resolves repo.find(id: 42) CALLS edge
  • Swift abstract dispatch > resolves repo.save(entity: user) CALLS edge
  • Swift abstract dispatch > populates parameterTypes for Repository.find
  • Swift abstract dispatch > populates parameterTypes for Repository.save
  • Swift abstract dispatch > records returnType for SqlRepository.find
  • Swift abstract dispatch > emits METHOD_IMPLEMENTS edges from SqlRepository methods → Repository protocol methods
  • Swift overloaded method disambiguation > detects 2 distinct find Method nodes on SqlRepository
  • Swift overloaded method disambiguation > emits METHOD_IMPLEMENTS edges for both find overloads
  • Swift overloaded method disambiguation > emits METHOD_IMPLEMENTS edge for save
  • Swift overloaded method disambiguation > emits exactly 3 METHOD_IMPLEMENTS edges total
  • Swift Child extends Parent — inherited method resolution (SM-9) > detects Parent and Child classes
  • Swift Child extends Parent — inherited method resolution (SM-9) > resolves c.parentMethod() to Parent.parentMethod via first-wins MRO walk

Code Coverage

Tests

Metric Coverage Covered Base Delta Status
Statements 72.57% 17233/23746 73% 📉 -0.4 🔴 ██████████████░░░░░░
Branches 61.46% 10961/17834 61.97% 📉 -0.5 🔴 ████████████░░░░░░░░
Functions 77.54% 1644/2120 77.65% 📉 -0.1 🔴 ███████████████░░░░░
Lines 75.06% 15633/20826 75.58% 📉 -0.5 🔴 ███████████████░░░░░

📋 View full run · Generated by CI

…s review findings

CI on PR abhigyanpatwari#888 caught 13 integration-test failures I did not cover locally:
my resolver refactor collected candidates via `labels(n)[0] AS type`, but
LadybugDB returns an empty string for that projection on certain node
types (most importantly Class). With an empty `type`, impact's downstream
`_runImpactBFS` no longer recognised `symType === 'Class' | 'Interface'`
and stopped seeding Constructor + File nodes into the frontier, so the
"impact(upstream) surfaces the file importer" assertion broke across 11
language fixtures plus 2 OVERRIDES filter tests.

The original impact resolver worked around this by running a prioritised
UNION across Class/Interface/Function/Method/Constructor and picking the
first hit. My refactor dropped that. Fix: keep the simple candidate MATCH
but enrich types afterward via a single scoped UNION query, so every
candidate carries an accurate label for both scoring and downstream
BFS seeding. The UID direct-lookup path is patched the same way.

Also addresses the findings from the senior reviewer on PR abhigyanpatwari#888:

* MIGRATION.md: document the `impact` behavioural change (silent first-
  pick → structured `{ status: 'ambiguous', candidates }`) so downstream
  callers know to branch on `result.status` before reading byDepth/
  summary. `context` is unchanged shape-wise (strict superset).

* New test: `context tool promotes top candidate via scoring when
  multiple rows survive DB pre-filter`. The review flagged that the
  existing file_path test works only because the mock ignores WHERE
  parameters -- the scored-promotion path (top ≥ 0.95 AND gap > 0.09)
  wasn't directly exercised. The new test uses two candidates both in
  App.tsx-containing paths plus a kind hint so promotion is decided by
  scoring, not DB pre-filtering. Also tightened the comment on the
  earlier file_path test to describe the mock vs production divergence
  honestly.

* NIT: added a paragraph explaining why `scored.length >= 2` is kept as
  a defensive guard even though the `normalized.length === 1` early
  return already covers the single-candidate path.

* Integration: two tests in `local-backend-calltool.test.ts` targeted
  `'authenticate'`, which now correctly resolves as ambiguous (two
  Method nodes: AuthService.authenticate and BaseService.authenticate).
  Updated both to pass `file_path: 'src/auth.ts'` so they exercise the
  new disambiguation API and still assert the METHOD_OVERRIDES filtering
  they were originally about.

Edge case fix in the promotion gap check: IEEE754 makes 0.50 + 0.40 +
0.20 - 0.90 = 0.09999999999999998 instead of exactly 0.10, which would
otherwise break the "winner clearly dominates" intent for legitimate
1.00 vs 0.90 cases. Changed `>= 0.10` to `> 0.09`; same user-facing
intent, no floating-point sensitivity.

Verification (all from gitnexus/):
  npx vitest run test/integration/class-impact-all-languages.test.ts
    -> 52 pass (was 11 FAIL on CI before this fix)
  npx vitest run test/integration/local-backend-calltool.test.ts
    -> 18 pass (was 2 FAIL on CI before this fix)
  npx vitest run test/integration/java-class-impact.test.ts
    -> 10 pass (regression guard for abhigyanpatwari#480 preserved)
  npx vitest run test/unit/calltool-dispatch.test.ts
    -> 65 pass (1 new test + 4 from original abhigyanpatwari#470 PR)
  npm run test:unit
    -> 3626 pass, 4 pre-existing env failures unchanged
  npx tsc --noEmit
    -> clean
@azizur100389

Copy link
Copy Markdown
Contributor Author

Thanks for the review, and thanks for catching the CI failure — that one was on me. Addressed everything in a12a7d2.

Root cause of the 13 CI failures

The refactor collected candidates via labels(n)[0] AS type, but LadybugDB returns an empty string for that projection on certain node types (most importantly Class). With type === '', _runImpactBFS could no longer recognise symType === 'Class' | 'Interface' and stopped seeding Constructor + File nodes into the frontier — which is precisely why all 11 class-impact-all-languages assertions (and the 2 OVERRIDES filter assertions) failed with the same shape: expected [ 'loginUser' ] to include 'app.ts' etc. The original impact resolver worked around this with a prioritised UNION across Class/Interface/Function/Method/Constructor and picked the first hit; my refactor dropped that.

Fix

Kept the simple candidate MATCH but enrich types afterwards via a single scoped UNION query, so every candidate carries an accurate label for both the kind-priority scoring bonus and downstream BFS seeding. One extra DB round-trip only when needed — no change when the labels come back populated. UID direct-lookup path patched the same way.

await this.enrichCandidateLabels(repo, normalized);
// ...
private async enrichCandidateLabels(repo, candidates) {
  const ids = candidates.filter(c => c.type === '' && c.id).map(c => c.id);
  if (ids.length === 0) return;
  // UNION across Class/Interface/Function/Method/Constructor, patch in-place
}

Also updated the 2 local-backend-calltool integration tests that targeted 'authenticate' — that name now correctly resolves as ambiguous (fixture has both AuthService.authenticate and BaseService.authenticate). Both tests gain file_path: 'src/auth.ts' to exercise the new disambiguation API while still asserting the METHOD_OVERRIDES filtering they were originally testing.

Review findings

1. NON-BLOCKING — MIGRATION.md entry missing → addressed

New entry at the top of MIGRATION.md describing the impact behavioural change (silent first-pick → { status: 'ambiguous', candidates, ... }), the response shape with example JSON, and guidance for callers to branch on result.status before reading byDepth / summary. context's ambiguous response is called out as a strict superset (no migration needed).

2. NON-BLOCKING — test mechanism diverges from production for file_pathaddressed

New test context tool promotes top candidate via scoring when multiple rows survive DB pre-filter in calltool-dispatch.test.ts — uses two candidates both in App.tsx-containing paths plus kind: 'Function' so promotion is decided purely by scoring (1.00 vs 0.90), not by DB pre-filtering. Also tightened the comment on the earlier file_path test to describe the mock-vs-production divergence honestly.

3. NIT — scored.length >= 2 is dead by construction → addressed with a comment

Added an inline comment explaining that the guard is intentional defensive programming: the normalized.length === 1 early return above already handles single-candidate paths, but keeping scored.length >= 2 means future refactors to the upstream early-return logic can't accidentally index scored[1] out of bounds.

One subtle fix the review didn't catch but the new test surfaced

The promotion-gap threshold was >= 0.10, which fails on IEEE754 arithmetic: 0.50 + 0.40 + 0.20 - 0.90 = 0.09999999999999998, not exactly 0.10. Changed to > 0.09 — same user-facing intent (a clearly-dominant winner), no floating-point sensitivity.

Verification (from gitnexus/)

Test Before (CI) After (local)
test/integration/class-impact-all-languages.test.ts 11 × FAIL 52 pass
test/integration/local-backend-calltool.test.ts 2 × FAIL 18 pass
test/integration/java-class-impact.test.ts (#480 regression guard) 10 pass
test/unit/calltool-dispatch.test.ts 65 pass (1 new + 4 original #470)
npm run test:unit 3626 pass, 4 pre-existing env failures unchanged
npx tsc --noEmit clean
Pre-commit hook (lint-staged + prettier + tsc) clean

@azizur100389

Copy link
Copy Markdown
Contributor Author

Quick ping — CI's been green on all 3 platforms since a12a7d2 landed on Apr 16, and every finding from the first @claude review is addressed in the same commit:

  • MIGRATION.md entry documenting the impact ambiguous-response shape
  • New test for the scoring-driven confident-promotion path (not just the mock-divergent DB pre-filter case)
  • NIT — inline comment explaining the defensive scored.length >= 2 guard

Happy to re-request @claude for a fresh verdict if that helps unblock — just let me know. Otherwise no rush, I'll keep an eye on it.

@magyargergo

Copy link
Copy Markdown
Collaborator

@claude Act as a senior reviewer for GitNexus. Your job is to determine whether this PR is production-ready for this repo, not to give a generic code review.

You are reviewing a PR in the GitNexus monorepo:

  • gitnexus/ → CLI + MCP
  • gitnexus-web/
  • gitnexus-shared/

Your task has 2 phases, in this exact order:

PHASE 1 — DEFINE THE BAR
Before reviewing the diff, establish a concise repo-specific definition of “production-ready” for GitNexus, based only on the repo docs and the affected area.
Keep this definition practical and reviewable. Do not invent standards that are not grounded in the repo.

PHASE 2 — REVIEW THE PR AGAINST THAT BAR
Review the actual diff only after defining the bar.
Stay tightly scoped to the changed code and its direct consequences.


CONTEXT TO LOAD FIRST
Read these before reviewing:

  • AGENTS.md
  • GUARDRAILS.md
  • CONTRIBUTING.md
  • TESTING.md
  • ARCHITECTURE.md

Additional context:


PRIMARY OBJECTIVE
Decide whether this PR is safe, correct, maintainable, and operationally acceptable to merge into production for GitNexus.

Do not optimize for completeness at the expense of signal.
Do not pad the review.
Do not propose unrelated refactors.
Do not restate the PR description unless needed for verification.


REVIEW RULES

  • Every finding must be grounded in specific evidence from the diff or directly relevant surrounding code.
  • Every finding must include path:line.
  • If you make a behavioral claim, cite the code that proves it.
  • If you make a performance claim, explain the mechanism.
  • If something cannot be verified from the diff alone, explicitly say so.
  • Distinguish clearly between:
    • verified issue
    • plausible risk
    • unverified concern
  • Avoid vague wording like “might be better” or “could be improved” unless you explain exactly why.
  • Keep the review focused on this PR’s scope only.

For each finding, assign one severity:

  • BLOCKING → must be fixed before merge
  • NON-BLOCKING → valid issue, but merge may still be acceptable
  • NIT → stylistic/minor, not merge-relevant

REPO-SPECIFIC REVIEW CHECKLIST
Use these exact headings.

1. Correctness & functional completeness

Check:

  • Does the implementation actually satisfy the PR claim?
    • ManifestExtractor is truly invoked
    • config.links produces non-zero cross-links where expected
  • Resolver contracts are preserved:
    • resolveSymbol remains exact-match
    • label-scoped Cypher remains correct per contract type
    • flag any regression toward fuzzy or unscoped matching
  • Graph schema integrity is preserved:
    • no silent changes to node labels
    • no silent changes to edge types
    • no silent changes to ID generation (generateId)
  • Call out any missing wiring, partial integration, dead branch, or mismatch between tests and runtime behavior

2. Code clarity & clean code

Check:

  • naming quality
  • local cohesion
  • dead code
  • unnecessary abstraction
  • hidden control flow
  • confusing indirection
  • adherence to repo conventions:
    • direct imports from gitnexus-shared
    • no barrel re-export regressions
    • no // removed comments
    • no unused re-exports
  • no drive-by refactors outside stated scope per CONTRIBUTING.md and GUARDRAILS.md § Scope

3. Test coverage & change safety

Evaluate against TESTING.md:

  • Are there unit tests under gitnexus/test/unit/ covering the newly wired path?
  • Is there a regression guard for 0-link → N-link behavior?
  • Are assertions meaningful rather than tautological?
  • Are fixtures realistic for manifest inputs?
  • If memoization/cache was introduced, is there a test proving hit/miss behavior and correctness?
  • Is there evidence the expected validation path would pass for staged gitnexus/ files?
    • tsc --noEmit
    • vitest run --project default
      If not verifiable, say exactly what is missing.

4. Performance

Inspect for:

  • hot-path overhead in ingestion/group sync
  • excess allocations per manifest entry
  • redundant Cypher round-trips
  • missed batching or missed parallelism (Promise.all) where it materially matters
  • O(n²) or repeated lookup patterns on large repos
  • memoization tradeoffs:
    • correctness
    • invalidation
    • bounded vs unbounded memory growth
      Do not speculate casually; explain the mechanism and likely impact.

5. Operational risk

Check:

  • Windows/cross-platform safety:
    • stream lifecycle
    • FD/file handle lifecycle
    • path separator assumptions
    • anything resembling prior ENOTEMPTY-style lifecycle regressions
  • LadybugDB single-writer invariant is preserved
  • Embeddings preservation:
    • no silent breakage of --embeddings
    • .gitnexus/meta.json.stats.embeddings not silently zeroed by changed paths
  • MCP contracts remain compatible:
    • group_*
    • query
    • context
    • impact
    • detect_changes
    • rename
    • cypher
      Flag any schema or contract break without migration note
  • staleness behavior still triggers correctly (gitnexus/src/mcp/staleness.ts)
  • rollback safety:
    • can this PR be reverted safely without re-analyze?
    • if not, explain why

6. Maintainability

Check:

  • Does the change respect Pipeline Phase DAG boundaries?
  • Is cross-phase coupling being introduced?
  • Are manifest-extractor entry points discoverable from group/sync.ts?
  • Would the next contributor know where to extend this?
  • Is shared logic placed appropriately:
    • in gitnexus-shared/
    • or gitnexus/src/core/**/utils.ts
    • and not duplicated between CLI and web bridge (server/api.ts)?

NON-NEGOTIABLES TO VERIFY
From GUARDRAILS.md / CONTRIBUTING.md:

  • no secrets, tokens, or machine-specific paths committed
  • no blind find-and-replace renames
  • if a symbol rename occurred, rename MCP tool or equivalent careful review was used
  • upstream impact was considered for modified shared symbols
  • no HIGH/CRITICAL impact left unaddressed
  • PR title follows conventional commit style + [area] Short description
  • PolyForm Noncommercial license headers/metadata untouched

OUTPUT FORMAT

Production-ready bar for this PR

Give a concise repo-specific definition in 5–10 bullets.

Findings

For each finding, use this structure exactly:

[BLOCKING|NON-BLOCKING|NIT] Short title

  • Evidence: path:line, path:line
  • Why it matters: ...
  • Recommendation: ...
  • Confidence: High / Medium / Low

Only include findings that are real and supported.
If there are no findings in a section, say:

  • No issues found in this area based on the diff and reviewed context.

What I could not fully verify

List any behavior you could not prove from the diff, tests, or available context.

Final verdict

Choose exactly one:

  • ✅ Ready to merge — no blocking issues.
  • ⚠️ Merge after addressing blocking items.
  • ❌ Not ready — fundamental concerns.

If verdict is not ready, list the blocking items in priority order.


IMPORTANT REVIEW BEHAVIOR

  • Be strict on correctness and contracts.
  • Be conservative about operational risk.
  • Be explicit about uncertainty.
  • Do not suggest unrelated refactors.
  • Do not turn nits into blockers.
  • Do not reward intent; review actual implementation.

@github-actions

github-actions Bot commented Apr 18, 2026

Copy link
Copy Markdown
Contributor

Claude finished @magyargergo's task in 3m 48s —— View job


Re-review of PR #888 — Final verdict

  • Load repo docs (AGENTS.md, GUARDRAILS.md, CONTRIBUTING.md, TESTING.md, ARCHITECTURE.md)
  • Read changed files (local-backend.ts, tools.ts, calltool-dispatch.test.ts, local-backend-calltool.test.ts, MIGRATION.md)
  • Define production-ready bar
  • Review diff against bar (both commits: 604c1f7 + a12a7d2)
  • Post final verdict

Production-ready bar for this PR

  1. MCP contract backward-compatibility: context ambiguous response is a strict superset (additive score field only); impact behavioral change (silent-pick → structured ambiguous) is intentional, documented, and callers can adapt by branching on result.status.
  2. Resolver contracts preserved: resolveSymbolCandidates uses exact n.name = $symName — no fuzzy fallback introduced. #480 Class/Constructor collapse preserved with !hints.kind guard.
  3. Graph schema integrity: no node label, edge type, or generateId changes. The change is MCP-layer only.
  4. Unit and integration test coverage: new behaviors covered under gitnexus/test/unit/; java-class-impact and class-impact-all-languages integration regression guards remain green.
  5. TypeScript clean: tsc --noEmit passes; pre-commit hook (lint-staged + prettier + tsc) verified.
  6. MIGRATION.md entry present for the impact behavioral contract change.
  7. CI green on all 3 platforms: 6341 passed, 0 failed.
  8. No secrets, no machine-specific paths, no blind renames, PR title follows conventional-commit format.
  9. PolyForm Noncommercial license headers untouched.

Findings

1. Correctness & functional completeness

  • No issues found in this area based on the diff and reviewed context.

Verified in commit a12a7d2:

  • Root cause of 13 CI failures fixed: enrichCandidateLabels runs a single UNION ALL across the five priority labels and patches type in-place for any candidate where labels(n)[0] returned empty. This is the exact mechanism that broke _runImpactBFS (which gates Class/Interface BFS seeding on symType). The fix is correct.
  • UID path enriched too (local-backend.ts UID branch): a direct UID lookup pointing at a Class node now also calls enrichCandidateLabels so resolvedLabel is populated for BFS seeding. Previously this path could return an empty type from the same LadybugDB projection issue.
  • #480 collapse still fires correctly: the !hints.kind guard on the Class/Constructor collapse block means kind: 'Constructor' still surfaces the Constructor; kind: 'Class' does the same. Verified at the diff site.
  • Integration test updates (local-backend-calltool.test.ts:151,174): file_path: 'src/auth.ts' added to the two authenticate tests plus expect(result.status).not.toBe('ambiguous') assertions confirm the disambiguation path is exercised end-to-end, not just mocked.
  • IEEE754 threshold fixed: > 0.09 (was >= 0.10) at the confident-promotion check. 0.50 + 0.40 + 0.20 − 0.90 = 0.09999... would have failed the old comparison; the fix is correct and documented inline.

2. Code clarity & clean code

  • No issues found in this area based on the diff and reviewed context.

enrichCandidateLabels, scoreCandidate, resolveSymbolCandidates are all well-named, cohesive, and documented with JSDoc. No dead code, no barrel re-export changes, no // removed comments, no drive-by refactors. The UNION ALL in enrichCandidateLabels correctly uses !labelById.has(id) to keep only the first (highest-priority) label per id.

One subtle observation (not a finding): when hints.file_path is provided and the query name is qualified (contains / or :), the resolver branches to WHERE n.name = $symName AND n.filePath CONTAINS $filePath, skipping the n.id = $symName OR n.name = $symName branch. In practice, qualified names are typically used as UIDs rather than name inputs, so this edge case has no realistic impact.

3. Test coverage & change safety

  • No issues found in this area based on the diff and reviewed context.

Previous review's NON-BLOCKING test-mechanism finding fully addressed:

  • New test context tool promotes top candidate via scoring when multiple rows survive DB pre-filter (calltool-dispatch.test.ts:302–328): both candidates have filePath containing App.tsx, so DB pre-filtering would return both in production. Resolution is determined by file_path + kind scoring (1.00 vs 0.90 → confident promotion). This is the exact path the previous review said was untested.
  • The original file_path test now has an honest comment describing the mock-vs-production divergence (calltool-dispatch.test.ts:282–290).
  • Scored arithmetic for tied candidates (0.56 = 0.50 + 3 × 0.02) verified at line 384.
  • target_uid bypass test inspects raw Cypher call args to prove WHERE n.name = $symName never fires — high-quality assertion.
  • CI confirms: 65 unit tests in calltool-dispatch, 18 in local-backend-calltool, 10 in java-class-impact, 52 in class-impact-all-languages — all passing.

4. Performance

  • No issues found in this area based on the diff and reviewed context.

enrichCandidateLabels is a single DB call regardless of how many candidates need patching (not per-candidate). It fires only when at least one type === '' candidate exists — no-op otherwise. scoreCandidate is O(1) pure TS. LIMIT raised from 10 → 20 on the same unindexed MATCH (n) scan, but 20 candidates at constant-time scoring is well within acceptable cost. The Class/Constructor collapse's two serial label queries are unchanged from the prior implementation.

5. Operational risk

  • No issues found in this area based on the diff and reviewed context.

Previous review's NON-BLOCKING fully addressed: MIGRATION.md entry is complete, accurate, and correctly scoped — includes the full ambiguous response shape, migration guidance (branch on result.status), and confirms no re-index is required.

Other checks:

  • LadybugDB single-writer: unaffected (all new paths are read-only queries).
  • Embeddings: resolveSymbolCandidates is query-only; meta.json untouched.
  • Rollback safety: revert is safe — no schema or stored data changes, no migration required.
  • Windows/cross-platform: no new streams, FD handling, or path-separator logic.
  • Staleness: staleness.ts untouched.
  • MCP contracts: context is a strict superset; impact behavioral change is documented. rename continues to forward status === 'ambiguous' unchanged. group_*, query, detect_changes, cypher untouched.

6. Maintainability

  • No issues found in this area based on the diff and reviewed context.

resolveSymbolCandidates is correctly placed on LocalBackend (MCP query layer, not ingestion). Pipeline Phase DAG boundaries unaffected. The scoreCandidate JSDoc explicitly names BM25/embedding as v2 additions, so the extension point is discoverable. No logic duplicated between CLI and web bridge.


Non-negotiables

  • ✅ No secrets or machine-specific paths.
  • ✅ No blind find-and-replace rename.
  • ✅ PR title follows conventional-commit format (feat(mcp): ...).
  • ✅ PolyForm Noncommercial license headers untouched.
  • ✅ MIGRATION.md entry present for impact behavioral contract change.
  • ✅ No HIGH/CRITICAL unaddressed impact (this PR is the fix for an impact-analysis correctness issue).

What I could not fully verify

  1. LadybugDB CONTAINS case-sensitivity in production (carried over from round 1): the Cypher pre-filter uses n.filePath CONTAINS $filePath (case sensitivity engine-dependent); scoreCandidate uses .toLowerCase().includes() (case-insensitive). If the DB's CONTAINS is case-sensitive, a hint like app.tsx could miss App.tsx at the DB layer even though the scorer would have matched. Cannot verify from diff alone.
  2. CI test counts on main: author reports 4 pre-existing env failures on main (skip-git-cli.test.ts, git-utils.test.ts); cannot independently confirm these are pre-existing and not regressed by this PR.

Final verdict

Ready to merge — no blocking issues.

All findings from the first review are addressed in a12a7d2:

  • MIGRATION.md entry: complete and accurate.
  • Scoring-promotion test: correctly exercises the production code path (two candidates both surviving DB pre-filter, resolved by scoring).
  • IEEE754 threshold: > 0.09 is correct.
  • NIT defensive guard: documented inline.

The root cause of the 13 CI failures (LadybugDB returning empty labels(n)[0] for Class nodes) is fixed with a clean single-UNION enrichment call. CI is green on all 3 platforms (6341 passed, 0 failed). The implementation is correct, contracts are documented, and the test coverage is meaningful.

@magyargergo magyargergo merged commit 131d411 into abhigyanpatwari:main Apr 18, 2026
12 of 13 checks passed
@magyargergo

Copy link
Copy Markdown
Collaborator

Thank you for your contribution!

github714801013 pushed a commit to github714801013/GitNexus that referenced this pull request Apr 28, 2026
…ind/file_path hints (abhigyanpatwari#888)

* feat(mcp): rank context/impact disambiguation candidates and expose kind/file_path hints

The `context` MCP tool already returned `{ status: 'ambiguous', candidates }`
when a name hit multiple symbols, but the candidates were returned in
arbitrary DB order and the only hint it accepted was file_path. The
`impact` tool was worse: when its name resolver found multiple viable
matches it silently picked the first one from a priority UNION, with no
signal back to the caller that a different symbol might have been
intended.

Both failure modes were flagged in issue abhigyanpatwari#470 and reconfirmed in the
comments by a second user who described impact as returning "incorrect
parsing results and meaningless tool calls" in the multi-match case.

Changes:

* Add `resolveSymbolCandidates(repo, query, hints)` private helper on
  LocalBackend. Single place that:
   - Short-circuits on direct uid (zero-ambiguity)
   - Runs the same name-or-qualified-id match as before, with LIMIT 20
     (was 10) so the ranker has headroom instead of arbitrary truncation
   - Preserves the abhigyanpatwari#480 Class/Constructor preference -- when the only
     ambiguity is a Class and its own Constructor, the Class wins
     silently
   - Scores each candidate (pure TS, no extra DB round-trip): base 0.50,
     +0.40 for file_path match, +0.20 for kind match, plus a small
     kind-priority tiebreaker (Class > Interface > Function > Method >
     Constructor) when no explicit kind hint is given
   - Sorts desc by score with stable tiebreakers (shorter filePath,
     then lex uid)
   - Promotes to a single confident resolve when the top score is
     >= 0.95 AND beats the runner-up by >= 0.10 -- lets a strong hint
     cut through without forcing the caller through a disambiguation
     round-trip

* Rewire `context()` to use the shared helper. Response shape is a
  strict superset of today's: candidates gain a `score` field, the
  existing `{ uid, name, kind, filePath, line }` keys are preserved so
  every downstream consumer (rename, eval-server formatter, etc.) keeps
  working. New `kind` input hint accepted.

* Rewire `impact()` to use the shared helper. Now emits the same
  `{ status: 'ambiguous', candidates, impactedCount: 0, risk: 'UNKNOWN' }`
  shape instead of silent first-pick. New inputs accepted:
  `target_uid`, `file_path`, `kind`.

* Update tool schemas in mcp/tools.ts to advertise the new inputs and
  describe ranked disambiguation.

Backward compatibility:

The abhigyanpatwari#480 Class/Constructor collapse is preserved and covered by the
existing java-class-impact integration test (still green). The
ambiguous response shape is a strict superset -- `eval-formatters`
unit test that parses the old shape is unchanged and still passes.
`impact` going from silent-first-pick to structured ambiguous is a
semantic improvement that is the entire point of the issue; callers
relying on silent first-pick now get an actionable response.

Scope declined for v1:

module/community hint -- the issue lists it as one of several hints,
but kind + file_path cover the vast majority of disambiguation needs
in practice, and a community-label filter requires an extra graph
query per candidate. Natural v2 follow-up.

Tests: calltool-dispatch.test.ts gains 5 new cases covering file_path
boost, kind hint boost, impact ambiguous shape, impact target_uid
short-circuit, and score field presence on the existing ambiguous
test. Plus the extended assertions on the existing
`context tool returns disambiguation for multiple matches`.

Verification:
  npx vitest run test/unit/calltool-dispatch.test.ts       -> 64 pass
  npx vitest run test/integration/java-class-impact.test.ts -> pass
  npm run test:unit                                         -> 3642 pass
    (4 pre-existing env failures unchanged: skip-git-cli needs built
    dist/, git-utils tmpdir on Windows worktree -- same on main)
  npx tsc --noEmit                                          -> clean

Closes abhigyanpatwari#470

* fix(mcp): enrich labels from UNION when labels(n)[0] is empty; address review findings

CI on PR abhigyanpatwari#888 caught 13 integration-test failures I did not cover locally:
my resolver refactor collected candidates via `labels(n)[0] AS type`, but
LadybugDB returns an empty string for that projection on certain node
types (most importantly Class). With an empty `type`, impact's downstream
`_runImpactBFS` no longer recognised `symType === 'Class' | 'Interface'`
and stopped seeding Constructor + File nodes into the frontier, so the
"impact(upstream) surfaces the file importer" assertion broke across 11
language fixtures plus 2 OVERRIDES filter tests.

The original impact resolver worked around this by running a prioritised
UNION across Class/Interface/Function/Method/Constructor and picking the
first hit. My refactor dropped that. Fix: keep the simple candidate MATCH
but enrich types afterward via a single scoped UNION query, so every
candidate carries an accurate label for both scoring and downstream
BFS seeding. The UID direct-lookup path is patched the same way.

Also addresses the findings from the senior reviewer on PR abhigyanpatwari#888:

* MIGRATION.md: document the `impact` behavioural change (silent first-
  pick → structured `{ status: 'ambiguous', candidates }`) so downstream
  callers know to branch on `result.status` before reading byDepth/
  summary. `context` is unchanged shape-wise (strict superset).

* New test: `context tool promotes top candidate via scoring when
  multiple rows survive DB pre-filter`. The review flagged that the
  existing file_path test works only because the mock ignores WHERE
  parameters -- the scored-promotion path (top ≥ 0.95 AND gap > 0.09)
  wasn't directly exercised. The new test uses two candidates both in
  App.tsx-containing paths plus a kind hint so promotion is decided by
  scoring, not DB pre-filtering. Also tightened the comment on the
  earlier file_path test to describe the mock vs production divergence
  honestly.

* NIT: added a paragraph explaining why `scored.length >= 2` is kept as
  a defensive guard even though the `normalized.length === 1` early
  return already covers the single-candidate path.

* Integration: two tests in `local-backend-calltool.test.ts` targeted
  `'authenticate'`, which now correctly resolves as ambiguous (two
  Method nodes: AuthService.authenticate and BaseService.authenticate).
  Updated both to pass `file_path: 'src/auth.ts'` so they exercise the
  new disambiguation API and still assert the METHOD_OVERRIDES filtering
  they were originally about.

Edge case fix in the promotion gap check: IEEE754 makes 0.50 + 0.40 +
0.20 - 0.90 = 0.09999999999999998 instead of exactly 0.10, which would
otherwise break the "winner clearly dominates" intent for legitimate
1.00 vs 0.90 cases. Changed `>= 0.10` to `> 0.09`; same user-facing
intent, no floating-point sensitivity.

Verification (all from gitnexus/):
  npx vitest run test/integration/class-impact-all-languages.test.ts
    -> 52 pass (was 11 FAIL on CI before this fix)
  npx vitest run test/integration/local-backend-calltool.test.ts
    -> 18 pass (was 2 FAIL on CI before this fix)
  npx vitest run test/integration/java-class-impact.test.ts
    -> 10 pass (regression guard for abhigyanpatwari#480 preserved)
  npx vitest run test/unit/calltool-dispatch.test.ts
    -> 65 pass (1 new test + 4 from original abhigyanpatwari#470 PR)
  npm run test:unit
    -> 3626 pass, 4 pre-existing env failures unchanged
  npx tsc --noEmit
    -> clean
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve context/impact symbol resolution with ranked disambiguation

2 participants