Skip to content

fix(ingestion): fully-qualified nested-type identity for C++/Ruby — structure (#1978) + resolution (#1982)#1981

Merged
magyargergo merged 21 commits into
mainfrom
fix/cpp-rust-qualified-node-identity
Jun 3, 2026
Merged

fix(ingestion): fully-qualified nested-type identity for C++/Ruby — structure (#1978) + resolution (#1982)#1981
magyargergo merged 21 commits into
mainfrom
fix/cpp-rust-qualified-node-identity

Conversation

@magyargergo

@magyargergo magyargergo commented Jun 3, 2026

Copy link
Copy Markdown
Collaborator

Scope: this branch now carries both halves of the #1978 fully-qualified nested-type work — the structure phase (#1978, original) and the resolution phase (#1982, the follow-up the earlier "Deferred" section promised) — plus a merge of main. Both halves are verified together; sections below.

Problem

Nested types that share a tail name within one file silently collapsed onto a single graph node keyed by the simple tail (e.g. both Outer::Inner and Other::Inner → one Struct:file:Inner node). Members cross-wired onto that one owner, and on the resolution side same-tail heritage (EXTENDS/IMPLEMENTS) and mixin / attr_accessor owners bound to the wrong sibling — silent wrong edges (0 dangling, so undetected by findDanglingEdges).

struct Outer { struct Inner { void from_outer(); int outer_field; }; };
struct Other { struct Inner { void from_other(); }; };
struct DerivedB : Other::Inner {};   // EXTENDS mis-resolved to Outer.Inner (wrong sibling)

Part 1 — Structure phase (#1978)

Key class-like type nodes (Class / Struct / Interface / Enum / Record) by their normalized fully-qualified path (Struct:file:Outer.Inner) instead of the simple tail, so same-tail nested types stay distinct. Gated per-language by a qualifiedNodeId config flag — default false, so every other language is byte-identical (verified empirically). Enabled for C++ and Ruby.

  • class-types.ts / class-extractors/generic.tsqualifiedNodeId on ClassExtractor + ClassExtractionConfig (?? false).
  • utils/ast-helpers.tsfindEnclosingClassInfo gains an optional getQualifiedOwnerName hook + EnclosingClassInfo.qualifiedClassId. Member-owner edges resolve to the qualified class node id, derived from the same extractQualifiedName the node id uses → owner id == node id by construction.
  • parsing-processor.ts (sequential) + workers/parse-worker.ts (worker) — flag-gated qualified node id + owner edges on both parse paths, incl. the worker routed-property block.
  • call-processor.ts — the same qualifier in the sequential routed-property pre-pass, kept in lockstep with the worker kind === 'properties' block.
  • configs/c-cpp.ts, configs/ruby.tsqualifiedNodeId: true.

Method/Property node ids stay simple-qualified (className.member); only type nodes get the fully-qualified id.


Part 2 — Resolution phase (#1982)

Registry-primary resolution still keyed same-tail nested types by their simple tail, so struct DerivedB : Other::Inner mis-resolved its EXTENDS to Outer.Inner (first-wins on the bare tail in resolveInheritanceBaseInScope), and Ruby's emitRubyMixinEdges cross-wired same-tail mixin / attr_accessor owners (last-wins) — exactly the items Part 1 deferred. Characterization confirmed these were wrong edges, not refusals.

  • Shared normalizernormalizeQualifiedName / splitQualifiedName extracted to utils/qualified-name.ts so the structure phase, the resolution inheritance path, and per-language captures all key against one ::/\. normalizer (NOT the heritage simplifyRawName, which collapses to the tail and would guarantee a lookup miss).
  • C++ heritage — the inheritance capture preserves the qualifier (@reference.qualified-name, additive, template-stripped: ns::Base<T>ns::Base; registered as a sub-tag so it can't shadow the @reference.inherits anchor). resolveInheritanceBaseInScope resolves it against the full-path QualifiedNameIndex qualified-first — with progressive-prefix lookup for relative bases (Outer::Inner written inside namespace NSNS.Outer.Inner) and refuse-on-tie — falling through to the existing simple-tail walk on miss. The index already carries the distinct Outer.Inner / Other.Inner keys from Part 1, so no type-registry change is needed.
  • Ruby emit mapemitRubyMixinEdges keys owners by the full def.qualifiedName, and the __heritage__ / __property__ markers carry the full enclosing-scope owner (buildEnclosingQualifiedName, handling the compact class Outer::Inner form). This closes the deferred same-tail routed-property / mixin owner item — on both the sequential and worker paths (markers survive worker serialization).

Additive + gated: the qualifier sidecar is emitted only by C++; the qualified-first branch is gated on a new optional ReferenceSite.rawQualifiedName (set only by C++), so non-C++ inheritance resolution is byte-identical. New resolution-side assertions are registry-primary-only (legacy leg skips them via helpers.ts).


Still deferred (separate follow-up issues)


Testing

  • New same-tail collision fixtures (C++ / Ruby / Rust) + #1978 resolver tests asserting positive owner identity (R7) (resolved owner / edge endpoint by node id, not merely dangle-free).
  • #1982 resolver tests: C++ same-tail EXTENDS (incl. qualified-cross-scope + cpp-diamond single-candidate regression guard); Ruby same-tail attr_accessor + mixin owner identity — each with a worker-path parity block so the qualified logic is exercised on the production pool, not just sequential.
  • Capture goldens regenerated localized (only the one edited Ruby fixture drifts; all others byte-identical); bench/scope-capture fingerprints rebaselined for cpp + ruby (additive) with provenance notes — 14/14 PASS.

Verified green: C++ + Ruby resolver suites on both parity legs (registry-primary + legacy) + the worker pool; a cross-language regression triangulation (go / java / csharp, 542 tests — the qualified-first branch is gated on rawQualifiedName, set only by C++); all capture goldens; tsc, eslint (0 errors), prettier, npm run build.

Both halves were shaped by an adversarial multi-agent verification pass — which (Part 1) surfaced the call-processor lockstep gap, missing worker coverage, and a non-discriminating #1975 assertion, and (Part 2) twice re-targeted the resolution-side plan onto the actual registry-primary path (preEmitInheritanceEdgesresolveInheritanceBaseInScope, not the legacy buildHeritageMap) before any code landed.


Merge of main

This branch also merges origin/main (brings #1934 rust F71 union / F72 macro, rust-coverage #1974, Go interface-impl / type-binding work). Conflicts resolved: rust.test.ts (kept both the deferred-Rust describe.skip block and main's F71/F72 blocks) and the baselines.json rust fingerprint (recomputed for the merged corpus — main's rust fixtures ∪ #1981's rust-nested-tail-collision, 126→127 fixtures). Post-merge re-verified: tsc, bench 14/14, rust 169-pass, cpp 278 + ruby 142 + 7 goldens all green.

Risk

Contained to the ingestion structure + scope-resolution phases, gated per-language. findEnclosingClassInfo and resolveInheritanceBaseInScope gained optional parameters (backward-compatible; all existing call sites unchanged). Node-id and resolution changes apply only to C++/Ruby nested types — validated on both legs and the worker path.

🤖 Generated with Claude Code

Nested types sharing a tail name in one file — C++ `Outer::Inner` vs
`Other::Inner`, Ruby `Outer::Inner` vs `Other::Inner` modules — silently merged
into a single graph node keyed by the simple tail (`Struct:file:Inner`),
cross-wiring their methods/properties onto one owner.

Key class-like type nodes (Class/Struct/Interface/Enum/Record) by their
normalized fully-qualified path (`Struct:file:Outer.Inner`) instead of the
simple name. Gated per-language by a new `qualifiedNodeId` config flag
(default false → byte-identical for every other language); enabled here for
C++ and Ruby.

- class-types.ts / generic.ts: `qualifiedNodeId` flag on ClassExtractor + config
- ast-helpers.ts: findEnclosingClassInfo gains an optional getQualifiedOwnerName
  hook + EnclosingClassInfo.qualifiedClassId, so member-owner edges resolve to
  the qualified class node id (owner id == node id by construction)
- parsing-processor.ts + parse-worker.ts: flag-gated qualified node-id + owner
  edges on both the sequential and worker parse paths (incl. routed properties)
- call-processor.ts: same qualifier in the routed-property pre-pass (lockstep
  with the worker `kind === 'properties'` block)
- configs/c-cpp.ts, configs/ruby.ts: qualifiedNodeId: true

Method/Property node ids stay simple-qualified; only type nodes get the
qualified id.

Deferred to a resolution-side follow-up: Ruby SAME-TAIL routed-property/mixin
owner identity under registry-primary (`emitRubyMixinEdges` keys owners by the
simple tail name, last-wins); and Rust inherent-impl methods (impl_item is not
a typeDeclaration — its #1978 test is describe.skip).

Tests: same-tail collision fixtures + #1978 resolver tests for C++/Ruby
(positive owner identity, R7), a worker-path parity block, and an unambiguous
nested attr_accessor case; the C++ #1975 out-of-line test updated to assert
qualified-id distinctness (forward-decl + out-of-line now unify). Verified
green on both parity legs, the worker path, and tsc.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@vercel

vercel Bot commented Jun 3, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
gitnexus Ready Ready Preview, Comment Jun 3, 2026 1:37pm

Request Review

@magyargergo

Copy link
Copy Markdown
Collaborator Author

Deferred resolution-side items (Ruby same-tail routed-properties/mixins via emitRubyMixinEdges, and Rust inherent-impl ownership) are tracked in #1982.

@github-actions

github-actions Bot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

✨ PR Autofix

Found fixable formatting / unused-import issues across 42 changed lines. Comment /autofix on this PR to apply them, or run npm run lint:fix && npm run format locally.

{"schema":"gitnexus.pr-autofix/v2","state":"fixes-available","pr_number":1981,"changed_lines":42,"head_sha":"ddc31ba72ef4b287023a0c2cd59b63763324e950","run_id":"26860586192","apply_command":"/autofix"}

…fix lint

- helpers.ts: exclude the new #1978 C++/Ruby resolver tests from the legacy
  parity leg (LEGACY_RESOLVER_PARITY_EXPECTED_FAILURES). They PASS on legacy
  too — the fix lives in the SHARED structure phase, not the legacy resolution
  path — so this is a deliberate registry-primary-only scoping (not a legacy
  gap), keeping the legacy path untouched and uncoupled from the new
  node-identity behavior.
- rust.test.ts: drop the `eslint-disable vitest/no-disabled-tests` directive.
  That rule isn't configured in this repo, so eslint errored "Definition for
  rule 'vitest/no-disabled-tests' was not found" and failed `quality / lint`.
  The describe.skip needs no disable directive.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

CI Report

All checks passed

Pipeline Status

Stage Status Details
✅ Typecheck success tsc --noEmit
✅ Tests success unit tests, 3 platforms
✅ E2E success gitnexus-web changes only

Test Results

Tests Passed Failed Skipped Duration
11014 10999 0 15 686s

✅ All 10999 tests passed

15 test(s) skipped — expand for details
  • COBOL pipeline benchmark > scales with file count
  • C++ ADL emit benchmark > emit phase scales sub-quadratically with co-scaled files and sites
  • C++ pipeline benchmark > scales with file count
  • C# pipeline benchmark > scales with file count — namespaces spread across the solution
  • C# pipeline benchmark > scales with file count — all types in one (global) namespace bucket
  • C# pipeline benchmark > scales with file count — all types in one (named) namespace bucket
  • Go pipeline benchmark > scales with file count (workers enabled)
  • Go pipeline benchmark — worker pool (issue Worker idle timeout kills long Go scope extraction and surfaces as Napi::Error during analyze #1848) > does not quarantine the large generated Go file on sub-batch idle timeout
  • Go structural interface detection benchmark > scales linearly with interface × struct count
  • Go structural interface detection split-phase benchmark > separates index-build and detection time
  • PHP pipeline benchmark > scales with file count (workers enabled)
  • Ruby pipeline benchmark > scales with file count (workers enabled)
  • Rust pipeline benchmark > scales with file count (workers enabled)
  • run.cjs direct-exec entrypoint (fix(cli): steer docs, skills, and hooks through a CLI-neutral project-local runner (#1939) #1945) > resolves a .cmd shim via the Windows shell branch, passing args and exit code
  • buildTypeEnv > known limitations (documented skip tests) > Ruby block parameter: users.each { |user| } — closure param inference, different feature

Code Coverage

Tests

Metric Coverage Covered Base Delta Status
Statements 80.37% 38443/47830 79.84% 📈 +0.5 🟢 ████████████████░░░░
Branches 68.94% 24470/35492 68.5% 📈 +0.4 🟢 █████████████░░░░░░░
Functions 85.53% 4003/4680 84.94% 📈 +0.6 🟢 █████████████████░░░
Lines 83.96% 34578/41181 83.36% 📈 +0.6 🟢 ████████████████░░░░

📋 View full run · Generated by CI

…ingerprint)

Adding the {cpp,ruby,rust}-nested-tail-collision fixtures changed the
lang-resolution corpus, which the scope-capture golden snapshots and the
fingerprint baselines gate on. These are pure fixture-corpus additions —
#1978 does not touch the scope-capture phase (captures.ts / emit*ScopeCaptures
are unchanged). Verified: the regenerated ruby/rust golden diffs are
additive-only (no existing fixture's capture digest changed), so the cpp/ruby/
rust fingerprint drift is solely the new fixtures.

- prettier --write test/integration/resolvers/{ruby,rust}.test.ts
- regenerate ruby/rust captures-golden snapshots (UPDATE_GOLDEN=1; +1 fixture each)
- rebaseline cpp/ruby/rust scope-capture fingerprints (bench/scope-capture/baselines.json)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@magyargergo magyargergo left a comment

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tri-review — qualify nested-type node identity for C++/Ruby (#1978)

Methods & engine breakdown (read first). 7 Claude reviewer lanes ran (correctness, adversarial, maintainability, performance, testing + 2 GitNexus lanes that ended mid-investigation — their domains were covered by the others). Codex (gpt-5.5, xhigh) ran but stalled before emitting retrievable structured findings — only truncated message previews were captured; I extracted one lead from it (heritage resolution of nested bases) and reproduced it myself. So weight this as a Claude-method review (one engine, several personas) with a single coordinator-verified Codex lead — not three independent confirmations. Persona agreement is "consistent," not independent.

  • Final verdict: production-ready with minor follow-ups.
  • Branch hygiene: clean feature/fix PR (behind main but mergeable; no unrelated churn).
  • Merge state: checks passing (core) — the first push failed quality/format + the scope-capture golden + the fingerprint gate (all from the new fixtures); fixed in e2641628 (prettier; regenerated ruby/rust goldens; rebaselined cpp/ruby/rust fingerprints — golden regen proven additive-only) and now green: format, lint, coverage, benchmarks all pass. scope-resolution parity still running (long job), verified locally on both legs (#1978 tests are registry-primary-only on the legacy leg via helpers.ts).

What's solid (credit)

The central owner-id == node-id invariant holds across plain / namespaced / templated / explicitly-specialized C++ and nested / compact-:: / reopened-module / singleton Ruby — correctness and adversarial probed it hard and could not break it. The constraintsTag asymmetry between parsing-processor and parse-worker is a non-issue for class-like ids (it only applies to Method/Function). The hoisted qualifiedTypeName is pure; the qualified !== simple guard is sound; normalizeQualifiedName only unifies genuinely-same types (Ruby reopening, C++ out-of-line + forward-decl). Flag-OFF is byte-identical; the legacy-leg exclusion strings are exact.

Headline — [P1 · reproduced] heritage mis-resolution for same-tail nested bases (registry-primary)

Inlined on c-cpp.ts. A class inheriting from a same-tail nested type mis-resolves its EXTENDS/IMPLEMENTS edge to the wrong sibling (the heritage builder resolves the base by simple tail name via the node-lookup simpleKey, first-wins; #1978 split the two Inner nodes). 0 dangling edges, so findDanglingEdges cannot catch it. This is the same resolution-side simpleKey class already deferred to #1982 — recommend folding it there (one qualified-owner/simpleKey fix covers heritage + properties + members uniformly) and documenting same-tail-nested-base heritage as a known limitation. Narrow trigger, but it silently corrupts impact-analysis on the orphaned type, and no test catches it.

Secondary

  • [P2 · reproduced] union- and anonymous-namespace-nested same-tail C++ types still merge (inlined on c-cpp.ts): ancestorScopeNodeTypes omits 'union_specifier'; anon namespaces add no scope segment. Same bug class, not in #1982.
  • [P2 · code-read] buildQualifiedName hot-path cost: the new owner hook calls it per member (was once per class), and it walks to the tree root + scans top-level children every call with no per-file fileScopeSegments memo. Gated to C++/Ruby, est. <5%, but this repo has an O(n²) scope-capture history — memoize fileScopeSegments per file and hoist the per-iteration closure out of the symbol loop. (generic.ts ~107-118; the closure in parsing-processor.ts/parse-worker.ts/call-processor.ts.)
  • [P2 · latent] classIdCache/classInfoCache key on the node ref only, not on getQualifiedOwnerName presence — the call-scope path calls without the hook, the definition path with it. Latent today (distinct node refs) but fragile; harden by keying on hook presence or always threading it. (Converged across the performance, risk, and maintainability lanes.)
  • [P2 · code-read] Ruby worker-path untested: the Ruby fix touches parse-worker.ts (main owner + the kind==='properties' block), but only the sequential path is tested; C++ has a worker-parity test. Add a Ruby worker-parity test (workerThresholdsForTest:{minFiles:1} + usedWorkerPool).
  • [P2/P3 · maintainability] the qualifier lambda + qualifiedClassId ?? classId fallback are duplicated across 4 sites kept "in lockstep" only by comment — extract a ClassExtractor helper; strengthen the "order is load-bearing" comment to name the failure mode; reconsider the qualifiedNodeId boolean name (vs the isTypeDeclaration convention).
  • [P3] the C++ worker-path test omits the sourceId-not-:: discriminator that the sequential #1975 test uses to prove the fix is engaged.

Validated (a feature, not a finding)

The owner-id==node-id invariant survived adversarial templated/specialized/deep-nested probing; the hoist purity and the guard logic were probed and found sound; normalizeQualifiedName collapses only genuinely-same types.

Test gaps

No fixtures for heritage-to-nested-base, templated-nested ownership, nested enums/interfaces, 3-level nesting, or the Ruby worker-path; the same-named-method-under-same-tail residual (#1982) isn't pinned.

Coverage

Full 15-file diff read + targeted coordinator pipeline reproductions (heritage, union, anon-namespace). Worker-path assertions can't execute in a no-build worktree (validated by code-read + the C++ worker-parity test that runs in CI).

Automated multi-tool digest — mostly one engine (Claude) across personas, plus one coordinator-verified Codex lead. Verify findings before acting.

Comment thread gitnexus/src/core/ingestion/class-extractors/configs/c-cpp.ts
Comment thread gitnexus/src/core/ingestion/class-extractors/configs/c-cpp.ts
magyargergo and others added 6 commits June 3, 2026 06:49
Move normalizeQualifiedName/splitQualifiedName out of class-extractors/
generic.ts into utils/qualified-name.ts so the structure-phase
buildQualifiedName, the scope-resolution inheritance resolver, and the
per-language capture emitters can all key against ONE normalizer. A raw
'::' qualifier must normalize to the exact '.'-joined key the
QualifiedNameIndex already holds, or the qualified lookup silently misses
(the #1982 resolution-side foundation). Pure relocation — byte-identical
function bodies; tsc clean; existing C++ nested-collision tests green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…rect qualified node (#1982)

Registry-primary C++ inheritance (preEmitInheritanceEdges -> resolveInheritanceBaseInScope)
resolved a same-tail nested base by its SIMPLE TAIL with first-wins, so
`struct DerivedB : Other::Inner` mis-resolved EXTENDS to Outer.Inner (the wrong
sibling; 0 dangling, so undetected). The namespace qualifier was discarded at the
C++ inheritance capture.

Fix (additive, qualified-first):
- ReferenceSite gains an optional `rawQualifiedName`; the C++ inheritance capture
  emits `@reference.qualified-name` (qualifier-preserving, template-stripped:
  Other::Inner, ns::Base<T> -> ns::Base) only when the base is qualified, registered
  as a sub-tag so it can't shadow the `@reference.inherits` anchor.
- resolveInheritanceBaseInScope resolves the qualifier against the full-path
  QualifiedNameIndex FIRST (which already carries Outer.Inner / Other.Inner keys from
  the structure phase), with progressive-prefix lookup for relative bases and
  refuse-on-tie, falling through to the existing simple-tail walk on miss — so
  unqualified bases and the single-candidate cross-file case are unchanged.

Registry-primary cpp.test.ts 278/278 (incl. worker-path: rawQualifiedName survives
worker serialization). Legacy leg unaffected (207 pass / 71 skip) — the new
resolution-side assertions are registry-primary-only via helpers.ts. tsc clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…the correct qualified node (#1982)

emitRubyMixinEdges keyed its owner map by the SIMPLE tail (def.qualifiedName
split-popped) with last-wins, and the __heritage__/__property__ markers carried
only the immediate owner name — so `module Outer; class Inner` and
`module Other; class Inner` collapsed onto one `Inner` key and cross-wired their
include/attr_accessor edges onto whichever Inner was processed last.

Fix (lockstep, full-qualified):
- ruby/captures.ts: build the marker owner from the FULL enclosing class/module
  chain (buildEnclosingQualifiedName walks all ancestors, normalizing the compact
  `class Outer::Inner` scope_resolution form via the shared splitQualifiedName) so
  the marker owner byte-matches the resolution def's qualifiedName.
- ruby/scope-resolver.ts: key graphIdByName by the full def.qualifiedName instead
  of the simple tail. Top-level owners/mixins are unchanged (full == simple).

Registry-primary ruby.test.ts 142/142 incl. a new worker-path block (the deferred
note's duplicate-edge concern: markers survive worker serialization, exactly one
HAS_PROPERTY per attr). Legacy leg unaffected (136 pass / 6 skip) — new assertions
registry-primary-only via helpers.ts. tsc clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Cross-cutting verification artifacts for the #1982 same-tail resolution fix:
- ruby capture golden regenerated: ONLY the ruby-nested-tail-collision fixture
  drifts (+10 capture groups from its new include/attr_accessor + the now
  full-qualified __heritage__/__property__ marker owner). All other ruby fixtures
  byte-identical (proves the owner-qualification is localized to nested owners).
- bench/scope-capture/baselines.json: rebaseline cpp + ruby fingerprints (the only
  two that drift; 12 other languages byte-identical). cpp = additive
  @reference.qualified-name capture; ruby = the localized owner change. Provenance
  notes record both. scaling linear (~1.0), 14/14 PASS.
- generic.ts: drop the now-unused normalizeQualifiedName import (lint error).
- walkers.ts / ruby.test.ts: prettier formatting.

Verified: cpp 278/278 + ruby 142/142 (registry-primary), both legacy legs clean
(skips registry-primary-only assertions), go/java/csharp 542 (cross-language
regression — the qualified-first branch is gated on rawQualifiedName, set only by
C++, so non-C++ inheritance resolution is unchanged). tsc + eslint(0 errors) + prettier clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Brings main (78ad6bc) — #1934 rust F71 union / F72 macro, rust-coverage (#1974),
Go interface-impl/type-binding work, hook-resolver changes — into the #1981+#1982
branch (cpp/ruby qualified nested-type identity + resolution-side same-tail fix).

Conflicts resolved (2):
- test/integration/resolvers/rust.test.ts: kept BOTH the #1981 deferred Rust
  same-tail `describe.skip` block AND main's F71 (union) / F72 (macro) describe
  blocks — independent additions at the same location.
- bench/scope-capture/baselines.json: rust fingerprint recomputed for the MERGED
  corpus (main's rust-coverage/F71/F72 fixtures UNION #1981's rust-nested-tail-collision,
  126→127 fixtures) → 56ffc1c0…; cpp/ruby auto-merged (the #1982 fingerprints intact).
  All other entries (reference-site.ts, scope-extractor.ts, helpers.ts, rust golden)
  auto-merged cleanly.

Verified post-merge: tsc clean; bench 14/14 PASS; rust.test.ts 169 pass / 1 skip;
cpp 278 + ruby 142 + 7 capture goldens (479) all green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…tity' into fix/cpp-rust-qualified-node-identity
@magyargergo magyargergo changed the title fix(ingestion): qualify nested-type node identity for C++/Ruby (#1978) fix(ingestion): fully-qualified nested-type identity for C++/Ruby — structure (#1978) + resolution (#1982) Jun 3, 2026

@magyargergo magyargergo left a comment

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tri-review (delta) — #1982 resolution-side: same-tail nested-type identity for C++/Ruby

Scope. Delta review of the new issue #1982 resolution-side work added since the prior pr-tri-review (which covered the #1978 structure phase and is left intact). Reviews only the resolution commits: qualified-first inheritance base resolution (resolveQualifiedInheritanceBase), the @reference.qualified-name / rawQualifiedName plumbing, the shared normalizeQualifiedName, and the Ruby full-qualified __heritage__/__property__ owner keying — the part the prior review deferred to #1982. Reviewed at head b9286bb9 (now CLEAN, caught up with main).

Verdict: needs changes before merge. One reproduced P1 regression (Ruby) introduced by this PR, plus one reproduced P2 incomplete-fix (C++) where the headline deliverable is a no-op for the common case. Neither is caught by CI. The structure-phase work and the owner-side same-tail fix (the parts with tests) are solid.

Methods & engine breakdown (read first). 5 Claude reviewer personas returned full findings (correctness, adversarial, maintainability, testing, performance); the 2 GitNexus lanes (risk, test/CI) ended mid-investigation, their domains covered by the others plus coordinator verification. Codex (gpt-5.5, xhigh) was dispatched but produced no retrievable structured output — weight this as a Claude-method review (several personas, one engine), not three independent confirmations. The two headline findings were each reproduced end-to-end twice (the adversarial persona + an independent verification agent, each running the full integration pipeline on BASE vs HEAD) — that reproduction, not persona agreement, is the strong signal here.

Headline — [P1 · reproduced · blocks merge] Ruby nested mixin included by short name silently drops its IMPLEMENTS edge

Inlined on ruby/scope-resolver.ts:32. emitRubyMixinEdges now keys graphIdByName by the full def.qualifiedName (line 32), but the mixin target is still looked up by its as-written short name (graphIdByName.get(mixinName), line 52, mixinName = arg.text). For

module App
  module Loggable; def log; end; end
  class Service; include Loggable; end
end

the edge resolves on BASE (App.Service -IMPLEMENTS-> Loggable) but is dropped on HEAD (reproduced on both trees). The owner side was upgraded to full-qualified on both the marker and the map key; the mixin side was left asymmetric (map = full qn, lookup = bare name). Blast radius: every include/extend/prepend of a nested/namespaced module by short name — common in Rails/gem code — loses its IMPLEMENTS edge and MRO entry, with 0 dangling edges so findDanglingEdges can't catch it. The new fixture uses only top-level mixin modules, so CI stays green and the regression is untested. Fix: resolve the mixin reference to the same qualified key the def is registered under (progressive-prefix against the enclosing scope), or keep a simple-tail fallback; add a nested-mixin-by-short-name fixture.

Headline — [P2 · reproduced · merge decision] C++ same-tail nested heritage is a no-op once wrapped in a namespace

Inlined on walkers.ts:343. The new resolveQualifiedInheritanceBase correctly picks the right def, but the result is defeated downstream: for a type inside a namespace the scope-model def.qualifiedName omits the namespace (B.Inner), while the structure-phase graph node carries it (NS.B.Inner). resolveDefGraphId (scope-resolution/graph-bridge/ids.ts, unchanged) then misses the qualifiedKey and falls back to simpleKey('Inner'), collapsing both same-tail bases (and both Derived callers) onto the first-inserted node — so the second derivation's EXTENDS is suppressed as a duplicate (reproduced: NS.DB.Derived gets no EXTENDS edge). Not a regression (BASE behaves identically) — but the C++ deliverable only works for top-level types, and the shipped shapes.cpp fixture is top-level-only, giving false confidence. The merge decision: blocks if #1982 is meant to fix namespaced C++ heritage; acceptable if this PR's C++ scope is intentionally top-level-only (then document the limitation and add the namespaced fixture as a known gap). Fix: thread the namespace into the scope-model qualifiedName so qualifiedKey hits, or add a namespace-suffix fallback in resolveDefGraphId. (Ruby's same-tail attr/mixin owner-identity tests pass — this is C++-specific.)

Secondary (body only — not blocking)

  • [P2 · testing] rust.test.ts's Rust inline mod-nested same-tail collision block is describe.skip (Rust impl-item ownership deferred), yet rust-nested-tail-collision/lib.rs is registered in the captures golden — zero active assertions behind a registered fingerprint reads as coverage that doesn't exist. Add a minimal active assertion (two distinct Struct nodes, no dangling HAS_METHOD) or drop the fingerprint-only entry.
  • [P2 · testing] Ruby worker-path parity asserts only attr_accessor (HAS_PROPERTY); the worker mixin (IMPLEMENTS) path — where rawQualifiedName / the marker owner must survive worker serialization — is untested. The C++ worker heritage test asserts only DerivedB and lacks a toHaveLength(1) duplicate guard.
  • [P2 · maintainability] Three sites must now agree byte-for-byte (the Ruby marker owner from buildEnclosingQualifiedName, the graphIdByName key from def.qualifiedName, and the structure-phase qualifiedName) — comment-enforced only, no shared encode/decode pair. cpp/captures.ts also keeps a parallel normalizeCppNamespaceQName instead of importing the new shared normalizeQualifiedName (the file's stated single-normalizer goal); PROPERTY_PREFIX is function-scoped while HERITAGE_PREFIX is module-scoped.
  • [P3 · perf] resolveInheritanceBaseInScope + preEmitInheritanceEdges walk findEnclosingClassDef twice per qualified site; buildEnclosingQualifiedName walks to the AST root per heritage/attr call with no per-class memo or program early-exit. Gated to C++/Ruby and bounded by nesting depth — minor, but memoize if it shows on the bench given this repo's O(n²) scope history.
  • [P3 · pre-existing, not a #1982 regression] A qualified mixin arg include Outer::Mixin corrupts the :-delimited __heritage__ marker (arg.text contains ::) → split mis-parses → edge dropped. Present on BASE too, but adjacent and worth fixing with the rest.
  • [P3 · niche] normalizeQualifiedName strips a leading ::, so an explicitly-global C++ base : ::Net::X could bind to an enclosing-relative same-path type if one exists. Narrow; no fixture.

Validated (a feature, not a finding)

Probed and refuted: scopes.qualifiedNames.get() returns a frozen empty array on miss → the new .length access is crash-safe; refuse-on-tie returns undefined and falls through to the unchanged simple-tail walk, so no base that previously resolved becomes unresolved; unqualified bases and the flag-OFF path are byte-identical to BASE; ordering is sound (populateOwners runs before preEmitInheritanceEdges); normalizeQualifiedName collapses only genuinely-equivalent forms. The owner-side same-tail fix (the part with tests) works on both legs.

CI & branch state

All checks green on the reviewed head — including scope-parity / scope-resolution parity (the registry-primary leg that runs the #1982 assertions), tests / ubuntu / coverage, all platform tests, benchmarks, and tree-sitter ABI. Branch is a clean feature/fix branch, now mergeable / CLEAN (caught up with main; merge-from-main churn from #1984/#1976/#1980/#1974/#1987 excluded via merge-base and not reviewed). CI passing does not cover the two reproduced gaps — both the C++ heritage and Ruby mixin fixtures are top-level-only.

Coverage

Full read of the 9 #1982 source files + the new tests/fixtures; both headline findings reproduced end-to-end on BASE vs HEAD via the integration pipeline.

Automated multi-tool digest — one engine (Claude) across several personas, Codex unavailable, two findings reproduced end-to-end. Verify before acting.

Comment thread gitnexus/src/core/ingestion/languages/ruby/scope-resolver.ts Outdated
Comment thread gitnexus/src/core/ingestion/scope-resolution/scope/walkers.ts
# Conflicts:
#	gitnexus/bench/scope-capture/baselines.json
@magyargergo magyargergo merged commit c60ad9f into main Jun 3, 2026
49 of 51 checks passed
@magyargergo magyargergo deleted the fix/cpp-rust-qualified-node-identity branch June 3, 2026 15:23
This was referenced Jun 3, 2026
magyargergo added a commit that referenced this pull request Jun 3, 2026
…ers (Ruby + Dart) (#1994)

The Ruby and Dart heritage/property pipelines encoded side-effect facts as ':'-delimited synthetic-import marker strings, hand-constructed and hand-parsed at ~8 sites with the field layout kept in agreement only by a comment — the fragility behind the #1981 edge-drop. Route every site through a single shared codec (utils/heritage-marker.ts: encodeMarker / decodeMarker / isHeritageMarker).

encodeMarker throws on a colon-bearing field so the silent-drop class becomes a loud failure; the ':' wire format is preserved byte-for-byte (ruby-captures-golden unchanged). Language-neutral — keyed only on the literal shared prefixes. Dart already single-sources its prefix and is heritage-only, so its import-target guard is left untouched (no invented __property__ path). Pure refactor: no new edges or behavior.

Verified: new codec unit test; ruby resolver + golden 155/155 (zero golden diff) and dart resolver 63/63 on registry-primary, both green on legacy; tsc + prettier clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
magyargergo added a commit that referenced this pull request Jun 4, 2026
…held (#1993)

PR #1981's bridge fixed within-namespace same-tail heritage (NS::A::Inner vs NS::B::Inner). The residual: a cross-namespace same-tail base (NS1::A::Inner vs NS2::A::Inner) both key the namespace-omitted `A.Inner` in the qualifiedNames index, so resolveQualifiedInheritanceBase couldn't pick a winner and the deriving classes cross-wired (DB's EXTENDS bound to NS1's A::Inner).

Fixed bridge-held via the existing `namespacePrefix` sidecar — no qualifiedName invariant flip, no resolution-index re-keying: (1) tagNamespacePrefixes also tags defs declared directly in a namespace (the deriving NS1::DA), composed identically to the class-nested path; (2) resolveQualifiedInheritanceBase breaks a same-tail tie by preferring the candidate whose namespacePrefix matches the deriving class's. Two-phase lookup, UDC, brace-init, file-local linkage untouched (def.qualifiedName + index keys unchanged).

New cpp-cross-namespace-same-tail fixture + registry-primary test (in the cpp parity expected-failures). Verified: cpp suite 287/287 primary, 209 + 78 skips legacy — no regression; tsc + prettier clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
magyargergo added a commit that referenced this pull request Jun 4, 2026
…held (#1993) (#2005)

* fix(cpp): resolve cross-namespace same-tail inheritance bases bridge-held (#1993)

PR #1981's bridge fixed within-namespace same-tail heritage (NS::A::Inner vs NS::B::Inner). The residual: a cross-namespace same-tail base (NS1::A::Inner vs NS2::A::Inner) both key the namespace-omitted `A.Inner` in the qualifiedNames index, so resolveQualifiedInheritanceBase couldn't pick a winner and the deriving classes cross-wired (DB's EXTENDS bound to NS1's A::Inner).

Fixed bridge-held via the existing `namespacePrefix` sidecar — no qualifiedName invariant flip, no resolution-index re-keying: (1) tagNamespacePrefixes also tags defs declared directly in a namespace (the deriving NS1::DA), composed identically to the class-nested path; (2) resolveQualifiedInheritanceBase breaks a same-tail tie by preferring the candidate whose namespacePrefix matches the deriving class's. Two-phase lookup, UDC, brace-init, file-local linkage untouched (def.qualifiedName + index keys unchanged).

New cpp-cross-namespace-same-tail fixture + registry-primary test (in the cpp parity expected-failures). Verified: cpp suite 287/287 primary, 209 + 78 skips legacy — no regression; tsc + prettier clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test(cpp): worker-path parity for #1993 cross-namespace tie-break + correct narrative

Add the missing parse-worker.ts parity describe for the #1993 cross-namespace
same-tail heritage tie-break, mirroring the #1982/#1995 worker siblings
(workerThresholdsForTest minFiles:1/minBytes:1, workerPoolSize:2, usedWorkerPool
guard, and the same NS1.DA→NS1.A.Inner / NS2.DB→NS2.A.Inner base assertions), and
register both worker test names in LEGACY_RESOLVER_PARITY_EXPECTED_FAILURES['cpp']
(registry-primary-only, like the sequential entry). Closes the DoD sequential≡worker
gap flagged in the tri-review of PR #2005.

Also correct the fixture/test narrative: the pre-fix failure is a CROSS-WIRE (DB's
EXTENDS binds NS1::A::Inner via the refuse-on-tie scope-walk fallback), not a silent
miss — the empirical pre-fix run shows the edge exists but points at the wrong target.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* refactor(scope-resolution): type the namespacePrefix sidecar; regen cpp bench baseline (#1993)

F4 follow-up to #1993: declare `namespacePrefix?: string` on SymbolDefinition
(gitnexus-shared) and drop the six `as { namespacePrefix?: string }` casts in
walkers.ts / graph-bridge/ids.ts that #1993 introduced. Pure type-level — the `as`
assertions erase at compile time, runtime is byte-identical, and the field stays a
sidecar (no graph-node identity; the qualifiedName-keyed index is untouched).

Also regenerate the cpp scope-capture bench baseline: rebased onto main (now
carrying #1995's cpp fixtures), #1993 adds cpp-cross-namespace-same-tail, growing
the cpp-* corpus 272->273 and drifting the fingerprint d63ded6->6d6207ae. Pure
fixture-corpus drift — no scope-extractor change.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
magyargergo added a commit that referenced this pull request Jun 4, 2026
…ers (Ruby + Dart) (#1994)

The Ruby and Dart heritage/property pipelines encoded side-effect facts as ':'-delimited synthetic-import marker strings, hand-constructed and hand-parsed at ~8 sites with the field layout kept in agreement only by a comment — the fragility behind the #1981 edge-drop. Route every site through a single shared codec (utils/heritage-marker.ts: encodeMarker / decodeMarker / isHeritageMarker).

encodeMarker throws on a colon-bearing field so the silent-drop class becomes a loud failure; the ':' wire format is preserved byte-for-byte (ruby-captures-golden unchanged). Language-neutral — keyed only on the literal shared prefixes. Dart already single-sources its prefix and is heritage-only, so its import-target guard is left untouched (no invented __property__ path). Pure refactor: no new edges or behavior.

Verified: new codec unit test; ruby resolver + golden 155/155 (zero golden diff) and dart resolver 63/63 on registry-primary, both green on legacy; tsc + prettier clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
magyargergo added a commit that referenced this pull request Jun 4, 2026
…ers (Ruby + Dart) (#1994) (#2007)

* refactor(ingestion): share a codec for __heritage__/__property__ markers (Ruby + Dart) (#1994)

The Ruby and Dart heritage/property pipelines encoded side-effect facts as ':'-delimited synthetic-import marker strings, hand-constructed and hand-parsed at ~8 sites with the field layout kept in agreement only by a comment — the fragility behind the #1981 edge-drop. Route every site through a single shared codec (utils/heritage-marker.ts: encodeMarker / decodeMarker / isHeritageMarker).

encodeMarker throws on a colon-bearing field so the silent-drop class becomes a loud failure; the ':' wire format is preserved byte-for-byte (ruby-captures-golden unchanged). Language-neutral — keyed only on the literal shared prefixes. Dart already single-sources its prefix and is heritage-only, so its import-target guard is left untouched (no invented __property__ path). Pure refactor: no new edges or behavior.

Verified: new codec unit test; ruby resolver + golden 155/155 (zero golden diff) and dart resolver 63/63 on registry-primary, both green on legacy; tsc + prettier clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* refactor(dart): single-source DART_HERITAGE_PREFIX from the shared codec (#1994)

Alias DART_HERITAGE_PREFIX to HERITAGE_MARKER_PREFIX (utils/heritage-marker.ts)
instead of re-declaring the '__heritage__:' literal, so the Dart import-target
heritage guard cannot desync from the codec's encode/decode. Value-identical;
gives the codec prefix a direct production consumer. Addresses the tri-review
nit on PR #2007.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant