fix(ingestion): materialize graph nodes for scoped class/module/impl declarations (#1975)#1977
Conversation
…oped-declaration nodes (U1, #1975) Adds findDanglingEdges() and pipeline-level tests asserting that Ruby namespaced class/module declarations materialize a Class/Trait node with a resolving HAS_METHOD edge. Red by design on the pre-fix base (5 failing) — the fix lands in U2 (shared core) + U3 (Ruby enablement). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ule declarations (U2/U3, #1975) Widen the Ruby legacy structure query so `class Foo::Bar` / `module Baz::Qux` (name field is a scope_resolution node) match @definition.class/.module as separate top-level patterns. The node is keyed by its full scoped name, which matches the HAS_METHOD owner id that findEnclosingClassInfo derives from the same name field — so the previously-dangling ownership edges now resolve, and distinct namespaces (Foo::Bar vs Baz::Bar) stay distinct nodes (no collision). No change to findEnclosingClassInfo (zero call-resolution blast radius) and no scope-extractor/golden/bench impact — the fix is purely the legacy structure query gate. Finalizes the U1 target assertions to the qualified-name identity. Validated: 134/134 Ruby resolver tests pass on BOTH legs; tsc --noEmit clean; dangling HAS_METHOD edges on the ruby-namespaced fixture drop from 3 to 0. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…rship (U4, #1975) For an out-of-line `struct Outer::Inner { ... }`, the container name is a qualified_identifier, so findEnclosingClassInfo derived the owner id from the full `Outer::Inner` text — but the type is keyed by its in-class declaration (the nested `Inner` node), leaving the method's HAS_METHOD edge dangling. Reduce a qualified_identifier container name to its tail segment for the owner id/name, matching how inline nested definitions are already keyed. Node-type scoped, so Ruby's scope_resolution names stay full (distinct-by-namespace) and no language is named in shared code. Only out-of-line-def methods (already dangling) change behavior — zero impact on bare classes or call resolution. Validated: C++ 268/268 default leg, 205+63-skip legacy leg, no regression; 2 new target tests pass both legs; Ruby namespaced tests still pass; tsc clean; scope-capture bench rebaselined (cpp +cpp-out-of-line-class fixture) — --check PASS (13 langs). Dangling HAS_METHOD on the new fixture: 1 -> 0. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…1975) `impl path::Type` and `impl Trait for path::Type` name the target with a scoped_type_identifier. Two coordinated fixes: - findEnclosingClassInfo: reduce a scoped_type_identifier impl target to its trailing type name (both the trait-impl `for` branch and the inherent branch), matching the type's own tail-keyed declaration. - tree-sitter-queries: add a @definition.impl arm for scoped inherent impls so the Impl node is materialized (keyed by the same tail) instead of missing. Together the trait-impl method owns through the real Struct node and the inherent-impl method owns through a real Impl node — no dangling edges. Rust's scoped_type_identifier has a name: field, so the tail extraction is exact. Validated: Rust 163/163 on BOTH legs, no regression; new target test passes; C++/Ruby suites unaffected; tsc clean; scope-capture bench rebaselined (rust +rust-scoped-impl fixture) — --check PASS (13 langs). Dangling 1 -> 0. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…t captures goldens (U6, #1975) - Add ruby-tail-collision fixture + test: Foo::Bar and Baz::Bar share the tail 'Bar' but must stay two distinct Class nodes (locks the KTD-2 anti-collision guarantee from full-scoped-name keying). No dangling, no cross-wiring. - Regenerate the ruby + rust captures goldens for the fixtures added in U3-U6 (ruby-tail-collision, rust-scoped-impl). Both diffs are additive-only — a single new entry each, existing entries byte-identical (no capture-logic drift; the fixes are in the legacy structure query + findEnclosingClassInfo, not the scope-extractor). - Re-baseline the ruby scope-capture fingerprint (81->82 fixtures). N/A-language verification: C#/Java/PHP have no class-declaration scoped-name gap and show no regression (606 passed; the 2 C# worker-pool failures are the known worktree 'parse-worker.js not built' limitation, unrelated to this change). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
✨ PR AutofixFound fixable formatting / unused-import issues across 32 changed lines. Comment |
CI Report✅ All checks passed Pipeline Status
Test Results
✅ All 10918 tests passed 12 test(s) skipped — expand for details
Code CoverageTests
📋 View full run · Generated by CI |
magyargergo
left a comment
There was a problem hiding this comment.
PR #1977 Tri-Review — scoped class/module/impl graph nodes (#1975)
Methods & engines. GitNexus risk / test-CI lanes + Compound-Engineering personas (correctness, adversarial, maintainability, testing) — all Claude — plus Codex (gpt-5.5, xhigh) as the one independent engine (live; mostly static GitHub inspection this run), plus claude-mem context (the Rust bareTypeIdentifier tail convention; the separate inheritance-edge path). Full disclosure: this PR was authored by the same coordinator running the review — it's a self-review, and it caught a P1 in the author's own implementation. The P1/P2 below were reproduced by the coordinator and corroborated by Codex + ≥2 Claude lanes.
🔴 P1 · reproduced · ship-blocker — C++/Rust tail-reduction collides same-tail types in one file
To kill the dangling owner edge, the PR reduces a C++ qualified_identifier / Rust scoped_type_identifier container to its bare tail. The owner node id is Struct:<file>:<tail>, so two out-of-line scoped types in the same file sharing a tail collapse:
- Reproduced:
struct Outer::Inner {…}+struct Other::Inner {…}in one file → a singleStruct:file:Innernode; bothfrom_outerandfrom_otherown through it (from_othersilently mis-attributed;Other::Innernever materialized). With same-named members the two methods merge into one node (symbol loss).DANGLING:0— so the PR's ownfindDanglingEdgescheck cannot detect it. - Adversarial reproduced the Rust analogue (
a::Inner+b::Inner→ oneImpl:Inner) and confirmed BASE kept them distinct (dangling, not merged) — so this trades a visible dangle for silent corruption for that case. - This is exactly the collision the PR deliberately avoided in Ruby by keeping the full scoped key. Fix: key the C++/Rust owner by the full qualified path (resolve to the type's
qualifiedName, e.g.Outer.Inner), not the bare tail — mirror the Ruby decision. (Cross-file same-tail types don't collide — the id is file-scoped — so the blast radius is single-file translation units with sibling nested types, common in C++.)
🟠 P2 · reproduced — C++ reduction is single-hop; 3+ level defs still dangle
struct A::B::C {…} → owner Struct:file:B::C [MISSING] (one childForFieldName('name') hop yields the middle B::C, not tail C; real node C / qn A.B.C). Same root cause as P1 — the bare-tail strategy is the wrong lever. (Rust's scoped_type_identifier is flat on name:, so it isn't hit.)
🟡 P2 · code-read — inaccurate comment / contract drift
The new ast-helpers.ts comment says the reduction "Mirrors the qualified_identifier handling above" — it's below; and it (plus the pre-existing NOTE) points at rust.ts:extractOwnerName, which actually lives at method-extractors/configs/rust.ts:extractOwnerName. That function was not updated and still returns the full scoped text — harmless today (it only feeds method metadata, not the owner edge) but the NOTE asserts a synchrony that no longer holds. Fix the references and note the deliberate divergence.
✅ What's solid (validated)
- Ruby is correct. Adversarial confirmed
Foo::Bar/Baz::Barstay distinct (full-key works) and a cross-fileFoo::Bar.new; obj.xresolves (R4 holds). The Ruby half is sound. - No inheritance regression — Codex + correctness confirmed the
IMPLEMENTSedge forimpl Trait for outer::Inneris emitted via a separate (@heritage/preEmitInheritanceEdges) path the owner-reduction doesn't touch. - Golden/bench re-baselines are additive-only;
bench --checkPASS (14 langs);tscclean.
🧪 Test gaps (why the P1 shipped green)
- No same-tail collision fixture for C++/Rust — the exact breaking case (Ruby has
ruby-tail-collision; add the parallel).findDanglingEdges == []is necessary but not sufficient — it passes on the mis-attribution; add positive owner-identity assertions. - IMPLEMENTS edge for the scoped trait-impl unasserted; the Rust target test uses plain
it(no parity wrapper → no legacy-leg differentiation); cross-file R4,class Foo::Bar < Baseheritage, andqux_methodownership untested.
CI / merge
Local validation passed (both legs, unit 1177, bench, tsc) but missed the collision (single-type fixtures only). Recommendation: do not merge as-is — rework the C++/Rust owner to a full-qualified key (the Ruby approach). The Ruby portion is sound and could ship independently.
Automated multi-tool digest (GitNexus + Compound-Engineering + Codex + claude-mem) — a self-review of the author's own PR. The P1/P2 were reproduced by the coordinator; verify before acting.
| // `class Foo::Bar` has no in-class declaration, so its node IS keyed by the | ||
| // full scoped name (and must stay distinct from a separate `Baz::Bar`). | ||
| const ownerNameNode = | ||
| nameNode.type === 'qualified_identifier' |
There was a problem hiding this comment.
🔴 P1 (reproduced) + 🟠 P2 (reproduced) — the bare-tail owner key is the wrong lever.
P1: two out-of-line scoped types sharing a tail in the same file collide. Reproduced — struct Outer::Inner{…} + struct Other::Inner{…} → one Struct:file:Inner node; from_outer AND from_other both own through it (from_other mis-attributed, Other::Inner never created); same-named members merge into one method node. DANGLING:0, so findDanglingEdges can't catch it. BASE kept them distinct → a regression to silent corruption for this case.
P2: childForFieldName('name') is a single hop — struct A::B::C{…} reduces to the middle B::C, not tail C, so it still dangles (Struct:file:B::C MISSING).
Fix: key the owner by the full qualified path (resolve to the type's qualifiedName, e.g. Outer.Inner / A.B.C), not the bare tail — the same full-key decision made for Ruby scope_resolution. [reproduced]
| // method owns through the real node instead of a `path::Type` id that was | ||
| // never materialized (#1975). Mirrors the qualified_identifier handling | ||
| // above and rust.ts:extractOwnerName. | ||
| const implTargetName = (n: SyntaxNode): string => |
There was a problem hiding this comment.
🔴 P1 (reproduced, Rust analogue) + 🟡 comment nit.
Collision: same bare-tail issue in Rust — impl a::Inner{…} + impl b::Inner{…} in one file → one Impl:file:Inner node, the second impl's methods mis-attributed (adversarial reproduced; BASE emitted no false owner). Key by the full module-qualified path instead. [reproduced]
Nit: this comment says it "Mirrors the qualified_identifier handling above" — that handling is below (the generic nameNode block); and rust.ts:extractOwnerName is actually method-extractors/configs/rust.ts:extractOwnerName, which was not updated (still returns full text — harmless today, metadata-only, but the NOTE implies a synchrony that no longer holds). [code-read]
…ly (#1975) The self-tri-review of PR #1977 (review 4411683756) found — and reproduced — that the C++/Rust tail-reduction in findEnclosingClassInfo collides same-tail types declared in the same file (struct Outer::Inner + struct Other::Inner -> one Struct:Inner node, methods silently mis-attributed; same-named members merge). Root cause is pre-existing: GitNexus keys nested-type nodes by their tail name within a file, so even plain inline same-tail nested types already merge. A correct fix needs fully-qualified nested-type node identity — a broad change deferred to #1978. This reverts the C++ (qualified_identifier) and Rust (scoped_type_identifier impl) owner reductions in ast-helpers.ts, the Rust @definition.impl scoped arm, and the cpp/rust fixtures+tests+golden+bench entries. The Ruby fix is unaffected (it keys the node by the full scoped text — no collision) and stays: namespaced class/module node materialization + the cross-namespace collision test. Validated Ruby-only: 136/136 both legs; ruby+rust captures goldens 19/19; bench --check PASS (14 langs); tsc clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ship (#1975) Re-introduces the C++/Rust fix the tri-review reverted, using a collision-safe approach instead of owner tail-reduction (which merged same-tail types in one file). Key the scoped DECLARATION's node by its full qualified text so it matches the owner id and stays distinct from a same-tail type elsewhere: - C++: widen the legacy structure query to materialize a node for out-of-line defs (class/struct Outer::Inner — name is qualified_identifier), keyed by the full text. No findEnclosingClassInfo change needed — BASE already derives the full-text owner, which now matches. Outer::Inner and Other::Inner stay distinct; 3-level A::B::C resolves. (A redundant forward-decl node remains.) - Rust: @definition.impl arm for scoped inherent impls (keyed full) + findEnclosingClassInfo inherent-impl branch accepts scoped_type_identifier with full text. impl a::Inner and impl b::Inner stay distinct. Collision-aware fixtures + positive owner-identity assertions (per the tri-review) replace the single-type fixtures. Deferred to #1978: Rust trait impls on a scoped struct path (impl T for a::Inner) and the pre-existing inline same-tail node collision — both need qualified struct-node identity. Validated: Ruby 136/136, C++/Rust 434/434 both legs (371+63-skip legacy); ruby+rust captures goldens 19/19 (additive); bench --check PASS (14 langs); tsc clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Resolves #1975 (the PR #1972 P2 follow-up). Scoped class/module/impl declarations now materialize a real graph node and own their methods through it, instead of leaving dangling
HAS_METHODedges.What's fixed (all collision-safe)
class Foo::Bar/module Baz::Qux— node keyed by the full scoped name.Foo::Bar≠Baz::Bar(distinct).struct/class Outer::Inner { … }— the legacy structure query now materializes a node keyed by the fullqualified_identifiertext, which matches theHAS_METHODowner id.Outer::InnerandOther::Innerstay distinct (no merge/mis-attribution); 3-levelA::B::Cresolves.impl a::Inner { … }— a@definition.implarm materializes the Impl node keyed by the full scoped text;findEnclosingClassInfoowns the methods through it.impl a::Innerandimpl b::Innerstay distinct.The unifying principle (same as the Ruby decision): key the scoped declaration's node by its full qualified text so node id == owner id and same-tail types in different scopes never collide. This deliberately avoids owner tail-reduction, which an earlier revision used and the self tri-review reproduced as a silent same-tail collision (
Outer::Inner+Other::Inner→ one merged node).Tests
A
findDanglingEdgesgraph-integrity helper + per-language pipeline tests with positive owner-identity assertions and same-tail collision fixtures (Outer::Inner+Other::Inner,impl a::Inner+impl b::Inner,Foo::Bar+Baz::Bar) — not just dangle-free checks, so a regression to mis-attribution would fail.Validation
Ruby 136/136, C++/Rust 434/434 on both resolver legs (371+63-skip legacy); ruby+rust captures goldens 19/19 (additive);
bench --checkPASS (14 langs);tsc --noEmitclean.Deferred to #1978
impl T for a::Inner) — references a simply-keyed struct, needs qualified struct-node identity.struct Outer{struct Inner}+struct Other{struct Inner}already merge onmain, independent of this PR).🤖 Generated with Claude Code