Skip to content

feat(cpp): resolve inheritance-lattice member lookup#2077

Merged
magyargergo merged 5 commits into
abhigyanpatwari:mainfrom
azizur100389:codex/cpp-inheritance-lattice
Jun 9, 2026
Merged

feat(cpp): resolve inheritance-lattice member lookup#2077
magyargergo merged 5 commits into
abhigyanpatwari:mainfrom
azizur100389:codex/cpp-inheritance-lattice

Conversation

@azizur100389

Copy link
Copy Markdown
Contributor

Summary

  • preserve C++ direct-base, virtual-base, and class-scope member using facts through worker side channels
  • resolve receiver members with inheritance-lattice dominance, conservative ambiguity suppression, virtual-diamond deduplication, and existing overload ranking
  • stop class-scope using Base::member declarations from being treated as namespace imports
  • add exact fixtures for unrelated-base ambiguity, derived hiding, virtual/non-virtual diamonds, and member-using overloads

Validation

  • npx tsc --noEmit
  • npx prettier --check on all changed TypeScript files
  • npx vitest run test/unit/scope-resolution/cpp/cpp-imports.test.ts (16/16)
  • npx vitest run test/integration/resolvers/cpp.test.ts (305/305)
  • REGISTRY_PRIMARY_CPP=0 npx vitest run test/integration/resolvers/cpp.test.ts (305/305)
  • full suite: 10,365 passed, 2 failed, 61 skipped; both failures reproduce unchanged on a detached clean upstream/main worktree (literal-collectors C# receiver-node assertion and Swift scope-capture timing tripwire)
  • GitNexus impact and pre-commit change-scope audits completed (medium risk, limited to expected C++/receiver-bound resolution flows)

Fixes #1891

@azizur100389 azizur100389 requested a review from magyargergo as a code owner June 8, 2026 10:02
@vercel

vercel Bot commented Jun 8, 2026

Copy link
Copy Markdown

@azizur100389 is attempting to deploy a commit to the NexusCore Team on Vercel.

A member of the Team first needs to authorize it.

Comment thread gitnexus/src/core/ingestion/languages/cpp/member-lookup.ts Fixed
@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

CI Report

All checks passed

Pipeline Status

Stage Status Details
✅ Typecheck success tsc --noEmit
✅ Tests success unit tests, 3 platforms
✅ E2E success gitnexus-web changes only

Test Results

Tests Passed Failed Skipped Duration
10768 10752 0 16 546s

✅ All 10752 tests passed

16 test(s) skipped — expand for details
  • COBOL pipeline benchmark > scales with file count
  • C++ ADL emit benchmark > emit phase scales sub-quadratically with co-scaled files and sites
  • C++ pipeline benchmark > scales with file count
  • C# pipeline benchmark > scales with file count — namespaces spread across the solution
  • C# pipeline benchmark > scales with file count — all types in one (global) namespace bucket
  • C# pipeline benchmark > scales with file count — all types in one (named) namespace bucket
  • Go pipeline benchmark > scales with file count (workers enabled)
  • Go pipeline benchmark — worker pool (issue Worker idle timeout kills long Go scope extraction and surfaces as Napi::Error during analyze #1848) > does not quarantine the large generated Go file on sub-batch idle timeout
  • Go structural interface detection benchmark > scales linearly with interface × struct count
  • Go structural interface detection split-phase benchmark > separates index-build and detection time
  • PHP pipeline benchmark > scales with file count (workers enabled)
  • Ruby pipeline benchmark > scales with file count (workers enabled)
  • Rust pipeline benchmark > scales with file count (workers enabled)
  • Vue pipeline benchmark > scales with component count
  • run.cjs direct-exec entrypoint (fix(cli): steer docs, skills, and hooks through a CLI-neutral project-local runner (#1939) #1945) > resolves a .cmd shim via the Windows shell branch, passing args and exit code
  • buildTypeEnv > known limitations (documented skip tests) > Ruby block parameter: users.each { |user| } — closure param inference, different feature

Code Coverage

Tests

Metric Coverage Covered Base Delta Status
Statements 74.93% 35163/46922 N/A% 🟢 ██████████████░░░░░░
Branches 62.67% 21753/34705 N/A% 🟢 ████████████░░░░░░░░
Functions 80.69% 3805/4715 N/A% 🟢 ████████████████░░░░
Lines 78.75% 31800/40379 N/A% 🟢 ███████████████░░░░░

📋 View full run · Generated by CI

@magyargergo magyargergo left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Tri-Review — feat(cpp): resolve inheritance-lattice member lookup

Methods (5 engines, asymmetric independence). GitNexus risk/test-CI lanes + Compound-Engineering personas (ISO-C++ semantics, performance, correctness, adversarial, maintainability) — all Claude — plus two non-Claude engines: Codex (independent review) and CodeQL (SAST gate). Weight Codex/CodeQL + a Claude lane agreeing as strong; agreement among Claude personas alone is "consistent," not independent confirmation.

Credit where due. The hard semantics are largely right and well-tested: virtual-base dominance and linear-chain name-hiding, the virtual-vs-non-virtual diamond grouping for the primary shapes (incl. a mixed virtual/non-virtual edge to the same base), primary name-hiding, and base-clause virtual detection were all verified faithful by the ISO-C++ lane. Common-case cost is O(1) (own-method / no-inheritance early returns) and capture-phase scaling stayed linear (1.162 < budget 1.5). Several suspected bugs were probed and refuted (below) — that validation is a feature.

Merge blockers + one P1 correctness bug

1. member-lookup.ts:390 greedy /<.*>/g — MERGE BLOCKER (CodeQL) + functional bug. The most-corroborated finding here: CodeQL + Codex + ISO-C++ + correctness (two non-Claude engines). CodeQL high-severity alert #683 (js/incomplete-multi-character-sanitization) fires on this line — the HTML-injection sink is a false positive (output feeds C++ base-name matching, not HTML), but the gate keeps the PR BLOCKED until the regex is fixed or the alert is dismissed-with-reason. It is also a real functional defect: A<x>::B<y>"A" not "B" (the correctness lane executed it), so virtual/using facts on templated nested bases silently fail to match. One fix (split on :: first, then strip <…> per segment — or parse the node structurally) clears both.

2. GITNEXUS_BENCH red — MERGE BLOCKER (missing re-baseline). The new fixture cpp-member-lattice/main.cpp joins the lang-resolution/cpp-* corpus the bench fingerprints (fixture_count 273→274), so the cpp entry in bench/scope-capture/baselines.json must be re-baselined — it wasn't. Log: FAIL: cpp: capture fingerprint drift (got 0abf525f…, expected fd3d3768…). scaling_ratio 1.162 < budget 1.5not a perf regression. Regenerate the cpp fingerprint and add a _note (per the #1975/#1990/#1993 precedent).

3. member-lookup.ts:152-159 non-virtual diamond → FALSE edge (P1). The dominance filter (with isAncestor, 398-409) is virtuality-blind. For a non-virtual diamond where an intermediate class overrides the member, it emits a wrong CALLS edge where ISO C++ requires ambiguity — the one outcome this design tries hardest to avoid. Trigger: struct V{void f();}; struct A:V{void f();}; struct B:V{}; struct D:A,B{}; D d; d.f(); resolves to A::f (the V occurrence is dropped because isAncestor(V,A) is true), but C++ deems it ambiguous (A::f hides only A's V-subobject's f, not B's). The plainDiamondCall fixture omits the intermediate override and so masks this. (adversarial lane + coordinator re-trace; the ISO-C++ lane separately confirmed the virtual dominance path is correct, so this is specifically the non-virtual-subobject case.)

Inline findings (P2)

  • :316 (+ capture :357) — same-simple-name bases in one file collide: parents are matched by simple name only (qualifier dropped at capture), so a using/virtual fact can bind to the wrong base. (Codex + adversarial + ISO-C++)
  • :203 — the early return in collectInheritedOccurrences precedes the member-using merge, so an inner class with both own methods and using Base::f drops the using'd overloads → wrong target. (Codex + maintainability)
  • :190 (performance — emphasized in this review)resolveCppReceiverMember runs per call site with no (owner,member) memo; O(occurrences²) dominance (an isAncestor BFS per pair); and the walk enumerates root-to-base paths (exponential on deep non-virtual diamonds). Bounded by realistic inputs today, but latent given this indexer's documented O(n²) history.

Body-level (scope / process / maintainability)

  • this->member() bypasses the new hook (Codex-only, independent engine). receiver-bound-calls.ts Case 0.5 (376-462) never calls resolveReceiverMember, so an ambiguous this->f() diamond still emits a leftmost-MRO edge — inconsistent with typed-receiver value.f(). Appears pre-existing; the new ambiguity detection just isn't wired here.
  • No resolution-phase benchmark. The bench measures only the capture phase, so a green-after-rebaseline says nothing about the resolution-path cost above — worth a scaling fixture given the O(n²) history.
  • Maintainability nits: buildMro is overridden purely to run populateResolvedHierarchy as a side effect (a clearer lifecycle seam exists); CppReceiverMemberResolution duplicates the contract's ReceiverMemberResolution; simpleName is re-implemented a third time in cpp/; the new resolveReceiverMember contract param is typed Callsite though the caller passes ReferenceSite (compiles under strict:false; clarity nit); _scopes is unused.

Refuted — excluded (validation is a feature)

simpleName splitting on . is correct (C++ qualifiedName is dot-separated here; adl.ts:188); the Callsite→ReferenceSite change is runtime-safe (bivariant under strict:false; sole implementer typed Callsite); chooseOverload on an emptied candidate set falls through correctly; \0-joined keys can't collide (nodeId/filePath grammar); the worker collect/apply round-trip is complete (clearCppMemberLookupState runs before the apply loop); malformed bases (decltype/CRTP/template-template/anonymous) don't crash; cyclic inheritance terminates.

Test gaps

No coverage for: namespaced bases, cross-file bases, a non-virtual diamond with an override (finding #3), a worker-mode collect/apply round-trip unit test, or same-signature using-hiding.

CI

MERGEABLE but BLOCKED. Pass: typecheck/lint/format, tree-sitter ABI (all OS), packaged-install smoke, Analyze (js-ts) + Analyze (python) (the full CodeQL analysis itself passed), build/push. Fail (code): CodeQL gate (#683) + GITNEXUS_BENCH. Fail (infra): Vercel = deploy-auth. Pending: macOS/windows platform-sensitive, ubuntu coverage.

Coverage note

The GitNexus risk and test-CI persona lanes ended mid-investigation; those domains were synthesized from the cross-stream findings and the coordinator's own CI-log verification.


Automated multi-tool digest (Codex + CodeQL + 5 Claude persona lanes, vetted by an independent synthesis critic). Independence is asymmetric — verify before acting.

}

function trailingIdentifier(value: string): string {
const withoutTemplates = value.replace(/<.*>/g, '');

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MERGE BLOCKER · CodeQL #683 + functional bug · reproduced] Greedy /<.*>/g is an incomplete strip: on a multi-segment templated name like A<x>::B<y> it consumes from the first < to the last >, yielding "A" instead of "B" (the correctness lane executed the regex). Effect: virtual/using facts on templated nested bases silently fail the simpleName(def) === baseName match → over-suppression. Separately, CodeQL fires high-severity js/incomplete-multi-character-sanitization here; the HTML-injection sink is a false positive (this output feeds base-name matching, not HTML) but the gate keeps the PR BLOCKED. Fix once: split('::') first, then strip <…> per segment (or read the qualified_identifier node structurally). Corroborated by CodeQL + Codex + ISO-C++ + correctness (two non-Claude engines).

@@ -0,0 +1,59 @@
struct Left {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MERGE BLOCKER · reproduced from CI] Adding this fixture grows the lang-resolution/cpp-* corpus that bench/scope-capture/measure.mjs fingerprints (fixture_count 273→274), so the cpp entry in bench/scope-capture/baselines.json must be re-baselined — it wasn't, so GITNEXUS_BENCH is red:

FAIL: cpp: capture fingerprint drift (got 0abf525f…, expected fd3d3768…)

scaling_ratio 1.162 < budget 1.5, so this is pure fixture-corpus drift, not a perf regression. Regenerate the cpp fingerprint and add a _note documenting the cpp-member-lattice addition (per the #1975/#1990/#1993 precedent). (baselines.json isn't in this diff, so the comment lands on the fixture that triggers the drift.)

);
if (occurrences.length === 0) return undefined;

const undominated = occurrences.filter(

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P1 · false edge · code-read] This dominance filter (via isAncestor, 398-409) is virtuality-blind, so it over-applies the cross-lattice dominance rule ISO C++ reserves for virtual bases. For a non-virtual diamond with an intermediate override it emits a wrong CALLS edge where C++ requires ambiguity. Trigger:

struct V { void f(); };
struct A : V { void f(); };   // non-virtual V
struct B : V {};
struct D : A, B {};
void c(){ D d; d.f(); }

The code resolves A::f (the V occurrence is dropped because isAncestor(V,A) is true), but d.f() is ambiguous in C++ — A::f hides only A's V-subobject's f, not the one reached through B. The plainDiamondCall fixture omits the override and masks this. Fix: only let dominance cross a subobject boundary through a virtual edge; keep distinct non-virtual paths as separate groups so the site is suppressed as ambiguous. (The virtual-diamond and linear-chain dominance paths were separately verified correct.)

Comment thread gitnexus/src/core/ingestion/languages/cpp/member-lookup.ts Outdated
nextActive.add(ownerDefId);

const definitions = model.methods.lookupAllByOwner(ownerDefId, memberName);
if (definitions.length > 0) {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P2 · code-read] This early return fires before the member-using merge below (207-224), so an inner inherited class that has both its own method(s) and using Base::f; drops the using'd overloads. Trigger:

struct A { void f(int); };
struct B : A { void f(double); using A::f; };
struct D : B {};
void c(){ D d; d.f(1); }

Resolves to B::f(double) instead of A::f(int) — a wrong edge (C++ picks the exact int match from the using-introduced overload). Merge own methods + member-using entries into the occurrence's overload set before returning. (Codex + the maintainability lane flagged the same root-vs-inner asymmetry.)

readonly virtualAnchor?: string;
}

function collectInheritedOccurrences(

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P2 · performance · code-read] This walk runs per receiver-bound call site with no (ownerDefId, memberName) memo (the lattice is immutable during the pass, so the result is invariant across sites), enumerates root-to-base paths rather than nodes (the per-path new Set(active) at 199 re-walks a node reachable via K paths K times → exponential on deep non-virtual diamonds), and feeds an O(occurrences²) dominance filter that runs a fresh isAncestor BFS per pair. Bounded by realistic C++ today, but given this indexer's documented O(n²)/exponential history on kernel-scale and generated code: add a per-pass memo keyed ${ownerId}\0${memberName} (cache the callsite-independent occurrence/dominance result; run only chooseOverload per site) and precompute an ancestor closure instead of per-pair BFS. No resolution-phase benchmark currently guards this. (performance lane.)

@abhigyanpatwari abhigyanpatwari left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hermes Agent Review

Verdict: Approve with suggestions — no critical issues. The algorithm is sound, the integration is clean, and the tests cover the right cases. Three suggestions below.


✅ What Looks Good

The inheritance-lattice algorithm is correct. The core loop in resolveCppReceiverMember (lines 117-181) correctly handles:

  1. Member-using declarations (using Base::select) — resolves them first via memberUsingsByDefId, picks the right overload via chooseOverload. Return-early at line 136 avoids descending into the lattice when using-declarations are in play — this is correct: using brings the base member into the derived class scope, so it is found directly.
  2. Direct declarations hide base declarationsif (ownMethods.length > 0) return undefined at line 140 correctly lets the shared MRO walk handle the simple case.
  3. DFS-based inherited occurrence collectioncollectInheritedOccurrences walks directParentsByDefId to find members in the inheritance lattice. Cycle detection via active set prevents infinite loops.
  4. Non-dominated filter — lines 152-159 correctly prune dominated occurrences (those where the member exists in an ancestor that is in the MRO of another occurrence owner). This handles the Dominant : Left, Right { void collide(); } case — the derived class declaration dominates both base declarations.
  5. Virtual-diamond deduplication via grouping keys — lines 160-169 group occurrences by (virtualAnchor, ownerDefId) or (path, ownerDefId). When a shared member comes through two virtual-base paths, both occurrences share the same virtualAnchor, so they merge into one group → groups.size === 1 → not ambiguous.
  6. Non-virtual diamond ambiguity — when the same declaration is reached through two non-virtual base sub-objects, each occurrence has a different path key → groups.size !== 1 → reported as ambiguous. This matches the C++ standard: you must qualify which base sub-object you mean.

The side-channel pattern is consistent. captureCppMemberLookupFacts extracts baseEdges and memberUsings at capture time; applyCppMemberLookupSideChannel restores on the worker path; populateResolvedHierarchy builds the resolved directParentsByDefId, virtualEdges, and memberUsingsByDefId maps from the captured facts + the graph EXTENDS edges. The integration into collectCppCaptureSideChannel / applyCppCaptureSideChannel follows the established pattern (same as ADL, file-local, inline-namespace, two-phase channels).

The buildMro contract extension is clean. Adding buildMro to ScopeResolver as a per-language override lets C++ build the MRO from its own member-lookup hierarchy. The buildCppMemberLookupMro wrapper delegates to the shared buildMro with defaultLinearize, so it only adds the populateResolvedHierarchy pre-step.

Test coverage is thorough. The fixture cpp-member-lattice/main.cpp covers:

  • Unrelated-base ambiguity (Ambiguous : Left, Right)
  • Derived hiding (Dominant : Left, Right { void collide(); })
  • Virtual diamond (VirtualLeft : virtual Root, VirtualRight : virtual Root → VirtualDiamond)
  • Non-virtual diamond (PlainLeft : Root, PlainRight : Root → PlainDiamond)
  • Member-using overloads (Derived : Base { using Base::select; void select(double); })

And cpp.test.ts has 6 assertions:

  • Ambiguous call: 0 CALLS edges
  • Dominant call: 1 CALLS edge, targets main.cpp
  • Virtual diamond: 1 CALLS edge
  • Non-virtual diamond: 0 CALLS edges
  • Using-call: resolves to select(int) overload
  • Resolution outcomes: both conservatively suppressed with reason member-lookup-ambiguous

💡 Suggestions

1. isAncestor uses BFS with shift() — O(n²) memory churn (line 398-409)

isAncestor uses queue.shift() in a loop. For deep class hierarchies, this is O(n²) because shift() on a large array re-indexes all elements. Consider using a manual index pointer instead:

function isAncestor(ancestorDefId: string, descendantDefId: string): boolean {
  const queue = [...(directParentsByDefId.get(descendantDefId) ?? [])];
  const seen = new Set<string>();
  let head = 0;
  while (head < queue.length) {
    const current = queue[head++];
    if (current === ancestorDefId) return true;
    if (seen.has(current)) continue;
    seen.add(current);
    const parents = directParentsByDefId.get(current);
    if (parents !== undefined) queue.push(...parents);
  }
  return false;
}

This is O(n) and avoids array re-indexing on every iteration. Not a blocker — this is called in the undominated filter at line 152, and C++ class hierarchies are rarely deep enough to matter. But given how clean the rest of the code is, fixing this micro-optimization is worth it.

2. populateResolvedHierarchy silently ignores graphIdToClassDef mismatches (line 327-328)

populateResolvedHierarchy builds defByGraphId and parents from graph.iterRelationshipsByType(EXTENDS). Then when building nextVirtualEdges, it resolves captured baseName by finding the EXTENDS parent whose simpleName matches (line 309-310). If the graph EXTENDS relationship and the captured base-edge describe the same inheritance but with different identifiers, this silently produces no virtual edge. Consider logging a warning when edge.isVirtual && parent === undefined so debugging is easier when resolution fails.

3. Consider caching collectInheritedOccurrences results per (ownerDefId, memberName)

collectInheritedOccurrences re-walks the entire inheritance lattice for every call site with the same (ownerDefId, memberName). In large C++ codebases with many ptr->method() calls on the same type, this is wasted work. A simple Map cache keyed by ${ownerDefId}:${memberName} could help. Not urgent — the current implementation is correct and the walk only hits classes that actually have the member (line 203 early-returns on first definition found).


Risk Assessment

  • Correctness risk: LOW. The algorithm correctly implements C++ member-name lookup semantics (direct hiding, virtual diamond deduplication, non-virtual diamond ambiguity).
  • Performance risk: LOW. The lattice walk only recurses into base classes that do NOT have the member themselves, and stops at the first occurrence. Only diamond cases cause deeper traversal.
  • Integration risk: VERY LOW. The resolveReceiverMember hook is opt-in — only C++ sets it, all other languages continue using the shared MRO walk. The side-channel follows the established pattern used by ADL, two-phase lookup, file-local linkage, and inline namespaces.

Reviewed by Hermes Agent via Claude Code analysis

@azizur100389

azizur100389 commented Jun 9, 2026

Copy link
Copy Markdown
Contributor Author

Addressed the tri-review findings in commit 9247d8d.

Changes:

  • replaced the greedy template regex with top-level qualified-segment parsing (A::B now normalizes to A.B), clearing the functional defect behind CodeQL [SM-18] Delete lookupFuzzy, lookupFuzzyCallable, globalIndex, callableIndex #683
  • fixed non-virtual diamond dominance: ancestry only suppresses a candidate reached through a shared virtual subobject
  • preserved qualified child/base identities in the worker side-channel and conservatively suppresses unresolved same-simple-name collisions
  • merged intermediate-class own methods with using Base::f overloads before terminating lookup
  • routed explicit this->member() through the same lattice hook
  • added per-pass (owner, member) lookup caching, precomputed ancestor closure, and a conservative traversal budget
  • re-baselined the C++ capture fingerprint after expanding the fixture corpus; 14-language benchmark passes, C++ scaling 1.032 < 1.5

Regressions added for non-virtual override diamonds, inherited using overloads, qualified same-name bases, nested templated bases, explicit this receivers, cross-file bases, and worker JSON round-trip.

Validation:

  • npx tsc --noEmit
  • full C++ resolver suite: 311/311 passed
  • focused side-channel test: passed
  • format check: passed
  • ESLint: 0 errors
  • scope-capture benchmark: PASS (14 languages)
  • full npm test executed: 10,365 passed / 61 skipped. Local non-PR failures were isolated: temp-directory git tests pass 114/114 with TMP/TEMP outside the enclosing C:\Users\erazi.git; the remaining literal-collector assertion is unchanged from upstream and outside this PR's files. CI will provide the clean environment confirmation.

@magyargergo magyargergo merged commit 3a4247e into abhigyanpatwari:main Jun 9, 2026
27 of 28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

cpp lookup: inheritance-lattice member lookup (dominance, ambiguous/virtual bases, member using-declarations)

4 participants