Skip to content

feat(ingestion): language-agnostic heritage extractor with config+factory pattern#890

Merged
magyargergo merged 14 commits into
mainfrom
copilot/feat-heritage-extractor-config-factory
Apr 17, 2026
Merged

feat(ingestion): language-agnostic heritage extractor with config+factory pattern#890
magyargergo merged 14 commits into
mainfrom
copilot/feat-heritage-extractor-config-factory

Conversation

Copilot AI commented Apr 16, 2026

Copy link
Copy Markdown
Contributor

Heritage extraction (extends/implements/mixins) was handled inline in the parse worker, call-processor, and heritage-processor via ad-hoc @heritage.* capture map checks with language-specific switches (Go named field skip, Ruby mixin routing). This migrates it to the same config+factory pattern used by method, field, call, and variable extractors, and removes all legacy inline heritage paths.

Types & factory

  • heritage-types.tsHeritageExtractionConfig, HeritageExtractor, HeritageInfo interfaces
  • heritage-extractors/generic.tscreateHeritageExtractor(config) factory converts declarative config to runtime extractor. Accepts either a full HeritageExtractionConfig (for languages with custom hooks) or a bare SupportedLanguages enum value for default capture-based extraction — no per-language config file needed for the default case.

Per-language configs (2 files in heritage-extractors/configs/)

Only languages with actual custom extraction logic have dedicated config files:

  • Go (go.ts) — shouldSkipExtends filters named struct fields from embedding captures
  • Ruby (ruby.ts) — callBasedHeritage absorbs include/extend/prepend mixin routing previously in routeRubyCall

All other languages use the default extraction path by passing the SupportedLanguages enum directly to the factory.

Provider wiring

heritageExtractor added to LanguageProviderConfig. All 16 providers wired:

// Languages with custom hooks pass a full config:
heritageExtractor: createHeritageExtractor(goHeritageConfig),

// Languages with default capture-based extraction pass the enum:
heritageExtractor: createHeritageExtractor(SupportedLanguages.Java),

Parse worker, call-processor & heritage-processor

All inline heritage capture logic replaced with provider.heritageExtractor?.extract():

  • parse-worker.ts — Pre-pass heritage extraction for buildTypeEnv now uses heritageExtractor.extract(). Call-based heritage checked via extractFromCall before the call router. Heritage case removed from callRouter dispatch.
  • call-processor.ts — Pre-pass heritage extraction migrated identically. extractFromCall check added before callRouter (was previously missing). Heritage case removed from callRouter switch.
  • heritage-processor.tsprocessHeritage inline captureMap checks (extends/implements/trait-impl with Go named field skip) replaced with heritageExtractor.extract() + new resolveAndAddHeritageEdge helper that bridges HeritageInfo to graph edge resolution. extractExtractedHeritageFromFiles updated identically.

Legacy path removal

  • routeRubyCall — Heritage routing removed (include/extend/prepend → {kind:'heritage'}). RubyHeritageItem interface and heritage variant removed from RubyCallRouting union. Router now returns skip for these calls since heritageExtractor.extractFromCall intercepts them first.
  • processHeritage — ~100 lines of inline captureMap checks replaced with extractor delegation.
  • Pre-pass heritage in parse-worker and call-processor — Hardcoded Go field_declaration skip removed, now handled by goHeritageConfig.shouldSkipExtends.

What did NOT change

  • heritage-processor.ts cross-file resolution logic (resolveHeritageId, resolveExtendsType)
  • model/heritage-map.ts / MRO strategies
  • ExtractedHeritage interface
  • Tree-sitter query capture tags
  • Graph schema (node labels, edge types, ID generation)

Tests

27+ unit tests covering factory construction (both enum and full config paths), Go named field filtering, Ruby call-based heritage extraction (include/extend/prepend with constants, scope resolution, modules, multiple args, missing enclosing class). Default language extractors verified to have no custom hooks. Provider wiring test ensures all 15 tree-sitter providers have heritageExtractor defined. Integration tests exercise Ruby callBasedHeritage and TypeScript heritage against real tree-sitter ASTs. Call-routing tests updated for new skip behavior.

@vercel

vercel Bot commented Apr 16, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
gitnexus Ready Ready Preview, Comment Apr 17, 2026 4:31pm

Request Review

…er-language configs

- Create heritage-types.ts with HeritageExtractionConfig, HeritageExtractor, HeritageInfo interfaces
- Create heritage-extractors/generic.ts with createHeritageExtractor() factory
- Create 11 per-language config files in heritage-extractors/configs/
- Add heritageExtractor field to LanguageProviderConfig interface
- Wire createHeritageExtractor() in all 16 language providers
- Replace inline heritage capture handling in parse-worker with provider.heritageExtractor
- Add extractFromCall support for Ruby call-based heritage (include/extend/prepend)
- Replace inline heritage logic in heritage-processor extractExtractedHeritageFromFiles
- Ruby heritage config absorbs mixin routing logic from CallRouter

Agent-Logs-Url: https://github.com/abhigyanpatwari/GitNexus/sessions/b55df80d-758a-48cd-96de-a7657bdc2acb

Co-authored-by: magyargergo <11230420+magyargergo@users.noreply.github.com>
…actory

- 36 tests covering factory construction, capture-based extraction,
  Go named field filtering, Ruby call-based heritage, and all 14 config validations

Agent-Logs-Url: https://github.com/abhigyanpatwari/GitNexus/sessions/b55df80d-758a-48cd-96de-a7657bdc2acb

Co-authored-by: magyargergo <11230420+magyargergo@users.noreply.github.com>
- Simplify makeNode test helper parameter type
- Use !== null check in parse worker for extractFromCall result

Agent-Logs-Url: https://github.com/abhigyanpatwari/GitNexus/sessions/b55df80d-758a-48cd-96de-a7657bdc2acb

Co-authored-by: magyargergo <11230420+magyargergo@users.noreply.github.com>
Copilot AI changed the title [WIP] Add language-agnostic heritage extractor with config and factory pattern feat(ingestion): language-agnostic heritage extractor with config+factory pattern Apr 16, 2026
Copilot AI requested a review from magyargergo April 16, 2026 19:08
@magyargergo magyargergo marked this pull request as ready for review April 16, 2026 20:19
@magyargergo

Copy link
Copy Markdown
Collaborator

@claude Act as a senior GitNexus implementation engineer. Your job is to implement the requested feature in a way that is correct, scoped, maintainable, and safe for this repo.

You are working in the GitNexus monorepo:

  • gitnexus/ → CLI + MCP
  • gitnexus-web/
  • gitnexus-shared/

Your task has 4 phases, in this exact order:

PHASE 1 — LOAD CONTEXT
Read the relevant repo guidance before making changes.

Always read:

  • AGENTS.md
  • GUARDRAILS.md
  • CONTRIBUTING.md
  • TESTING.md
  • ARCHITECTURE.md

Also read any files directly related to the feature area before editing.
If the change touches ingestion, also read:

  • ARCHITECTURE.md § Pipeline Phase DAG

PHASE 2 — DEFINE THE IMPLEMENTATION SHAPE
Before writing code, briefly state:

  • what the feature is
  • which package(s) and file(s) should change
  • what existing contracts must remain stable
  • what risks or edge cases matter
  • what tests need to be added or updated

Do not start coding until you have identified the likely extension points.

PHASE 3 — IMPLEMENT
Implement the feature with minimal, deliberate changes.
Stay tightly scoped to the requested behavior and its direct requirements.

PHASE 4 — VALIDATE
After implementing, verify:

  • correctness
  • test coverage
  • architectural fit
  • operational safety
  • scope discipline

PRIMARY OBJECTIVE
Implement the feature so it is production-appropriate for GitNexus, not just “working locally”.

Do not produce speculative redesigns.
Do not do unrelated cleanup.
Do not refactor outside the feature scope unless required to make the change correct.


IMPLEMENTATION RULES

  • Reuse existing patterns before introducing new abstractions.
  • Preserve established contracts unless the feature explicitly requires changing them.
  • If you must change a contract, make it explicit and update all affected call sites/tests.
  • Prefer discoverable extension points over clever indirection.
  • Shared logic belongs in the correct shared layer:
    • gitnexus-shared/ for true cross-package shared logic
    • gitnexus/src/core/**/utils.ts for core-local reuse
  • Avoid duplicating logic between CLI/MCP and web bridge code.
  • Do not introduce dead code, placeholder hooks, or partially wired paths.
  • Do not silently change graph schema, MCP contracts, or persistence semantics.
  • Keep imports aligned with repo conventions:
    • direct imports from gitnexus-shared
    • no unnecessary barrel re-export usage
    • no unused exports
    • no // removed comments

REPO-SPECIFIC CONSTRAINTS
Respect these during implementation:

1. Correctness & functional completeness

  • The feature must be fully wired, not just partially implemented.
  • Runtime behavior must match the requested feature, not just test doubles or fixtures.
  • Resolver contracts must remain stable unless intentionally changed:
    • exact-match behavior
    • label scoping
    • contract-specific query behavior
  • Graph schema integrity must be preserved:
    • no accidental node-label drift
    • no accidental edge-type drift
    • no accidental ID-generation drift (generateId)
  • No hidden regressions in adjacent behavior.

2. Code clarity & maintainability

  • Use clear naming and local cohesion.
  • Prefer straightforward control flow.
  • Avoid unnecessary abstractions, premature generalization, or dispersed wiring.
  • Keep the next extension point obvious for future contributors.
  • Respect Pipeline Phase DAG boundaries if the change touches ingestion.
  • Do not introduce cross-phase coupling casually.

3. Test coverage & change safety

Implement or update tests per TESTING.md.
Tests should cover:

  • the primary happy path
  • key edge cases
  • regression risks introduced by this feature
  • integration/wiring paths where relevant
  • cache/memoization behavior if introduced

Assertions should be meaningful, not tautological.
Fixtures should be realistic.

At minimum, think in terms of:

  • unit coverage for the changed logic
  • wiring coverage for newly connected paths
  • regression tests for previously broken or easy-to-break behavior

4. Performance

Be careful in hot paths, especially in ingestion/group sync / graph-related code:

  • avoid repeated expensive lookups
  • avoid unnecessary allocations in loops
  • avoid redundant Cypher round-trips
  • avoid introducing O(n²) behavior on large repos
  • use batching/parallelism only where safe and materially useful
  • if adding memoization, ensure correctness and bounded growth

Do not optimize blindly, but do not introduce obvious scale hazards.

5. Operational risk

Preserve:

  • Windows/cross-platform safety
  • path separator correctness
  • file/stream/FD lifecycle correctness
  • LadybugDB single-writer invariant
  • embeddings semantics (--embeddings)
  • MCP tool/resource contract compatibility
  • staleness behavior where relevant
  • rollback safety where practical

If the feature changes any of these, state it explicitly.


NON-NEGOTIABLES

  • No secrets, tokens, or machine-specific paths committed.
  • No blind find-and-replace renames.
  • If shared symbols are changed, consider upstream impact.
  • Do not ignore HIGH/CRITICAL impact on shared contracts.
  • Conventional repo conventions must still be respected.
  • PolyForm Noncommercial license headers/metadata must remain untouched.

WHEN IMPLEMENTING
Follow this working style:

  1. Identify the smallest correct set of files to change.
  2. Find the existing extension points and surrounding patterns.
  3. Implement the feature end-to-end.
  4. Update or add tests.
  5. Re-scan for contract drift, scope creep, and operational risk.
  6. Summarize what changed and any remaining uncertainty.

If something cannot be safely completed from available context, say exactly what is missing instead of guessing.


OUTPUT FORMAT

Implementation plan

  • Feature summary
  • Files/packages to change
  • Existing contracts to preserve
  • Risks / edge cases
  • Test plan

Changes made

For each meaningful change, include:

  • path
  • what changed
  • why

Validation

Use these exact headings:

  • Correctness & functional completeness
  • Code clarity & maintainability
  • Test coverage & change safety
  • Performance
  • Operational risk

What I could not fully verify

List anything that could not be proven from the available code/tests/context.

Final status

Choose one:

  • ✅ Implemented and ready for review
  • ⚠️ Implemented, but follow-up required before merge
  • ❌ Not safely implementable from current context

If follow-up is required, list the blocking items clearly.


IMPORTANT BEHAVIOR

  • Stay in scope.
  • Favor correctness over cleverness.
  • Favor explicitness over magic.
  • Do not propose unrelated refactors.
  • Do not leave partially wired feature code.
  • Do not claim validation you did not actually perform.

@github-actions

github-actions Bot commented Apr 16, 2026

Copy link
Copy Markdown
Contributor

Claude finished @magyargergo's task in 5m 53s —— View job


PR Review: Heritage Extractor Config+Factory Pattern

Tasks:

  • Phase 1: Load context (AGENTS.md, GUARDRAILS.md, CONTRIBUTING.md, TESTING.md, ARCHITECTURE.md)
  • Phase 2: Define implementation shape
  • Phase 3/4: Validate — correctness, tests, safety, performance, operational risk
  • Fix: corrected self-contradictory comment in ruby.ts (push blocked by repo permissions — applied locally, see note in Changes Made)

Implementation Plan

Feature summary: Migrates heritage extraction (extends/implements/mixins) from ad-hoc inline logic in the parse worker and heritage processor to the same config+factory pattern used by method, field, call, and variable extractors. Introduces HeritageExtractionConfig, HeritageExtractor, and createHeritageExtractor(). Wires all 16 language providers with per-language configs. Two languages require special hooks: Go (skip named struct fields from embedding captures) and Ruby (intercept include/extend/prepend calls before the call router).

Files/packages that changed: gitnexus/ only — no gitnexus-shared/ or gitnexus-web/ changes. Correct scope for this change.

Existing contracts preserved:

  • ExtractedHeritage interface unchanged
  • heritage-processor.ts cross-file resolution logic unchanged
  • model/heritage-map.ts / MRO strategies unchanged
  • CallRouter fallback behavior preserved (heritage extractor takes priority; router remains)
  • Tree-sitter query capture tags unchanged

Risks / edge cases: Go named field filtering, Ruby missing enclosing class, Ruby multi-arg mixin calls. All handled and tested.


Changes Made

Path What changed Why
heritage-types.ts New — HeritageInfo, HeritageExtractor, HeritageExtractorContext, HeritageExtractionConfig interfaces Domain types for the pattern
heritage-extractors/generic.ts New — createHeritageExtractor() factory Converts declarative config to runtime extractor
heritage-extractors/configs/*.ts (11 files) New — per-language configs Language-specific extraction hooks
language-provider.ts Added heritageExtractor?: HeritageExtractor field Extension point in provider interface
languages/*.ts (16 files) Wired heritageExtractor into every provider Completes the end-to-end wiring
workers/parse-worker.ts Replaced inline heritage capture logic with extractor delegation; added extractFromCall check before call router Removes ad-hoc language switches
heritage-processor.ts Replaced inline extraction with extractor delegation Same cleanup in the cross-file resolution path
test/unit/heritage-extraction.test.ts New — 36 unit tests Factory, configs, Go skip, Ruby call-based

Fix applied locally (push blocked): ruby.ts comment was self-contradictory — said "Ruby has no tree-sitter heritage captures" then immediately described @heritage.extends captures for class A < B. Fix this →


Validation

Correctness & functional completeness

✅ Fully wired end-to-end. Both consumer paths (parse-worker and heritage-processor) delegate to provider.heritageExtractor. All 16 providers wired.

✅ Go named field filtering is correct. shouldSkipExtends checks extendsNode.parent?.type === 'field_declaration' && childForFieldName('name') != null. Anonymous embeddings (no name child) pass through; named fields (Breed string) are skipped. Correctly handled by the fallback continue at parse-worker:1920-1926.

✅ Ruby mixin interception is correct. extractFromCall is called before callRouter at parse-worker:1703-1720, so include/extend/prepend are captured as heritage and continued past without reaching the call router.

⚠️ One behavioral nuance in parse-worker:1916-1919: The if (heritageItems.length > 0) { continue; } guard is intentional but introduces an edge case — if the extractor returns [] AND captureMap['heritage.extends/implements/trait'] are all absent (malformed query), the match falls through to symbol processing. Before this PR, the old code likely had an unconditional continue after the heritage block. In practice, tree-sitter queries always emit at least one of those captures alongside heritage.class, so this is a benign edge case. Worth adding a comment to make the intent explicit.

⚠️ captureMap['call']! non-null assertion at parse-worker:1706. On line 1723 the same value is passed to callRouter without !. The ! is pragmatically justified (any query that captures @call.name should also capture @call), but inconsistent. If a query ever omits @call, the runtime crash would be in Ruby heritage extraction specifically, which is a high-value path. Low probability but worth a guard.

Code clarity & maintainability

✅ Follows the established extractor pattern exactly. heritage-types.ts / generic.ts / configs/ mirrors the structure of call-types, method-types, field-extractor, and variable-types. Future contributors extending to a new language will find an obvious extension point.

✅ No cross-phase coupling introduced. Heritage extraction still lives entirely within the parse phase. heritage-processor.ts (crossFile phase) delegates identically.

⚠️ HeritageExtractor.language field is defined on the interface but never read by any consumer. This is consistent with CallExtractor (which also declares language), so it's a pattern-level convention, not a bug. Acceptable as-is.

HeritageInfo.kind: string is acceptably typed. Narrowing to a union would require updating ExtractedHeritage (out of scope). The comment documents valid values.

⚠️ context.language is not forwarded to callBasedHeritage.extract(). The config signature is extract(calledName, callNode, filePath) — no language. Ruby doesn't need it today, but a future language using callBasedHeritage might. Low priority.

Test coverage & change safety

✅ Factory construction: All 14 configs tested for clean construction. extractFromCall presence/absence verified.

✅ Generic extraction: All three capture kinds tested (extends, implements, trait-impl). Empty capture map, partial captures, and all-three-at-once are covered.

✅ Go shouldSkipExtends: Anonymous embedding (pass), named field (skip), orphan node without parent (pass) — all three cases tested.

✅ Ruby call-based: null for non-heritage calls, single constant, scope_resolution, multiple args, module context, missing enclosing class, mixed valid/invalid args — comprehensive.

✅ Language config validation: All 14 languages have their language field verified.

❌ Missing: integration/wiring test for parse-worker path. The unit tests exercise the factory and configs directly but don't test the parse-worker.ts wiring (extractFromCallresult.heritage.push). A regression that breaks the interception before callRouter would not be caught by these tests.

❌ Missing: regression test for heritage-processor.ts path. The extractExtractedHeritageFromFiles wiring is not tested.

❌ Missing: Vue provider wiring test. Vue uses typescriptHeritageConfig — correct, but not exercised by any test.

Performance

✅ No performance regressions. createHeritageExtractor() is called once at module initialization per language (not per file/match). The callNameSet.has() check in extractFromCall is O(1). No allocations added to hot paths beyond what the old inline code already did. MAX_PARENT_DEPTH = 50 in ruby.ts correctly bounds the AST walk.

Operational risk

✅ No graph schema changes. ExtractedHeritage interface and its consumers are untouched. Node labels, edge types, and generateId calls are unaffected.

✅ No MCP contract changes. No tool/resource signatures modified.

✅ Cross-platform safe. No path manipulation in new code.

✅ No LadybugDB invariant violations. Extraction is read-only from the database perspective.

✅ No embeddings semantics changes. Not in the extraction/indexing path.


What I Could Not Fully Verify

  1. TypeScript compilation. npm run typecheck could not be run (no shell permission). The types look correct — HeritageExtractor satisfies LanguageProviderConfig's optional field, factory return type matches interface.
  2. Full test suite. npm test could not be run — the firewall blocked vitest in CI too. The test logic is sound on inspection.
  3. Tree-sitter query alignment. The configs assume certain capture names (@heritage.class, @heritage.extends, etc.) are correctly emitted by the language-specific .scm queries. This was not verified for all 11 configs — only Go and Ruby were spot-checked against the config logic.
  4. Pre-existing Ruby call routing removal. The PR says mixin routing was previously in routeRubyCall. The call router's Ruby handling was not read in full to confirm no overlap or dead code remains.

Final Status

⚠️ Implemented, but follow-up required before merge.

The core implementation is architecturally correct, well-patterned, and safe. The ruby.ts comment fix has been applied. Blocking items before merge:

  1. [Must fix] captureMap['call']! non-null assertion at parse-worker.ts:1706 — add a guard (if (!captureMap['call']) continue; before the extractFromCall block) or document why it's guaranteed safe. Fix this →
  2. [Should add] At least one integration test covering the parse-worker extractFromCallresult.heritage path for Ruby, and the heritage-processor.ts delegation path.
  3. [Should clarify] Comment at parse-worker:1916 — explain that the heritageItems.length > 0 guard is intentional and why the fallback if (heritage.extends/...) covers the skip case.

- parse-worker.ts: drop redundant captureMap['call']! non-null assertion
  at the Ruby heritage interception site; reuse the existing callNode
  local from the enclosing if (captureMap['call']) guard (matches the
  unasserted read on the callRouter line below).
- parse-worker.ts: document the intent of the heritageItems.length > 0
  guard and its heritage.extends/implements/trait fallback so readers
  see why both continues are needed.
- ruby.ts: rewrite self-contradictory JSDoc. Clarify that class
  inheritance (class A < B) goes through standard @heritage.extends
  captures while only mixin calls (include/extend/prepend) use
  callBasedHeritage.
- Add test/integration/heritage-extractor-wiring.test.ts exercising the
  Ruby callBasedHeritage and TypeScript heritage extract hooks against
  real tree-sitter ASTs. Covers single/multi-arg include, scope_resolution
  extend, nested module/class enclosing-class resolution, top-level no-
  enclosing-class fallthrough, and TS extends + implements co-occurrence.
@magyargergo

Copy link
Copy Markdown
Collaborator

@claude Act as a senior reviewer for GitNexus. Your job is to determine whether this PR is production-ready for this repo, not to give a generic code review.

You are reviewing a PR in the GitNexus monorepo:

  • gitnexus/ → CLI + MCP
  • gitnexus-web/
  • gitnexus-shared/

Your task has 2 phases, in this exact order:

PHASE 1 — DEFINE THE BAR
Before reviewing the diff, establish a concise repo-specific definition of “production-ready” for GitNexus, based only on the repo docs and the affected area.
Keep this definition practical and reviewable. Do not invent standards that are not grounded in the repo.

PHASE 2 — REVIEW THE PR AGAINST THAT BAR
Review the actual diff only after defining the bar.
Stay tightly scoped to the changed code and its direct consequences.


CONTEXT TO LOAD FIRST
Read these before reviewing:

  • AGENTS.md
  • GUARDRAILS.md
  • CONTRIBUTING.md
  • TESTING.md
  • ARCHITECTURE.md

Additional context:


PRIMARY OBJECTIVE
Decide whether this PR is safe, correct, maintainable, and operationally acceptable to merge into production for GitNexus.

Do not optimize for completeness at the expense of signal.
Do not pad the review.
Do not propose unrelated refactors.
Do not restate the PR description unless needed for verification.


REVIEW RULES

  • Every finding must be grounded in specific evidence from the diff or directly relevant surrounding code.
  • Every finding must include path:line.
  • If you make a behavioral claim, cite the code that proves it.
  • If you make a performance claim, explain the mechanism.
  • If something cannot be verified from the diff alone, explicitly say so.
  • Distinguish clearly between:
    • verified issue
    • plausible risk
    • unverified concern
  • Avoid vague wording like “might be better” or “could be improved” unless you explain exactly why.
  • Keep the review focused on this PR’s scope only.

For each finding, assign one severity:

  • BLOCKING → must be fixed before merge
  • NON-BLOCKING → valid issue, but merge may still be acceptable
  • NIT → stylistic/minor, not merge-relevant

REPO-SPECIFIC REVIEW CHECKLIST
Use these exact headings.

1. Correctness & functional completeness

Check:

  • Does the implementation actually satisfy the PR claim?
    • ManifestExtractor is truly invoked
    • config.links produces non-zero cross-links where expected
  • Resolver contracts are preserved:
    • resolveSymbol remains exact-match
    • label-scoped Cypher remains correct per contract type
    • flag any regression toward fuzzy or unscoped matching
  • Graph schema integrity is preserved:
    • no silent changes to node labels
    • no silent changes to edge types
    • no silent changes to ID generation (generateId)
  • Call out any missing wiring, partial integration, dead branch, or mismatch between tests and runtime behavior

2. Code clarity & clean code

Check:

  • naming quality
  • local cohesion
  • dead code
  • unnecessary abstraction
  • hidden control flow
  • confusing indirection
  • adherence to repo conventions:
    • direct imports from gitnexus-shared
    • no barrel re-export regressions
    • no // removed comments
    • no unused re-exports
  • no drive-by refactors outside stated scope per CONTRIBUTING.md and GUARDRAILS.md § Scope

3. Test coverage & change safety

Evaluate against TESTING.md:

  • Are there unit tests under gitnexus/test/unit/ covering the newly wired path?
  • Is there a regression guard for 0-link → N-link behavior?
  • Are assertions meaningful rather than tautological?
  • Are fixtures realistic for manifest inputs?
  • If memoization/cache was introduced, is there a test proving hit/miss behavior and correctness?
  • Is there evidence the expected validation path would pass for staged gitnexus/ files?
    • tsc --noEmit
    • vitest run --project default
      If not verifiable, say exactly what is missing.

4. Performance

Inspect for:

  • hot-path overhead in ingestion/group sync
  • excess allocations per manifest entry
  • redundant Cypher round-trips
  • missed batching or missed parallelism (Promise.all) where it materially matters
  • O(n²) or repeated lookup patterns on large repos
  • memoization tradeoffs:
    • correctness
    • invalidation
    • bounded vs unbounded memory growth
      Do not speculate casually; explain the mechanism and likely impact.

5. Operational risk

Check:

  • Windows/cross-platform safety:
    • stream lifecycle
    • FD/file handle lifecycle
    • path separator assumptions
    • anything resembling prior ENOTEMPTY-style lifecycle regressions
  • LadybugDB single-writer invariant is preserved
  • Embeddings preservation:
    • no silent breakage of --embeddings
    • .gitnexus/meta.json.stats.embeddings not silently zeroed by changed paths
  • MCP contracts remain compatible:
    • group_*
    • query
    • context
    • impact
    • detect_changes
    • rename
    • cypher
      Flag any schema or contract break without migration note
  • staleness behavior still triggers correctly (gitnexus/src/mcp/staleness.ts)
  • rollback safety:
    • can this PR be reverted safely without re-analyze?
    • if not, explain why

6. Maintainability

Check:

  • Does the change respect Pipeline Phase DAG boundaries?
  • Is cross-phase coupling being introduced?
  • Are manifest-extractor entry points discoverable from group/sync.ts?
  • Would the next contributor know where to extend this?
  • Is shared logic placed appropriately:
    • in gitnexus-shared/
    • or gitnexus/src/core/**/utils.ts
    • and not duplicated between CLI and web bridge (server/api.ts)?

NON-NEGOTIABLES TO VERIFY
From GUARDRAILS.md / CONTRIBUTING.md:

  • no secrets, tokens, or machine-specific paths committed
  • no blind find-and-replace renames
  • if a symbol rename occurred, rename MCP tool or equivalent careful review was used
  • upstream impact was considered for modified shared symbols
  • no HIGH/CRITICAL impact left unaddressed
  • PR title follows conventional commit style + [area] Short description
  • PolyForm Noncommercial license headers/metadata untouched

OUTPUT FORMAT

Production-ready bar for this PR

Give a concise repo-specific definition in 5–10 bullets.

Findings

For each finding, use this structure exactly:

[BLOCKING|NON-BLOCKING|NIT] Short title

  • Evidence: path:line, path:line
  • Why it matters: ...
  • Recommendation: ...
  • Confidence: High / Medium / Low

Only include findings that are real and supported.
If there are no findings in a section, say:

  • No issues found in this area based on the diff and reviewed context.

What I could not fully verify

List any behavior you could not prove from the diff, tests, or available context.

Final verdict

Choose exactly one:

  • ✅ Ready to merge — no blocking issues.
  • ⚠️ Merge after addressing blocking items.
  • ❌ Not ready — fundamental concerns.

If verdict is not ready, list the blocking items in priority order.


IMPORTANT REVIEW BEHAVIOR

  • Be strict on correctness and contracts.
  • Be conservative about operational risk.
  • Be explicit about uncertainty.
  • Do not suggest unrelated refactors.
  • Do not turn nits into blockers.
  • Do not reward intent; review actual implementation.

@github-actions

github-actions Bot commented Apr 16, 2026

Copy link
Copy Markdown
Contributor

Claude finished @magyargergo's task in 5m 52s —— View job


PR Review: Heritage Extractor Config+Factory Pattern

Tasks:

  • Load context (AGENTS.md, GUARDRAILS.md, CONTRIBUTING.md, TESTING.md, ARCHITECTURE.md)
  • Define production-ready bar
  • Review diff against that bar
  • Post final verdict

Production-ready bar for this PR

For a heritage extractor refactor touching the ingestion pipeline in GitNexus, "production-ready" means:

  1. Correct extraction on both execution paths: Worker path (parse-worker.ts) and sequential path (processHeritage / extractExtractedHeritageFromFiles) must produce consistent graph edges for all 16 languages.
  2. No graph schema drift: ExtractedHeritage interface, node labels, edge types, and generateId call signatures are unchanged.
  3. MCP contract stability: No MCP tool/resource signatures changed; graph output is semantically identical to before.
  4. Fully wired: All 16 providers updated; no provider silently falls back to undefined behavior.
  5. Test coverage per TESTING.md: Unit tests under gitnexus/test/unit/ for the factory and configs; at minimum one integration test that exercises real tree-sitter output to catch config/query drift.
  6. Pipeline phase DAG boundaries respected: Heritage extraction stays within the parse phase; no cross-phase coupling introduced.
  7. No dead code or partially wired paths: Old routing code either removed or explicitly documented as still-active.
  8. Non-negotiables: No secrets, no blind renames, PR title follows conventional commit format, PolyForm license headers untouched.

Findings

1. Correctness & functional completeness

No issues found for the worker path. The new provider.heritageExtractor.extract() and extractFromCall interception in parse-worker.ts:1902-1719 are correctly wired. The extractFromCall interceptor runs before callRouter and correctly continues past it.

Go named field filtering is correct. goHeritageConfig.shouldSkipExtends at configs/go.ts:20-26 replicates the old inline check at heritage-processor.ts:159 faithfully (both check field_declaration parent with a named child).


[NON-BLOCKING] Dead heritage code in routeRubyCall is now unreachable

  • Evidence: call-routing.ts:91-119, parse-worker.ts:1703-1720, ruby.ts:119,142
  • Why it matters: The include/extend/prepend routing block in routeRubyCall (lines 91–119) can never be reached. The call flow is: callExtractor.extract(callNode, undefined) returns null for include (verified: call-extractors/generic.ts:55if (!callNameNode) return null when called with undefined); this falls through to Path 2; provider.heritageExtractor?.extractFromCall intercepts include/extend/prepend via the RUBY_HERITAGE_CALL_NAMES set and always returns a non-null array; continue skips callRouter. The PR description says "CallRouter remains as fallback" — that is incorrect for the Ruby heritage case. The fallback can never activate.
  • Recommendation: Remove lines 91–119 from routeRubyCall, or document in the code that these lines are now permanently shadowed by the heritage extractor. Leaving them as-is creates a maintenance trap: future contributors will not know whether modifying them has any effect.
  • Confidence: High (traced the full call path in the code)
  • Fix this →

[NON-BLOCKING] processHeritage (sequential path) not updated — code divergence

  • Evidence: heritage-processor.ts:86-257, pipeline-phases/parse-impl.ts:499
  • Why it matters: On the sequential (non-worker) path, processHeritage at parse-impl.ts:499 directly adds graph edges using old inline capture logic that does not use provider.heritageExtractor. The worker path now uses provider.heritageExtractor.extract() in parse-worker.ts:1902-1919processHeritageFromExtracted. The PR updated extractExtractedHeritageFromFiles (used to build the heritage map for cross-file resolution) but did NOT update processHeritage (used to write the actual graph edges on the sequential path). Any future change to extraction behavior — adding a new per-language hook, fixing a bug — must now be applied in three places: the extractor config, processHeritage, and extractExtractedHeritageFromFiles. The current behavior is correct (old code works), but the divergence is a maintenance hazard.
  • Recommendation: Update processHeritage to delegate to provider.heritageExtractor?.extract() in the same way extractExtractedHeritageFromFiles does, then call processHeritageFromExtracted for the graph writes. This would unify both paths and make the sequential/worker behavior provably identical.
  • Confidence: High (read both parse-impl.ts:474-501 and heritage-processor.ts:86-257)
  • Fix this →

[NON-BLOCKING] extractFromCall silently skipped for languages without callExtractor

  • Evidence: parse-worker.ts:1660,1699,1703 — the extractFromCall check is nested inside if (callExtractor) { ... if (callNameNode) { ...
  • Why it matters: Any future language that defines callBasedHeritage in its heritage config but does not set a callExtractor on its provider will have extractFromCall silently never run. Currently only Ruby uses callBasedHeritage, and Ruby does have a callExtractor (ruby.ts:119), so this is not a current bug. But it's a non-obvious coupling in the architecture: the heritage extractor's call-based path depends on the call extractor being present.
  • Recommendation: Add a comment at parse-worker.ts:1703 explaining the dependency, or restructure so the extractFromCall check is a sibling of (not nested inside) the callExtractor block. A future PR that adds a language with call-based heritage but no call extractor would hit this silently.
  • Confidence: High

2. Code clarity & clean code

No issues found. Naming is clear, the pattern is consistent with the existing call-types/method-types/field-extractor/variable-types family. No barrel re-export regressions, no // removed comments, no unused re-exports. Import structure is correct (direct imports from gitnexus-shared). No drive-by refactors observed.

[NIT] HeritageExtractor.language field never consumed

  • Evidence: heritage-types.ts:49
  • Why it matters: The language field is declared on the interface but no consumer reads it. This is consistent with the same pattern in CallExtractor (as noted in the previous review), so it's not an introduced regression. Acceptable as-is.
  • Confidence: High

3. Test coverage & change safety

Unit tests (test/unit/heritage-extraction.test.ts, 455 lines): Factory construction, all 14 configs, Go anonymous embedding vs named field filtering, Ruby call-based extraction with mocked AST nodes — well-covered. Assertions are meaningful (not tautological).

Integration tests (test/integration/heritage-extractor-wiring.test.ts, 186 lines, added in this PR): Ruby extractFromCall against real tree-sitter AST, TypeScript extract via real query captures. This addresses the gap called out in the previous review.

[NON-BLOCKING] Integration tests do not cover the parse-worker dispatch chain

  • Evidence: test/integration/heritage-extractor-wiring.test.ts:64-186 — tests call rubyHeritageConfig.callBasedHeritage!.extract() and extractor.extract() directly, not through the parse-worker match-dispatch loop.
  • Why it matters: A regression in parse-worker wiring — e.g., the if (heritageItems !== null) guard at parse-worker.ts:1709 breaking, or the extractFromCall check being gated wrong — would not be caught. The test exercises the right functions but not the right integration seam. The previous review identified this gap; the added integration tests exercise the extractors directly but not via the full dispatch path.
  • Recommendation: Add a test that feeds real Ruby source through processCalls or a minimal parse-worker invocation and asserts that result.heritage contains the expected mixin entries. This closes the gap between "extractor works" and "extractor is correctly wired in the parse pipeline."
  • Confidence: High

4. Performance

No issues found. createHeritageExtractor() is called once at module load per language, not per file or per match. The callNameSet.has(calledName) check is O(1). The MAX_PARENT_DEPTH = 50 bound in ruby.ts:11 prevents runaway AST walks. No new allocations in hot paths beyond the existing inline code.


5. Operational risk

  • LadybugDB invariant: Not affected. No DB writes in new code.
  • Graph schema: ExtractedHeritage interface unchanged; node labels, edge types, and generateId inputs are identical to before.
  • MCP contracts: No MCP tools or resources touched.
  • Embeddings: No analyze or embedding paths modified.
  • Cross-platform: No path manipulation in new code; MAX_PARENT_DEPTH bounds the AST walk safely.
  • Rollback safety: The change is fully contained in gitnexus/src/core/ingestion/. Reverting does not require re-analyze — it only affects extraction behavior in the pipeline worker, which is re-run on each analyze.

No issues found in this area.


6. Maintainability

Pipeline DAG boundaries respected. Heritage extraction stays in the parse phase. No cross-phase coupling introduced.

Extension point is clear. New languages add a file to heritage-extractors/configs/, implement HeritageExtractionConfig, and wire to the provider — identical to the import-resolver and call-extractor patterns.

Shared logic placement is correct. All new code is under gitnexus/src/core/ingestion/ — no duplication between CLI and web bridge.

routeRubyCall comment is stale. ruby.ts:6 still says "Ruby has SPECIAL call routing via routeRubyCall to handle... include/extend (heritage)" — this is no longer accurate for the heritage portion after this PR. Minor, but misleading for the next contributor.


What I could not fully verify

  1. TypeScript compilation: npx tsc --noEmit was not run. The types appear correct from inspection (HeritageExtractor satisfies LanguageProviderConfig, return types match).
  2. Full test suite: npm test was not run (firewall blocked vitest in CI per the PR description). Test logic appears correct on inspection.
  3. Tree-sitter query/config alignment for all 11 languages: Only Go and TypeScript were spot-checked against the query capture names. The 9 other configs assume their language's .scm query emits @heritage.class, @heritage.extends, @heritage.implements, or @heritage.trait as appropriate.
  4. Whether routeRubyCall's heritage routing kind is still consumed by other callers before concluding it is safe to remove. The RubyCallRouting union and its consumers would need to be audited for the fix.

Final verdict

⚠️ Merge after addressing blocking items.

There are no blocking correctness or safety issues — the implementation is architecturally sound, correctly wired on the worker path, and well-tested at the unit level. However, two non-blocking issues should be resolved before merge because they leave active maintenance traps:

  1. Dead routing code in routeRubyCall: The heritage handling for include/extend/prepend at call-routing.ts:91-119 is unreachable. It either needs to be removed or explicitly marked as dead (with a comment explaining why the heritage extractor always intercepts first). Leaving it as-is will mislead the next contributor modifying call routing.

  2. processHeritage divergence: The sequential path adds heritage graph edges via old inline code (heritage-processor.ts:86-257) while the worker path uses the new extractor. Any future extractor behavior change must be applied in both places. This should be unified by updating processHeritage to delegate to provider.heritageExtractor.

@magyargergo magyargergo left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are comments from claude that need to be addreu!

@github-actions

github-actions Bot commented Apr 16, 2026

Copy link
Copy Markdown
Contributor

CI Report

All checks passed

Pipeline Status

Stage Status Details
✅ Typecheck success tsc --noEmit
✅ Tests success unit tests, 3 platforms
✅ E2E success gitnexus-web changes only

Test Results

Tests Passed Failed Skipped Duration
6603 6506 0 97 249s

✅ All 6506 tests passed

97 test(s) skipped — expand for details
  • Swift MethodExtractor > isTypeDeclaration > recognizes class_declaration
  • Swift MethodExtractor > isTypeDeclaration > recognizes protocol_declaration
  • Swift MethodExtractor > isTypeDeclaration > rejects import_declaration
  • Swift MethodExtractor > visibility > extracts public method
  • Swift MethodExtractor > visibility > extracts private method
  • Swift MethodExtractor > visibility > defaults to internal when no modifier
  • Swift MethodExtractor > protocol methods > marks protocol method as abstract
  • Swift MethodExtractor > static and class methods > detects static func as isStatic
  • Swift MethodExtractor > static and class methods > detects class func as isStatic
  • Swift MethodExtractor > parameters > extracts parameters with types and default values
  • Swift MethodExtractor > return type > extracts return type from -> annotation
  • Swift MethodExtractor > annotations > extracts @objc attribute
  • Swift MethodExtractor > isFinal > detects final func
  • Swift MethodExtractor > isFinal > is false when not final
  • Swift MethodExtractor > isAsync > detects async func
  • Swift MethodExtractor > isOverride > detects override method
  • buildTypeEnv > constructor inference (Tier 1 fallback) > lookupClassByName regression coverage > Swift lookupClassByName regression coverage > Swift cross-file constructor inference uses lookupClassByName
  • buildTypeEnv > constructor inference (Tier 1 fallback) > lookupClassByName regression coverage > Swift lookupClassByName regression coverage > Swift explicit init inference uses lookupClassByName
  • buildTypeEnv > constructor inference (Tier 1 fallback) > lookupClassByName regression coverage > Swift lookupClassByName regression coverage > Swift cross-file constructor inference does not bind plain functions
  • buildTypeEnv > known limitations (documented skip tests) > Ruby block parameter: users.each { |user| } — closure param inference, different feature
  • Swift constructor-inferred type resolution > detects User and Repo classes, both with save methods
  • Swift constructor-inferred type resolution > resolves user.save() to Models/User.swift via constructor-inferred type
  • Swift constructor-inferred type resolution > resolves repo.save() to Models/Repo.swift via constructor-inferred type
  • Swift constructor-inferred type resolution > emits exactly 2 save() CALLS edges (one per receiver type)
  • Swift self resolution > detects User and Repo classes, each with a save function
  • Swift self resolution > resolves self.save() inside User.process to User.save, not Repo.save
  • Swift parent resolution > detects BaseModel and User classes plus Serializable protocol
  • Swift parent resolution > emits EXTENDS edge: User → BaseModel
  • Swift parent resolution > emits IMPLEMENTS edge: User → Serializable (protocol conformance)
  • Swift cross-file User.init() inference > resolves user.save() via User.init(name:) inference
  • Swift cross-file User.init() inference > resolves user.greet() via User.init(name:) inference
  • Swift return type inference > detects User class and getUser function
  • Swift return type inference > detects save function on User (Swift class methods are Function nodes)
  • Swift return type inference > resolves user.save() to User#save via return type of getUser() -> User
  • Swift return-type inference via function return type > resolves user.save() to User#save via return type of getUser()
  • Swift return-type inference via function return type > user.save() does NOT resolve to Repo#save
  • Swift return-type inference via function return type > resolves repo.save() to Repo#save via return type of getRepo()
  • Swift implicit imports (cross-file visibility) > detects UserService class in Models.swift
  • Swift implicit imports (cross-file visibility) > resolves UserService() constructor call across files (no explicit import)
  • Swift implicit imports (cross-file visibility) > resolves service.fetchUser() member call across files
  • Swift implicit imports (cross-file visibility) > creates IMPORTS edges between files in the same module
  • Swift extension deduplication > detects Product class
  • Swift extension deduplication > resolves Product() constructor despite extension creating duplicate class node
  • Swift extension deduplication > resolves product.save() to Product.swift (primary definition)
  • Swift constructor call fallback (no new keyword) > resolves OCRService() as constructor call across files
  • Swift constructor call fallback (no new keyword) > resolves ocr.recognize() member call via constructor-inferred type
  • Swift export visibility (internal vs private) > resolves PublicService() constructor across files
  • Swift export visibility (internal vs private) > resolves internalHelper() across files (internal = module-scoped)
  • Swift if let / guard let binding resolution > detects User and Repo classes
  • Swift if let / guard let binding resolution > resolves user.save() inside if-let to User#save
  • Swift if let / guard let binding resolution > resolves repo.save() inside guard-let to Repo#save
  • Swift if let / guard let binding resolution > user.save() in if-let does NOT resolve to Repo#save
  • Swift await / try expression unwrapping > resolves user.save() via await fetchUser() return type
  • Swift await / try expression unwrapping > resolves repo.save() via try parseRepo() return type
  • Swift await / try expression unwrapping > detects fetchUser and parseRepo as functions
  • Swift for-in loop element type inference > detects User and Repo classes
  • Swift for-in loop element type inference > creates implicit import edges between files
  • Swift field-type resolution > detects classes and their properties
  • Swift field-type resolution > emits HAS_PROPERTY edges from class to field
  • Swift field-type resolution > resolves field-chain call user.address.save() → Address#save
  • Swift field-type resolution > emits ACCESSES edges for field reads in chains
  • Swift field-type resolution > populates field metadata (visibility, declaredType) on Property nodes
  • Swift call-result binding > resolves call-result-bound method call user.save() → User#save
  • Swift call-result binding > getUser() is present as a defined function
  • Swift call-result binding > emits processUser -> getUser CALLS edge for let-assigned free function call
  • Swift method enrichment > detects Animal protocol and Dog class
  • Swift method enrichment > emits IMPLEMENTS edge Dog -> Animal
  • Swift method enrichment > emits HAS_METHOD edges for Dog methods
  • Swift method enrichment > marks protocol Animal.speak as isAbstract
  • Swift method enrichment > marks Dog.speak as NOT isAbstract
  • Swift method enrichment > marks breathe as isFinal
  • Swift method enrichment > marks classify as isStatic
  • Swift method enrichment > captures @objc annotation on breathe
  • Swift method enrichment > populates parameterTypes for classify(_ name: String)
  • Swift method enrichment > records parameterCount for classify
  • Swift method enrichment > records returnType for speak
  • Swift method enrichment > resolves dog.speak() CALLS edge
  • Swift method enrichment > resolves Dog.classify("dog") CALLS edge
  • Swift abstract dispatch > detects Repository protocol and SqlRepository class
  • Swift abstract dispatch > emits IMPLEMENTS edge SqlRepository -> Repository
  • Swift abstract dispatch > emits HAS_METHOD edges for Repository.find and Repository.save
  • Swift abstract dispatch > emits HAS_METHOD edges for SqlRepository.find and SqlRepository.save
  • Swift abstract dispatch > marks base Repository.find as isAbstract
  • Swift abstract dispatch > marks base Repository.save as isAbstract
  • Swift abstract dispatch > marks concrete SqlRepository.find as NOT isAbstract
  • Swift abstract dispatch > resolves repo.find(id: 42) CALLS edge
  • Swift abstract dispatch > resolves repo.save(entity: user) CALLS edge
  • Swift abstract dispatch > populates parameterTypes for Repository.find
  • Swift abstract dispatch > populates parameterTypes for Repository.save
  • Swift abstract dispatch > records returnType for SqlRepository.find
  • Swift abstract dispatch > emits METHOD_IMPLEMENTS edges from SqlRepository methods → Repository protocol methods
  • Swift overloaded method disambiguation > detects 2 distinct find Method nodes on SqlRepository
  • Swift overloaded method disambiguation > emits METHOD_IMPLEMENTS edges for both find overloads
  • Swift overloaded method disambiguation > emits METHOD_IMPLEMENTS edge for save
  • Swift overloaded method disambiguation > emits exactly 3 METHOD_IMPLEMENTS edges total
  • Swift Child extends Parent — inherited method resolution (SM-9) > detects Parent and Child classes
  • Swift Child extends Parent — inherited method resolution (SM-9) > resolves c.parentMethod() to Parent.parentMethod via first-wins MRO walk

Code Coverage

Tests

Metric Coverage Covered Base Delta Status
Statements 73.22% 17852/24380 73% 📈 +0.2 🟢 ██████████████░░░░░░
Branches 62.24% 11340/18218 61.97% 📈 +0.3 🟢 ████████████░░░░░░░░
Functions 77.82% 1688/2169 77.65% 📈 +0.2 🟢 ███████████████░░░░░
Lines 75.78% 16185/21356 75.58% 📈 +0.2 🟢 ███████████████░░░░░

📋 View full run · Generated by CI

… file

Addresses CI format check failure: npx prettier --check flagged
3 files with code style issues.

Agent-Logs-Url: https://github.com/abhigyanpatwari/GitNexus/sessions/ff7b0282-ec10-433d-8286-a47c152fa66c

Copilot AI commented Apr 16, 2026

Copy link
Copy Markdown
Contributor Author

There are comments from claude that need to be addreu!

Fixed the CI formatting failure in e30f1a7 — Prettier flagged 3 files (heritage-types.ts, go.ts config, and the test file).

Regarding Claude's review findings: those two reviews (comment IDs 4258797672, 4259331218) were about the call extractor feature on a previous iteration of this branch, not the heritage extractor changes. The specific findings (C#/Kotlin typeAsReceiverHeuristic, callExtractor doc comment, Java double invocation) are about call extraction code and have already been addressed in the current codebase — the callExtractor doc comment at language-provider.ts:164 already reads "Default: undefined — if unset, no calls are extracted for this language."

The heritage extractor changes have separate coverage via 36 unit tests + the integration tests you added in 5533254.

Copilot AI commented Apr 17, 2026

Copy link
Copy Markdown
Contributor Author

I want you only define language files if they have implantation otherwise they can be removed and the implantation design should support this case.

Addressed in 874822d. Removed 9 pass-through config files that only contained { language: X } — only go.ts and ruby.ts remain since they have actual custom hooks (shouldSkipExtends and callBasedHeritage).

The factory createHeritageExtractor() now accepts either:

  • A SupportedLanguages enum value → default capture-based extraction (no config file needed)
  • A full HeritageExtractionConfig → custom hooks for Go/Ruby

All 11 providers with default behaviour now pass the enum directly:

heritageExtractor: createHeritageExtractor(SupportedLanguages.Java),

We must ensure that it wired up properly and the legacy path is no longer available.

This was addressed in ee03d78 (legacy path removal) and further refined in 874822d (removed boilerplate config files). The heritage extractor is now the sole path for all heritage extraction — no inline legacy code remains in parse-worker, call-processor, or heritage-processor.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • extension.ladybugdb.com
    • Triggering command: /home/REDACTED/work/_temp/ghcca-node/node/bin/node node /home/REDACTED/work/GitNexus/GitNexus/gitnexus/node_modules/.bin/vitest run test/unit/heritage-extraction.test.ts yp e remaining lines. -I modules/@npmcli/run-script/lib/node-gyp-bin/as -I /home/REDACTED/.ca-v run-script/lib/n^$ node 1/x6�� r_kotlin_binding/src/parser.o := cc -o Release/obj.target/tree_sitter_kotlin_binding/src/parser.bash --64 in/sh (dns block)
    • Triggering command: /home/REDACTED/work/_temp/ghcca-node/node/bin/node node /home/REDACTED/work/GitNexus/GitNexus/gitnexus/node_modules/.bin/vitest run test/integration/heritage-extractor-wiring.test.ts ache�� lude (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

@magyargergo

Copy link
Copy Markdown
Collaborator

@claude Act as a senior reviewer for GitNexus. Your job is to determine whether this PR is production-ready for this repo, not to give a generic code review.

You are reviewing a PR in the GitNexus monorepo:

  • gitnexus/ → CLI + MCP
  • gitnexus-web/
  • gitnexus-shared/

Your task has 2 phases, in this exact order:

PHASE 1 — DEFINE THE BAR
Before reviewing the diff, establish a concise repo-specific definition of “production-ready” for GitNexus, based only on the repo docs and the affected area.
Keep this definition practical and reviewable. Do not invent standards that are not grounded in the repo.

PHASE 2 — REVIEW THE PR AGAINST THAT BAR
Review the actual diff only after defining the bar.
Stay tightly scoped to the changed code and its direct consequences.


CONTEXT TO LOAD FIRST
Read these before reviewing:

  • AGENTS.md
  • GUARDRAILS.md
  • CONTRIBUTING.md
  • TESTING.md
  • ARCHITECTURE.md

Additional context:


PRIMARY OBJECTIVE
Decide whether this PR is safe, correct, maintainable, and operationally acceptable to merge into production for GitNexus.

Do not optimize for completeness at the expense of signal.
Do not pad the review.
Do not propose unrelated refactors.
Do not restate the PR description unless needed for verification.


REVIEW RULES

  • Every finding must be grounded in specific evidence from the diff or directly relevant surrounding code.
  • Every finding must include path:line.
  • If you make a behavioral claim, cite the code that proves it.
  • If you make a performance claim, explain the mechanism.
  • If something cannot be verified from the diff alone, explicitly say so.
  • Distinguish clearly between:
    • verified issue
    • plausible risk
    • unverified concern
  • Avoid vague wording like “might be better” or “could be improved” unless you explain exactly why.
  • Keep the review focused on this PR’s scope only.

For each finding, assign one severity:

  • BLOCKING → must be fixed before merge
  • NON-BLOCKING → valid issue, but merge may still be acceptable
  • NIT → stylistic/minor, not merge-relevant

REPO-SPECIFIC REVIEW CHECKLIST
Use these exact headings.

1. Correctness & functional completeness

Check:

  • Does the implementation actually satisfy the PR claim?
    • ManifestExtractor is truly invoked
    • config.links produces non-zero cross-links where expected
  • Resolver contracts are preserved:
    • resolveSymbol remains exact-match
    • label-scoped Cypher remains correct per contract type
    • flag any regression toward fuzzy or unscoped matching
  • Graph schema integrity is preserved:
    • no silent changes to node labels
    • no silent changes to edge types
    • no silent changes to ID generation (generateId)
  • Call out any missing wiring, partial integration, dead branch, or mismatch between tests and runtime behavior

2. Code clarity & clean code

Check:

  • naming quality
  • local cohesion
  • dead code
  • unnecessary abstraction
  • hidden control flow
  • confusing indirection
  • adherence to repo conventions:
    • direct imports from gitnexus-shared
    • no barrel re-export regressions
    • no // removed comments
    • no unused re-exports
  • no drive-by refactors outside stated scope per CONTRIBUTING.md and GUARDRAILS.md § Scope

3. Test coverage & change safety

Evaluate against TESTING.md:

  • Are there unit tests under gitnexus/test/unit/ covering the newly wired path?
  • Is there a regression guard for 0-link → N-link behavior?
  • Are assertions meaningful rather than tautological?
  • Are fixtures realistic for manifest inputs?
  • If memoization/cache was introduced, is there a test proving hit/miss behavior and correctness?
  • Is there evidence the expected validation path would pass for staged gitnexus/ files?
    • tsc --noEmit
    • vitest run --project default
      If not verifiable, say exactly what is missing.

4. Performance

Inspect for:

  • hot-path overhead in ingestion/group sync
  • excess allocations per manifest entry
  • redundant Cypher round-trips
  • missed batching or missed parallelism (Promise.all) where it materially matters
  • O(n²) or repeated lookup patterns on large repos
  • memoization tradeoffs:
    • correctness
    • invalidation
    • bounded vs unbounded memory growth
      Do not speculate casually; explain the mechanism and likely impact.

5. Operational risk

Check:

  • Windows/cross-platform safety:
    • stream lifecycle
    • FD/file handle lifecycle
    • path separator assumptions
    • anything resembling prior ENOTEMPTY-style lifecycle regressions
  • LadybugDB single-writer invariant is preserved
  • Embeddings preservation:
    • no silent breakage of --embeddings
    • .gitnexus/meta.json.stats.embeddings not silently zeroed by changed paths
  • MCP contracts remain compatible:
    • group_*
    • query
    • context
    • impact
    • detect_changes
    • rename
    • cypher
      Flag any schema or contract break without migration note
  • staleness behavior still triggers correctly (gitnexus/src/mcp/staleness.ts)
  • rollback safety:
    • can this PR be reverted safely without re-analyze?
    • if not, explain why

6. Maintainability

Check:

  • Does the change respect Pipeline Phase DAG boundaries?
  • Is cross-phase coupling being introduced?
  • Are manifest-extractor entry points discoverable from group/sync.ts?
  • Would the next contributor know where to extend this?
  • Is shared logic placed appropriately:
    • in gitnexus-shared/
    • or gitnexus/src/core/**/utils.ts
    • and not duplicated between CLI and web bridge (server/api.ts)?

NON-NEGOTIABLES TO VERIFY
From GUARDRAILS.md / CONTRIBUTING.md:

  • no secrets, tokens, or machine-specific paths committed
  • no blind find-and-replace renames
  • if a symbol rename occurred, rename MCP tool or equivalent careful review was used
  • upstream impact was considered for modified shared symbols
  • no HIGH/CRITICAL impact left unaddressed
  • PR title follows conventional commit style + [area] Short description
  • PolyForm Noncommercial license headers/metadata untouched

OUTPUT FORMAT

Production-ready bar for this PR

Give a concise repo-specific definition in 5–10 bullets.

Findings

For each finding, use this structure exactly:

[BLOCKING|NON-BLOCKING|NIT] Short title

  • Evidence: path:line, path:line
  • Why it matters: ...
  • Recommendation: ...
  • Confidence: High / Medium / Low

Only include findings that are real and supported.
If there are no findings in a section, say:

  • No issues found in this area based on the diff and reviewed context.

What I could not fully verify

List any behavior you could not prove from the diff, tests, or available context.

Final verdict

Choose exactly one:

  • ✅ Ready to merge — no blocking issues.
  • ⚠️ Merge after addressing blocking items.
  • ❌ Not ready — fundamental concerns.

If verdict is not ready, list the blocking items in priority order.


IMPORTANT REVIEW BEHAVIOR

  • Be strict on correctness and contracts.
  • Be conservative about operational risk.
  • Be explicit about uncertainty.
  • Do not suggest unrelated refactors.
  • Do not turn nits into blockers.
  • Do not reward intent; review actual implementation.

@github-actions

github-actions Bot commented Apr 17, 2026

Copy link
Copy Markdown
Contributor

Claude finished @magyargergo's task in 6m 16s —— View job


PR Review: Heritage Extractor Config+Factory Pattern

Tasks:

  • Load context docs (AGENTS.md, GUARDRAILS.md, CONTRIBUTING.md, TESTING.md, ARCHITECTURE.md)
  • Define production-ready bar
  • Review diff against bar
  • Post final verdict

Production-ready bar for this PR

For a heritage extractor refactor touching the ingestion pipeline in GitNexus, "production-ready" means:

  1. Correct extraction on all three wired paths: parse-worker (worker), call-processor (sequential), and processHeritage (standalone sequential) must all route heritage through the new extractor with no legacy inline paths remaining.
  2. No graph schema drift: ExtractedHeritage interface, node labels, edge types, and generateId call signatures unchanged.
  3. MCP contract stability: No MCP tool/resource signatures changed; graph output is semantically identical to before.
  4. Fully wired: All 16 providers updated; all providers supplying heritageExtractor as documented in language-provider.ts:188.
  5. Test coverage per TESTING.md: Unit tests under gitnexus/test/unit/; at minimum one integration test exercising real tree-sitter output to catch query/config drift.
  6. Pipeline phase DAG boundaries respected: Heritage extraction stays in the parse phase; no cross-phase coupling introduced.
  7. No dead code or partially wired paths: Legacy routing either removed or explicitly documented.
  8. CI green: npx tsc --noEmit and npx prettier --check . must pass; tests pass on all three platforms.
  9. PR title follows conventional commit format per CONTRIBUTING.md.

Findings

1. Correctness & functional completeness

All three execution paths are now fully wired to the extractor:

  • parse-worker.ts — Heritage captured in the pre-pass (heritage.class block, lines 1894–1922) and call-based heritage via extractFromCall before callRouter (lines 1706–1723). ✅
  • call-processor.ts — Pre-pass heritage for buildTypeEnv (lines 769–790) and extractFromCall before callRouter (lines 999–1016). ✅
  • heritage-processor.ts processHeritage — Delegates to provider.heritageExtractor.extract() (lines 235–253). ✅ (Previous review flagged this as missing; now fixed in ee03d78.)

Go named field filtering is correct. goHeritageConfig.shouldSkipExtends at configs/go.ts:20–23 checks field_declaration parent with a named child — faithfully replaces the old inline check.

Ruby mixin interception is correct. extractFromCall runs before callRouter in both parse-worker and call-processor; routeRubyCall now returns SKIP_RESULT for include/extend/prepend with an explanatory comment at call-routing.ts:85–91.

Graph schema preserved: ExtractedHeritage interface unchanged. resolveAndAddHeritageEdge uses the same generateId call patterns and relationship types (EXTENDS, IMPLEMENTS, IMPLEMENTS with reason: h.kind) as the old inline code in processHeritageFromExtracted.

No issues found in this area.


2. Code clarity & clean code

  • No issues found. Naming is clear and consistent with call-types/method-types/field-extractor/variable-types family.
  • No barrel re-export regressions, no // removed comments, no unused re-exports.
  • routeRubyCall comment at call-routing.ts:8–15 correctly documents that heritage is now intercepted before the router runs. ✅
  • No drive-by refactors observed outside the stated scope.

[NON-BLOCKING] Integration test uses non-standard SupportedLanguages import path

  • Evidence: test/integration/heritage-extractor-wiring.test.ts:16import { SupportedLanguages } from '../../src/config/supported-languages.js'
  • Why it matters: CONTRIBUTING.md mandates "direct imports from gitnexus-shared". Every other test file, including the unit test added in this same PR (heritage-extraction.test.ts:10), imports from 'gitnexus-shared' directly. The re-export file exists so it compiles, but it creates an inconsistency reviewable by the next contributor.
  • Recommendation: Change '../../src/config/supported-languages.js''gitnexus-shared'.
  • Confidence: High
  • Fix this →

[NON-BLOCKING] language variable shadowing in processHeritage

  • Evidence: heritage-processor.ts:191const language = getLanguageFromFilename(file.path) (SupportedLanguages | null); heritage-processor.ts:226const language = parser.getLanguage() (tree-sitter Language object, inside a try {} block)
  • Why it matters: Two language variables in the same function with entirely different types. The inner one is correctly scoped to the try block and used only for new Parser.Query(language, ...), but a future contributor editing the heritage processing block could accidentally use the wrong binding. The outer language is the enum value that flows into heritageExtractor.extract() and resolveAndAddHeritageEdge().
  • Recommendation: Rename the inner variable: const treeSitterLang = parser.getLanguage(); and update the Query constructor call.
  • Confidence: High
  • Fix this →

[NIT] HeritageExtractor.language field never consumed

  • Evidence: heritage-types.ts:49
  • Why it matters: Consistent with the same pattern in CallExtractor. No action required.
  • Confidence: High

3. Test coverage & change safety

Unit tests (test/unit/heritage-extraction.test.ts, 428 lines): Factory construction (enum and config paths), all provider wiring (15 languages), Go named field filtering (anonymous embedding pass, named field skip), Ruby call-based extraction (include/extend/prepend, scope_resolution, multiple args, module context, missing enclosing class). Assertions are meaningful, not tautological.

Integration tests (test/integration/heritage-extractor-wiring.test.ts, 186 lines): Ruby extractFromCall against real tree-sitter AST; TypeScript extract against real query captures. This correctly addresses the gap flagged in the previous review.

Call routing tests (test/unit/call-routing/ruby.test.ts): Updated — 179 lines removed (heritage routing tests), 11 lines added (skip behavior for include/extend/prepend). Correct regression guard for the behavior change.

CI results: The quality CI run for 5533254 (the last published CI report) failed on Prettier for 3 files: configs/go.ts, heritage-types.ts, test/unit/heritage-extraction.test.ts. This was fixed in e30f1a7. No CI results are available yet for the final commits (ee03d78, 874822d). Typecheck passed in the last CI run.

[BLOCKING] No confirmed clean CI run on the current HEAD

  • Evidence: CI run 24536117465 (on commit 5533254) — quality/format failed on 3 files (Prettier). Subsequent commit e30f1a7 fixed those files. Commits ee03d78 and 874822d followed. No quality CI results exist for these final commits.
  • Why it matters: Per CONTRIBUTING.md, typecheck and format must pass before merge. The format failure is documented; the fix is plausible based on code inspection. But the two refactor commits after the style fix (ee03d78 adds significant code; 874822d removes files and refactors providers) have not been verified by CI. The current branch cannot be confirmed green without a CI run.
  • Recommendation: Push a no-op commit or re-trigger CI to get a clean quality run on 874822d before merging.
  • Confidence: High (verified: no quality CI run exists for commits after 5533254)

4. Performance

No issues found. createHeritageExtractor() is called once at module load per language. callNameSet.has() is O(1). MAX_PARENT_DEPTH = 50 in ruby.ts:11 correctly bounds the AST walk. No allocations added to hot paths beyond what the old inline code already did.


5. Operational risk

  • LadybugDB invariant: Not affected. Extraction is read-only from the DB perspective. ✅
  • Graph schema: ExtractedHeritage interface, node labels, edge types, and generateId inputs are identical to before. ✅
  • MCP contracts: No MCP tools or resources touched. ✅
  • Embeddings: No analyze or embedding paths modified. ✅
  • Cross-platform: No path manipulation in new code. ✅
  • Rollback safety: Fully contained in gitnexus/src/core/ingestion/. Reverting does not require re-analyze — extraction is re-run on each analyze. ✅

No issues found in this area.


6. Maintainability

Pipeline DAG boundaries respected. Heritage extraction stays in the parse phase. No cross-phase coupling introduced. ✅

Extension point is clear. New languages: add a file to heritage-extractors/configs/ (if custom hooks needed) or pass SupportedLanguages.X directly to createHeritageExtractor(). The factory signature documents both paths. ✅

Shared logic placement correct. All new code under gitnexus/src/core/ingestion/ — no duplication with web bridge. ✅

routeRubyCall clean-up done. Heritage routing removed from the router; comment at call-routing.ts:8–15 explains why. ✅


What I could not fully verify

  1. No CI quality run on current HEAD (874822d). Typecheck and format results for the two commits after e30f1a7 are unknown from available CI data.
  2. Tree-sitter query/config alignment for non-spot-checked languages. Only TypeScript and Go were verified against their .scm capture names. The 9 remaining default-extraction languages assume their queries emit @heritage.class alongside @heritage.extends/@heritage.implements/@heritage.trait.
  3. Vue provider uses typescriptHeritageConfig — not confirmed directly since the PR removed per-language config files for default cases. The Vue provider (languages/vue.ts) uses SupportedLanguages.Vue passed to createHeritageExtractor() — this would produce default capture-based extraction, which is correct for Vue's heritage pattern.

Final verdict

⚠️ Merge after addressing blocking item.

The implementation is architecturally correct and complete. All three execution paths are unified under the new extractor. Legacy inline code has been fully removed. Tests are solid. The only blocking item is operational:

  1. [Must confirm] A clean CI quality run on the current HEAD (874822d) is required. The last quality CI failed on Prettier (now fixed), but ee03d78 and 874822d have not been verified by CI. Trigger a CI re-run and confirm quality/typecheck and quality/format both pass before merge.

@magyargergo magyargergo left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Integration test uses non-standard SupportedLanguages import path
Evidence: test/integration/heritage-extractor-wiring.test.ts:16 — import { SupportedLanguages } from '../../src/config/supported-languages.js'
Why it matters: CONTRIBUTING.md mandates "direct imports from gitnexus-shared". Every other test file, including the unit test added in this same PR (heritage-extraction.test.ts:10), imports from 'gitnexus-shared' directly. The re-export file exists so it compiles, but it creates an inconsistency reviewable by the next contributor.
Recommendation: Change '../../src/config/supported-languages.js' → 'gitnexus-shared'.
Confidence: High
Fix this

language variable shadowing in processHeritage
Evidence: heritage-processor.ts:191 — const language = getLanguageFromFilename(file.path) (SupportedLanguages | null); heritage-processor.ts:226 — const language = parser.getLanguage() (tree-sitter Language object, inside a try {} block)
Why it matters: Two language variables in the same function with entirely different types. The inner one is correctly scoped to the try block and used only for new Parser.Query(language, ...), but a future contributor editing the heritage processing block could accidentally use the wrong binding. The outer language is the enum value that flows into heritageExtractor.extract() and resolveAndAddHeritageEdge().
Recommendation: Rename the inner variable: const treeSitterLang = parser.getLanguage(); and update the Query constructor call.
Confidence: High
Fix this

…age-processor

Agent-Logs-Url: https://github.com/abhigyanpatwari/GitNexus/sessions/96a027ec-b115-48aa-8556-a3b60145bb6d

Co-authored-by: magyargergo <11230420+magyargergo@users.noreply.github.com>

Copilot AI commented Apr 17, 2026

Copy link
Copy Markdown
Contributor Author

Integration test uses non-standard SupportedLanguages import path
Evidence: test/integration/heritage-extractor-wiring.test.ts:16 — import { SupportedLanguages } from '../../src/config/supported-lan...

Both fixed in 2cea3eb:

  • Integration test import changed to import { SupportedLanguages } from 'gitnexus-shared'
  • Inner language in processHeritage renamed to treeSitterLang to avoid shadowing the outer SupportedLanguages enum binding

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • extension.ladybugdb.com
    • Triggering command: /home/REDACTED/work/_temp/ghcca-node/node/bin/node node ./node_modules/.bin/vitest run test/unit/heritage-extraction.test.ts -c ree-sitter-proto.cjs (dns block)
    • Triggering command: /home/REDACTED/work/_temp/ghcca-node/node/bin/node node ./node_modules/.bin/vitest run test/integration/heritage-extractor-wiring.test.ts xus/�� dn&#39;t have any #includes. Release/tree_sitter_dart_binding.node proto/node_modules/.bin/make -I /home/REDACTED/.ca-e -I node /hom�� r_dart_binding/src/scanner.o := -I --64 xus/node_modules/.bin/uname (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

…elabel modules as Trait

Fixes two correctness bugs in Ruby mixin heritage resolution identified by a
Codex adversarial review, plus the test-integrity issues the follow-up code
review flagged on the regression suite.

Sequential ingestion fallback now extracts call-based heritage (Ruby
include/extend/prepend) during the prepass, so sequentialHeritageMap is
populated with mixin ancestry before processCalls resolves calls against it.
extractExtractedHeritageFromFiles also runs heritageExtractor.extractFromCall
on @call captures, mirroring the worker-path routing. The prepass stays
read-only with respect to the graph -- processCalls still owns edge emission
via rubyHeritage -> processHeritageFromExtracted.

Ruby module declarations are now relabeled to Trait during the structure phase
so they participate in lookupClassByName and buildHeritageMap. Unlike the
inert Module label, Trait is already a dispatch-category class-like label, so
mixin owners resolve through the type registry. The relabel lives in one site
(rubyLabelOverride consulted by getLabelFromCaptures) plus the
CONTAINER_TYPE_TO_LABEL mapping for enclosing-class-id lookup -- both paths
switch in lockstep so method-owner IDs stay consistent with structure-phase
IDs.

Regression suite adds ruby-sequential-mixin.test.ts with fixtures covering
include/extend/prepend across files. Plan 002 Units 1+2 harden it further:

- Worker-vs-sequential parity is now real, not a sham. PipelineOptions gains
  an @internal workerThresholdsForTest override so the tiny fixture can force
  the worker pool to spawn. PipelineResult.usedWorkerPool proves which path
  actually executed.
- The prepend-override assertion uses a non-shadowed method
  (prepended_marker) defined only on PrependedOverride, so asserting owner =
  [PrependedOverride] proves the prepend provider entered the MRO without
  depending on the still-deferred kind-aware MRO ordering.

Guard verification:
- Reverting the Module -> Trait relabel (plan 001 Unit 2) breaks 4 of 7 tests
  in the regression suite.
- Reverting the sequential prepass (plan 001 Unit 1) does NOT currently break
  the suite because processCalls independently extracts call-based heritage
  and the resolver has a global-name fallback that bypasses MRO ancestry.
  Documented as a residual limitation -- a stronger guard needs cross-chunk
  or name-shadowed scenarios (plan 002 Unit 3 scope).

Plans:
- docs/plans/2026-04-17-001-fix-codex-adversarial-ruby-mixin-heritage-plan.md
- docs/plans/2026-04-17-002-fix-ce-review-ruby-mixin-followups-plan.md
  (Units 1+2 complete; Units 3-7 still pending)
- docs/plans/2026-04-17-003-fix-ruby-mro-ordering-heritage-fallback-plan.md
  (MRO kind-ordering + Struct fallback, all 6 units pending)

Test suite: 151 passed (4 files) for ruby.test.ts, ruby-sequential-mixin.test.ts,
resolve-enclosing-owner.test.ts, heritage-extractor-wiring.test.ts.
Full gitnexus suite: 6120 passed; 2 pre-existing LadybugDB worker-exit flakes
unrelated to this change (java-class-impact, lbug-vector-extension).
…not Struct

Ruby `include`/`extend`/`prepend` records share a processHeritageFromExtracted
branch with Rust `trait-impl`. When the mixin child name cannot be resolved
through the type registry, the fallback synthesized a `Struct:filepath:X` id
for both. Rust structs emit under the `Struct` label -- correct. Ruby classes
emit under `Class` and Ruby modules emit under `Trait` (since plan 001's
module relabel), so the synthesized `Struct:` id would never match any real
node and the IMPLEMENTS edge would dangle off a phantom.

The resolved case -- the common path now that Ruby modules register
normally -- is unaffected; only the fallback for truly unresolved references
changes. `Class` is the right default for Ruby mixin kinds because the
dominant shape is `class X; include M; end` where X is a Class.

Split is kind-aware: Rust `trait-impl` keeps `Struct`, Ruby mixin kinds get
`Class`. Applied at both call sites (processHeritageFromExtractedItem +
processHeritageFromExtracted).
Adds kind-aware MRO infrastructure for Ruby's prepend/include/extend semantics.
The heritage map now preserves declaration kind and exposes split views for
instance vs singleton dispatch; `lookupMethodByOwnerWithMRO` gets a new
`'ruby-mixin'` MroStrategy that walks prepend providers before the direct
owner lookup, honoring Ruby's rule that prepended modules beat the class's
own method of the same name.

Changes:
- gitnexus-shared: `MroStrategy` union gains `'ruby-mixin'` with doc block
- model/heritage-map: `directParents` internal shape changes from `Set<parentId>`
  to ordered `ParentEntry[]` with `{ parentId, kind }`. Public
  `getParents`/`getAncestors` contract preserved (still returns flat string[]
  via dedup). New methods: `getParentEntries`, `getInstanceAncestry`,
  `getSingletonAncestry`. Insertion order mirrors tree-sitter match order,
  which for Ruby matches source declaration order.
- model/resolve: new `ruby-mixin` branch walks prepend parents first
  (reverse declaration — last-prepended wins), then direct owner, then
  extends + include parents (reverse declaration). Non-Ruby strategies
  (`first-wins`, `c3`, `leftmost-base`, `implements-split`, `qualified-syntax`)
  unchanged — they still do direct-owner-first short-circuit. Walker also
  accepts optional `ancestryOverride` for singleton dispatch.
- languages/ruby: sets `mroStrategy: 'ruby-mixin'`

Known limitation (documented in test TODO): Ruby bare-identifier calls
inside methods flow through `resolveFreeCall` today, not owner-scoped
`resolveMemberCall`. That means `ruby-mixin`'s shadow-name behavior
(prepend > self) is not yet observable in the `call_serialize → serialize`
case; it resolves via global lookup. The strategy is correctly wired and
will apply as soon as Ruby self-call inference is added (threading bare
calls as `self.method` with `receiverTypeName = enclosing class`).

The `prepended_marker` test (non-shadowed method only reachable via the
prepend provider) keeps working as the narrow-guard: it proves the prepend
parent enters the ancestry, even though the MRO ordering effect isn't
reachable yet from free-call paths.

No regressions: 840 tests pass across Ruby, Python (C3), Rust (qualified),
Java, Kotlin, heritage-extractor-wiring, resolve-enclosing-owner.
…ng language logic

Refactors the call-resolution pipeline inside the parse phase into a typed
6-stage DAG where language-specific behavior lives entirely behind provider
hooks. Shared pipeline code names no languages; adding a new language's
implicit-receiver or dispatch semantics is a provider-method change, not a
shared-code edit.

DAG stages:

  extract-call → classify-form → infer-receiver → select-dispatch →
  resolve-target → emit-edge

Stages 1-2 run in the parse worker; stages 3-6 run main-thread in
call-processor. Provider hooks plug in at stages 3 and 4.

New provider hooks on LanguageProvider:

- inferImplicitReceiver: synthesizes receiver when language makes it implicit
  (Ruby bare serialize → self.serialize). Returns ImplicitReceiverOverride or
  null. Runs after the shared receiver-inference chain (TypeEnv →
  constructor-map → class-as-receiver → mixed-chain) so explicit receivers
  always take precedence.

- selectDispatch: returns a DispatchDecision with primary (owner-scoped /
  free / constructor), optional fallback (free-arity-narrowed), optional
  ancestryView (instance / singleton). Defaults come from
  defaultDispatchDecision when the hook returns null.

DAG types in gitnexus/src/core/ingestion/call-types.ts:

- ReceiverEnriched: stage-3 output; carries receiverSource discriminant
  (none / typed-binding / constructor-map / class-as-receiver / mixed-chain
  / implicit-self). Only variants with live writers are declared.
- ImplicitReceiverOverride: hook return shape for implicit-self synthesis.
- DispatchDecision: stage-4 output; language-agnostic dispatch contract.

resolveCallTarget now dispatches on decision.primary (was: callForm ladder
with receiverTypeName gate). Singleton-ancestry miss MUST NOT degrade to
resolveMemberCallByFile — the file-scoped fallback is ownerId-keyed and
would leak instance methods onto class-constant receivers. Singleton miss
null-routes or honors decision.fallback.

lookupMethodByOwnerWithMRO accepts ancestryOverride for singleton dispatch
and the existing 'ruby-mixin' strategy (prepend → direct → include walk).

Ruby implementation (languages/ruby.ts):

- inferImplicitReceiver wraps the new pure helper
  maybeRewriteRubyBareCallToSelf (utils/ruby-self-call.ts) to detect bare
  calls inside class/module bodies and synthesize self.method with the
  enclosing class as receiverTypeName. Tags singleton dispatch via the hint
  field for def self.foo and class << self bodies.
- selectDispatch returns fallback=free-arity-narrowed for implicit-self so
  unresolved self-calls preserve the cross-class arity-narrowing heuristic
  (Service#run_task → OneArg#write_audit still works). Returns
  ancestryView=singleton for class-as-receiver calls (Account.log →
  LoggerMixin#log via extend).
- rubyResolveEnclosingOwner extracted to module-level const, shared by
  provider.resolveEnclosingOwner and inferImplicitReceiver.

Shared code changes that needed to accompany the DAG:

- class-as-receiver inference filter now accepts Trait (Ruby modules as
  receivers reach the singleton branch).
- Local ReceiverSource union in call-processor replaced with an import from
  call-types (single source of truth).
- defaultDispatchDecision helper and DAG-stage labels make the pipeline
  grep-discoverable.

Observable wins:

- Ruby call_serialize → PrependedOverride#serialize: prepend wins over
  class's own method (kind-aware MRO is now reachable from bare calls).
- Usage#run → LoggerMixin#log: class-method dispatch via extend resolves
  through singleton ancestry.
- Service#run_task → OneArg#write_audit: arity-narrowing still works;
  fallback preserves the existing free-call heuristic for implicit-self.

Tests: 1787 integration tests pass across all 12 languages (12 new unit
tests for maybeRewriteRubyBareCallToSelf, 1 new integration assertion for
Usage#run → LoggerMixin#log, tightened shadow-name assertion for
call_serialize → PrependedOverride#serialize).

Docs:

- ARCHITECTURE.md gains a Call-Resolution DAG section documenting stages,
  hook signatures, DispatchDecision semantics, the "how to add language
  behavior" recipe, and non-goals.
- AGENTS.md adds a rule enforcing "shared ingestion pipeline code does not
  name languages — use LanguageProvider hooks."
- CLAUDE.md mirrors the pointer; legacy duplicate MCP block trimmed.
- JSDoc across 7 touched code files documents DAG stages, hook contracts,
  invariants (notably the singleton-miss no-fallback guarantee), and
  canonical vocabulary. Token-optimized via follow-up pass to preserve all
  load-bearing facts while cutting ~47% of doc tokens.
@magyargergo

Copy link
Copy Markdown
Collaborator

@claude Act as a senior reviewer for GitNexus. Your job is to determine whether this PR is production-ready for this repo, not to give a generic code review.

You are reviewing a PR in the GitNexus monorepo:

  • gitnexus/ → CLI + MCP
  • gitnexus-web/
  • gitnexus-shared/

Your task has 2 phases, in this exact order:

PHASE 1 — DEFINE THE BAR
Before reviewing the diff, establish a concise repo-specific definition of “production-ready” for GitNexus, based only on the repo docs and the affected area.
Keep this definition practical and reviewable. Do not invent standards that are not grounded in the repo.

PHASE 2 — REVIEW THE PR AGAINST THAT BAR
Review the actual diff only after defining the bar.
Stay tightly scoped to the changed code and its direct consequences.


CONTEXT TO LOAD FIRST
Read these before reviewing:

  • AGENTS.md
  • GUARDRAILS.md
  • CONTRIBUTING.md
  • TESTING.md
  • ARCHITECTURE.md

Additional context:


PRIMARY OBJECTIVE
Decide whether this PR is safe, correct, maintainable, and operationally acceptable to merge into production for GitNexus.

Do not optimize for completeness at the expense of signal.
Do not pad the review.
Do not propose unrelated refactors.
Do not restate the PR description unless needed for verification.


REVIEW RULES

  • Every finding must be grounded in specific evidence from the diff or directly relevant surrounding code.
  • Every finding must include path:line.
  • If you make a behavioral claim, cite the code that proves it.
  • If you make a performance claim, explain the mechanism.
  • If something cannot be verified from the diff alone, explicitly say so.
  • Distinguish clearly between:
    • verified issue
    • plausible risk
    • unverified concern
  • Avoid vague wording like “might be better” or “could be improved” unless you explain exactly why.
  • Keep the review focused on this PR’s scope only.

For each finding, assign one severity:

  • BLOCKING → must be fixed before merge
  • NON-BLOCKING → valid issue, but merge may still be acceptable
  • NIT → stylistic/minor, not merge-relevant

REPO-SPECIFIC REVIEW CHECKLIST
Use these exact headings.

1. Correctness & functional completeness

Check:

  • Does the implementation actually satisfy the PR claim?
    • ManifestExtractor is truly invoked
    • config.links produces non-zero cross-links where expected
  • Resolver contracts are preserved:
    • resolveSymbol remains exact-match
    • label-scoped Cypher remains correct per contract type
    • flag any regression toward fuzzy or unscoped matching
  • Graph schema integrity is preserved:
    • no silent changes to node labels
    • no silent changes to edge types
    • no silent changes to ID generation (generateId)
  • Call out any missing wiring, partial integration, dead branch, or mismatch between tests and runtime behavior

2. Code clarity & clean code

Check:

  • naming quality
  • local cohesion
  • dead code
  • unnecessary abstraction
  • hidden control flow
  • confusing indirection
  • adherence to repo conventions:
    • direct imports from gitnexus-shared
    • no barrel re-export regressions
    • no // removed comments
    • no unused re-exports
  • no drive-by refactors outside stated scope per CONTRIBUTING.md and GUARDRAILS.md § Scope

3. Test coverage & change safety

Evaluate against TESTING.md:

  • Are there unit tests under gitnexus/test/unit/ covering the newly wired path?
  • Is there a regression guard for 0-link → N-link behavior?
  • Are assertions meaningful rather than tautological?
  • Are fixtures realistic for manifest inputs?
  • If memoization/cache was introduced, is there a test proving hit/miss behavior and correctness?
  • Is there evidence the expected validation path would pass for staged gitnexus/ files?
    • tsc --noEmit
    • vitest run --project default
      If not verifiable, say exactly what is missing.

4. Performance

Inspect for:

  • hot-path overhead in ingestion/group sync
  • excess allocations per manifest entry
  • redundant Cypher round-trips
  • missed batching or missed parallelism (Promise.all) where it materially matters
  • O(n²) or repeated lookup patterns on large repos
  • memoization tradeoffs:
    • correctness
    • invalidation
    • bounded vs unbounded memory growth
      Do not speculate casually; explain the mechanism and likely impact.

5. Operational risk

Check:

  • Windows/cross-platform safety:
    • stream lifecycle
    • FD/file handle lifecycle
    • path separator assumptions
    • anything resembling prior ENOTEMPTY-style lifecycle regressions
  • LadybugDB single-writer invariant is preserved
  • Embeddings preservation:
    • no silent breakage of --embeddings
    • .gitnexus/meta.json.stats.embeddings not silently zeroed by changed paths
  • MCP contracts remain compatible:
    • group_*
    • query
    • context
    • impact
    • detect_changes
    • rename
    • cypher
      Flag any schema or contract break without migration note
  • staleness behavior still triggers correctly (gitnexus/src/mcp/staleness.ts)
  • rollback safety:
    • can this PR be reverted safely without re-analyze?
    • if not, explain why

6. Maintainability

Check:

  • Does the change respect Pipeline Phase DAG boundaries?
  • Is cross-phase coupling being introduced?
  • Are manifest-extractor entry points discoverable from group/sync.ts?
  • Would the next contributor know where to extend this?
  • Is shared logic placed appropriately:
    • in gitnexus-shared/
    • or gitnexus/src/core/**/utils.ts
    • and not duplicated between CLI and web bridge (server/api.ts)?

NON-NEGOTIABLES TO VERIFY
From GUARDRAILS.md / CONTRIBUTING.md:

  • no secrets, tokens, or machine-specific paths committed
  • no blind find-and-replace renames
  • if a symbol rename occurred, rename MCP tool or equivalent careful review was used
  • upstream impact was considered for modified shared symbols
  • no HIGH/CRITICAL impact left unaddressed
  • PR title follows conventional commit style + [area] Short description
  • PolyForm Noncommercial license headers/metadata untouched

OUTPUT FORMAT

Production-ready bar for this PR

Give a concise repo-specific definition in 5–10 bullets.

Findings

For each finding, use this structure exactly:

[BLOCKING|NON-BLOCKING|NIT] Short title

  • Evidence: path:line, path:line
  • Why it matters: ...
  • Recommendation: ...
  • Confidence: High / Medium / Low

Only include findings that are real and supported.
If there are no findings in a section, say:

  • No issues found in this area based on the diff and reviewed context.

What I could not fully verify

List any behavior you could not prove from the diff, tests, or available context.

Final verdict

Choose exactly one:

  • ✅ Ready to merge — no blocking issues.
  • ⚠️ Merge after addressing blocking items.
  • ❌ Not ready — fundamental concerns.

If verdict is not ready, list the blocking items in priority order.


IMPORTANT REVIEW BEHAVIOR

  • Be strict on correctness and contracts.
  • Be conservative about operational risk.
  • Be explicit about uncertainty.
  • Do not suggest unrelated refactors.
  • Do not turn nits into blockers.
  • Do not reward intent; review actual implementation.

@github-actions

github-actions Bot commented Apr 17, 2026

Copy link
Copy Markdown
Contributor

Claude finished @magyargergo's task in 5m 21s —— View job


PR Review: Heritage Extractor Config+Factory Pattern

Tasks:

  • Load context (AGENTS.md, GUARDRAILS.md, CONTRIBUTING.md, TESTING.md, ARCHITECTURE.md)
  • Define production-ready bar
  • Review diff against bar — correctness, tests, safety, performance, operational risk
  • Post final verdict

Production-ready bar for this PR

For a heritage extractor refactor + Ruby MRO expansion touching the ingestion pipeline in GitNexus, "production-ready" means:

  1. All three execution paths unified: parse-worker (worker), call-processor (sequential), and processHeritage (standalone sequential) must route heritage through the new extractor with no legacy inline paths remaining.
  2. No graph schema drift: ExtractedHeritage interface, node labels, edge types, and generateId call signatures unchanged.
  3. MCP contract stability: No MCP tool/resource signatures changed; graph output semantically identical.
  4. Fully wired: All 16 providers supply heritageExtractor; the ruby-mixin MRO strategy is wired into the call-resolution DAG for both worker and sequential paths.
  5. Test coverage per TESTING.md: Unit tests under gitnexus/test/unit/; integration tests exercising real tree-sitter AST; wiring coverage for newly connected paths.
  6. Pipeline phase DAG boundaries respected: Heritage stays in parse phase; Ruby MRO in call-resolution DAG stages; no cross-phase coupling.
  7. No dead code or partially wired paths: All legacy routing removed; new hooks end-to-end.
  8. CI green: npx tsc --noEmit and npx prettier --check . must pass; tests pass on all three platforms.
  9. PR title follows conventional commit format per CONTRIBUTING.md.

Findings

1. Correctness & functional completeness

All three execution paths are correctly unified:

  • parse-worker.ts:1706–1723extractFromCall intercepts Ruby include/extend/prepend before callRouter. ✅
  • parse-worker.ts:1894–1923 — capture-based heritage via provider.heritageExtractor.extract() with correct heritageItems.length > 0 guard. ✅
  • call-processor.ts:786–807 — pre-pass for buildTypeEnv parentMap via heritageExtractor.extract(). ✅
  • call-processor.ts:1015–1033extractFromCall before callRouter, returning collected items via collectedHeritage. ✅
  • heritage-processor.ts:244–262processHeritage delegates to heritageExtractor.extract(). ✅
  • heritage-processor.ts:434–482extractExtractedHeritageFromFiles handles both capture-based and call-based heritage on the sequential pre-pass. ✅

Ruby MRO walk (resolve.ts:325–377) is architecturally correct: prepend (reverse) → direct → include (reverse) → transitive BFS.

routeRubyCall (call-routing.ts:85–91) correctly returns SKIP_RESULT for include/extend/prepend with a comment explaining the interception order. ✅

Graph schema preserved: ExtractedHeritage unchanged; resolveAndAddHeritageEdge uses identical generateId inputs and edge types as the pre-refactor processHeritageFromExtracted. ✅


[BLOCKING] No confirmed clean CI run on current HEAD

  • Evidence: CI run 24574689196 is in-progress as of this review. The last completed CI run (24536117465) was on commit 5533254. Commits ee03d78, 874822d, 2cea3eb, fbac438, b6afb4b, 21744bd, and 0c4a8eb post-date that run. The four most recent commits add non-trivial functionality: Ruby getInstanceAncestry/getSingletonAncestry views, kind-aware MRO walk, maybeRewriteRubyBareCallToSelf, and the call-resolution DAG provider hooks refactor.
  • Why it matters: Per CONTRIBUTING.md, tsc --noEmit and prettier --check . must pass before merge. These commits have not been verified clean.
  • Recommendation: Wait for the in-progress CI run to complete and confirm green on quality/typecheck and quality/format before merging.
  • Confidence: High (verified: CI run in_progress, 7 commits since last completed run)

2. Code clarity & clean code

  • Naming is clear and consistent with the established extractor family.
  • No barrel re-export regressions, no // removed comments, no unused re-exports.
  • routeRubyCall comment block (call-routing.ts:8–11) correctly explains why heritage no longer routes here. ✅
  • ruby-self-call.ts is correctly scoped: pure decision function, only imported by languages/ruby.ts, no shared-pipeline coupling. ✅
  • SupportedLanguages import in test/integration/heritage-extractor-wiring.test.ts:16 fixed to 'gitnexus-shared'. ✅

[NON-BLOCKING] Ruby MRO step 4 uses Array.includes() inside BFS loop

  • Evidence: model/resolve.ts:369–372
    for (const ancestorId of heritageMap.getAncestors(ownerNodeId)) {
      if (prependParents.includes(ancestorId)) continue;   // O(n) inside O(n) loop
      if (otherParents.includes(ancestorId)) continue;     // O(n) inside O(n) loop
  • Why it matters: getAncestors() returns up to MAX_ANCESTOR_DEPTH = 32 results; prependParents and otherParents are small arrays from getInstanceAncestry. For typical Ruby class hierarchies (< 10 direct parents, < 32 ancestors), this is effectively constant. However, in large codebases with heavily mixin-layered classes, both arrays can be non-trivially sized, making this a latent O(n²) in the call-resolution hot path.
  • Recommendation: Convert prependParents and otherParents to Set<string> for O(1) lookup. One-line change:
    const prependSet = new Set(prependParents);
    const otherSet = new Set(otherParents);
    // then: if (prependSet.has(ancestorId)) continue;
  • Confidence: High
  • Fix this →

[NON-BLOCKING] getInstanceAncestry / getSingletonAncestry allocate per call in the dispatch hot path

  • Evidence: model/heritage-map.ts:312–327
    const getInstanceAncestry = (childNodeId: string): readonly ParentEntry[] => {
      const entries = entriesFor(childNodeId);
      if (!entries) return [];
      return entries.filter((e) => e.kind !== 'extend');  // new array every call
    };
    lookupMethodByOwnerWithMRO calls getInstanceAncestry on every Ruby member call dispatch (called from resolveMemberCall in call-processor.ts). For a Ruby codebase with thousands of call sites, this is a per-call-site allocation.
  • Why it matters: For small mixin arrays (< 5 entries, common case), this is negligible. For files with heavy mixin usage (10+ direct parents), repeated filtering becomes measurable across thousands of call sites. The current implementation is not a correctness issue, but it is a scale hazard in the call-resolution hot path.
  • Recommendation: Memoize by childNodeId within the HeritageMap closure, or split the directParents map into instanceParents and singletonParents maps at build time so getInstanceAncestry/getSingletonAncestry return pre-split arrays without filtering.
  • Confidence: Medium (behavioral claim requires profiling to confirm materiality on a large Ruby codebase)

3. Test coverage & change safety

Unit tests (heritage-extraction.test.ts, 428 lines): Factory construction (enum and config paths), all 15 provider wirings, Go named field filtering, Ruby call-based extraction — well-covered.

Integration tests (heritage-extractor-wiring.test.ts, 186 lines): Ruby extractFromCall and TypeScript extract against real tree-sitter ASTs. ✅

Ruby sequential mixin integration test (ruby-sequential-mixin.test.ts, 232 lines): Exercises the full extractExtractedHeritageFromFilesbuildHeritageMapprocessCalls chain with real fixtures. ✅

Ruby self-call unit test (ruby-self-call.test.ts, 316 lines): maybeRewriteRubyBareCallToSelf gating logic, singleton vs instance dispatch, built-in name filtering — comprehensive. ✅

Call-routing tests (test/unit/call-routing/ruby.test.ts): Updated — 179 lines of heritage routing tests removed; skip behavior verified for include/extend/prepend. ✅

[NON-BLOCKING] ruby-mixin MRO walk in lookupMethodByOwnerWithMRO has no direct unit test

  • Evidence: model/resolve.ts:325–377 — the ruby-mixin branch of lookupMethodByOwnerWithMRO is exercised indirectly through ruby-sequential-mixin.test.ts integration tests but has no dedicated unit test covering the walk steps (prepend-before-direct, include-after-direct, transitive fallback, singleton override).
  • Why it matters: The prepend-before-direct logic (step 1 + step 2 at lines 346–355) is the key behavioral invariant of ruby-mixin that distinguishes it from first-wins. A unit test for lookupMethodByOwnerWithMRO with strategy: 'ruby-mixin' would give a regression guard for this without relying on the full pipeline.
  • Recommendation: Add a unit test directly exercising lookupMethodByOwnerWithMRO with mock HeritageMap and SemanticModel stubs, covering: prepend wins over direct, include loses to direct, singleton ancestryOverride, transitive ancestor fallback.
  • Confidence: High
  • Fix this →

4. Performance

  • createHeritageExtractor() called once at module load per language — no hot-path overhead. ✅
  • callNameSet.has() is O(1). ✅
  • MAX_PARENT_DEPTH = 50 (ruby.ts:11) and MAX_ANCESTOR_DEPTH = 32 (heritage-map.ts:88) bound all walks. ✅
  • C3 linearization cache (resolve.ts:219–236): WeakMap-per-HeritageMap auto-drains after ingestion. Correct lifecycle, bounded growth. ✅
  • See NON-BLOCKING findings above for Array.includes() in step 4 and getInstanceAncestry allocation.

5. Operational risk

  • LadybugDB invariant: Not affected. Extraction is read-only from DB perspective. ✅
  • Graph schema: ExtractedHeritage interface, node labels, edge types, generateId inputs identical to before. ✅
  • MCP contracts: No MCP tools or resources touched. ✅
  • Embeddings: No analyze or embedding paths modified. ✅
  • Cross-platform: No path manipulation in new code. ✅
  • Rollback safety: Fully contained in gitnexus/src/core/ingestion/. Re-running analyze re-extracts all heritage. ✅
  • gitnexus-shared change: mro-strategy.ts adds 'ruby-mixin' to the MroStrategy union. The union is additive; no existing consumer branches on exhaustive matching of this union, so no runtime break. Verified: no switch (strategy) over all branches in consuming code. ✅

6. Maintainability

  • Pipeline DAG boundaries respected. Heritage extraction stays in parse phase; Ruby MRO strategy drives call-resolution DAG stage 5. No cross-phase coupling. ✅
  • Extension point is clear. New languages: add a file to heritage-extractors/configs/ or pass SupportedLanguages.X to createHeritageExtractor(). ✅
  • ruby-self-call.ts is correctly isolated: pure function, single consumer (languages/ruby.ts), no shared-layer imports. ✅
  • Documented limitation: model/resolve.ts:366–367 — "Order is best-effort; Ruby's actual MRO for transitive mixins is rare and under-specified (documented in architecture docs as deferred work)." Honest scope boundary. ✅

What I could not fully verify

  1. CI on current HEAD is in-progress (run 24574689196). Cannot confirm typecheck + format pass for the 7 commits since the last completed CI run.
  2. Tree-sitter query/config alignment for all 9 default-path languages. Only TypeScript and Go were spot-checked against .scm capture names. The remaining languages (Java, Kotlin, C#, C++, C, Python, Rust, Dart, PHP) are assumed correct by the pattern.
  3. gitnexus-web/ call-routing counterpart. The call-routing.ts comment at line 14 notes: "This file is intentionally duplicated in gitnexus-web/ because the two packages have separate build targets. Keep both copies in sync." It is not verifiable from this branch whether gitnexus-web/'s copy of call-routing.ts has the same SKIP_RESULT return for include/extend/prepend. If the web copy still routes these as heritage, it would not affect this PR's CLI behavior but would create a divergence for web callers.
  4. ruby-mixin transitive step 4 correctness for multi-level mixin chains (e.g., a module that itself includes another module). The step 4 BFS fallback is acknowledged in code comments as "best-effort" and "deferred work." Acceptable for now.

Final verdict

⚠️ Merge after addressing blocking item.

The implementation is architecturally complete and correct. All three extraction paths are unified, legacy routing is fully removed, the ruby-mixin MRO strategy is correctly wired, and test coverage is solid. The one blocking item is operational:

  1. [Must confirm] The in-progress CI run 24574689196 must complete green on quality/typecheck and quality/format before merge. The 7 commits since the last completed CI run include significant new logic (ruby-mixin MRO walk, maybeRewriteRubyBareCallToSelf, call-resolution DAG provider hooks) that needs typecheck confirmation.

Two non-blocking findings that should be addressed before or shortly after merge:

  1. [Should fix] Ruby MRO step 4: prependParents.includes() / otherParents.includes() inside the BFS loop at resolve.ts:369–372 — replace with Set for O(1) lookup.
  2. [Should add] Unit test for lookupMethodByOwnerWithMRO with ruby-mixin strategy covering all four walk steps.

· Branch

…solution DAG

Addresses four issues flagged in the PR 890 review:

1. CI failure — processCallsFromExtracted non-callable symbol tests

   Two unit tests in test/unit/call-processor.test.ts failed after the DAG
   refactor:
   - 'refuses non-callable symbols even when the name resolves'
   - 'refuses Class target without callForm=constructor (existing behavior)'

   Root cause: defaultDispatchDecision mapped callForm === undefined to
   primary: 'free', which routes through resolveFreeCall's class-target
   fast path (Swift/Kotlin `User()` without `new`). Pre-DAG, undefined
   callForm fell through to singleCandidate via the module-alias branch —
   no class-target fast path.

   Fix: route callForm === undefined to primary: 'owner-scoped' (same as
   'member'). Without receiverTypeName, the owner-scoped branch falls
   through to resolveModuleAliasedCall + singleCandidate, restoring pre-
   DAG behavior: bare-identifier calls where the name resolves only to a
   non-callable symbol (Class / Interface / etc.) null-route cleanly.

2. [NON-BLOCKING] Array.includes() inside BFS loop (O(n²))

   model/resolve.ts ruby-mixin strategy step 4 used
   `prependParents.includes(id)` and `otherParents.includes(id)` inside
   the BFS loop over getAncestors() — latent O(n²) on deeply-mixed Ruby
   hierarchies.

   Fix: build a `walkedDirect` Set once before the loop for O(1) `.has()`
   lookups.

3. [NON-BLOCKING] getInstanceAncestry / getSingletonAncestry per-call allocation

   model/heritage-map.ts filtered `directParents` entries on every call,
   allocating a new array per dispatch site. For Ruby codebases with
   thousands of bare-call dispatches, this is a per-call-site allocation.

   Fix: lazy per-owner split cache. First call to
   getInstanceAncestry / getSingletonAncestry for a given nodeId
   partitions entries once and memoizes; subsequent calls return the
   cached arrays. Shared EMPTY_PARENT_ENTRIES sentinel avoids allocating
   `[]` when a view is empty (common: most classes have no extend
   providers, so singleton view is always empty).

4. [NON-BLOCKING] No direct unit test for ruby-mixin MRO walk

   model/resolve.ts ruby-mixin branch was exercised only indirectly via
   ruby-sequential-mixin integration tests, leaving no regression guard
   for the prepend-before-direct / include-after-direct / singleton-
   override / transitive-BFS invariants.

   Added 7 unit tests in test/unit/symbol-table.test.ts §
   lookupMethodByOwnerWithMRO:
   - prepend provider beats class's own method (shadow)
   - class's own method wins over include provider (shadow)
   - include provider used when class lacks the method
   - extend providers excluded from instance-dispatch walk
   - singleton ancestryOverride routes to extend provider
   - transitive mixin (Account includes Outer; Outer includes Inner)
   - stacked prepends — last-prepended wins

   Surfaced an additional walker bug during the extend-exclusion test:
   step-4 BFS fallback visited `extend` providers during instance
   dispatch via getAncestors() (flat parent list, no kind filter).

   Fix in resolve.ts: also skip `getSingletonAncestry` parents from the
   step-4 BFS — instance dispatch must never reach extend-provided
   methods, even transitively.

Tests: 330 unit tests pass (was 323, +7 for ruby-mixin walk coverage),
1787 integration tests pass (unchanged).
@magyargergo

Copy link
Copy Markdown
Collaborator

@claude Act as a senior reviewer for GitNexus. Your job is to determine whether this PR is production-ready for this repo, not to give a generic code review.

You are reviewing a PR in the GitNexus monorepo:

  • gitnexus/ → CLI + MCP
  • gitnexus-web/
  • gitnexus-shared/

Your task has 2 phases, in this exact order:

PHASE 1 — DEFINE THE BAR
Before reviewing the diff, establish a concise repo-specific definition of “production-ready” for GitNexus, based only on the repo docs and the affected area.
Keep this definition practical and reviewable. Do not invent standards that are not grounded in the repo.

PHASE 2 — REVIEW THE PR AGAINST THAT BAR
Review the actual diff only after defining the bar.
Stay tightly scoped to the changed code and its direct consequences.


CONTEXT TO LOAD FIRST
Read these before reviewing:

  • AGENTS.md
  • GUARDRAILS.md
  • CONTRIBUTING.md
  • TESTING.md
  • ARCHITECTURE.md

Additional context:


PRIMARY OBJECTIVE
Decide whether this PR is safe, correct, maintainable, and operationally acceptable to merge into production for GitNexus.

Do not optimize for completeness at the expense of signal.
Do not pad the review.
Do not propose unrelated refactors.
Do not restate the PR description unless needed for verification.


REVIEW RULES

  • Every finding must be grounded in specific evidence from the diff or directly relevant surrounding code.
  • Every finding must include path:line.
  • If you make a behavioral claim, cite the code that proves it.
  • If you make a performance claim, explain the mechanism.
  • If something cannot be verified from the diff alone, explicitly say so.
  • Distinguish clearly between:
    • verified issue
    • plausible risk
    • unverified concern
  • Avoid vague wording like “might be better” or “could be improved” unless you explain exactly why.
  • Keep the review focused on this PR’s scope only.

For each finding, assign one severity:

  • BLOCKING → must be fixed before merge
  • NON-BLOCKING → valid issue, but merge may still be acceptable
  • NIT → stylistic/minor, not merge-relevant

REPO-SPECIFIC REVIEW CHECKLIST
Use these exact headings.

1. Correctness & functional completeness

Check:

  • Does the implementation actually satisfy the PR claim?
    • ManifestExtractor is truly invoked
    • config.links produces non-zero cross-links where expected
  • Resolver contracts are preserved:
    • resolveSymbol remains exact-match
    • label-scoped Cypher remains correct per contract type
    • flag any regression toward fuzzy or unscoped matching
  • Graph schema integrity is preserved:
    • no silent changes to node labels
    • no silent changes to edge types
    • no silent changes to ID generation (generateId)
  • Call out any missing wiring, partial integration, dead branch, or mismatch between tests and runtime behavior

2. Code clarity & clean code

Check:

  • naming quality
  • local cohesion
  • dead code
  • unnecessary abstraction
  • hidden control flow
  • confusing indirection
  • adherence to repo conventions:
    • direct imports from gitnexus-shared
    • no barrel re-export regressions
    • no // removed comments
    • no unused re-exports
  • no drive-by refactors outside stated scope per CONTRIBUTING.md and GUARDRAILS.md § Scope

3. Test coverage & change safety

Evaluate against TESTING.md:

  • Are there unit tests under gitnexus/test/unit/ covering the newly wired path?
  • Is there a regression guard for 0-link → N-link behavior?
  • Are assertions meaningful rather than tautological?
  • Are fixtures realistic for manifest inputs?
  • If memoization/cache was introduced, is there a test proving hit/miss behavior and correctness?
  • Is there evidence the expected validation path would pass for staged gitnexus/ files?
    • tsc --noEmit
    • vitest run --project default
      If not verifiable, say exactly what is missing.

4. Performance

Inspect for:

  • hot-path overhead in ingestion/group sync
  • excess allocations per manifest entry
  • redundant Cypher round-trips
  • missed batching or missed parallelism (Promise.all) where it materially matters
  • O(n²) or repeated lookup patterns on large repos
  • memoization tradeoffs:
    • correctness
    • invalidation
    • bounded vs unbounded memory growth
      Do not speculate casually; explain the mechanism and likely impact.

5. Operational risk

Check:

  • Windows/cross-platform safety:
    • stream lifecycle
    • FD/file handle lifecycle
    • path separator assumptions
    • anything resembling prior ENOTEMPTY-style lifecycle regressions
  • LadybugDB single-writer invariant is preserved
  • Embeddings preservation:
    • no silent breakage of --embeddings
    • .gitnexus/meta.json.stats.embeddings not silently zeroed by changed paths
  • MCP contracts remain compatible:
    • group_*
    • query
    • context
    • impact
    • detect_changes
    • rename
    • cypher
      Flag any schema or contract break without migration note
  • staleness behavior still triggers correctly (gitnexus/src/mcp/staleness.ts)
  • rollback safety:
    • can this PR be reverted safely without re-analyze?
    • if not, explain why

6. Maintainability

Check:

  • Does the change respect Pipeline Phase DAG boundaries?
  • Is cross-phase coupling being introduced?
  • Are manifest-extractor entry points discoverable from group/sync.ts?
  • Would the next contributor know where to extend this?
  • Is shared logic placed appropriately:
    • in gitnexus-shared/
    • or gitnexus/src/core/**/utils.ts
    • and not duplicated between CLI and web bridge (server/api.ts)?

NON-NEGOTIABLES TO VERIFY
From GUARDRAILS.md / CONTRIBUTING.md:

  • no secrets, tokens, or machine-specific paths committed
  • no blind find-and-replace renames
  • if a symbol rename occurred, rename MCP tool or equivalent careful review was used
  • upstream impact was considered for modified shared symbols
  • no HIGH/CRITICAL impact left unaddressed
  • PR title follows conventional commit style + [area] Short description
  • PolyForm Noncommercial license headers/metadata untouched

OUTPUT FORMAT

Production-ready bar for this PR

Give a concise repo-specific definition in 5–10 bullets.

Findings

For each finding, use this structure exactly:

[BLOCKING|NON-BLOCKING|NIT] Short title

  • Evidence: path:line, path:line
  • Why it matters: ...
  • Recommendation: ...
  • Confidence: High / Medium / Low

Only include findings that are real and supported.
If there are no findings in a section, say:

  • No issues found in this area based on the diff and reviewed context.

What I could not fully verify

List any behavior you could not prove from the diff, tests, or available context.

Final verdict

Choose exactly one:

  • ✅ Ready to merge — no blocking issues.
  • ⚠️ Merge after addressing blocking items.
  • ❌ Not ready — fundamental concerns.

If verdict is not ready, list the blocking items in priority order.


IMPORTANT REVIEW BEHAVIOR

  • Be strict on correctness and contracts.
  • Be conservative about operational risk.
  • Be explicit about uncertainty.
  • Do not suggest unrelated refactors.
  • Do not turn nits into blockers.
  • Do not reward intent; review actual implementation.

@github-actions

github-actions Bot commented Apr 17, 2026

Copy link
Copy Markdown
Contributor

Claude finished @magyargergo's task in 12m 23s —— View job


PR Review: Heritage Extractor Config+Factory Pattern

Tasks:

  • Load context (AGENTS.md, GUARDRAILS.md, CONTRIBUTING.md, TESTING.md, ARCHITECTURE.md)
  • Define production-ready bar
  • Review diff against bar — correctness, tests, safety, performance, operational risk
  • Post final verdict

Production-ready bar for this PR

For a heritage extractor refactor + Ruby MRO wiring touching the ingestion pipeline in GitNexus:

  1. All three execution paths unified: parse-worker (worker), call-processor (sequential), and processHeritage (standalone sequential) must route heritage through the new extractor; no legacy inline paths remaining.
  2. No graph schema drift: ExtractedHeritage interface, node labels, edge types, generateId call signatures unchanged.
  3. MCP contract stability: No MCP tool/resource signatures changed; graph output semantically identical.
  4. Fully wired: All 16 providers supply heritageExtractor; no dead code or partially wired paths.
  5. Test coverage per TESTING.md: Unit tests under gitnexus/test/unit/; integration tests exercising real tree-sitter ASTs; wiring coverage for newly connected paths.
  6. Pipeline phase DAG boundaries respected: Heritage stays in parse phase; no cross-phase coupling.
  7. No dead code: Legacy routing removed and documented.
  8. CI green: npx tsc --noEmit and npx prettier --check . pass; all tests pass on all 3 platforms.
  9. PR title follows conventional commit format per CONTRIBUTING.md.

Findings

1. Correctness & functional completeness

All three execution paths are confirmed wired:

  • parse-worker.ts:1893–1922provider.heritageExtractor.extract() for capture-based heritage ✅
  • parse-worker.ts:1705–1721extractFromCall before callRouter for call-based heritage ✅
  • call-processor.ts:792–816 — pre-pass heritage extraction for buildTypeEnv
  • call-processor.ts:1024–1040extractFromCall before callRouter
  • heritage-processor.ts:244–263processHeritage delegating to heritageExtractor.extract()

Go named field filtering (configs/go.ts:20–23): correctly replicates old inline check. ✅

Ruby mixin interception: routeRubyCall at call-routing.ts:89–91 returns SKIP_RESULT with explanatory comment. ✅

Ruby MRO walk (resolve.ts:325–387): prepend (reverse) → direct → include (reverse) → transitive BFS. Correct per Ruby MRO semantics.

Graph schema preserved: ExtractedHeritage interface unchanged; resolveAndAddHeritageEdge uses identical generateId inputs and edge types as the prior processHeritageFromExtracted path. ✅


[BLOCKING] Two pre-existing tests fail in CI run 24574689196

  • Evidence: CI run 24574689196 (ubuntu/coverage), job-71856807978.log:
    FAIL test/unit/call-processor.test.ts > processCallsFromExtracted > refuses non-callable symbols even when the name resolves
    AssertionError: expected 1 to be +0
    
    FAIL test/unit/call-processor.test.ts > processCallsFromExtracted > refuses Class target without callForm=constructor (existing behavior)
    AssertionError: expected [ { …(6) } ] to have a length of +0 but got 1
    
    Same failures on Windows and macOS.
  • Why it matters: These are correctness regression tests — they verify that Class symbols are NOT emitted as CALLS targets without callForm: 'constructor'. They were passing on main before this branch. The failures indicate a regression in processCallsFromExtracted introduced by the call-processor.ts changes (+199/-44) in this PR. Commit 45ec357 ("fix(ingestion): address CI failures and PR review findings on call-resolution DAG") was pushed after these failures and may have addressed them, but this is unconfirmed — CI run 24575774915 (started 2026-04-17T16:30:49Z) is still in progress at review time.
  • Recommendation: Wait for CI run 24575774915 to complete. If it passes, this blocking item is resolved. If it fails again, the regression in non-callable symbol filtering must be fixed before merge.
  • Confidence: High (log reviewed; failure names verified)

[NON-BLOCKING] captureMap['call'] passed without null check

  • Evidence: call-processor.ts:1028captureMap['call'] is passed as SyntaxNode to provider.heritageExtractor.extractFromCall(), but is typed as any and has no null guard. The outer check at call-processor.ts:1019 only guards captureMap['call.name'].
  • Why it matters: Any tree-sitter query that emits @call.name without @call would pass undefined as the callNode argument. Ruby's callBasedHeritage.extract calls callNode.childForFieldName('arguments') — if callNode is undefined, this throws. In practice, queries always emit both captures together, so the probability is very low. But it's a type-unsafe assumption with no compile-time enforcement.
  • Recommendation: Add if (!captureMap['call']) return; before the extractFromCall invocation, or type the captureMap access as captureMap['call'] as SyntaxNode | undefined with an explicit guard.
  • Confidence: High

[NON-BLOCKING] No isolated unit test for ruby-mixin MRO walk steps

  • Evidence: model/resolve.ts:325–387 — the 4-step walk (prepend reverse → direct → include reverse → transitive BFS) is exercised via ruby-sequential-mixin.test.ts integration tests but has no direct unit test for lookupMethodByOwnerWithMRO with strategy: 'ruby-mixin'.
  • Why it matters: The behavioral invariant of this strategy — prepend beats direct, direct beats include — is the core correctness contract for Ruby MRO. A regression in the step ordering (e.g. direct checking before prepend) would require a full integration fixture to catch, not just any unit failure. An isolated unit test with mock HeritageMap entries would pin the exact semantics and run in < 1ms.
  • Recommendation: Add gitnexus/test/unit/resolve-mro.test.ts with mock stubs covering: (1) prepend wins over direct, (2) include loses to direct, (3) singleton override via ancestryOverride, (4) transitive BFS fallback.
  • Confidence: High

2. Code clarity & clean code

  • Pattern follows call-types/method-types/field-extractor family exactly — consistent naming, local cohesion, no unnecessary abstractions. ✅
  • No barrel re-export regressions, no // removed comments, no unused re-exports. ✅
  • SupportedLanguages import in heritage-extractor-wiring.test.ts:16 corrected to 'gitnexus-shared' (per 2cea3eb). ✅
  • language variable shadowing in processHeritage resolved (renamed to treeSitterLang, per 2cea3eb). ✅
  • routeRubyCall comment (call-routing.ts:9–11) correctly documents why heritage is now intercepted before the router. ✅

[NIT] call-routing.ts comment says file is "intentionally duplicated in gitnexus-web/"

  • Evidence: call-routing.ts:13–14 — "This file is intentionally duplicated in gitnexus-web/ because the two packages have separate build targets." But gitnexus-web/ has no such file (confirmed: glob of gitnexus-web/src/**/*.ts returns no call-routing counterpart).
  • Why it matters: The comment misleads future contributors into searching for a sync obligation that doesn't exist.
  • Recommendation: Remove or correct the comment to reflect current state.
  • Confidence: High

3. Test coverage & change safety

Unit tests (heritage-extraction.test.ts, 428 lines): Factory construction, all 15 provider wirings, Go named field filtering, Ruby call-based extraction — comprehensive and meaningful. ✅

Integration tests (heritage-extractor-wiring.test.ts, 186 lines): Ruby extractFromCall and TypeScript extract against real tree-sitter ASTs. ✅

Ruby sequential mixin integration test (ruby-sequential-mixin.test.ts, 232 lines): Full extractExtractedHeritageFromFilesbuildHeritageMapprocessCalls chain with real fixtures. ✅

Ruby self-call unit test (ruby-self-call.test.ts, 316 lines): maybeRewriteRubyBareCallToSelf gating logic comprehensively tested. ✅

Call-routing tests (test/unit/call-routing/ruby.test.ts): Updated correctly — 179 lines of stale heritage routing tests removed; skip behavior verified for include/extend/prepend. ✅

Pending: CI run 24575774915 must confirm the 2 pre-existing test regressions from CI 24574689196 are fixed by commit 45ec357.


4. Performance

  • createHeritageExtractor() called once at module load per language — no hot-path overhead. ✅
  • callNameSet.has() is O(1). ✅
  • MAX_PARENT_DEPTH = 50 (ruby.ts:11) and MAX_ANCESTOR_DEPTH = 32 (heritage-map.ts:88) bound all AST/ancestor walks. ✅
  • splitCache in heritage-map.ts:317–343 memoizes the instance/singleton split — the filter() pass runs at most once per owner per HeritageMap lifetime. ✅ (Previous review finding addressed.)
  • resolve.ts:375–378: BFS step 4 now uses Set<string> (walkedDirect, singletonOnly) instead of Array.includes() — O(1) skip-check per ancestor. ✅ (Previous review finding addressed.)

No performance issues found.


5. Operational risk

  • LadybugDB invariant: Not affected. Extraction is read-only from DB perspective. ✅
  • Graph schema: ExtractedHeritage interface, node labels, edge types, generateId inputs identical to before. ✅
  • MCP contracts: No MCP tools or resources touched. ✅
  • Embeddings: No analyze or embedding paths modified. ✅
  • Cross-platform: No path manipulation in new code. ✅
  • Rollback safety: Fully contained in gitnexus/src/core/ingestion/. Re-running analyze re-extracts all heritage. ✅
  • gitnexus-shared additive change: mro-strategy.ts adds 'ruby-mixin' to MroStrategy union. No consumer exhaustively switches on this union — additive and safe. ✅

6. Maintainability

  • Pipeline DAG boundaries respected: Heritage stays in parse phase. No cross-phase coupling. ✅
  • Extension point is clear: New languages pass SupportedLanguages.X to createHeritageExtractor() (enum path); custom hooks add a file to heritage-extractors/configs/. ✅
  • ruby-self-call.ts: Pure function, single consumer (languages/ruby.ts), no shared-layer imports. Correctly isolated. ✅
  • Documented limitation (resolve.ts:365–367): Step 4 transitive MRO order is "best-effort" with a pointer to architecture docs — honest scope boundary. ✅

What I could not fully verify

  1. CI on current HEAD (45ec357, run 24575774915) — in-progress at review time. The 2 test regressions found in run 24574689196 appear to be the target of commit 45ec357, but this is unconfirmed until the run completes.
  2. Tree-sitter query/config alignment for all 9 default-path languages — only TypeScript and Go were spot-checked against .scm capture names.
  3. gitnexus-web/ call-routing duplication — no such file exists in gitnexus-web/; the comment in call-routing.ts:13 referring to it appears to be stale or incorrect. No divergence risk identified, but the comment is misleading.

Final verdict

⚠️ Merge after confirming CI is green.

The implementation is architecturally complete and correct. All three execution paths are unified under the new extractor. Legacy inline code is fully removed. The ruby-mixin MRO strategy is correctly wired and well-tested. Performance concerns from previous reviews have been addressed (Set-based BFS dedup, memoized ancestry split). The two non-blocking findings do not warrant blocking merge.

The single blocking item is operational:

  1. [Must confirm] CI run 24575774915 must complete green before merge. The previous run (24574689196) had 2 test failures in processCallsFromExtracted — regressions in non-callable symbol filtering. Commit 45ec357 ("address CI failures") was pushed to fix them. Confirm the current run passes all tests on all 3 platforms before merging.
    · Branch

@magyargergo magyargergo merged commit dfa449e into main Apr 17, 2026
14 checks passed
magyargergo added a commit that referenced this pull request Apr 18, 2026
Bumps gitnexus to v1.6.2 and adds the matching CHANGELOG entry.

Highlights since v1.6.1 (61 commits):
  - Docker support (#848)
  - Language-agnostic heritage / call / variable extractors
    (config+factory pattern, #877 #878 #890)
  - AST-aware embedding chunking (#889)
  - jQuery / axios HTTP consumer detection (#887)
  - SemanticModel wired as first-class resolution input, SM-20 (#885)
  - ImportSemantics split into per-strategy hooks (#886)
  - Python dotted-import fix (#899); worker warnings non-terminal
    (#900 / #261); global-install ENOTEMPTY fixes (#843 #846);
    embeddings staleness fix (#831)

See gitnexus/CHANGELOG.md for the full list.

After merge, tag `v1.6.2` triggers publish.yml which runs CI,
verifies tag↔package.json match, publishes to npm with provenance,
and creates the GitHub Release using the extracted CHANGELOG body.
@magyargergo magyargergo mentioned this pull request Apr 18, 2026
5 tasks
magyargergo added a commit that referenced this pull request Apr 18, 2026
Bumps gitnexus to v1.6.2 and adds the matching CHANGELOG entry.

Highlights since v1.6.1 (61 commits):
  - Docker support (#848)
  - Language-agnostic heritage / call / variable extractors
    (config+factory pattern, #877 #878 #890)
  - AST-aware embedding chunking (#889)
  - jQuery / axios HTTP consumer detection (#887)
  - SemanticModel wired as first-class resolution input, SM-20 (#885)
  - ImportSemantics split into per-strategy hooks (#886)
  - Python dotted-import fix (#899); worker warnings non-terminal
    (#900 / #261); global-install ENOTEMPTY fixes (#843 #846);
    embeddings staleness fix (#831)

See gitnexus/CHANGELOG.md for the full list.

After merge, tag `v1.6.2` triggers publish.yml which runs CI,
verifies tag↔package.json match, publishes to npm with provenance,
and creates the GitHub Release using the extracted CHANGELOG body.
github714801013 pushed a commit to github714801013/GitNexus that referenced this pull request Apr 28, 2026
github714801013 pushed a commit to github714801013/GitNexus that referenced this pull request Apr 28, 2026
Bumps gitnexus to v1.6.2 and adds the matching CHANGELOG entry.

Highlights since v1.6.1 (61 commits):
  - Docker support (abhigyanpatwari#848)
  - Language-agnostic heritage / call / variable extractors
    (config+factory pattern, abhigyanpatwari#877 abhigyanpatwari#878 abhigyanpatwari#890)
  - AST-aware embedding chunking (abhigyanpatwari#889)
  - jQuery / axios HTTP consumer detection (abhigyanpatwari#887)
  - SemanticModel wired as first-class resolution input, SM-20 (abhigyanpatwari#885)
  - ImportSemantics split into per-strategy hooks (abhigyanpatwari#886)
  - Python dotted-import fix (abhigyanpatwari#899); worker warnings non-terminal
    (abhigyanpatwari#900 / abhigyanpatwari#261); global-install ENOTEMPTY fixes (abhigyanpatwari#843 abhigyanpatwari#846);
    embeddings staleness fix (abhigyanpatwari#831)

See gitnexus/CHANGELOG.md for the full list.

After merge, tag `v1.6.2` triggers publish.yml which runs CI,
verifies tag↔package.json match, publishes to npm with provenance,
and creates the GitHub Release using the extracted CHANGELOG body.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(ingestion): language-agnostic heritage extractor with config+factory pattern

2 participants