refactor(pipeline): DAG-based phase architecture + container-logic extraction to LanguageProvider#809
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
…ovider - Add resolveEnclosingOwner hook to LanguageProviderConfig - Add staticOwnerTypes to MethodExtractionConfig - Implement Ruby resolveEnclosingOwner (singleton_class → class/module) - Replace hardcoded STATIC_OWNER_TYPES with config.staticOwnerTypes - Move Ruby static types to rubyMethodConfig - Move Kotlin static types to kotlinMethodConfig - Remove Ruby singleton_class branch from findEnclosingClassInfo - Collapse seqFindEnclosingClassNode/seqFindRawEnclosingContainerNode into single provider-aware seqFindEnclosingOwnerNode - Update worker path to pass provider.resolveEnclosingOwner Agent-Logs-Url: https://github.com/abhigyanpatwari/GitNexus/sessions/bc9f9d4d-f749-4872-9ff2-17fc86e08787 Co-authored-by: magyargergo <11230420+magyargergo@users.noreply.github.com>
…olveEnclosingOwner hook Agent-Logs-Url: https://github.com/abhigyanpatwari/GitNexus/sessions/bc9f9d4d-f749-4872-9ff2-17fc86e08787 Co-authored-by: magyargergo <11230420+magyargergo@users.noreply.github.com>
…tion Restructure the ingestion pipeline from a ~1800-line monolithic orchestrator into a DAG (Directed Acyclic Graph) of named phases with explicit dependencies. New files under pipeline-phases/: - types.ts: PipelinePhase, PipelineContext, PhaseResult contracts - runner.ts: DAG runner with topological sort validation - scan.ts, structure.ts, markdown.ts, cobol.ts: early phases - parse.ts + parse-impl.ts: chunked parse + resolve (the core) - routes.ts, tools.ts, orm.ts: post-parse enrichment phases - cross-file.ts + cross-file-impl.ts: cross-file binding propagation - mro.ts, communities.ts, processes.ts: graph analysis phases - index.ts: barrel export pipeline.ts reduced from ~1960 lines to ~184 lines: - DAG phase array declaration - runPipelineFromRepo as thin orchestrator - topologicalLevelSort retained for backward compat Agent-Logs-Url: https://github.com/abhigyanpatwari/GitNexus/sessions/136bf9c3-2f4f-449b-9fff-001332c8371c
|
@claude review this from performance, architecture perspective. is there any bugs? does it follow clean code principles? |
|
Claude finished @magyargergo's task in 2m 45s —— View job PR Review: Performance, Architecture, Clean Code & BugsTasks:
SummaryThis is a well-executed refactor that reduces ArchitectureStrengths
Issues 🔴 Circular dependency
Node.js can resolve some cycles via hoisting, but this is fragile and can yield 🟡 Design goal contradiction in
|
- Extract synthesizeWildcardImportBindings to wildcard-synthesis.ts - Extract extractORMQueriesInline to orm-extraction.ts - Create shared constants.ts for AST_CACHE_CAP - Fix inline type import in orm.ts (use proper top-level import) - Add comprehensive JSDoc to getPhaseOutput explaining type safety - Move isDev to module level in cross-file.ts (consistency) - Improve module-level documentation across files - Organize barrel exports in index.ts with section comments Agent-Logs-Url: https://github.com/abhigyanpatwari/GitNexus/sessions/2bd6d4aa-6271-4009-8dd2-332ea8ec73ab Co-authored-by: magyargergo <11230420+magyargergo@users.noreply.github.com>
|
@claude review this from performance, architecture perspective. is there any bugs? does it follow clean code principles? |
|
Claude finished @magyargergo's task in 3m 40s —— View job PR Review: Performance, Architecture, Clean Code & BugsTasks:
SummaryAll critical and major issues from the three previous review passes have been resolved. The PR is in mergeable shape. This pass focuses on verifying those fixes and surfacing any remaining observations. ✅ Previously-Flagged Issues — All Resolved
Remaining Observations🟡 Performance
Three call sites exist:
The final call at line 504 runs even when all prior chunks already synthesized, traversing all graph nodes and relationships again for no additional effect. A boolean flag tracking whether synthesis has run would eliminate the redundant final pass for repos where every chunk needed synthesis. Fix this → File contents re-read in cross-file propagation
O(n) per dequeue due to array reindexing. With 12 phases this is imperceptible. Flagging for completeness as the phase graph grows. Fix this → 🟡 Architecture
onProgress({
phase: 'parsing',
percent: 82,
message: `Cross-file type propagation (${filesWithGaps}+ files)...`,The outer 🟢 Minor Style
Nine consecutive single-element push loops remain. These could be Progress percentage budget remains implicit Ranges are hardcoded across files: parse 20–82%, crossFile 82%, mro 83%, communities 84–93%, processes 94–99%. A constants object mapping phase → Overall AssessmentReady to merge. All previously identified critical and major issues are cleanly resolved. The architecture is sound, data flow is explicit, and the dep-isolation test correctly enforces the contract. The remaining items above are all 🟡/🟢 carry-overs — worth tracking as follow-ups but not blockers. |
…ror progress event
Restores phase diagnostics at CLI/MCP boundary. runPipeline now wraps
phase.execute() in try/catch and rethrows with 'Phase <name> failed: ...'
preserving the original via { cause }. Also emits a terminal
{ phase: 'error' } progress event so subscribers see the failure before
the rejection propagates. Handler errors during error reporting are
swallowed to keep the original cause authoritative.
Plan: docs/plans/2026-04-13-001-fix-pipeline-dag-refactor-review-findings-plan.md (U1)
…ally; make single-use crossFile.execute() now wraps its body in try/finally so the accumulator is released on both the happy path and when runCrossFileBindingPropagation throws. Dev-mode telemetry stays inside the try block before dispose (all three counters return 0 after dispose clears internal maps). BindingAccumulator becomes single-use: appendFile after dispose now throws 'BindingAccumulator: use after dispose' instead of silently re-animating via the old _disposed auto-clear. Docs updated; the only production construction site (parse-impl) always creates a fresh instance per run, so no caller relied on the re-use contract. Residual risk documented in crossFile module JSDoc: a future phase inserted between parse and crossFile that throws would still leak the accumulator. Any such phase must manage accumulator lifetime explicitly. Plan: docs/plans/2026-04-13-001-fix-pipeline-dag-refactor-review-findings-plan.md (U2)
Investigation (plan U3) confirms: `importCtx` (ImportResolutionContext) is a scratch workspace with no downstream consumer after parse. `resolutionContext` (returned to crossFile) is a distinct object that owns importMap / namedImportMap / packageMap / moduleAliasMap / model, and never closes over importCtx. cross-file-impl consumes only that ctx via processCalls. The two confusingly-similar "context" names were the root of the adversarial reviewer's concern — comment locks in the invariant so the next reader sees it. No behavioral change. Plan: docs/plans/2026-04-13-001-fix-pipeline-dag-refactor-review-findings-plan.md (U3)
…rseOutput totalFiles was a hidden mutable field on PipelineContext written by parse and read by mro/communities/processes — five reviewers flagged this as a violation of the immutable-context invariant. Removed from PipelineContext, which is now fully readonly, and made the implicit temporal dep explicit: mro/communities/processes now declare 'parse' as a dep and read totalFiles via getPhaseOutput<ParseOutput>(...). No behavior change. Topo-sort unchanged because parse was already a transitive dep through crossFile. Plan: docs/plans/2026-04-13-001-fix-pipeline-dag-refactor-review-findings-plan.md (U4)
…kepoint createMethodExtractor now rejects MethodExtractionConfigs that list companion_object / singleton_class / object_declaration in typeDeclarationNodes but omit the matching entry from staticOwnerTypes. Fails loudly at provider construction time instead of producing silent isStatic=false on the 50000th file analyzed. Opt-out convention preserved: an explicit `new Set()` (empty Set) signals intentional exclusion and passes the guard (memory obs #30588). All 13 existing language configs pass the guard; the new negative test fails without it. Test-first. Plan: docs/plans/2026-04-13-001-fix-pipeline-dag-refactor-review-findings-plan.md (U5)
…vives throws The sequential-fallback block in runChunkedParseAndResolve now runs inside a try/finally that guarantees astCache.clear(), accumulator finalize, and enrichExportedTypeMap execute even if readFileContents or processCalls throws mid-fallback. Cleanup failures are caught inside the finally so they can't mask the original error. Accumulator disposal ownership remains with crossFile (U2) — U6 only adds astCache cleanup and preserves finalize ordering on the error path. Plan: docs/plans/2026-04-13-001-fix-pipeline-dag-refactor-review-findings-plan.md (U6)
…-file-impl Both modules previously had zero direct unit coverage — branches were exercised only through integration tests' happy paths. wildcard-synthesis.test.ts covers: Go graph-IMPORTS fallback, Python moduleAliasMap build, MAX_SYNTHETIC_BINDINGS_PER_FILE cap, dedup against existing namedImportMap entries, and empty-exportedSymbols early return. cross-file-impl.test.ts covers: gapRatio below threshold no-op, MAX_CROSS_FILE_REPROCESS cap, graph-only exportedTypeMap fallback, and empty namedImportMap short-circuit. Tests assert current behavior — any future regression flips them. Plan: docs/plans/2026-04-13-001-fix-pipeline-dag-refactor-review-findings-plan.md (U7)
…o fixture Pins the current post-P1/P2 graph output (57 symbols, 92 relationships, 4 processes, deterministic edge digest) so future silent refactors cannot drift behavior unnoticed. If any count changes or any edge rewires, the test fails with a readable diff listing what changed and a copy-pasteable UPDATE_GOLDEN=1 regen command. Edge digest keyed by symbolic (label, name, filePath) triples rather than raw generateId output — stays meaningful across id-encoding refactors while still catching real semantic rewiring. Plan: docs/plans/2026-04-13-001-fix-pipeline-dag-refactor-review-findings-plan.md (U8)
…afeguards
U9: runner cycle detection now reports only the SCC members via DFS
back-edge trace ('Cycle detected: A -> B -> C -> A') rather than
everything with inDegree > 0 (which mixed cycle members with blocked
dependents). Also emits the 'error' progress event for graph-
validation failures, symmetric with U1's runtime-error path.
U16: findEnclosingClassInfo now defends against language-provider
hooks that return non-container nodes — visitedContainers Set breaks
repeat-visit loops, MAX_ENCLOSING_WALK_ITERATIONS is belt-and-braces.
Documented the hook contract invariant so future provider authors
know the walk-continues-upward expectation.
Plan: docs/plans/2026-04-13-001-fix-pipeline-dag-refactor-review-findings-plan.md (U9, U16)
…t, graph-sort naming Bundles plan units U10, U11, U12, U14, U15: U10 — Type hygiene: readonly ParseOutput arrays (allExtractedRoutes, allDecoratorRoutes, allToolDefs, allORMQueries, allPaths); removed redundant 'as string[] | undefined' cast in routes.ts and 'as URL' in parse-impl.ts; WorkerPool is now 'import type'. Readonly contract propagated into processORMQueries (only iterates). U11 — Dead code & shims: deleted constants.ts shim (AST_CACHE_CAP inlined into its sole real consumer cross-file-impl.ts; isDev consumers now import directly from ../utils/env.js). Removed internal utility re-exports from pipeline-phases/index.ts (no external consumers). Removed topologicalLevelSort re-export from pipeline.ts; updated topological-sort.test.ts to import from the canonical utils/graph-sort.js. Stripped 'Phase 3+4:' stale JSDoc from parse-impl.ts. U12 — Perf: StructureOutput now carries allPathSet (ReadonlySet<string>) built once; cobol, markdown, and cross-file-impl consume the shared set instead of allocating their own. Parse forwards it via ParseOutput.allPathSet; processCobol/processMarkdown widened to ReadonlySet<string>. U14 — graph-sort.ts: renamed local 'inDegree' to 'pendingImportsPerFile' with expanded JSDoc explaining the reverse- graph Kahn's formulation and warning future maintainers not to 'correct' it to standard in-degree semantics. Added self-edge test. U15 — Unconditional worker-fallback logging: removed isDev guard on the worker-pool-creation-failure console.warn so operators can diagnose perf degradations in production. No behavior change. U8 golden-file test confirms pipeline output is byte-identical. Plan: docs/plans/2026-04-13-001-fix-pipeline-dag-refactor-review-findings-plan.md (U10, U11, U12, U14, U15)
…o 1.3.0 U13 — documentation fixes: ARCHITECTURE.md: the prior insertion of the 'Pipeline Phase DAG' section orphaned 7 rows from the 'Where to change what' header. Moved those 7 rows back up under their header so the table reads contiguously; DAG section now follows the completed table. AGENTS.md + CLAUDE.md: bumped version 1.2.0 -> 1.3.0, updated Last reviewed to 2026-04-13, added matching Changelog row documenting the GitNexus index stats refresh after the DAG refactor. Stat bumps (symbols/relationships/execution flows) that were sitting uncommitted in the working tree are now landed under a proper changelog entry per each file's own documented schema. Plan: docs/plans/2026-04-13-001-fix-pipeline-dag-refactor-review-findings-plan.md (U13)
CI Report✅ All checks passed Pipeline Status
Test Results
✅ All 6181 tests passed 97 test(s) skipped — expand for details
Code CoverageTests
📋 View full run · Generated by CI |
|
@claude review this from performance, architecture perspective. is there any bugs? does it follow clean code principles? |
|
Claude finished @magyargergo's task in 2m 20s —— View job PR Review — Pass 5Tasks:
SummaryPass 4 declared "ready to merge" — verified that all four critical issues are still cleanly resolved. However, three items that pass 4 marked as ✅ Fixed are actually still present in the code, and a new mutable-output issue is worth flagging. ❌ Pass 4 Incorrectly Marked as Fixed🟡 Redundant
|
| Issue | Location | Priority |
|---|---|---|
Array.shift() O(n) dequeue |
runner.ts:60 |
🟢 |
| File contents re-read in cross-file propagation | cross-file-impl.ts:150 |
🟡 |
for (const item of arr) target.push(item) pattern |
parse-impl.ts:315–382 |
🟢 |
| Progress percentage budget implicit and scattered | All phase files | 🟢 |
Overall Assessment
The four critical issues from passes 1–2 remain fixed. The PR is functionally sound. Three items that pass 4 incorrectly marked as resolved are still present — the most actionable is the redundant parse dep in three phases, which could be addressed with the ctx.totalFiles approach or alternatively by promoting totalFiles directly from StructureOutput into the context at pipeline setup time. The exportedTypeMap mutability asymmetry is a new 🟡 worth addressing before the pattern proliferates.
…ut.exportedTypeMap, skip redundant wildcard synth - mro/communities/processes: switch redundant `parse` dep to `structure` — totalFiles originates in structure, so depending on parse for it was a spurious data dep that obscured the real DAG. - ParseOutput.exportedTypeMap: typed as truly ReadonlyMap<...,ReadonlyMap>>; graph→exports enrichment moved into parse-impl so the snapshot is fully populated at parse return. crossFile builds its own local mutable working copy for per-file re-resolution writes — no cast at the boundary. - parse-impl: hasSynthesized flag guards the unconditional final synthesizeWildcardImportBindings call when per-chunk/fallback synthesis already ran (graph-global + idempotent across chunks). - cross-file-impl: documented the intentional `phase: 'parsing'` progress label so telemetry bucketing stays consistent with the parse phase. - cross-file-impl test: replaced the now-moved fallback-enrichment assertion with a stronger one — crossFile must not mutate the parse-supplied map. Addresses PR #809 review pass 5 carry-overs.
|
@claude review this from performance, architecture perspective. is there any bugs? does it follow clean code principles? |
|
Claude finished @magyargergo's task in 2m 11s —— View job PR Review — Pass 6Tasks:
SummaryAll previously-flagged issues from Pass 5 are now resolved. The PR is clean and ready to merge. Below is a verification of those fixes plus a handful of new minor observations. ✅ Pass 5 Items — All Resolved
Remaining Observations🟡
|
…traction to LanguageProvider (abhigyanpatwari#809) * Initial plan * refactor: move language-specific container node logic into LanguageProvider - Add resolveEnclosingOwner hook to LanguageProviderConfig - Add staticOwnerTypes to MethodExtractionConfig - Implement Ruby resolveEnclosingOwner (singleton_class → class/module) - Replace hardcoded STATIC_OWNER_TYPES with config.staticOwnerTypes - Move Ruby static types to rubyMethodConfig - Move Kotlin static types to kotlinMethodConfig - Remove Ruby singleton_class branch from findEnclosingClassInfo - Collapse seqFindEnclosingClassNode/seqFindRawEnclosingContainerNode into single provider-aware seqFindEnclosingOwnerNode - Update worker path to pass provider.resolveEnclosingOwner Agent-Logs-Url: https://github.com/abhigyanpatwari/GitNexus/sessions/bc9f9d4d-f749-4872-9ff2-17fc86e08787 Co-authored-by: magyargergo <11230420+magyargergo@users.noreply.github.com> * test: add regression tests for config-driven staticOwnerTypes and resolveEnclosingOwner hook Agent-Logs-Url: https://github.com/abhigyanpatwari/GitNexus/sessions/bc9f9d4d-f749-4872-9ff2-17fc86e08787 Co-authored-by: magyargergo <11230420+magyargergo@users.noreply.github.com> * refactor: implement DAG-based pipeline architecture with phase extraction Restructure the ingestion pipeline from a ~1800-line monolithic orchestrator into a DAG (Directed Acyclic Graph) of named phases with explicit dependencies. New files under pipeline-phases/: - types.ts: PipelinePhase, PipelineContext, PhaseResult contracts - runner.ts: DAG runner with topological sort validation - scan.ts, structure.ts, markdown.ts, cobol.ts: early phases - parse.ts + parse-impl.ts: chunked parse + resolve (the core) - routes.ts, tools.ts, orm.ts: post-parse enrichment phases - cross-file.ts + cross-file-impl.ts: cross-file binding propagation - mro.ts, communities.ts, processes.ts: graph analysis phases - index.ts: barrel export pipeline.ts reduced from ~1960 lines to ~184 lines: - DAG phase array declaration - runPipelineFromRepo as thin orchestrator - topologicalLevelSort retained for backward compat Agent-Logs-Url: https://github.com/abhigyanpatwari/GitNexus/sessions/136bf9c3-2f4f-449b-9fff-001332c8371c * test: add DAG runner unit tests, update ARCHITECTURE.md with phase DAG docs Agent-Logs-Url: https://github.com/abhigyanpatwari/GitNexus/sessions/136bf9c3-2f4f-449b-9fff-001332c8371c * fix: address code review - pass resolutionContext through parse output, fix worker URL path Agent-Logs-Url: https://github.com/abhigyanpatwari/GitNexus/sessions/136bf9c3-2f4f-449b-9fff-001332c8371c * fix: declare transitive parse dependency explicitly in mro/communities/processes phases Agent-Logs-Url: https://github.com/abhigyanpatwari/GitNexus/sessions/136bf9c3-2f4f-449b-9fff-001332c8371c * refactor: improve pipeline-phases clean code and folder structure - Extract synthesizeWildcardImportBindings to wildcard-synthesis.ts - Extract extractORMQueriesInline to orm-extraction.ts - Create shared constants.ts for AST_CACHE_CAP - Fix inline type import in orm.ts (use proper top-level import) - Add comprehensive JSDoc to getPhaseOutput explaining type safety - Move isDev to module level in cross-file.ts (consistency) - Improve module-level documentation across files - Organize barrel exports in index.ts with section comments Agent-Logs-Url: https://github.com/abhigyanpatwari/GitNexus/sessions/2bd6d4aa-6271-4009-8dd2-332ea8ec73ab Co-authored-by: magyargergo <11230420+magyargergo@users.noreply.github.com> * address review feedback: fix circular dep, allFetchCalls mutation, progress bugs, remove DAG naming, extract isDev, fix _item naming, fix O(n²) line calc Agent-Logs-Url: https://github.com/abhigyanpatwari/GitNexus/sessions/6cf53c9b-d55d-4c6f-bf3d-7bfb82d512b6 Co-authored-by: magyargergo <11230420+magyargergo@users.noreply.github.com> * improve JSDoc on lineNumberAtOffset binary search Agent-Logs-Url: https://github.com/abhigyanpatwari/GitNexus/sessions/6cf53c9b-d55d-4c6f-bf3d-7bfb82d512b6 Co-authored-by: magyargergo <11230420+magyargergo@users.noreply.github.com> * address review: filter deps in runner, move totalFiles to ctx, fix cycle JSDoc, centralize isDev, remove DAG naming Agent-Logs-Url: https://github.com/abhigyanpatwari/GitNexus/sessions/b388424f-b939-4a94-97de-3855f9465564 Co-authored-by: magyargergo <11230420+magyargergo@users.noreply.github.com> * fix doc consistency in graph-sort.ts module-level and function-level JSDoc Agent-Logs-Url: https://github.com/abhigyanpatwari/GitNexus/sessions/b388424f-b939-4a94-97de-3855f9465564 Co-authored-by: magyargergo <11230420+magyargergo@users.noreply.github.com> * fix(pipeline): wrap phase errors with phase name and emit terminal error progress event Restores phase diagnostics at CLI/MCP boundary. runPipeline now wraps phase.execute() in try/catch and rethrows with 'Phase <name> failed: ...' preserving the original via { cause }. Also emits a terminal { phase: 'error' } progress event so subscribers see the failure before the rejection propagates. Handler errors during error reporting are swallowed to keep the original cause authoritative. Plan: docs/plans/2026-04-13-001-fix-pipeline-dag-refactor-review-findings-plan.md (U1) * fix(pipeline): move bindingAccumulator dispose into crossFile try/finally; make single-use crossFile.execute() now wraps its body in try/finally so the accumulator is released on both the happy path and when runCrossFileBindingPropagation throws. Dev-mode telemetry stays inside the try block before dispose (all three counters return 0 after dispose clears internal maps). BindingAccumulator becomes single-use: appendFile after dispose now throws 'BindingAccumulator: use after dispose' instead of silently re-animating via the old _disposed auto-clear. Docs updated; the only production construction site (parse-impl) always creates a fresh instance per run, so no caller relied on the re-use contract. Residual risk documented in crossFile module JSDoc: a future phase inserted between parse and crossFile that throws would still leak the accumulator. Any such phase must manage accumulator lifetime explicitly. Plan: docs/plans/2026-04-13-001-fix-pipeline-dag-refactor-review-findings-plan.md (U2) * docs(pipeline): explain why importCtx teardown is safe before crossFile Investigation (plan U3) confirms: `importCtx` (ImportResolutionContext) is a scratch workspace with no downstream consumer after parse. `resolutionContext` (returned to crossFile) is a distinct object that owns importMap / namedImportMap / packageMap / moduleAliasMap / model, and never closes over importCtx. cross-file-impl consumes only that ctx via processCalls. The two confusingly-similar "context" names were the root of the adversarial reviewer's concern — comment locks in the invariant so the next reader sees it. No behavioral change. Plan: docs/plans/2026-04-13-001-fix-pipeline-dag-refactor-review-findings-plan.md (U3) * refactor(pipeline): remove ctx.totalFiles side-channel; promote to ParseOutput totalFiles was a hidden mutable field on PipelineContext written by parse and read by mro/communities/processes — five reviewers flagged this as a violation of the immutable-context invariant. Removed from PipelineContext, which is now fully readonly, and made the implicit temporal dep explicit: mro/communities/processes now declare 'parse' as a dep and read totalFiles via getPhaseOutput<ParseOutput>(...). No behavior change. Topo-sort unchanged because parse was already a transitive dep through crossFile. Plan: docs/plans/2026-04-13-001-fix-pipeline-dag-refactor-review-findings-plan.md (U4) * feat(method-extractor): runtime staticOwnerTypes guard at factory chokepoint createMethodExtractor now rejects MethodExtractionConfigs that list companion_object / singleton_class / object_declaration in typeDeclarationNodes but omit the matching entry from staticOwnerTypes. Fails loudly at provider construction time instead of producing silent isStatic=false on the 50000th file analyzed. Opt-out convention preserved: an explicit `new Set()` (empty Set) signals intentional exclusion and passes the guard (memory obs #30588). All 13 existing language configs pass the guard; the new negative test fails without it. Test-first. Plan: docs/plans/2026-04-13-001-fix-pipeline-dag-refactor-review-findings-plan.md (U5) * fix(pipeline): wrap sequential-fallback in try/finally so cleanup survives throws The sequential-fallback block in runChunkedParseAndResolve now runs inside a try/finally that guarantees astCache.clear(), accumulator finalize, and enrichExportedTypeMap execute even if readFileContents or processCalls throws mid-fallback. Cleanup failures are caught inside the finally so they can't mask the original error. Accumulator disposal ownership remains with crossFile (U2) — U6 only adds astCache cleanup and preserves finalize ordering on the error path. Plan: docs/plans/2026-04-13-001-fix-pipeline-dag-refactor-review-findings-plan.md (U6) * test(pipeline): direct unit coverage for wildcard-synthesis and cross-file-impl Both modules previously had zero direct unit coverage — branches were exercised only through integration tests' happy paths. wildcard-synthesis.test.ts covers: Go graph-IMPORTS fallback, Python moduleAliasMap build, MAX_SYNTHETIC_BINDINGS_PER_FILE cap, dedup against existing namedImportMap entries, and empty-exportedSymbols early return. cross-file-impl.test.ts covers: gapRatio below threshold no-op, MAX_CROSS_FILE_REPROCESS cap, graph-only exportedTypeMap fallback, and empty namedImportMap short-circuit. Tests assert current behavior — any future regression flips them. Plan: docs/plans/2026-04-13-001-fix-pipeline-dag-refactor-review-findings-plan.md (U7) * test(pipeline): golden-file graph-parity regression guard on mini-repo fixture Pins the current post-P1/P2 graph output (57 symbols, 92 relationships, 4 processes, deterministic edge digest) so future silent refactors cannot drift behavior unnoticed. If any count changes or any edge rewires, the test fails with a readable diff listing what changed and a copy-pasteable UPDATE_GOLDEN=1 regen command. Edge digest keyed by symbolic (label, name, filePath) triples rather than raw generateId output — stays meaningful across id-encoding refactors while still catching real semantic rewiring. Plan: docs/plans/2026-04-13-001-fix-pipeline-dag-refactor-review-findings-plan.md (U8) * fix(pipeline): minimal cycle reporting + resolveEnclosingOwner loop safeguards U9: runner cycle detection now reports only the SCC members via DFS back-edge trace ('Cycle detected: A -> B -> C -> A') rather than everything with inDegree > 0 (which mixed cycle members with blocked dependents). Also emits the 'error' progress event for graph- validation failures, symmetric with U1's runtime-error path. U16: findEnclosingClassInfo now defends against language-provider hooks that return non-container nodes — visitedContainers Set breaks repeat-visit loops, MAX_ENCLOSING_WALK_ITERATIONS is belt-and-braces. Documented the hook contract invariant so future provider authors know the walk-continues-upward expectation. Plan: docs/plans/2026-04-13-001-fix-pipeline-dag-refactor-review-findings-plan.md (U9, U16) * refactor(pipeline): type hygiene, dead code cleanup, shared allPathSet, graph-sort naming Bundles plan units U10, U11, U12, U14, U15: U10 — Type hygiene: readonly ParseOutput arrays (allExtractedRoutes, allDecoratorRoutes, allToolDefs, allORMQueries, allPaths); removed redundant 'as string[] | undefined' cast in routes.ts and 'as URL' in parse-impl.ts; WorkerPool is now 'import type'. Readonly contract propagated into processORMQueries (only iterates). U11 — Dead code & shims: deleted constants.ts shim (AST_CACHE_CAP inlined into its sole real consumer cross-file-impl.ts; isDev consumers now import directly from ../utils/env.js). Removed internal utility re-exports from pipeline-phases/index.ts (no external consumers). Removed topologicalLevelSort re-export from pipeline.ts; updated topological-sort.test.ts to import from the canonical utils/graph-sort.js. Stripped 'Phase 3+4:' stale JSDoc from parse-impl.ts. U12 — Perf: StructureOutput now carries allPathSet (ReadonlySet<string>) built once; cobol, markdown, and cross-file-impl consume the shared set instead of allocating their own. Parse forwards it via ParseOutput.allPathSet; processCobol/processMarkdown widened to ReadonlySet<string>. U14 — graph-sort.ts: renamed local 'inDegree' to 'pendingImportsPerFile' with expanded JSDoc explaining the reverse- graph Kahn's formulation and warning future maintainers not to 'correct' it to standard in-degree semantics. Added self-edge test. U15 — Unconditional worker-fallback logging: removed isDev guard on the worker-pool-creation-failure console.warn so operators can diagnose perf degradations in production. No behavior change. U8 golden-file test confirms pipeline output is byte-identical. Plan: docs/plans/2026-04-13-001-fix-pipeline-dag-refactor-review-findings-plan.md (U10, U11, U12, U14, U15) * docs: fix ARCHITECTURE.md table integrity; bump AGENTS.md/CLAUDE.md to 1.3.0 U13 — documentation fixes: ARCHITECTURE.md: the prior insertion of the 'Pipeline Phase DAG' section orphaned 7 rows from the 'Where to change what' header. Moved those 7 rows back up under their header so the table reads contiguously; DAG section now follows the completed table. AGENTS.md + CLAUDE.md: bumped version 1.2.0 -> 1.3.0, updated Last reviewed to 2026-04-13, added matching Changelog row documenting the GitNexus index stats refresh after the DAG refactor. Stat bumps (symbols/relationships/execution flows) that were sitting uncommitted in the working tree are now landed under a proper changelog entry per each file's own documented schema. Plan: docs/plans/2026-04-13-001-fix-pipeline-dag-refactor-review-findings-plan.md (U13) * refactor(pipeline): drop spurious parse deps, true-readonly ParseOutput.exportedTypeMap, skip redundant wildcard synth - mro/communities/processes: switch redundant `parse` dep to `structure` — totalFiles originates in structure, so depending on parse for it was a spurious data dep that obscured the real DAG. - ParseOutput.exportedTypeMap: typed as truly ReadonlyMap<...,ReadonlyMap>>; graph→exports enrichment moved into parse-impl so the snapshot is fully populated at parse return. crossFile builds its own local mutable working copy for per-file re-resolution writes — no cast at the boundary. - parse-impl: hasSynthesized flag guards the unconditional final synthesizeWildcardImportBindings call when per-chunk/fallback synthesis already ran (graph-global + idempotent across chunks). - cross-file-impl: documented the intentional `phase: 'parsing'` progress label so telemetry bucketing stays consistent with the parse phase. - cross-file-impl test: replaced the now-moved fallback-enrichment assertion with a stronger one — crossFile must not mutate the parse-supplied map. Addresses PR abhigyanpatwari#809 review pass 5 carry-overs. --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: magyargergo <11230420+magyargergo@users.noreply.github.com> Co-authored-by: Gergo Magyar <gergomagyar@icloud.com>
Overview
Two interleaved refactors landed on this branch:
LanguageProvider— language-specific knowledge of which AST nodes count as containers (classes, interfaces, modules, etc.) is now owned by each language provider rather than scattered through generic processors.pipeline.tsshrunk from ~1,950 lines to ~130, with each phase extracted intopipeline-phases/as a typed node in a dependency graph, scheduled by a generic runner via topological sort.A long review series (passes 1–5) drove sustained cleanup of correctness, type, and lifecycle issues across both refactors.
Architecture changes
depsand a typedoutput; the runner resolves deps via topological sort and passesPhaseResult<T>maps. Runner now filtersresultsto only the declared deps before passing toexecute(), so phases can't accidentally read undeclared upstream output.PipelineContextis fullyreadonly— phase outputs flow only through typeddeps, not through context mutation.scan,structure,markdown,cobol,parse,routes,tools,orm,crossFile,mro,communities,processes.LanguageProviderowns container-node detection;resolveEnclosingOwneris config-driven with a runtimestaticOwnerTypesguard at the factory chokepoint.Correctness, lifecycle & type safety
bindingAccumulatordisposal moved intocrossFile'stry/finally(single-disposer ownership documented; sequential fallback also wrapped intry/finallyso cleanup survives throws).resolveEnclosingOwnerloop safeguards.allFetchCallsand other parse outputs typedreadonlyto prevent downstream mutation.ParseOutput.exportedTypeMapexposed as trulyReadonlyMap<…, ReadonlyMap<…>>. Worker-path graph→exports enrichment moved intoparse-implso the snapshot is fully populated at parse return;crossFilebuilds its own local mutable working copy for per-file re-resolution writes — no cast at the boundary.parsedeps dropped frommro/communities/processes(they only neededtotalFiles, which originates instructure).hasSynthesizedflag guards the unconditional finalsynthesizeWildcardImportBindingscall when per-chunk/fallback synthesis already ran.isDevcentralized inutils/env.ts(no more duplication across processors).lineNumberAtOffsetbinary-search JSDoc improved; O(n²) line calc fixed.allPathSetbuilt once instructureand consumed across phases (was rebuilt per phase).graph-sort.tsmade consistent across module-level and function-level docs.Testing
wildcard-synthesisandcross-file-impl(incl. assertion thatcrossFiledoes NOT mutate the parse-suppliedexportedTypeMap).staticOwnerTypesandresolveEnclosingOwnerhook.phase: 'parsing'progress label on cross-file re-resolution so telemetry bucketing stays consistent.tsc --noEmitclean. Full Vitest run green except for one pre-existing unrelateddartparser-loader native-module failure.Test plan
tsc --noEmitclean acrossgitnexus/andgitnexus-shared/dartquery-compilation failure is pre-existing, unrelated to this branch)