feat(ingestion): wire ScopeExtractor into parse-worker + processor (#920, RFC #909 Ring 2 PKG)#969
Conversation
, RFC #909 Ring 2 PKG) Plumbs the ScopeExtractor (#919) into the real parsing pipeline. `ParsedFile` artifacts now flow from workers to the parsing-processor without changing any legacy-DAG behavior. ## Shipped ### `gitnexus/src/core/ingestion/scope-extractor-bridge.ts` (new) - `extractParsedFile(provider, sourceText, filePath, onWarn?)` - Short-circuits (returns `undefined`) when the provider has not implemented `emitScopeCaptures`. True for every language today — this is the default no-op path. - Invokes the hook + `ScopeExtractor.extract`, returns a `ParsedFile`. - **Swallows exceptions on both sides.** Failures route through the optional `onWarn` callback (or `console.warn`) and return `undefined`. Scope-extraction errors NEVER break legacy parsing on the same file. - Standalone module (not nested in `parse-worker.ts`) so tests can import it directly without triggering the worker's top-level `parentPort!.on(...)`. ### `gitnexus/src/core/ingestion/workers/parse-worker.ts` - `ParseWorkerResult.parsedFiles: ParsedFile[]` added. - `processFileGroup` calls `extractParsedFile` AFTER tree parse, BEFORE legacy extraction. Worker provides an `onWarn` callback that routes bridge warnings through `parentPort.postMessage({ type: 'warning', message })`. - `mergeResult` includes `parsedFiles` in the sub-batch merge. - Initial + reset accumulator templates include `parsedFiles: []`. ### `gitnexus/src/core/ingestion/parsing-processor.ts` - `WorkerExtractedData.parsedFiles: ParsedFile[]` added. - Empty-result branch and the across-chunk aggregation both include `parsedFiles`. Aggregation is tolerant of workers that don't emit the field (older builds / partial rollouts). ### Ring 1 tweak: `emitScopeCaptures` sync return `readonly CaptureMatch[]` (was `Promise<readonly CaptureMatch[]>`). Tree-sitter and COBOL's regex tagger are both synchronous; no foreseeable need for async work inside this hook. Sync lets the already-sync worker pipeline invoke it inline without cascading `async` up through the batch driver + IPC handler. ## Tests (9 new; full suite 311/311) `gitnexus/test/unit/scope-resolution/parse-worker-scope-integration.test.ts`: - Not-migrated (2): undefined-returning hook · never-invokes-extractor - Migrated (3): happy path · argument threading · honors `shouldCreateScope` override - Error resilience (4): hook throws · extractor throws (no Module) · extractor throws (sibling overlap) · `onWarn` gets routed message with filePath + error body ## Verification - `tsc --noEmit` clean in both packages - `gitnexus-shared` build clean - 311/311 combined scope-resolution / shadow / model / flag suite - 9/9 new bridge tests ## What's NOT in this PR (still deferred to #921) - Actually using the `parsedFiles` — that's the finalize orchestrator. - `ModuleScopeIndex.byFilePath` materialization — belongs alongside the rest of the SemanticModel indexes in #921. ## Closes part of #909. Unblocks - #921 finalize-orchestrator — consumes `WorkerExtractedData.parsedFiles`
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
@claude Act as a senior reviewer for GitNexus. Your job is to determine whether this PR is production-ready for this repo, not to give a generic code review. You are reviewing a PR in the GitNexus monorepo:
Your task has 2 phases, in this exact order: PHASE 1 — DEFINE THE BAR PHASE 2 — REVIEW THE PR AGAINST THAT BAR CONTEXT TO LOAD FIRST
Additional context:
PRIMARY OBJECTIVE Do not optimize for completeness at the expense of signal. REVIEW RULES
For each finding, assign one severity:
REPO-SPECIFIC REVIEW CHECKLIST 1. Correctness & functional completenessCheck:
2. Code clarity & clean codeCheck:
3. Test coverage & change safetyEvaluate against TESTING.md:
4. PerformanceInspect for:
5. Operational riskCheck:
6. MaintainabilityCheck:
NON-NEGOTIABLES TO VERIFY
OUTPUT FORMAT Production-ready bar for this PRGive a concise repo-specific definition in 5–10 bullets. FindingsFor each finding, use this structure exactly: [BLOCKING|NON-BLOCKING|NIT] Short title
Only include findings that are real and supported.
What I could not fully verifyList any behavior you could not prove from the diff, tests, or available context. Final verdictChoose exactly one:
If verdict is not ready, list the blocking items in priority order. IMPORTANT REVIEW BEHAVIOR
You must also ensure we are on track with the RFC plan! Check the issue tickets involved! |
|
Claude finished @magyargergo's task in 3m 33s —— View job PR #969 — Senior Review: feat(ingestion): wire ScopeExtractor into parse-worker + processorTasks:
Production-ready bar for this PRBased on repo docs and the affected area (ingestion →
Findings1. Correctness & functional completenessNo blocking issues. Verified:
2. Code clarity & clean codeNo issues found in this area based on the diff and reviewed context. The separation of
3. Test coverage & change safetyNine tests in
The overlap test ( NON-BLOCKING — Worker-level
|
| Ticket | Status | Notes |
|---|---|---|
| #910, #911, #912 (Ring 2 SHARED deps) | Merged | Pre-requisites ✓ |
| #917 ClassRegistry/MethodRegistry/FieldRegistry | Merged (1bf9fb4) |
✓ |
| #919 ScopeExtractor driver | Merged (c6a291d) |
Direct dependency ✓ |
| #920 (this PR) | Open | Wires bridge into parse-worker |
| #924 REGISTRY_PRIMARY_ flag | Merged (eece634) |
✓ |
| #921 finalize-orchestrator | Blocked on this PR | Unblocked once merged ✓ |
Ring 2 PKG is on track. This PR correctly unblocks #921 by providing WorkerExtractedData.parsedFiles.
What I could not fully verify
tsc --noEmitresult: CI is still in progress. The PR description claims clean typecheck in both packages, but I cannot confirm from the diff alone. The type signatures look correct.- 311/311 test pass: Claimed by PR description; CI pending. The 9 new tests look correctly structured for Vitest.
console.warnfallback inextractParsedFile(scope-extractor-bridge.ts:51) whenonWarnis absent: correctly falls back, but there's no test assertingconsole.warnis called in that case (only theonWarnpath is tested). Minor omission — not a blocking concern.
Final verdict
✅ Ready to merge — no blocking issues.
The PR is a clean, well-isolated additive step. The bridge is correct, the legacy DAG is provably unaffected (zero emitScopeCaptures implementations today), IPC is wired into an existing handler, and the 9 tests pin all three invariants stated in the PR description. The one non-blocking gap (worker-level aggregation test) and one nit (sequential fallback gap documentation) are both appropriate to defer given the dormant state of the feature.
CI Report❌ Some checks failed Pipeline Status
Test Results
✅ All 6749 tests passed 97 test(s) skipped — expand for details
Code CoverageTests
📋 View full run · Generated by CI |
…bhigyanpatwari#920, RFC abhigyanpatwari#909 Ring 2 PKG) (abhigyanpatwari#969) Plumbs the ScopeExtractor (abhigyanpatwari#919) into the real parsing pipeline. `ParsedFile` artifacts now flow from workers to the parsing-processor without changing any legacy-DAG behavior. ## Shipped ### `gitnexus/src/core/ingestion/scope-extractor-bridge.ts` (new) - `extractParsedFile(provider, sourceText, filePath, onWarn?)` - Short-circuits (returns `undefined`) when the provider has not implemented `emitScopeCaptures`. True for every language today — this is the default no-op path. - Invokes the hook + `ScopeExtractor.extract`, returns a `ParsedFile`. - **Swallows exceptions on both sides.** Failures route through the optional `onWarn` callback (or `console.warn`) and return `undefined`. Scope-extraction errors NEVER break legacy parsing on the same file. - Standalone module (not nested in `parse-worker.ts`) so tests can import it directly without triggering the worker's top-level `parentPort!.on(...)`. ### `gitnexus/src/core/ingestion/workers/parse-worker.ts` - `ParseWorkerResult.parsedFiles: ParsedFile[]` added. - `processFileGroup` calls `extractParsedFile` AFTER tree parse, BEFORE legacy extraction. Worker provides an `onWarn` callback that routes bridge warnings through `parentPort.postMessage({ type: 'warning', message })`. - `mergeResult` includes `parsedFiles` in the sub-batch merge. - Initial + reset accumulator templates include `parsedFiles: []`. ### `gitnexus/src/core/ingestion/parsing-processor.ts` - `WorkerExtractedData.parsedFiles: ParsedFile[]` added. - Empty-result branch and the across-chunk aggregation both include `parsedFiles`. Aggregation is tolerant of workers that don't emit the field (older builds / partial rollouts). ### Ring 1 tweak: `emitScopeCaptures` sync return `readonly CaptureMatch[]` (was `Promise<readonly CaptureMatch[]>`). Tree-sitter and COBOL's regex tagger are both synchronous; no foreseeable need for async work inside this hook. Sync lets the already-sync worker pipeline invoke it inline without cascading `async` up through the batch driver + IPC handler. ## Tests (9 new; full suite 311/311) `gitnexus/test/unit/scope-resolution/parse-worker-scope-integration.test.ts`: - Not-migrated (2): undefined-returning hook · never-invokes-extractor - Migrated (3): happy path · argument threading · honors `shouldCreateScope` override - Error resilience (4): hook throws · extractor throws (no Module) · extractor throws (sibling overlap) · `onWarn` gets routed message with filePath + error body ## Verification - `tsc --noEmit` clean in both packages - `gitnexus-shared` build clean - 311/311 combined scope-resolution / shadow / model / flag suite - 9/9 new bridge tests ## What's NOT in this PR (still deferred to abhigyanpatwari#921) - Actually using the `parsedFiles` — that's the finalize orchestrator. - `ModuleScopeIndex.byFilePath` materialization — belongs alongside the rest of the SemanticModel indexes in abhigyanpatwari#921. ## Closes part of abhigyanpatwari#909. Unblocks - abhigyanpatwari#921 finalize-orchestrator — consumes `WorkerExtractedData.parsedFiles`
Closes #920. Plumbs the ScopeExtractor (#919) into the real parsing pipeline. `ParsedFile` artifacts now flow from workers to the parsing-processor without changing any legacy-DAG behavior.
What ships
`scope-extractor-bridge.ts` (new)
`parse-worker.ts`
`parsing-processor.ts`
Ring 1 tweak: `emitScopeCaptures` sync return
`readonly CaptureMatch[]` (was `Promise<readonly CaptureMatch[]>`). Tree-sitter and COBOL's regex tagger are both synchronous; no foreseeable async need. Sync lets the worker invoke it inline without cascading `async` through the batch pipeline + IPC handler. No consumers yet, so no breakage.
Design constraints honored
Tests (9 new; full suite 311/311)
`gitnexus/test/unit/scope-resolution/parse-worker-scope-integration.test.ts`:
Verification
What's deferred (not this PR)
Part of