Skip to content

[group/sync] Fix ManifestExtractor never called — config.links always produced 0 cross-links#827

Merged
magyargergo merged 6 commits into
abhigyanpatwari:mainfrom
jonasvanderhaegen-xve:fix/manifest-extractor-not-called-in-sync
Apr 15, 2026
Merged

[group/sync] Fix ManifestExtractor never called — config.links always produced 0 cross-links#827
magyargergo merged 6 commits into
abhigyanpatwari:mainfrom
jonasvanderhaegen-xve:fix/manifest-extractor-not-called-in-sync

Conversation

@jonasvanderhaegen-xve

Copy link
Copy Markdown
Contributor

What changed

Wire ManifestExtractor into syncGroup so links declared in group.yaml actually produce cross-links.

Files changed:

  • gitnexus/src/core/group/sync.ts — import + call ManifestExtractor
  • gitnexus/test/unit/group/sync.test.ts — new test covering manifest cross-link generation

Why

ManifestExtractor (extractors/manifest-extractor.ts) is fully implemented: it converts GroupManifestLink[] into provider/consumer StoredContract pairs and direct CrossLink entries with matchType: 'manifest'. However sync.ts never imported or called it.

config.links was parsed and validated by config-parser.ts but then silently dropped — syncGroup never accessed it.

This bug most visibly affects repos where HTTP clients use absolute URLs (e.g. PHP file_get_contents('https://api.example.com/...')). The HTTP extractor only emits providers (it normalises call URLs starting with /), so these consumer calls are invisible to auto-detection. The links block in group.yaml is the documented workaround — but it did nothing.

How to verify

# Add a links block to an existing group.yaml, then sync:
gitnexus group sync <your-group>
# Before this fix: 0 cross-links
# After this fix: N manifest cross-links (one per declared link, confidence 1.0)

Or run the new unit test:

cd gitnexus && npx vitest run test/unit/group/sync.test.ts

Design note

The manifest extraction is placed after the DB try/finally block rather than inside it. ManifestExtractor.resolveSymbol gracefully falls back to synthetic UIDs (manifest::<repo>::<contractId>) when dbExecutors is undefined or a pool is closed — cross-links are always generated regardless. This also makes the code testable without mocked DB pools.

Related

Closes #826
Patch gist: https://gist.github.com/jonasvanderhaegen-xve/e2bf066d842f676aa16400e10ae5114f

Checklist

  • Unit tests pass (npx vitest run test/unit/group/sync.test.ts — 10/10)
  • New test covers the manifest cross-link path
  • No secrets or machine-specific paths committed
  • Typecheck clean on touched files (pre-existing gitnexus-shared workspace errors unrelated to this PR)

jonasvanderhaegen-xve added 2 commits April 14, 2026 21:54
ManifestExtractor was fully implemented in extractors/manifest-extractor.ts
but never imported or called in sync.ts. As a result, any links declared in
group.yaml were parsed and validated by config-parser.ts but silently dropped
— config.links was always an empty dead-end as far as syncGroup was concerned.

Changes:
- Import ManifestExtractor in sync.ts
- Call extractFromManifest(config.links, dbExecutors) inside the outer try
  block, after all repos are processed but before the finally closes the DB
  pools (symbol resolution via resolveSymbol requires open executors)
- Collect the resulting contracts into autoContracts and the cross-links into
  a separate manifestCrossLinks array
- Merge manifestCrossLinks into the final crossLinks alongside runExactMatch
  results

Without this fix, users who declare explicit service dependencies in
group.yaml links (the documented workaround for HTTP clients that use absolute
URLs and are invisible to the auto-extractors) get 0 cross-links regardless of
what they configure.
Add a unit test that asserts config.links entries produce contract pairs
and a manifest cross-link (matchType: 'manifest') via syncGroup.

Also refactors the manifest extraction call to sit outside the else/try
block so it runs regardless of extractorOverride arity — makes the code
testable without mocked DB pools and ensures links work when callers supply
a zero-arity override (e.g. in tests or programmatic usage).
@vercel

vercel Bot commented Apr 14, 2026

Copy link
Copy Markdown

Someone is attempting to deploy a commit to the NexusCore Team on Vercel.

A member of the Team first needs to authorize it.

@xkonjin xkonjin left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: Fix ManifestExtractor never called

This PR fixes a real functional gap: config.links were parsed but silently dropped because ManifestExtractor was imported but never instantiated in the sync pipeline.

What's good

  • Clean placement after the DB try/finally block. Using synthetic UIDs as fallback when dbExecutors is unavailable is the right resilience pattern.
  • Test covers the happy path: manifest links produce cross-links with matchType: 'manifest' and confidence 1.0.

Potential issues

  1. Duplicate contracts when manifest links overlap with auto-extracted contracts. If an HTTP route extractor also produces a contract with the same ID, autoContracts will contain duplicates. runExactMatch and downstream consumers should be checked to see if they tolerate duplicate contract IDs.

  2. Error handling in manifestEx.extractFromManifest is opaque. If a link references a repo not present in config.repos, what happens? The test uses repos that exist in the config. Consider adding a test for a dangling manifest link (e.g., from: 'app/missing').

  3. Synthetic UID determinism. The comment says resolveSymbol falls back to synthetic UIDs when dbExecutors is undefined. If dbExecutors is defined but a specific repo pool is closed or missing, do all synthetic UIDs for the same symbol remain stable across runs? Non-deterministic UIDs would break cross-link stability.

Suggestions

  • Add a test verifying behavior when dbExecutors is undefined to ensure synthetic fallback is exercised.
  • Consider logging unmatched manifest links at warn level rather than silently skipping them.

Overall: solid bug fix, well-scoped. Just verify duplicate contract tolerance.

@xkonjin xkonjin left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review — PR #827

Overall this is a solid, well-documented fix for a real gap: was implemented but never wired into , so were silently dropped.

Bugs / edge cases

  • The new variable is declared but not used beyond merging into . That usage is fine, but double-check that downstream consumers of (e.g., the registry writer or any exact-match dedup logic) won’t create duplicate entries if already produced a contract that overlaps with a manifest-declared one. Consider adding a cross-link dedup step or a unit test for overlap.
  • is captured after the try/finally block. The comment says resolveSymbol falls back to synthetic UIDs when undefined, but verify that handles an empty Map correctly and never throws on .

Security

  • No concerns. No new inputs are eval’d or executed. is already validated at the config layer.

Test coverage

  • The new test covers the happy path well. I’d suggest one additional test: (e.g., pass that causes all pools to fail or mock a closed pool). That would validate the fallback behavior described in the code comment.

Nit

  • Empty line added inside the finally block (after the block) is harmless but unrelated to this change. Consider removing to keep the diff minimal.

Nice catch shipping this — it closes a real functional hole in the pipeline.

@xkonjin xkonjin left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review — PR #827

Overall this is a solid, well-documented fix for a real gap: ManifestExtractor was implemented but never wired into syncGroup, so config.links were silently dropped.

Bugs / edge cases

  • The new manifestCrossLinks variable is declared but not used beyond merging into crossLinks. That usage is fine, but double-check that downstream consumers of crossLinks (e.g., the registry writer or any exact-match dedup logic) won't create duplicate entries if runExactMatch already produced a contract that overlaps with a manifest-declared one. Consider adding a cross-link dedup step or a unit test for overlap.
  • dbExecutors is captured after the try/finally block. The comment says resolveSymbol falls back to synthetic UIDs when undefined, but verify that ManifestExtractor.extractFromManifest handles an empty Map correctly and never throws on undefined.

Security

  • No concerns. No new inputs are eval'd or executed. config.links is already validated at the config layer.

Test coverage

  • The new test covers the happy path well. I'd suggest one additional test: "manifest links still produce cross-links when dbExecutors is unavailable" (e.g., pass opts that causes all pools to fail or mock a closed pool). That would validate the fallback behavior described in the code comment.

Nit

  • Empty line added inside the finally block (after the missingRepos.push block) is harmless but unrelated to this change. Consider removing to keep the diff minimal.

Nice catch shipping this — it closes a real functional hole in the pipeline.

@github-actions

github-actions Bot commented Apr 14, 2026

Copy link
Copy Markdown
Contributor

CI Report

All checks passed

Pipeline Status

Stage Status Details
✅ Typecheck success tsc --noEmit
✅ Tests success unit tests, 3 platforms
✅ E2E success gitnexus-web changes only

Test Results

Tests Passed Failed Skipped Duration
6336 6239 0 97 234s

✅ All 6239 tests passed

97 test(s) skipped — expand for details
  • Swift MethodExtractor > isTypeDeclaration > recognizes class_declaration
  • Swift MethodExtractor > isTypeDeclaration > recognizes protocol_declaration
  • Swift MethodExtractor > isTypeDeclaration > rejects import_declaration
  • Swift MethodExtractor > visibility > extracts public method
  • Swift MethodExtractor > visibility > extracts private method
  • Swift MethodExtractor > visibility > defaults to internal when no modifier
  • Swift MethodExtractor > protocol methods > marks protocol method as abstract
  • Swift MethodExtractor > static and class methods > detects static func as isStatic
  • Swift MethodExtractor > static and class methods > detects class func as isStatic
  • Swift MethodExtractor > parameters > extracts parameters with types and default values
  • Swift MethodExtractor > return type > extracts return type from -> annotation
  • Swift MethodExtractor > annotations > extracts @objc attribute
  • Swift MethodExtractor > isFinal > detects final func
  • Swift MethodExtractor > isFinal > is false when not final
  • Swift MethodExtractor > isAsync > detects async func
  • Swift MethodExtractor > isOverride > detects override method
  • buildTypeEnv > constructor inference (Tier 1 fallback) > lookupClassByName regression coverage > Swift lookupClassByName regression coverage > Swift cross-file constructor inference uses lookupClassByName
  • buildTypeEnv > constructor inference (Tier 1 fallback) > lookupClassByName regression coverage > Swift lookupClassByName regression coverage > Swift explicit init inference uses lookupClassByName
  • buildTypeEnv > constructor inference (Tier 1 fallback) > lookupClassByName regression coverage > Swift lookupClassByName regression coverage > Swift cross-file constructor inference does not bind plain functions
  • buildTypeEnv > known limitations (documented skip tests) > Ruby block parameter: users.each { |user| } — closure param inference, different feature
  • Swift constructor-inferred type resolution > detects User and Repo classes, both with save methods
  • Swift constructor-inferred type resolution > resolves user.save() to Models/User.swift via constructor-inferred type
  • Swift constructor-inferred type resolution > resolves repo.save() to Models/Repo.swift via constructor-inferred type
  • Swift constructor-inferred type resolution > emits exactly 2 save() CALLS edges (one per receiver type)
  • Swift self resolution > detects User and Repo classes, each with a save function
  • Swift self resolution > resolves self.save() inside User.process to User.save, not Repo.save
  • Swift parent resolution > detects BaseModel and User classes plus Serializable protocol
  • Swift parent resolution > emits EXTENDS edge: User → BaseModel
  • Swift parent resolution > emits IMPLEMENTS edge: User → Serializable (protocol conformance)
  • Swift cross-file User.init() inference > resolves user.save() via User.init(name:) inference
  • Swift cross-file User.init() inference > resolves user.greet() via User.init(name:) inference
  • Swift return type inference > detects User class and getUser function
  • Swift return type inference > detects save function on User (Swift class methods are Function nodes)
  • Swift return type inference > resolves user.save() to User#save via return type of getUser() -> User
  • Swift return-type inference via function return type > resolves user.save() to User#save via return type of getUser()
  • Swift return-type inference via function return type > user.save() does NOT resolve to Repo#save
  • Swift return-type inference via function return type > resolves repo.save() to Repo#save via return type of getRepo()
  • Swift implicit imports (cross-file visibility) > detects UserService class in Models.swift
  • Swift implicit imports (cross-file visibility) > resolves UserService() constructor call across files (no explicit import)
  • Swift implicit imports (cross-file visibility) > resolves service.fetchUser() member call across files
  • Swift implicit imports (cross-file visibility) > creates IMPORTS edges between files in the same module
  • Swift extension deduplication > detects Product class
  • Swift extension deduplication > resolves Product() constructor despite extension creating duplicate class node
  • Swift extension deduplication > resolves product.save() to Product.swift (primary definition)
  • Swift constructor call fallback (no new keyword) > resolves OCRService() as constructor call across files
  • Swift constructor call fallback (no new keyword) > resolves ocr.recognize() member call via constructor-inferred type
  • Swift export visibility (internal vs private) > resolves PublicService() constructor across files
  • Swift export visibility (internal vs private) > resolves internalHelper() across files (internal = module-scoped)
  • Swift if let / guard let binding resolution > detects User and Repo classes
  • Swift if let / guard let binding resolution > resolves user.save() inside if-let to User#save
  • Swift if let / guard let binding resolution > resolves repo.save() inside guard-let to Repo#save
  • Swift if let / guard let binding resolution > user.save() in if-let does NOT resolve to Repo#save
  • Swift await / try expression unwrapping > resolves user.save() via await fetchUser() return type
  • Swift await / try expression unwrapping > resolves repo.save() via try parseRepo() return type
  • Swift await / try expression unwrapping > detects fetchUser and parseRepo as functions
  • Swift for-in loop element type inference > detects User and Repo classes
  • Swift for-in loop element type inference > creates implicit import edges between files
  • Swift field-type resolution > detects classes and their properties
  • Swift field-type resolution > emits HAS_PROPERTY edges from class to field
  • Swift field-type resolution > resolves field-chain call user.address.save() → Address#save
  • Swift field-type resolution > emits ACCESSES edges for field reads in chains
  • Swift field-type resolution > populates field metadata (visibility, declaredType) on Property nodes
  • Swift call-result binding > resolves call-result-bound method call user.save() → User#save
  • Swift call-result binding > getUser() is present as a defined function
  • Swift call-result binding > emits processUser -> getUser CALLS edge for let-assigned free function call
  • Swift method enrichment > detects Animal protocol and Dog class
  • Swift method enrichment > emits IMPLEMENTS edge Dog -> Animal
  • Swift method enrichment > emits HAS_METHOD edges for Dog methods
  • Swift method enrichment > marks protocol Animal.speak as isAbstract
  • Swift method enrichment > marks Dog.speak as NOT isAbstract
  • Swift method enrichment > marks breathe as isFinal
  • Swift method enrichment > marks classify as isStatic
  • Swift method enrichment > captures @objc annotation on breathe
  • Swift method enrichment > populates parameterTypes for classify(_ name: String)
  • Swift method enrichment > records parameterCount for classify
  • Swift method enrichment > records returnType for speak
  • Swift method enrichment > resolves dog.speak() CALLS edge
  • Swift method enrichment > resolves Dog.classify("dog") CALLS edge
  • Swift abstract dispatch > detects Repository protocol and SqlRepository class
  • Swift abstract dispatch > emits IMPLEMENTS edge SqlRepository -> Repository
  • Swift abstract dispatch > emits HAS_METHOD edges for Repository.find and Repository.save
  • Swift abstract dispatch > emits HAS_METHOD edges for SqlRepository.find and SqlRepository.save
  • Swift abstract dispatch > marks base Repository.find as isAbstract
  • Swift abstract dispatch > marks base Repository.save as isAbstract
  • Swift abstract dispatch > marks concrete SqlRepository.find as NOT isAbstract
  • Swift abstract dispatch > resolves repo.find(id: 42) CALLS edge
  • Swift abstract dispatch > resolves repo.save(entity: user) CALLS edge
  • Swift abstract dispatch > populates parameterTypes for Repository.find
  • Swift abstract dispatch > populates parameterTypes for Repository.save
  • Swift abstract dispatch > records returnType for SqlRepository.find
  • Swift abstract dispatch > emits METHOD_IMPLEMENTS edges from SqlRepository methods → Repository protocol methods
  • Swift overloaded method disambiguation > detects 2 distinct find Method nodes on SqlRepository
  • Swift overloaded method disambiguation > emits METHOD_IMPLEMENTS edges for both find overloads
  • Swift overloaded method disambiguation > emits METHOD_IMPLEMENTS edge for save
  • Swift overloaded method disambiguation > emits exactly 3 METHOD_IMPLEMENTS edges total
  • Swift Child extends Parent — inherited method resolution (SM-9) > detects Parent and Child classes
  • Swift Child extends Parent — inherited method resolution (SM-9) > resolves c.parentMethod() to Parent.parentMethod via first-wins MRO walk

Code Coverage

Tests

Metric Coverage Covered Base Delta Status
Statements 72.89% 16867/23140 72.87% 📈 +0.0 🟢 ██████████████░░░░░░
Branches 61.73% 10725/17372 61.76% 📉 -0.0 🔴 ████████████░░░░░░░░
Functions 78.18% 1577/2017 78.2% 📉 -0.0 🔴 ███████████████░░░░░
Lines 75.28% 15316/20345 75.27% 📈 +0.0 🟢 ███████████████░░░░░

📋 View full run · Generated by CI

@magyargergo

Copy link
Copy Markdown
Collaborator

@claude try to find bugs, performance issues, clean code violations.

@github-actions

github-actions Bot commented Apr 15, 2026

Copy link
Copy Markdown
Contributor

Claude finished @magyargergo's task in 2m 24s —— View job


Code Review — PR #827

  • Read sync.ts and sync.test.ts changes
  • Read ManifestExtractor source and matching.ts
  • Analyze for bugs, performance issues, clean code violations

Bugs

1. Duplicate cross-links for every manifest link (High)

sync.ts lines 173, 182–183

autoContracts.push(...manifestResult.contracts);   // ← manifest contracts injected here
// ...
const { matched, unmatched } = runExactMatch(autoContracts);          // ← also sees them
const crossLinks: CrossLink[] = [...matched, ...manifestCrossLinks];  // ← both included

ManifestExtractor always produces a paired provider + consumer for each manifest link (same contractId, different repos). runExactMatch then picks up those pairs and generates an additional matchType: 'exact' cross-link for each of them. The final crossLinks array therefore contains two entries per manifest link: one 'exact' (from runExactMatch) and one 'manifest' (from manifestCrossLinks).

The new test at line 234 only checks filter((cl) => cl.matchType === 'manifest'), so it never asserts the total length and misses the duplicate. If you add expect(result.crossLinks).toHaveLength(1) to the test you'll see it fail with 2.

Fix: Exclude manifest contracts from the runExactMatch input — keep them in allContracts for storage but don't feed them to the matcher:

// separate auto-detected from manifest contracts before matching
const { matched, unmatched } = runExactMatch(autoContracts.filter(c => c.meta?.source !== 'manifest'));

Or keep track of the manifest contracts separately instead of merging them into autoContracts before matching. Fix this →


2. Stale executor references passed after pool close (Medium)

sync.ts lines 157–172

} finally {
  for (const id of [...new Set(openPoolIds)]) {
    await closeLbug(id).catch(() => {});   // ← all pools closed here
  }
}
// ...
const manifestEx = new ManifestExtractor();
const manifestResult = await manifestEx.extractFromManifest(config.links, dbExecutors);  // ← stale executors

dbExecutors is populated inside the try block, but all pools are closed in the finally. The map object still exists and holds references to executor functions that now target closed pools. ManifestExtractor.resolveSymbol calls executor(...), which will fail, get caught, and log a console.warn for every manifest link × every repo pair. This silently degrades to synthetic UIDs (which is safe), but generates spurious warnings and wastes one failed DB round-trip per link.

The comment says "resolveSymbol falls back to synthetic UIDs when dbExecutors is undefined or a pool is closed" — the undefined path is clean, but the closed-pool path goes through exception handling. Passing undefined explicitly when pools are already closed would avoid the warnings:

// Option A: clear the map so resolveSymbol sees no executor
dbExecutors?.clear();
// or
// Option B: pass undefined
await manifestEx.extractFromManifest(config.links, undefined);

Unless there is a concrete future plan to run extractFromManifest inside the try block (before pool close), the current design guarantees the symbol resolution always fails.


Performance

3. Sequential symbol resolution in ManifestExtractor.extractFromManifest

manifest-extractor.ts lines 85–132

The for loop makes two await resolveSymbol(...) calls per link sequentially. For a config with 20 links and 2 DB executors, that's 80 sequential round-trips. Both resolutions for a single link are independent:

// current — sequential
const providerSymbol = await this.resolveSymbol(providerRepo, link, dbExecutors);
const consumerSymbol = await this.resolveSymbol(consumerRepo, link, dbExecutors);

// better — parallel
const [providerSymbol, consumerSymbol] = await Promise.all([
  this.resolveSymbol(providerRepo, link, dbExecutors),
  this.resolveSymbol(consumerRepo, link, dbExecutors),
]);

This halves DB resolution time at zero complexity cost. Fix this →


Clean Code

4. allContracts is an unnecessary alias

sync.ts line 184

const allContracts: StoredContract[] = autoContracts;

This is a pure reference alias — it's the same array object. Using two names for the same thing throughout lines 184–201 makes readers wonder if allContracts is supposed to contain something autoContracts doesn't (e.g., "does it include manifest contracts too?"). Just use autoContracts directly. Fix this →

5. Test missing total length assertion

sync.test.ts line 233

The manifest test only checks filtered cross-links, not the full array:

const manifestLinks = result.crossLinks.filter((cl) => cl.matchType === 'manifest');
expect(manifestLinks).toHaveLength(1);
// missing: expect(result.crossLinks).toHaveLength(1)

This allows the duplicate cross-link bug (#1 above) to go undetected. Adding the total-length assertion would have caught it.


Summary

Issue Severity
Bug Manifest contracts fed into runExactMatch → duplicate cross-links per manifest link High
Bug Stale pool executors passed post-close → spurious warn logs + failed DB calls Medium
Perf Sequential resolveSymbol per link — parallelizable with Promise.all Medium
Clean allContracts alias is redundant Low
Test Missing toHaveLength(1) on total crossLinks hides the duplicate bug Low

The core fix (wiring ManifestExtractor) is correct and the placement decision is sound. Issue #1 is the most important to address before merging.

Also removes the stray empty line in the finally block (noted in review).
Addresses review feedback on PR #827:

1. Dedupe cross-links. Manifest contracts participate in runExactMatch, so a
   manifest-declared link also emitted a duplicate matchType:'exact' CrossLink
   for the same endpoint pair. Dedupe by (from, to, type, contractId) and
   prefer manifest (operator-declared intent).

2. Warn on dangling repos. When a manifest link references a repo not in
   config.repos, log a warning. Synthetic UIDs keep the cross-link
   deterministic, but the operator probably meant something else.

3. Tests:
   - Assert no duplicate 'exact' CrossLink is emitted alongside the manifest one.
   - Assert synthetic UID format when no DB executors are available.
   - New test: dangling manifest repo still produces a cross-link + logs a warning.
Previous implementation ran 2N sequential Cypher round-trips per
manifest (one for provider side, one for consumer, awaited in-order
per link). For manifests with tens of links this dominated syncGroup
latency in groups with many declared cross-repo contracts.

Changes:
- Resolve provider + consumer in parallel per link (Promise.all).
- Resolve all links in parallel (outer Promise.all over links.map).
  Each repo's executor pool is independent, so cross-repo fan-out
  scales with the number of distinct repos in the manifest.
- Memoize by (repo, type, contract). Manifests frequently declare
  the same contract from both directions or across sibling groups,
  so duplicate triples now hit the DB once instead of 2× per link.

Correctness:
- resolveSymbol is a pure LIMIT 1 read, so caching + concurrent
  invocation is safe.
- Iteration order over links is preserved in the final
  contracts / crossLinks arrays — result shape is identical.

Test:
- New test asserts that two links sharing (repo, type, contract)
  produce exactly one DB call per distinct repo-tuple.
@magyargergo

Copy link
Copy Markdown
Collaborator

@claude are we ready for production? review other comments if we delivered them

@github-actions

github-actions Bot commented Apr 15, 2026

Copy link
Copy Markdown
Contributor

Claude finished @magyargergo's task in 1m 39s —— View job


Production Readiness Check

  • Reading current state of changed files
  • Checking each issue from prior reviews (Claude + xkonjin)
  • Final production-readiness verdict

Issue Tracker — Prior Review Feedback

The branch includes 5 commits since the initial fix. Here's the status of every raised issue:

# Severity Issue Status
1 High Duplicate cross-links: manifest contracts fed into runExactMatch → both 'exact' + 'manifest' entries FixeddedupeCrossLinks() added; manifest links take precedence ([...manifestCrossLinks, ...matched])
2 Medium Stale executor refs passed to extractFromManifest after pool close ⚠️ Partially addressedresolveSymbol catches executor errors and falls back to synthetic UIDs gracefully, but still logs spurious console.warn per failed round-trip. dbExecutors map is never cleared before being passed
3 Medium (perf) Sequential resolveSymbol per link FixedPromise.all + memoization (resolveCache) added in manifest-extractor.ts:91–113
4 Low allContracts alias is a redundant reference to autoContracts Not fixedsync.ts:218 still has const allContracts: StoredContract[] = autoContracts
5 Low (test) Missing toHaveLength(1) on total crossLinks Fixedsync.test.ts:259–263 now explicitly asserts no matchType:'exact' duplicate and result.crossLinks.toHaveLength(1)
6 xkonjin Dangling manifest link handling Fixedsync.ts:190–198 warns with repo names and still generates synthetic-UID cross-links
7 xkonjin dbExecutors undefined coverage Fixed — test uses extractorOverride path (leaving dbExecutors = undefined), and sync.test.ts:253–255 explicitly validates synthetic UID format
8 xkonjin Synthetic UID determinism AddressedmanifestSymbolUid (format manifest::<repo>::<contractId>) is deterministic; documented in code comments

Remaining Concerns

1. CI Typecheck Failure (Blocker)
The pipeline shows ❌ Typecheck: failure (tsc --noEmit). The PR description attributes this to "pre-existing gitnexus-shared workspace errors unrelated to this PR." Regardless of origin, CI must be green before merging to main. This needs to be confirmed as a pre-existing failure on main (run tsc --noEmit on main to verify) or fixed.

2. Stale dbExecutors passed post-close (Medium — functional but noisy)
sync.ts:201 calls manifestEx.extractFromManifest(config.links, dbExecutors) after all pools are closed in the finally block. When real DB executors exist, resolveSymbol will attempt DB queries on closed pools, fail, catch the error, log a console.warn, and fall back to synthetic UIDs. The fallback is safe but produces warn noise. A one-liner fix: dbExecutors?.clear() before calling extractFromManifest (or pass undefined). Not a correctness issue, but worth cleaning up.

3. allContracts alias (Low — cosmetic)
sync.ts:218: const allContracts: StoredContract[] = autoContracts is still an alias to the same array. It creates reader confusion about whether the two names have different contents. Minor but easy to fix.


Verdict

Not quite ready to merge — one CI blocker remains.

The critical duplicate cross-link bug (#1) is fixed, performance is improved, and test coverage is solid. The core fix is production-quality. But the typecheck CI failure must be resolved (or confirmed pre-existing on main) before merging. The stale executor issue (#2) is safe to ship but creates log noise that will show up in production.

To get this to green:

  1. Confirm/fix the typecheck CI failure on this branch
  2. Optionally: dbExecutors?.clear() before extractFromManifest to silence spurious warnings
  3. Optionally: remove the allContracts alias

@magyargergo

Copy link
Copy Markdown
Collaborator

@claude Before we decide whether this PR is ready for production, we should first define what “production-ready” means in this context.

I think the review criteria should be something along these lines:

correctness and functional completeness
code clarity and adherence to clean code principles
test coverage and confidence in change safety
performance characteristics and any potential bottlenecks
operational risk, including failure modes and rollback safety
maintainability and whether the structure supports future contributions

Once those criteria are explicit, it will be much easier to assess the PR in a consistent and meaningful way.

@github-actions

github-actions Bot commented Apr 15, 2026

Copy link
Copy Markdown
Contributor

Claude finished @magyargergo's task in 2m 11s —— View job


Production-Readiness Assessment — PR #827

  • Define production-readiness criteria
  • Read current state of changed files
  • Assess: Correctness & functional completeness
  • Assess: Code clarity & clean code
  • Assess: Test coverage & change safety
  • Assess: Performance
  • Assess: Operational risk & rollback safety
  • Assess: Maintainability
  • Render final verdict

Criteria Definition

Before scoring: what "production-ready" means for this PR across six dimensions.

Dimension Definition
Correctness The change does what it claims, edge cases are handled, no regressions introduced
Code clarity Logic is readable, names are unambiguous, comments justify non-obvious decisions
Test coverage New paths are exercised, assertions are tight, CI is green
Performance No unnecessary latency introduced; hot paths are efficient
Operational risk Failure modes are safe, rollback is feasible, production noise is bounded
Maintainability Future contributors can extend the code without traps or undocumented invariants

1. Correctness & Functional Completeness — ✅ Ready

Core bug is fixed. ManifestExtractor is now called (sync.ts:200–201), config.links are no longer silently dropped.

Duplicate cross-link bug (raised in prior review) is fixed. dedupeCrossLinks() at sync.ts:69–79 merges manifest and exact-match results with manifest-first precedence ([...manifestCrossLinks, ...matched] at line 217). The test at sync.test.ts:259–263 explicitly asserts no duplicate matchType:'exact' entry and toHaveLength(1).

Dangling repos are handled. A warn loop at sync.ts:190–198 fires for repos not in config.repos, and synthetic UIDs are still generated, so cross-links are produced regardless. Validated by sync.test.ts:266–312.

One known correctness gap — stale executor map (Medium). After finally closes all pools (sync.ts:173–177), the dbExecutors map still holds references to closed executors. resolveSymbol tries to use them, catches the exception, warns, and falls back to synthetic UIDs (manifest-extractor.ts:281–290). Behavior is correct (always produces cross-links), but it means every manifest link triggers a failed DB call followed by a console.warn in any real (non-test) invocation. This degrades observability signal. One-liner fix: dbExecutors?.clear() before line 201.

Undocumented behavior — custom contract type (Low). buildContractId handles custom (manifest-extractor.ts:337), but resolveSymbol silently returns null for it without a comment. This is fine today, but leaves a trap for whoever adds custom resolution later.

http::*:: wildcard is documented-future. Method-agnostic manifest links produce http::*::/path contract IDs that won't match http::GET::/path from auto-extractors in runExactMatch. The behavior is correct (manifest cross-link is still emitted), and it's documented in the buildContractId JSDoc. No action needed now, but it's worth tracking.


2. Code Clarity & Clean Code — ⚠️ Minor gap

Good: dedupeCrossLinks is well-named with a clear docstring on precedence. manifestSymbolUid has a thorough JSDoc explaining why synthetic rather than empty UIDs. buildContractId uses TypeScript's never exhaustiveness guard so adding a new ContractType is a compile error.

Remaining issue — allContracts alias (sync.ts:218). This has been flagged twice in prior reviews and is still unaddressed:

const allContracts: StoredContract[] = autoContracts;  // ← same array object, different name

The alias makes readers wonder whether allContracts is supposed to contain something autoContracts doesn't (e.g. manifest + auto). This is low-severity but causes genuine reader confusion and is trivial to remove. Fix this →


3. Test Coverage & Change Safety — ✅ Ready (with one CI mystery)

Coverage delta is positive across all metrics (+0.04% statements, +0.01% branches, +0.05% functions, +0.03% lines). The increment is small because the change wires existing code, not new branches.

New test surface is solid:

  • sync.test.ts:210–264 — manifest happy path: asserts contract count, UID format, matchType:'manifest', no duplicate 'exact' entry, total length
  • sync.test.ts:266–312 — dangling repo: asserts warning fired and synthetic UID generated
  • manifest-extractor.test.ts additions (30 new assertions) cover: gRPC exact resolution, lib exact resolution, HTTP path normalization variants, METHOD prefix stripping, parameterised paths, method casing canonicalization, memoization

CI status requires clarification. The latest CI comment shows ❌ Tests: failure yet reports 6239 passed / 0 failed / 97 skipped. The 97 skipped entries are all Swift tests — if the CI runner on at least one platform skips an entire test file (e.g. due to a missing Swift toolchain), some CI configurations report that as a job failure. This should be confirmed: if it's a pre-existing infra constraint unrelated to this PR, it shouldn't block the merge, but it should be acknowledged explicitly rather than assumed.


4. Performance — ✅ Ready

Both performance issues raised in prior reviews are fixed:

  • Sequential resolveSymbolPromise.all (manifest-extractor.ts:107–110): provider and consumer resolution are now parallel, halving per-link DB latency.
  • Memoization via resolveCache (manifest-extractor.ts:91–100): repeated (repo, type, contract) tuples hit the DB once. Verified by the memoization test at manifest-extractor.test.ts:587–615.

No new hot-path regressions introduced.


5. Operational Risk & Rollback Safety — ✅ Ready

Rollback is safe. The change is purely additive: it produces cross-links that were previously silently dropped. Reverting the PR restores the silent-drop behavior; no data migrations, no schema changes.

Graceful degradation is strong:

  • dbExecutors undefined → synthetic UIDs, no throw
  • Pool closed → exception caught, warn logged, synthetic UIDs
  • Repo not in config → warn logged, synthetic UIDs still generated
  • All cross-links are always produced regardless of symbol resolution success

Production noise concern (Medium, not blocking). As noted above, stale executors passed after pool close will generate console.warn entries for every manifest link in any deployment that has real DB executors. In a large group config with 20+ manifest links, that's 20+ spurious warnings per sync. This doesn't cause incorrect behavior but will make production log analysis noisier. Fix this →


6. Maintainability — ✅ Ready (with one trap to document)

custom type silent null in resolveSymbol deserves a one-line comment so the next person extending it doesn't wonder if it's a bug. Currently resolveSymbol hits the final return null for custom with no explanation.

http::*:: wildcard is documented in buildContractId's JSDoc but not surfaced in dedupeCrossLinks. A future contributor who extends dedup logic will need to understand that wildcard contract IDs intentionally don't dedup against specific-method ones.

Overall structure is cleandedupeCrossLinks is a pure function, ManifestExtractor is stateless, the extractorOverride seam makes the whole pipeline testable without mocked DB pools.


Final Verdict

Dimension Status Notes
Correctness Core fix correct; stale executor warn noise is safe but present
Code clarity ⚠️ allContracts alias unresolved; otherwise clear
Test coverage Happy path, edge cases, and regression tests all present
Performance Promise.all + memoization in place
Operational risk Fully graceful degradation; safe rollback
Maintainability Clean structure; one undocumented custom type null case

This PR is production-ready with one recommended change before merging:

  1. Remove allContracts alias (low effort, removes reader confusion — flagged 3× across reviews)

Two optional improvements that don't block the merge but would reduce production noise:

  1. dbExecutors?.clear() before extractFromManifest (one line, eliminates spurious warn logs from closed pools)
  2. Add a comment in resolveSymbol explaining the custom type null return (one line)

The CI ❌ Tests flag with 0 actual failures should be confirmed as pre-existing before merge. If it reflects a platform where Swift tests can't run (not introduced by this PR), that's acceptable to document and proceed. If it's new, it needs investigation.


@magyargergo magyargergo merged commit 385ee03 into abhigyanpatwari:main Apr 15, 2026
11 of 12 checks passed
@magyargergo

Copy link
Copy Markdown
Collaborator

@jonasvanderhaegen-xve Thank you for your contribution!

jyhk1314 pushed a commit to jyhk1314/GitNexus that referenced this pull request Apr 15, 2026
… RC, group sync

- Take upstream splitRelCsvByLabelPair + tests (abhigyanpatwari#818/abhigyanpatwari#832); preserve fork
  closeLbugForPath and import evictPoolsForDbPath from pool-adapter.
- Fix nightly-refresh evictPools import path to ../core/lbug/pool-adapter.js.
- Includes abhigyanpatwari#818 drain fix, abhigyanpatwari#823 embeddings PK, abhigyanpatwari#825 RC workflow, abhigyanpatwari#827 manifest sync.
This was referenced Apr 18, 2026
github714801013 pushed a commit to github714801013/GitNexus that referenced this pull request Apr 28, 2026
… produced 0 cross-links (abhigyanpatwari#827)

* fix(group/sync): wire ManifestExtractor into syncGroup pipeline

ManifestExtractor was fully implemented in extractors/manifest-extractor.ts
but never imported or called in sync.ts. As a result, any links declared in
group.yaml were parsed and validated by config-parser.ts but silently dropped
— config.links was always an empty dead-end as far as syncGroup was concerned.

Changes:
- Import ManifestExtractor in sync.ts
- Call extractFromManifest(config.links, dbExecutors) inside the outer try
  block, after all repos are processed but before the finally closes the DB
  pools (symbol resolution via resolveSymbol requires open executors)
- Collect the resulting contracts into autoContracts and the cross-links into
  a separate manifestCrossLinks array
- Merge manifestCrossLinks into the final crossLinks alongside runExactMatch
  results

Without this fix, users who declare explicit service dependencies in
group.yaml links (the documented workaround for HTTP clients that use absolute
URLs and are invisible to the auto-extractors) get 0 cross-links regardless of
what they configure.

* test(group/sync): cover manifest links producing cross-links

Add a unit test that asserts config.links entries produce contract pairs
and a manifest cross-link (matchType: 'manifest') via syncGroup.

Also refactors the manifest extraction call to sit outside the else/try
block so it runs regardless of extractorOverride arity — makes the code
testable without mocked DB pools and ensures links work when callers supply
a zero-arity override (e.g. in tests or programmatic usage).

* style: prettier format sync.ts and sync.test.ts

Also removes the stray empty line in the finally block (noted in review).

* fix(group/sync): dedupe cross-links and warn on dangling manifest repos

Addresses review feedback on PR abhigyanpatwari#827:

1. Dedupe cross-links. Manifest contracts participate in runExactMatch, so a
   manifest-declared link also emitted a duplicate matchType:'exact' CrossLink
   for the same endpoint pair. Dedupe by (from, to, type, contractId) and
   prefer manifest (operator-declared intent).

2. Warn on dangling repos. When a manifest link references a repo not in
   config.repos, log a warning. Synthetic UIDs keep the cross-link
   deterministic, but the operator probably meant something else.

3. Tests:
   - Assert no duplicate 'exact' CrossLink is emitted alongside the manifest one.
   - Assert synthetic UID format when no DB executors are available.
   - New test: dangling manifest repo still produces a cross-link + logs a warning.

* perf(group/manifest): parallelize and memoize symbol resolution

Previous implementation ran 2N sequential Cypher round-trips per
manifest (one for provider side, one for consumer, awaited in-order
per link). For manifests with tens of links this dominated syncGroup
latency in groups with many declared cross-repo contracts.

Changes:
- Resolve provider + consumer in parallel per link (Promise.all).
- Resolve all links in parallel (outer Promise.all over links.map).
  Each repo's executor pool is independent, so cross-repo fan-out
  scales with the number of distinct repos in the manifest.
- Memoize by (repo, type, contract). Manifests frequently declare
  the same contract from both directions or across sibling groups,
  so duplicate triples now hit the DB once instead of 2× per link.

Correctness:
- resolveSymbol is a pure LIMIT 1 read, so caching + concurrent
  invocation is safe.
- Iteration order over links is preserved in the final
  contracts / crossLinks arrays — result shape is identical.

Test:
- New test asserts that two links sharing (repo, type, contract)
  produce exactly one DB call per distinct repo-tuple.

---------

Co-authored-by: jonasvanderhaegen-xve <>
Co-authored-by: Gergo Magyar <gergomagyar@icloud.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: config.links in group.yaml produces 0 cross-links — ManifestExtractor never called in syncGroup

3 participants