diff --git a/CHANGELOG.md b/CHANGELOG.md index 1b6298d78c..b0ecf38598 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -4,6 +4,12 @@ All notable changes to GitNexus will be documented in this file. ## [Unreleased] +### Added +- Cross-repo impact analysis (`group_impact` tool and CLI command) +- Bridge.lbug storage: contract registry stored in LadybugDB instead of `contracts.json` +- gRPC canonical ID normalization: proto-aware extraction with wildcard matching +- Backward compatibility: automatic `contracts.json` fallback with deprecation warning + ### Changed - Migrated from KuzuDB to LadybugDB v0.15 (`@ladybugdb/core`, `@ladybugdb/wasm-core`) - Renamed all internal paths from `kuzu` to `lbug` (storage: `.gitnexus/kuzu` → `.gitnexus/lbug`) diff --git a/README.md b/README.md index 482b496bc4..88003bc0b0 100644 --- a/README.md +++ b/README.md @@ -214,13 +214,14 @@ gitnexus group remove # Remove a repo from a group gitnexus group list [name] # List groups, or show one group's config gitnexus group sync # Extract contracts and match across repos/services gitnexus group contracts # Inspect extracted contracts and cross-links +gitnexus group impact # Cross-repo blast radius analysis gitnexus group query # Search execution flows across all repos in a group gitnexus group status # Check staleness of repos in a group ``` ### What Your AI Agent Gets -**16 tools** exposed via MCP (11 per-repo + 5 group): +**17 tools** exposed via MCP (11 per-repo + 6 group): | Tool | What It Does | `repo` Param | | ------------------ | ----------------------------------------------------------------- | -------------- | @@ -234,11 +235,14 @@ gitnexus group status # Check staleness of repos in a group | `group_list` | List configured repository groups | — | | `group_sync` | Extract contracts and match across repos/services | — | | `group_contracts`| Inspect extracted contracts and cross-links | — | +| `group_impact` | Cross-repo blast radius analysis | — | | `group_query` | Search execution flows across all repos in a group | — | | `group_status` | Check staleness of repos in a group | — | > When only one repo is indexed, the `repo` parameter is optional. With multiple repos, specify which one: `query({query: "auth", repo: "my-app"})`. +> **Storage migration:** Group contract data is now stored in `bridge.lbug` (LadybugDB) instead of `contracts.json`. Existing groups with `contracts.json` are supported via automatic fallback with a deprecation warning. Run `gitnexus group sync ` to migrate. + **Resources** for instant context: | Resource | Purpose | diff --git a/docs/specs/2026-03-31-cross-index-impact-design.md b/docs/specs/2026-03-31-cross-index-impact-design.md new file mode 100644 index 0000000000..e90a5c43c1 --- /dev/null +++ b/docs/specs/2026-03-31-cross-index-impact-design.md @@ -0,0 +1,1043 @@ +# RFC: Cross-Index Impact Analysis — Repository Groups + +> **Superseded:** Contract storage migrated from `contracts.json` to `bridge.lbug` (LadybugDB). See [`docs/superpowers/specs/2026-04-03-bridge-lbug-grpc-normalization-design.md`](../superpowers/specs/2026-04-03-bridge-lbug-grpc-normalization-design.md) for the current design. + +**Date:** 2026-03-31 +**Status:** Superseded +**Author:** @ivkond +**Related Issues:** [#256](https://github.com/abhigyanpatwari/GitNexus/issues/256), [#306](https://github.com/abhigyanpatwari/GitNexus/issues/306), [#77](https://github.com/abhigyanpatwari/GitNexus/issues/77) + +## Summary + +Add cross-repository impact analysis to GitNexus by allowing users to organize repositories into logical groups with hierarchical naming (e.g., `hr/hiring/backend`, `hr/hiring/ui`). When analyzing blast radius, the system looks not only into the current repo's index but also into neighboring repos in the group, connected through a Contract Registry of shared touch-points (HTTP routes, gRPC services, message topics, shared library exports). + +## Motivation + +Modern applications are split across multiple repositories: frontend, BFF, backend, ML pipeline, workflow engines, shared libraries. GitNexus currently indexes each repo in isolation — the knowledge graph captures call chains within each repo, but connections across repo boundaries are lost. + +When a developer changes a DTO in the backend, they have no way to know which frontend components, BFF handlers, or downstream services will break — without manually grepping across repos or relying on LLM inference. + +### Use Cases + +1. **Developer:** "I'm changing `UserDTO.email` in backend — what breaks in the UI and BFF?" +2. **Architect:** "Show me all dependencies between services in the `hr` group." +3. **CI/CD:** Pre-merge check — does this PR affect contracts consumed by other repos? + +## Design: Hybrid — Lazy Virtual Graph (Approach C) + +Each repo keeps its own isolated index (`.gitnexus/lbug`). A lightweight metadata layer on top stores group configuration and a Contract Registry of extracted touch-points. Cross-repo impact works by fan-out: local impact in the current repo, then follow cross-links to run local impact in neighboring repos. + +### Why Not a Unified Super-Graph? + +Merging all indexes into one KuzuDB/LadybugDB would give full Cypher across the group, but: +- O(n) rebuild time when any single repo is re-indexed +- Multiplicative graph size growth +- Name collisions between repos +- Breaks the current "each repo is independent" model + +The hybrid approach is incremental, non-destructive, and minimizes changes to the core indexing pipeline. + +### Prerequisites and Current State + +**Current public surface (no `group` surface exists today):** +- CLI commands: `analyze`, `serve`, `wiki`, `status`, `clean`, `list`, `impact`, `cypher`, `mcp` ([index.ts](gitnexus/src/cli/index.ts)) +- MCP tools: 7 tools — `list_repos`, `query`, `cypher`, `context`, `impact`, `detect_changes`, `rename` ([tools.ts](gitnexus/src/mcp/tools.ts)) +- Shape guards: tool count asserted in [tools.test.ts](gitnexus/test/unit/tools.test.ts), resource count in [resources.test.ts](gitnexus/test/unit/resources.test.ts) + +**Graph schema gaps** — the following entities referenced in this RFC do NOT currently exist in the LadybugDB schema ([schema.ts](gitnexus/src/core/lbug/schema.ts)): +- No `Route` node label +- No `HANDLES_ROUTE` or `FETCHES` relation types +- Route data during ingestion is ephemeral — reduced to `CALLS` edges with `reason: "laravel-route"` ([parse-worker.ts:1145](gitnexus/src/core/ingestion/workers/parse-worker.ts), [call-processor.ts:642](gitnexus/src/core/ingestion/call-processor.ts)) +- Import graph stores file→file edges, not raw package coordinates ([import-processor.ts:343](gitnexus/src/core/ingestion/import-processor.ts)) +- `.proto` files are not a supported language ([supported-languages.ts](gitnexus/src/config/supported-languages.ts)) + +**Impact tool limitation** — current `impact` resolves symbols by name with `LIMIT 1` ([local-backend.ts:1347-1352](gitnexus/src/mcp/local/local-backend.ts)), not by UID. This creates false positives for common names. + +**Existing tech debt** — `impact` tool description documents `HAS_METHOD/OVERRIDES` relation types ([tools.ts:203](gitnexus/src/mcp/tools.ts)) but runtime filter only allows 4 types ([local-backend.ts:50](gitnexus/src/mcp/local/local-backend.ts)). Should be resolved before extending `group_impact`. + +These gaps define the **prerequisite work** required before the group features can function. See Section 8: Implementation Prerequisites. + +--- + +## Section 1: Concepts and Terminology + +### Repository Group + +A logical group of related repositories with hierarchical path-like addressing. Names support arbitrary nesting depth: + +``` +company/ + hr/ + hiring/ + backend <- repo (leaf) + ui <- repo (leaf) + payroll/ + backend <- repo (leaf) + camunda <- repo (leaf) + sales/ + admin/ + ui <- repo (leaf) + bff <- repo (leaf) + crm/ + backend <- repo (leaf) +``` + +A **group** is any non-leaf node in the hierarchy. `company/hr` is a group, `company/hr/hiring` is also a group, `company/hr/hiring/backend` is a repo (leaf). Operations on a group (impact, query) cascade to all nested repos. + +### Contract + +A touch-point between repositories. Types: + +| Type | Description | Example | +|------|-------------|---------| +| **HTTP Route** | REST/GraphQL endpoint | `GET /api/users` | +| **gRPC Service** | Proto service + method | `UserService.GetUser` | +| **Message Topic** | Kafka/RabbitMQ topic | `user.created` | +| **Shared Library Export** | Package export | `@hr/common::UserDTO` | +| **Custom** | User-defined from manifest | `custom::payroll-calc-v2` | + +### Contract Registry + +A lightweight JSON index of extracted contracts from all repos in a group. Stored at `~/.gitnexus/groups//contracts.json`. + +### Cross-Link + +An edge between a contract provider in one repo and a contract consumer in another. Confidence levels: + +| Match Type | Confidence | Source | +|-----------|-----------|--------| +| `exact` | 1.0 | Identical contract IDs | +| `manifest` | 1.0 | Explicitly declared in group.yaml (bypasses matching cascade) | +| `bm25` | 0.85-0.95 | BM25 text similarity | +| `embedding` | 0.5-0.85 | Semantic vector similarity | + +--- + +## Section 2: Storage and Group Configuration + +### File Structure + +``` +~/.gitnexus/ + registry.json <- existing (unchanged) + groups/ + company/ + group.yaml <- root group definition + contracts.json <- Contract Registry for entire subtree + embeddings.bin <- embedding vectors for contracts (optional) +``` + +### group.yaml + +```yaml +version: 1 +name: company +description: "All company microservices" + +# Mapping: path in group -> repo name from registry.json +# Repo can also be referenced by filesystem path or git remote URL +# for portability across machines where registry names may differ. +repos: + hr/hiring/backend: hr-hiring-backend + hr/hiring/ui: hr-hiring-ui + hr/payroll/backend: hr-payroll-api + sales/admin/ui: sales-admin-frontend + sales/admin/bff: sales-admin-bff + sales/crm/backend: sales-crm + +# Explicit links (manifest) for connections auto-detect can't find. +# Manifest links bypass the matching cascade entirely (confidence 1.0). +# `role` specifies the role of the `from` repo in this contract. +links: + - from: hr/payroll/backend + to: hr/hiring/backend + type: topic + contract: "employee.hired" + role: provider # provider | consumer (role of `from` repo) + + - from: sales/admin/bff + to: sales/crm/backend + type: http + contract: "/api/v2/leads/*" + role: consumer # sales/admin/bff consumes this route from sales/crm/backend + +# Cross-language package coordinate mapping. +# Keys under each repo are free-form ecosystem identifiers, +# supporting any package manager (npm, maven, pypi, go, nuget, crates, gems, etc.) +packages: + hr/common: + npm: "@hr/common" + maven: "com.hr.common" + pypi: "hr-common" + go: "github.com/hr/common" + nuget: "Hr.Common" + +# Auto-detection settings +detect: + http: true + grpc: true + topics: true + shared_libs: true + embedding_fallback: true + +# Matching cascade tuning +matching: + bm25_threshold: 0.7 + embedding_threshold: 0.65 + max_candidates_per_step: 3 +``` + +### contracts.json + +Auto-generated by `gitnexus group sync`. Written atomically (write to temp file, then rename) to prevent torn reads during concurrent `group_impact` queries. + +**Version migration:** On version mismatch, `group_impact` and `group_contracts` fail with a message to re-run `group sync`. No automatic migration — the file is fully regenerated on each sync anyway. + +**Staleness detection** uses two complementary checks for different purposes: + +1. **Repo index staleness** (commit-based, consistent with existing [staleness.ts:20](gitnexus/src/mcp/staleness.ts)): compares `meta.json.lastCommit` vs `git rev-parse HEAD`. Answers: "is this repo's index behind its own HEAD?" Used by `group sync` before extraction and by `group status`. + +2. **Contract Registry staleness** (indexedAt-based): compares `repoSnapshots[repo].indexedAt` in `contracts.json` against the repo's current `meta.json.indexedAt`. Answers: "was this repo re-indexed after the last `group sync`?" Used by `group_impact` before fan-out and by `group status`. + +These are different questions and intentionally use different heuristics: +- Check 1 catches: repo has new commits but hasn't been re-indexed +- Check 2 catches: repo was re-indexed (possibly with schema changes) but `group sync` hasn't been re-run + +```json +{ + "version": 1, + "generatedAt": "2026-03-31T10:00:00Z", + "repoSnapshots": { + "sales/crm/backend": { "indexedAt": "2026-03-30T21:14:14Z", "lastCommit": "5838fb8d" }, + "sales/admin/bff": { "indexedAt": "2026-03-30T19:05:00Z", "lastCommit": "a1b2c3d4" } + }, + "contracts": [ + { + "id": "http::GET::/api/v2/leads", + "type": "http", + "repo": "sales/crm/backend", + "symbolName": "LeadController.list", + "symbolUid": "abc123", + "symbolRef": { "filePath": "src/controller/LeadController.java", "name": "LeadController.list" }, + "role": "provider", + "meta": { + "method": "GET", + "path": "/api/v2/leads", + "pathSegments": ["api", "v2", "leads"], + "extractionStrategy": "source_scan" + } + }, + { + "id": "http::GET::/api/v2/leads", + "type": "http", + "repo": "sales/admin/bff", + "symbolName": "fetchLeads", + "symbolUid": "def456", + "symbolRef": { "filePath": "src/api/leads.ts", "name": "fetchLeads" }, + "role": "consumer", + "meta": { + "method": "GET", + "path": "/api/v2/leads", + "pathSegments": ["api", "v2", "leads"], + "extractionStrategy": "source_scan" + } + } + ], + "crossLinks": [ + { + "from": { "repo": "sales/admin/bff", "symbolUid": "def456", "symbolRef": { "filePath": "src/api/leads.ts", "name": "fetchLeads" } }, + "to": { "repo": "sales/crm/backend", "symbolUid": "abc123", "symbolRef": { "filePath": "src/controller/LeadController.java", "name": "LeadController.list" } }, + "type": "http", + "contractId": "http::GET::/api/v2/leads", + "matchType": "exact", + "confidence": 1.0 + } + ] +} +``` + +### Contract ID Format + +`::`: + +| Type | Discriminator | Example | +|------|---------------|---------| +| http | `METHOD::path` | `http::GET::/api/v2/leads` | +| grpc | `package.Service/Method` | `grpc::hr.UserService/GetUser` | +| topic | `topic_name` | `topic::employee.hired` | +| lib | `package::export` | `lib::@hr/common::UserDTO` | +| custom | free-form from manifest | `custom::payroll-calc-v2` | + +--- + +## Section 3: Contract Extraction + +### Extractor Architecture + +Contract extraction uses a **two-tier strategy**: graph queries where the data is available in LadybugDB, and lightweight source scanning where it is not. This is necessary because the current graph does not store all the data extractors need (see Prerequisites). + +``` +ContractExtractor (interface) + |-- HttpRouteExtractor <- graph (CALLS with route reason) + source scan (fetch/axios patterns) + |-- GrpcExtractor <- source scan only (.proto files, not in supported languages) + |-- MessageTopicExtractor <- graph (CALLS) + source scan (publish/subscribe patterns) + |-- SharedLibExtractor <- graph (IMPORTS file->file) + packages map from group.yaml + |-- ManifestExtractor <- group.yaml links (no graph/source access) +``` + +### ContractExtractor Interface + +```typescript +interface ContractExtractor { + type: ContractType; // 'http' | 'grpc' | 'topic' | 'lib' | 'custom' + canExtract(repoHandle: RepoHandle): Promise; + /** + * Extract contracts. Gets both db connection (for graph queries) + * and repoPath (for source file scanning when graph data is insufficient). + */ + extract(db: LbugConnection, repoPath: string): Promise; +} + +interface ExtractedContract { + contractId: string; + type: ContractType; + role: 'provider' | 'consumer'; + symbolUid: string; // may be empty for source-scan-only results + symbolRef: { filePath: string; name: string }; // stable fallback for UID + symbolName: string; // human-readable, used in BM25/embedding + confidence: number; // extraction confidence (1.0 graph, 0.3-0.8 source scan) + meta: Record; +} +``` + +Note: `symbolName` is a denormalized convenience field (same as `symbolRef.name`) used in BM25 document building and embedding input. `confidence` reflects extraction quality — graph-derived contracts get 1.0, source-scanned get lower confidence depending on pattern reliability. + +### HttpRouteExtractor + +**Current graph state:** No `Route` nodes, no `HANDLES_ROUTE`/`FETCHES` edges. Route handlers are recorded as `CALLS` edges with `reason: "laravel-route"` (Laravel only). Frontend fetch calls are not in the graph at all. + +**Provider extraction (backend)** — two strategies: + +Strategy A — Graph query for existing route-annotated CALLS edges (auxiliary only): +```cypher +MATCH (source)-[r:CodeRelation {type: 'CALLS'}]->(target) +WHERE r.reason CONTAINS 'route' +RETURN source.name, source.uid, source.filePath, + target.name, target.uid, target.filePath, + r.reason, r.confidence +``` +**Limitation:** Current graph stores only `reason: "laravel-route"` without HTTP method or path ([call-processor.ts:685](gitnexus/src/core/ingestion/call-processor.ts)). Strategy A can identify that a symbol is a route handler (useful for filtering) but **cannot reconstruct the contract ID** (`http::METHOD::path`). Contract ID must come from Strategy B (source scan). Strategy A serves as a hint to narrow source scan scope. + +Strategy B — Source scan for route decorator/annotation patterns (primary source of contract ID): +- Scan files matching common patterns: `*Controller.*`, `*Router.*`, `routes/*` +- Regex match for route annotations: `@GetMapping`, `@app.route`, `router.get`, `@Controller`/`@RequestMapping` (Java/Spring), `Route::get` (Laravel) +- Extract HTTP method + path from annotation arguments +- Resolve to nearest symbol in graph by file + line number + +**Consumer extraction (frontend/BFF)** — source scan only (not in graph): +- Scan `.ts`, `.tsx`, `.js`, `.jsx`, `.vue`, `.svelte` files +- Regex match for fetch patterns: `fetch('...')`, `axios.get('...')`, `$.ajax`, `http.get` +- Extract HTTP method (from function name or options) + URL path +- Resolve calling function from graph by file + approximate line range + +**Path normalization:** strip trailing slash, collapse path params (`/users/:id`, `/users/{id}`, `/users/[id]` -> `/users/{param}`). + +**Meta fields:** +```json +{ + "method": "GET", + "path": "/api/v2/users", + "pathSegments": ["api", "v2", "users"], + "extractionStrategy": "source_scan", + "handlerName": "UserController.list", + "paramNames": ["limit", "offset"] +} +``` + +Note: `responseKeys`/`accessedKeys` from the original design require response shape tracking which does not exist in the current graph. These fields are omitted from MVP and listed in Future Work as a prerequisite for shape_check cross-repo integration. + +**Contract ID:** `http::{METHOD}::{normalized_path}` + +### GrpcExtractor + +`.proto` files are not a supported language in GitNexus — no symbols are extracted during indexing. This extractor is **source-scan only**. + +- Scan for `.proto` files in repo directory +- Parse `service` and `rpc` declarations with regex, inheriting package context through imported proto files when definitions are split +- For consumers: scan source files for generated stub/client class usage patterns (`@GrpcClient`, `ClientGrpc.getService(...)`, `new XxxServiceClient(...)`, `grpc.loadPackageDefinition(...)`-based construction) + +In MVP, `canExtract` returns `true` only when `.proto` files exist in the repo's file tree. Extraction confidence is lower (0.7) due to regex-only parsing. + +Explicitly unsupported in this extractor today: `C#`, `Ruby`, and `Rust` gRPC client/server ecosystems. Those require follow-up extractor coverage and should not be treated as implicitly supported by the TypeScript/Go/Java/Python patterns above. + +**Contract ID:** `grpc::{package}.{Service}/{Method}` + +### MessageTopicExtractor + +**Current graph state:** `ACCESSES` relation type does not exist in the current schema ([schema.ts:29](gitnexus/src/core/lbug/schema.ts)). This extractor is **source-scan only**. + +Topic name resolution cascade (all via source scanning): + +1. **String literal** — `publish("user.created", ...)` -> topic name directly (confidence 1.0) +2. **Constant** — `publish(TOPICS.USER_CREATED, ...)` -> source-scan the constant definition file (follow import path from source, not graph ACCESSES edge) to find the string value. If constant is in the same file or a direct import, resolution succeeds (confidence 0.9). If indirect or dynamic import chain, falls through to step 4. +3. **Env variable** — `publish(process.env.USER_TOPIC, ...)` -> cannot auto-resolve, recorded as `topic::${USER_TOPIC}` with `confidence: 0.3` and env name in meta +4. **Dynamic / unresolvable** — `publish(getTopicName(), ...)` -> **no contract created**, warning emitted: + +``` +WARNING: sales/crm/backend: found publish() call at src/events/publisher.ts:42 + but could not resolve topic name (dynamic expression). + -> Add this link explicitly in group.yaml: + links: + - from: sales/crm/backend + to: + type: topic + contract: "" + role: provider +``` + +Meta includes `resolution` field for transparency: `"literal"`, `"constant"`, `"env"`, `"unresolved"`. + +### SharedLibExtractor + +**Current graph state:** IMPORTS edges are file→file only ([import-processor.ts:343](gitnexus/src/core/ingestion/import-processor.ts)), without raw package coordinates. External imports (to packages outside the repo) are not in the graph. + +**Strategy — hybrid graph + source scan:** + +1. **Source scan** — read import statements from source files to get raw package coordinates (e.g., `import { UserDTO } from '@hr/common'`, `import com.hr.common.UserDTO`) +2. **Match against `packages` map** from group.yaml: + +```yaml +packages: + hr/common: + npm: "@hr/common" + maven: "com.hr.common" + pypi: "hr-common" + go: "github.com/hr/common" +``` + +3. **Resolve symbols** — once the target repo is identified via package match, look up the imported symbol name in that repo's graph to get the `symbolUid` +4. Without packages map — fallback: search import strings containing another group repo's name (fuzzy, low confidence 0.6) + +Contract ID normalized to group path: `lib::hr/common::UserDTO` — same contract regardless of import language. + +Warning for unmatched imports: + +``` +WARNING: hr/hiring/backend: import "com.acme.utils.DateHelper" at src/Main.java:3 + matches no known package in group. If this is a shared library, add to packages: + packages: + : + maven: "com.acme.utils" +``` + +### ManifestExtractor + +Reads `links` section from `group.yaml` directly. Manifest links **bypass the matching cascade entirely** — they are pre-matched by definition and always produce cross-links with confidence 1.0. + +Role mapping from `group.yaml`: +- `role: provider` on `from` repo means `from` provides the contract, `to` consumes it +- `role: consumer` on `from` repo means `from` consumes the contract, `to` provides it + +For each manifest link, the extractor creates two `ExtractedContract` entries (one provider, one consumer) and one pre-matched cross-link. + +**Symbol resolution during sync** (not deferred): When sync processes manifest links, it attempts to resolve the contract identifier against the graph of each referenced repo. For example, manifest `contract: "employee.hired"` with `type: topic` — sync searches for symbols in the `from` and `to` repos that reference this topic name (using the same strategies as MessageTopicExtractor). If resolution succeeds, `symbolUid` and `symbolRef` are populated. If it fails, the contract is created with empty `symbolUid` and `symbolRef: { filePath: "", name: contract }`. During `group_impact` Phase 2, step 4c-iv handles these entries: it searches the target repo's graph for symbols matching the contractId pattern (e.g., route handlers matching the HTTP path). + +### Sync Execution Order + +``` +1. For each repo in group (sequential, LadybugDB pool limit): + a. Open LadybugDB (read-only) + b. For each enabled extractor (http, grpc, topic, lib): + canExtract(repo)? -> extract(db, repoPath) + c. Close connection + +2. ManifestExtractor — add links from group.yaml + +3. Matching cascade: + a. Exact match by contract ID -> crossLinks (confidence 1.0) + b. BM25 by contract ID + symbol + meta -> crossLinks (confidence 0.85-0.95) + c. Embedding fallback by symbol + meta -> crossLinks (confidence 0.5-0.85) + +4. Write contracts.json +``` + +--- + +## Section 4: Cross-Index Impact + +### Algorithm — Two Phases + +**Prerequisite: UID-based impact resolution.** The current `impact` tool resolves symbols by name with `LIMIT 1` ([local-backend.ts:1347-1352](gitnexus/src/mcp/local/local-backend.ts)). For `group_impact` Phase 2 fan-out, where the target symbol is known by UID from the Contract Registry, this creates false positives on common names (e.g., `getUser` may match the wrong overload). Phase 2 requires an internal `impactByUid(uid, direction)` variant that resolves by UID directly. This is listed as a prerequisite in Section 8. + +**Phase 1: Local impact** — blast radius within current repo (existing `impact` tool, unchanged): + +``` +UserDTO (hr/hiring/backend) + d=1: UserController.getUser, UserMapper.toDTO, UserService.findById + d=2: HiringRouter (/api/v2/users/:id) +``` + +**Phase 2: Cross-boundary fan-out** — expand through Contract Registry: + +``` +1. Close Phase 1 db connection (free pool slot for fan-out) +2. Collect all symbol UIDs from Phase 1 result +3. For each symbol — lookup in contracts.json: + a. Primary: match by symbolUid + b. Fallback: if UID not found (repo re-indexed since last sync), + match by symbolRef (filePath + name) + c. If neither matches — skip with warning "contracts.json may be stale, re-run group sync" +4. For each found crossLink (sorted by confidence desc): + a. Determine traversal side based on direction: + - upstream ("what depends on me"): follow links where the changed symbol + is the PROVIDER — look up consumers in other repos + (crossLinks where `to.symbolUid` matches phase1 symbol → fan-out to `from.repo`) + - downstream ("what do I depend on"): follow links where the changed symbol + is the CONSUMER — look up providers in other repos + (crossLinks where `from.symbolUid` matches phase1 symbol → fan-out to `to.repo`) + b. Open LadybugDB of target repo (sequential, reusing pool) + c. Resolve target symbol in target repo's graph: + i. Try symbolUid (fast, exact) + ii. Fallback: symbolRef filePath + name + iii. Fallback: symbolRef name only (warn if ambiguous) + iv. Fallback (for manifest links with empty symbolRef): search target repo's graph + for symbols matching the contractId pattern (e.g., for http contract, find + route handlers matching the path; for topic, find publish/subscribe calls + matching the topic name). This is a slower text-based search in the graph. + v. If all fail: skip with staleness warning + d. Run local impactByUid(resolvedUid, direction) in that repo + e. Tag results as cross-repo (with crossLink confidence) + f. Close connection before opening next repo +5. DO NOT recurse further — one hop through boundary (default) +``` + +### Symbol UID Stability + +Symbol UIDs in LadybugDB may change when a repo is re-indexed. Cross-links store both `symbolUid` (fast lookup) and `symbolRef` (stable fallback: filePath + name). + +Resolution cascade when `symbolUid` lookup fails: +1. **filePath + name** — match by both fields together (stable unless file was moved) +2. **name only** — if filePath match fails, search by name alone. If multiple candidates found, emit warning: "Ambiguous symbol resolution for {name}, {count} candidates — re-run `group sync`" and skip the cross-link +3. **Neither** — skip with staleness warning + +### Why One Hop Default + +Transitive cross-boundary chains (UI -> BFF -> Backend -> ML) are exponentially expensive and yield decreasing confidence. One hop covers the primary scenario. `--cross-depth 2+` is reserved for Future Work — the flag is accepted but capped at 1 in MVP with a message: "Multi-hop cross-boundary traversal is not yet implemented. Using --cross-depth 1." + +### Response Format + +**`impact` tool** — unchanged. Returns exactly the same response as today. + +**`group_impact` tool** — new tool, new response type: + +```typescript +interface GroupImpactResult { + local: ImpactResult; // everything the existing impact returns + group: string; // group name + cross: CrossRepoImpact[]; // empty if no cross-links found + outOfScope: OutOfScopeLink[]; // cross-links not followed due to --subgroup filter + truncated: boolean; // true if timeout reached before all repos processed + truncatedRepos: string[]; // repos not reached due to timeout + summary: { + direct: number; // existing impact field name + processes_affected: number; // existing impact field name ([local-backend.ts:1477]) + modules_affected: number; // existing impact field name + cross_repo_hits: number; // new field, 0 if no cross-links + }; + risk: RiskLevel; // recalculated with cross-repo factors +} + +interface OutOfScopeLink { + from: string; // repo path + to: string; // repo path + contractId: string; + confidence: number; +} + +interface CrossRepoImpact { + repo: string; // registry name + repo_path: string; // path in group hierarchy + contract: { + id: string; + type: ContractType; + match_type: 'exact' | 'manifest' | 'bm25' | 'embedding'; + confidence: number; + }; + by_depth: Record; + affected_processes: string[]; +} +// Note: JSON field naming uses snake_case to match existing impact output style +// ([local-backend.ts:1477](gitnexus/src/mcp/local/local-backend.ts)) +``` + +Clients using `impact` are unaffected. Two fully independent tool registrations in MCP. + +### Risk Scoring + +| Factor | Risk contribution | +|--------|------------------| +| d=1 local callers > 5 | +MEDIUM | +| Any cross-repo hit with confidence >= 0.85 | +HIGH | +| Cross-repo hit with confidence < 0.85 | +MEDIUM + "verify manually" warning | +| Cross-repo hits in >= 3 repos | +CRITICAL | +| Affected process with > 10 steps | +HIGH | + +### Concurrency and Performance + +LadybugDB pool limit = 5 simultaneous databases. Fan-out strategy: + +``` +Phase 1: local impact -> 1 db connection, released after completion +Phase 2: contract registry -> in-memory (JSON), no db +Phase 3: fan-out impacts -> sequential, one connection at a time + (Phase 1 connection already released, + so full pool of 5 available for fan-out) + +Fan-out order: sorted by confidence desc + -> exact matches (1.0) first — most likely real impact + -> embedding matches (0.5-0.85) last — can be interrupted by timeout +``` + +Timeout budget: +- Total wall time: 30s default (configurable via `--timeout`) +- Phase 1: max 5s +- Remaining budget = total - Phase1_elapsed +- Each fan-out hop: `min(5s, remaining_budget / remaining_hops)` +- On timeout: return partial result with `"truncated": true` and list of repos not reached + +--- + +## Section 5: Matching Cascade + +### Overview + +During `group sync`, after extracting contracts from all repos, the cascade finds provider-consumer pairs. Each step processes only **unmatched** contracts remaining from the previous step: + +``` +Extracted contracts (all repos) + | + +- Step 1: Exact match by contract ID + | matched -> crossLinks (confidence 1.0) + | unmatched | + | + +- Step 2: BM25 by contract ID + symbol + meta + | score >= threshold -> crossLinks (confidence 0.85-0.95) + | unmatched | + | + +- Step 3: Embedding similarity by symbol + meta + | score >= threshold -> crossLinks (confidence 0.5-0.85) + | unmatched | + | + +- Unmatched -> report (warnings) +``` + +### Step 1: Exact Match + +``` +providers = contracts.filter(c => c.role === 'provider') +consumers = contracts.filter(c => c.role === 'consumer') +index = Map + +for each consumer: + if index.has(consumer.contractId): + emit crossLink(consumer -> provider, matchType: 'exact', confidence: 1.0) +``` + +Contract ID normalization before comparison: +- HTTP: lowercase method, strip trailing slash, collapse path params +- gRPC: lowercase package name +- Topic: trim whitespace, lowercase +- Lib: lowercase package coordinates + +Time: O(n) — single hashmap pass. + +### Step 2: BM25 + +Document for indexing — concatenation of contract fields: + +```typescript +function contractToDocument(c: ExtractedContract): string { + const parts = [ + c.contractId, + c.symbolRef.name, + c.type, + ]; + if (c.meta.path) parts.push(c.meta.path); + if (c.meta.pathSegments) parts.push(...c.meta.pathSegments); + if (c.meta.responseKeys) parts.push(...c.meta.responseKeys); + if (c.meta.accessedKeys) parts.push(...c.meta.accessedKeys); + if (c.meta.paramNames) parts.push(...c.meta.paramNames); + if (c.meta.topicName) parts.push(c.meta.topicName); + return parts.join(' '); +} +``` + +Only type-compatible pairs match: http <-> http, topic <-> topic, lib <-> lib, grpc <-> grpc. Within `lib` type, cross-ecosystem matching is allowed (e.g., a `maven` provider can match an `npm` consumer for the same logical package — this is exactly what the `packages` map in group.yaml enables). + +Thresholds — BM25 raw scores are unbounded and corpus-dependent, so we use **relative scoring** (score / max_score in result set) rather than an absolute threshold: +- `BM25_RELATIVE_THRESHOLD = 0.7` (min ratio of score to top result's score) +- `BM25_TOP_K = 3` (candidates per query) +- Configurable via `matching.bm25_threshold` in `group.yaml` — may need tuning per deployment + +Confidence mapping (based on relative score): +- relative 0.7-0.8 -> confidence 0.85 +- relative 0.8-0.9 -> confidence 0.90 +- relative 0.9-1.0 -> confidence 0.95 + +What BM25 catches well: +- Versioned paths (`/api/v1/users` <-> `/api/v2/users`) +- Partial name matches (`UserDTO` <-> `UserResponseDTO`) +- Matching meta fields (e.g., same pathSegments, paramNames; `responseKeys`/`accessedKeys` are future work — see Section 3 HttpRouteExtractor note) + +What BM25 does NOT catch: +- Cross-language naming (`IUser` in TypeScript <-> `UserDTO` in Java) +- Synonymous concepts (`fetchPeople` <-> `GET /api/employees`) + +These cases fall through to Step 3. + +### Step 3: Embedding Similarity + +Embedding input — richer than BM25, includes structural context: + +```typescript +function contractToEmbeddingInput(c: ExtractedContract): string { + const parts = [ + `${c.type} contract`, + c.role, + c.symbolRef.name, + c.contractId, + ]; + if (c.meta.responseKeys) { + parts.push(`fields: ${c.meta.responseKeys.join(', ')}`); + } + if (c.meta.accessedKeys) { + parts.push(`accesses: ${c.meta.accessedKeys.join(', ')}`); + } + return parts.join(' | '); +} +``` + +Model: Snowflake/snowflake-arctic-embed-xs (384 dims) — same as GitNexus internal semantic search. + +Thresholds: +- `EMBEDDING_THRESHOLD = 0.65` (min cosine similarity) +- `EMBEDDING_MAX_CONFIDENCE = 0.85` (cap — never higher than BM25 min) + +Confidence mapping (linear): +- cosine 0.65 -> confidence 0.50 +- cosine 0.75 -> confidence 0.67 +- cosine 0.85 -> confidence 0.85 + +Storage: `~/.gitnexus/groups//embeddings.bin` — flat binary, ~1.5KB per contract. + +### Interaction with Existing Flags + +Embedding fallback respects flags and index state: + +| Flag | Behavior | +|------|----------| +| (default) | Exact -> BM25 -> Embedding fallback | +| `--skip-embeddings` | Exact -> BM25 only. No model load. No embeddings.bin | +| `--exact-only` | Exact match only. Fastest, strictest | +| `--force-embeddings` | Regenerate embeddings.bin even if fresh | + +Automatic skip when model unavailable: + +``` +WARNING: Embedding model unavailable (onnxruntime not found). + Matching cascade limited to: exact -> BM25. + Unmatched contracts may increase. Use group.yaml links for manual linking. +``` + +Note: `group sync --skip-embeddings` and `gitnexus analyze --embeddings` are independent. Per-repo embeddings (for `query` semantic search) and group embeddings (for cross-repo contract matching) are separate concerns. + +### Unmatched Report + +``` +WARNING: Unmatched contracts (4): + + PROVIDERS without consumers: + http::DELETE::/api/v2/users/{param} (hr/hiring/backend) + topic::employee.terminated (hr/hiring/backend) + + CONSUMERS without providers: + http::GET::/api/v2/departments (hr/hiring/ui) + lib::hr/common::DepartmentDTO (hr/payroll/backend) + + -> To link manually, add to group.yaml links section. +``` + +--- + +## Section 6: CLI Commands and MCP Tools + +### CLI Commands + +All commands live under `gitnexus group`: + +#### `gitnexus group create ` + +Creates `~/.gitnexus/groups//group.yaml` with a template. Errors if group exists (use `--force` to overwrite). + +#### `gitnexus group add ` + +Adds a repo to a group. `` is a name from registry.json, `` is the path in group hierarchy. + +Validations: +- `` must exist in registry.json +- `` must not duplicate within group +- A repo can be in multiple groups (e.g., shared lib) + +#### `gitnexus group remove ` + +Removes a repo from a group. Prompts to re-sync. + +#### `gitnexus group sync [flags]` + +Main command — runs contract extraction and matching cascade. + +``` +$ gitnexus group sync company + +Syncing group "company" (6 repos)... + + [1/6] hr/hiring/backend 12 contracts (8 provider, 4 consumer) + [2/6] hr/hiring/ui 7 contracts (0 provider, 7 consumer) + ... + +Matching cascade: + exact: 18 cross-links (confidence 1.0) + bm25: 4 cross-links (confidence 0.85-0.95) + embedding: 2 cross-links (confidence 0.62-0.78) + unmatched: 3 contracts + +Wrote ~/.gitnexus/groups/company/contracts.json (41 contracts, 24 cross-links) +``` + +Flags: + +| Flag | Default | Description | +|------|---------|-------------| +| `--skip-embeddings` | false | Exact + BM25 only | +| `--exact-only` | false | Exact match only | +| `--force-embeddings` | false | Regenerate embeddings.bin | +| `--allow-stale` | false | Skip stale index warnings | +| `--verbose` | false | Show each cross-link detail | +| `--json` | false | JSON output | + +**Stale index detection** uses the same heuristic as the existing staleness system ([staleness.ts:20](gitnexus/src/mcp/staleness.ts)): compare `meta.json.lastCommit` vs `git rev-parse HEAD`. This is commit-based, not time-based — consistent with the staleness heuristic defined in Section 2 for `contracts.json` (which compares `repoSnapshots[repo].indexedAt` against current `meta.json.indexedAt`). Two complementary checks: + +1. **Repo index staleness** (commit-based): is the repo's own index behind HEAD? → warn before extraction +2. **Contract Registry staleness** (indexedAt-based): was the repo re-indexed since last `group sync`? → warn during `group_impact` + +**Missing repo handling:** If a repo listed in `group.yaml` is not found in `registry.json` (not indexed, deleted, different machine), sync **skips it with a warning** and continues with remaining repos. The missing repo is listed in the sync summary. Contracts from a missing repo are **dropped** from the regenerated `contracts.json` (since sync is a full rebuild, not a patch). The missing repo is recorded in a top-level `missingRepos` array in `contracts.json` for transparency: + +```json +{ + "missingRepos": ["sales/admin/bff"], + ... +} +``` + +#### `gitnexus group list [name]` + +Without argument — all groups. With name — details including repo list and subgroup tree. + +#### `gitnexus group contracts ` + +Debug/inspect view of Contract Registry. Flags: `--type`, `--repo`, `--unmatched`, `--json`. + +#### `gitnexus group impact [flags]` + +``` +$ gitnexus group impact company --target UserDTO --repo hr/hiring/backend + +Target: UserDTO (hr/hiring/backend) +Risk: HIGH (cross-repo hits in 2 repos) + +Local (hr/hiring/backend): + d=1 WILL BREAK: + UserController.getUser src/controller/UserController.java:42 + ... + +Cross-repo: + hr/hiring/ui (via http::GET::/api/v2/users/{param}, exact, conf=1.0): + d=1 WILL BREAK: + fetchUser src/api/users.ts:18 + d=2 LIKELY AFFECTED: + UserProfile src/components/UserProfile.tsx:7 + + hr/payroll/backend (via topic::employee.updated, bm25, conf=0.88): + d=1 WILL BREAK: + EmployeeEventHandler src/events/EmployeeEventHandler.java:31 +``` + +Flags: + +| Flag | Default | Description | +|------|---------|-------------| +| `--target` | required | Symbol name | +| `--repo` | required | Repo in group (path or registry name) | +| `--direction` | upstream | upstream / downstream | +| `--cross-depth` | 1 | Hops through boundaries (MVP: capped at 1) | +| `--max-depth` | 3 | Max depth within each repo | +| `--min-confidence` | 0.5 | Min confidence for cross-links | +| `--subgroup` | (all) | Limit fan-out scope: `--subgroup hr/hiring` | +| `--timeout` | 30000 | Total wall time budget in ms | +| `--json` | false | JSON output | + +#### `gitnexus group query ` + +Fan-out of existing `query` across all repos in group, results merged via RRF, grouped by repo path. + +#### `gitnexus group status ` + +Quick health check — shows staleness of Contract Registry relative to repo indexes: + +``` +$ gitnexus group status company + +Group: company (last sync: 2026-03-31T10:00:00Z) + + Repo index staleness (meta.lastCommit vs HEAD): + hr/hiring/backend OK (index at HEAD 5838fb8d) + hr/hiring/ui STALE (index at a1b2c3d4, HEAD is e5f6g7h8 — 2 commits behind) + hr/payroll/backend OK + + Contract Registry staleness (repoSnapshot.indexedAt vs meta.indexedAt): + hr/hiring/backend OK (sync matches index) + hr/hiring/ui STALE (re-indexed after last sync) + hr/payroll/backend OK + + Missing repos: + sales/admin/bff MISSING (not in registry.json) +``` + +#### Subgroup Boundary Behavior + +When `--subgroup hr/hiring` is specified, fan-out only follows cross-links where the **target repo** is within the subgroup. Cross-links pointing outside (e.g., from `hr/hiring/backend` to `hr/payroll/backend`) are **not followed** but are listed in the output as "out-of-scope" for transparency. + +### MCP Tools + +| MCP Tool | Parameters | Description | +|----------|-----------|-------------| +| `group_list` | `name?` | List groups or details of one | +| `group_sync` | `name`, `skipEmbeddings?`, `exactOnly?` | Sync Contract Registry | +| `group_contracts` | `name`, `type?`, `repo?`, `unmatchedOnly?` | Show contracts and cross-links | +| `group_impact` | `name`, `target`, `repo`, `direction?`, `crossDepth?`, `maxDepth?`, `minConfidence?`, `subgroup?`, `timeout?` | Cross-index blast radius | +| `group_query` | `name`, `query`, `subgroup?`, `limit?` | Search flows across group | +| `group_status` | `name` | Staleness check for group and repos | + +Mutating operations (`group_create`, `group_add`, `group_remove`) are CLI-only — not exposed as MCP tools. + +### AI Context Integration (CLAUDE.md + AGENTS.md) + +When groups exist, `gitnexus analyze` appends to both `CLAUDE.md` and `AGENTS.md` (consistent with the existing generation pattern in [ai-context.ts:293](gitnexus/src/cli/ai-context.ts)): + +```markdown +## Cross-Repo Groups + +This repo is part of group **company** as `hr/hiring/backend`. +Use `group_impact` instead of `impact` when changes may affect other repos in the group. +``` + +--- + +## Section 7: Limitations and Future Work + +### Known Limitations (MVP) + +1. **Single hop only (MVP)** — cross-boundary traversal is capped at 1 hop. `--cross-depth` flag is accepted but values >1 are ignored with a message. Full E2E chain (UI -> BFF -> Backend -> ML) requires a future `group trace` command. +2. **Contract Registry is full-rebuild** — `group sync` regenerates entirely. Incremental sync (only re-extract changed repos) is a future optimization. +3. **LadybugDB pool limit** — max 5 databases open simultaneously. Groups with 10+ repos will see sequential fan-out with queuing. +4. **Runtime-only connections** — service discovery, feature flags, A/B routing are invisible to static analysis. +5. **Embedding quality for short names** — symbol names like `IUser` vs `UserDTO` may not embed well without field context. +6. **No unified Cypher** — cannot run a single Cypher query across the entire group (each repo has its own database). + +### Future Work + +- **Incremental sync** — detect which repos changed since last sync, re-extract only those +- **`group trace`** — full E2E flow tracing across multiple boundaries +- **Virtual Cypher** — Cypher-like query language that transparently fans out across group databases +- **CI integration** — `group impact` as a PR check ("this change affects 3 other repos") +- **OpenAPI/AsyncAPI import** — generate contracts from spec files instead of extracting from code +- **Dependency drift detection** — alert when a consumer accesses fields that the provider no longer returns +- **Web UI visualization** — graph view showing cross-repo connections in gitnexus-web + +### Demo PR Scope + +A minimal demonstration PR to validate the concept: + +1. **`group.yaml` parser** — read/validate group configuration +2. **`group list`** and **`group status`** CLI commands — show groups, repos, and staleness +3. **`group sync`** with exact-match only — extract HTTP contracts via source scan, build cross-links +4. **`group_impact`** MCP tool — Phase 1 (local) + Phase 2 (cross-boundary fan-out, exact match only) +5. **Tests** — integration test with two small fixture repos (frontend + backend) +6. **Test migration** — update tool/resource count assertions (see Section 9) + +BM25 and embedding matching are out of scope for the demo PR but designed in from the start. + +--- + +## Section 8: Implementation Prerequisites + +Changes required in GitNexus core before group features can function correctly. These should be separate PRs merged before the group feature PR. + +### P1: `impactByUid` — UID-based symbol resolution for impact + +**Problem:** Current `impact` resolves by name with `LIMIT 1` ([local-backend.ts:1347-1352](gitnexus/src/mcp/local/local-backend.ts)). For `group_impact` fan-out, the target symbol is known by UID from the Contract Registry. Name-based resolution creates false positives for common symbol names. + +**Change:** Add an internal `impactByUid(uid: string, direction: string, opts)` function alongside the existing name-based `impact`. The public MCP `impact` tool is unchanged — `impactByUid` is internal-only, called by `group_impact` during Phase 2. + +**Scope:** ~50 lines in `local-backend.ts`. No public API change. No schema change. + +### P2: Impact relationTypes runtime filter alignment + +**Problem:** Impact tool description documents 6 relation types as valid for `relationTypes` parameter: `CALLS`, `IMPORTS`, `EXTENDS`, `IMPLEMENTS`, `HAS_METHOD`, `OVERRIDES` ([tools.ts:203](gitnexus/src/mcp/tools.ts)). But runtime filter only allows 4: `CALLS`, `IMPORTS`, `EXTENDS`, `IMPLEMENTS` ([local-backend.ts:50](gitnexus/src/mcp/local/local-backend.ts)). The additional types `HAS_METHOD` and `OVERRIDES` are silently ignored at runtime. + +**Change:** Expand runtime filter to accept the documented types (`HAS_METHOD`, `OVERRIDES`). This is existing tech debt, not introduced by this RFC, but should be resolved to avoid confusion when `group_impact` inherits the same filter. + +**Scope:** ~10 lines in `local-backend.ts`. + +### P3: (Future, not MVP) Route/FETCHES schema extension + +For full-fidelity HTTP contract extraction from the graph (without source scanning), the ingestion pipeline would need: +- `Route` node label in schema.ts +- `HANDLES_ROUTE` and `FETCHES` relation types in schema.ts +- Route extraction generalized beyond Laravel (Spring Boot, Express, FastAPI, etc.) +- Consumer-side fetch detection persisted as `FETCHES` edges + +This is a **significant change to the ingestion pipeline** and is NOT a prerequisite for MVP. The demo PR uses source-scan extraction instead. This is listed here as future optimization path — once the graph has this data, extractors can switch from source scan to Cypher queries (faster, more accurate). + +--- + +## Section 9: Test Migration Plan + +### Affected test guards + +Adding group CLI commands and MCP tools will break existing shape assertions: + +| Test | Current assertion | After change | +|------|-------------------|--------------| +| [tools.test.ts:14](gitnexus/test/unit/tools.test.ts) | Tool count = 7 | Tool count = 7 + 6 group tools = 13 | +| [resources.test.ts:40](gitnexus/test/unit/resources.test.ts) | Resource count assertion | Unchanged (no new resources) | +| [resources.test.ts:66](gitnexus/test/unit/resources.test.ts) | Resource template count assertion | Unchanged | + +### New tests required + +| Test | Type | Description | +|------|------|-------------| +| `group-config.test.ts` | Unit | Parse/validate group.yaml, handle missing repos, nested paths | +| `contract-extractor.test.ts` | Unit | Each extractor against fixture source files | +| `matching-cascade.test.ts` | Unit | Exact match, BM25 relative scoring, confidence mapping | +| `group-impact.test.ts` | Integration | Two fixture repos (TS frontend + Java backend), end-to-end group_impact | +| `group-cli.test.ts` | Integration | CLI commands: create, add, list, status, sync | +| `group-tools.test.ts` | Unit | MCP tool registration, parameter validation | + +### Fixture repos for integration tests + +Two minimal repos checked into `test/fixtures/group/`: +- `test-frontend/` — TypeScript, contains `fetch('/api/users')` call +- `test-backend/` — Java/TypeScript, contains route handler for `/api/users` + +Both pre-indexed with `.gitnexus/` directories committed as test fixtures. diff --git a/docs/superpowers/plans/2026-04-02-pr626-high-fixes.md b/docs/superpowers/plans/2026-04-02-pr626-high-fixes.md index 0c9204e8c6..42a419b36a 100644 --- a/docs/superpowers/plans/2026-04-02-pr626-high-fixes.md +++ b/docs/superpowers/plans/2026-04-02-pr626-high-fixes.md @@ -1,5 +1,7 @@ # PR #626 HIGH-Priority Fixes Implementation Plan +> **Historical:** This plan was executed for PR #626. Contract storage has since migrated from `contracts.json` to `bridge.lbug`. See `2026-04-04-bridge-lbug-grpc-normalization.md` for the current plan. + > **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. **Goal:** Fix 4 HIGH-priority issues from PR #626 code review before merge. diff --git a/docs/superpowers/plans/2026-04-04-bridge-lbug-grpc-normalization.md b/docs/superpowers/plans/2026-04-04-bridge-lbug-grpc-normalization.md new file mode 100644 index 0000000000..1cfe3675e4 --- /dev/null +++ b/docs/superpowers/plans/2026-04-04-bridge-lbug-grpc-normalization.md @@ -0,0 +1,1328 @@ +# Bridge.lbug & gRPC Canonical ID Implementation Plan + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** Migrate contract storage from `contracts.json` to LadybugDB (`bridge.lbug`) and fix gRPC normalization mismatch via proto-aware extraction. + +**Architecture:** Two independent components — (1) bridge.lbug: new `bridge-db.ts` module for LadybugDB lifecycle, `bridge-schema.ts` for DDL, consumer migration in storage/service/cross-impact/sync/CLI; (2) gRPC: `buildProtoMap()` + `resolveProtoConflict()` in grpc-extractor, `serviceContractId()`, wildcard matching in matching.ts. Both integrate at `sync.ts` where matching → write happens. + +**Tech Stack:** TypeScript, LadybugDB (DuckDB-based graph DB), Vitest, Node.js `node:crypto` for SHA-256. + +**Spec:** [`docs/superpowers/specs/2026-04-03-bridge-lbug-grpc-normalization-design.md`](../specs/2026-04-03-bridge-lbug-grpc-normalization-design.md) + +--- + +## File Map + +### New Files +| File | Responsibility | +|------|---------------| +| `src/core/group/bridge-schema.ts` | DDL constants for Contract, RepoSnapshot, ContractLink tables; `BRIDGE_SCHEMA_VERSION` | +| `src/core/group/bridge-db.ts` | `openBridgeDb`, `ensureBridgeSchema`, `writeBridge`, `queryBridge`, `closeBridgeDb`, `openBridgeDbReadOnly`, `bridgeExists`, `readBridgeMeta`, `writeBridgeMeta`, `retryRename`, `contractNodeId` | +| `test/unit/group/bridge-db.test.ts` | Unit tests for bridge-db.ts | +| `test/integration/group/bridge-sync.test.ts` | Integration tests for bridge.lbug through syncGroup | + +### Modified Files +| File | What Changes | +|------|-------------| +| `src/core/group/types.ts` | Add `BridgeHandle`, `BridgeMeta`, `LegacyContractRegistry`; add `'wildcard'` to `MatchType` | +| `src/core/group/storage.ts` | Remove `writeContractRegistry`, `readContractRegistry`, `CONTRACTS_FILE`; keep as `readContractRegistryJson` (private); add `openBridgeOrFallback` (imports from bridge-db.ts) | +| `src/core/group/matching.ts` | Export `buildProviderIndex`; `runExactMatch` skips gRPC `/*`; add `runWildcardMatch` | +| `src/core/group/extractors/grpc-extractor.ts` | Add `buildProtoMap`, `resolveProtoConflict`, `serviceContractId`; modify 4 source scanners | +| `src/core/group/sync.ts` | Replace `writeContractRegistry` with `writeBridge`; add wildcard pass | +| `src/core/group/cross-impact.ts` | New `runGroupImpact` with `bridgeQuery`; rename old to `runGroupImpactLegacy` | +| `src/core/group/service.ts` | Use `openBridgeOrFallback`; extend `crossImpactFn` with hint | +| `src/cli/group.ts` | Update sync/impact/status commands | +| `src/mcp/tools.ts` | Update tool descriptions (remove `contracts.json` references) | +| `src/mcp/local/local-backend.ts` | Update `groupImpact`, `groupContracts` | +| `test/unit/group/matching.test.ts` | Add wildcard match tests | +| `test/unit/group/grpc-extractor.test.ts` | Add proto map + canonical ID tests | +| `test/unit/group/cross-impact.test.ts` | Direction-dependent Cypher, ref fallback, hint | +| `test/unit/group/sync.test.ts` | Update for bridge.lbug + wildcard pass | +| `test/unit/group/service.test.ts` | Update for openBridgeOrFallback | +| `test/unit/group/storage.test.ts` | Update for removed functions | +| `test/unit/tools.test.ts` | Update tool description assertions if any | +| `test/integration/group/group-impact.test.ts` | Update for bridge.lbug | + +--- + +## Task 1: Types & Schema Foundation + +**Files:** +- Modify: `gitnexus/src/core/group/types.ts` +- Create: `gitnexus/src/core/group/bridge-schema.ts` + +- [ ] **Step 1: Add new types to `types.ts`** + +At the top of `gitnexus/src/core/group/types.ts`, after the existing `MatchType`: + +```typescript +// Line 2: update MatchType +export type MatchType = 'exact' | 'manifest' | 'wildcard' | 'bm25' | 'embedding'; +``` + +At the end of `types.ts`, after `OutOfScopeLink`: + +```typescript +/** + * @deprecated Use bridge.lbug instead. Kept for JSON fallback during migration. + * This is a type alias — ContractRegistry is NOT removed yet. + * In Task 10 (cleanup), ContractRegistry will be renamed to LegacyContractRegistry + * and all imports updated. For now, both names work. + */ +export type LegacyContractRegistry = ContractRegistry; + +/** Opaque handle to an open bridge LadybugDB. */ +export interface BridgeHandle { + /** Internal — do not access directly. */ + readonly _db: unknown; + readonly _conn: unknown; + readonly groupDir: string; +} + +export interface BridgeMeta { + version: number; + generatedAt: string; + missingRepos: string[]; +} +``` + +- [ ] **Step 2: Create `bridge-schema.ts`** + +Create `gitnexus/src/core/group/bridge-schema.ts`: + +```typescript +/** + * Bridge LadybugDB schema for cross-repo Contract Registry. + * Separate from per-repo schema in lbug/schema.ts. + */ + +export const BRIDGE_SCHEMA_VERSION = 1; + +export const CONTRACT_SCHEMA = ` +CREATE NODE TABLE Contract ( + id STRING, + contractId STRING, + type STRING, + role STRING, + repo STRING, + service STRING DEFAULT '', + symbolUid STRING DEFAULT '', + filePath STRING DEFAULT '', + symbolName STRING DEFAULT '', + confidence DOUBLE DEFAULT 0.0, + meta STRING DEFAULT '{}', + PRIMARY KEY (id) +)`; + +export const REPO_SNAPSHOT_SCHEMA = ` +CREATE NODE TABLE RepoSnapshot ( + id STRING, + indexedAt STRING DEFAULT '', + lastCommit STRING DEFAULT '', + PRIMARY KEY (id) +)`; + +export const CONTRACT_LINK_SCHEMA = ` +CREATE REL TABLE ContractLink ( + FROM Contract TO Contract, + matchType STRING, + confidence DOUBLE, + contractId STRING, + fromRepo STRING, + toRepo STRING +)`; + +export const BRIDGE_SCHEMA_QUERIES = [ + CONTRACT_SCHEMA, + REPO_SNAPSHOT_SCHEMA, + CONTRACT_LINK_SCHEMA, +]; +``` + +- [ ] **Step 3: Verify build** + +Run: `cd gitnexus && npm run build` +Expected: Clean build, no errors. + +- [ ] **Step 4: Commit** + +```bash +git add gitnexus/src/core/group/types.ts gitnexus/src/core/group/bridge-schema.ts +git commit -m "feat(group): add BridgeHandle/BridgeMeta types and bridge schema DDL" +``` + +--- + +## Task 2: Bridge DB Core — Open, Schema, Query, Close + +**Files:** +- Create: `gitnexus/src/core/group/bridge-db.ts` +- Create: `gitnexus/test/unit/group/bridge-db.test.ts` + +- [ ] **Step 1: Write failing tests for open/schema/query/close** + +Create `gitnexus/test/unit/group/bridge-db.test.ts`: + +```typescript +import { describe, it, expect, beforeEach, afterEach } from 'vitest'; +import * as fs from 'node:fs/promises'; +import * as path from 'node:path'; +import * as os from 'node:os'; +import { + openBridgeDb, + ensureBridgeSchema, + queryBridge, + closeBridgeDb, +} from '../../../src/core/group/bridge-db.js'; + +describe('bridge-db core', () => { + let tmpDir: string; + + beforeEach(async () => { + tmpDir = await fs.mkdtemp(path.join(os.tmpdir(), 'bridge-test-')); + }); + + afterEach(async () => { + await fs.rm(tmpDir, { recursive: true, force: true }); + }); + + it('test_openBridgeDb_creates_file_and_closes', async () => { + const dbPath = path.join(tmpDir, 'test.lbug'); + const handle = await openBridgeDb(dbPath); + expect(handle).toBeDefined(); + expect(handle.groupDir).toBe(tmpDir); + await closeBridgeDb(handle); + // File should exist after close + await expect(fs.access(dbPath)).resolves.toBeUndefined(); + }); + + it('test_ensureBridgeSchema_creates_tables_idempotent', async () => { + const dbPath = path.join(tmpDir, 'test.lbug'); + const handle = await openBridgeDb(dbPath); + await ensureBridgeSchema(handle); + // Run again — should not throw + await ensureBridgeSchema(handle); + // Verify tables exist by inserting a dummy node + const rows = await queryBridge<{ cnt: number }>( + handle, + 'MATCH (c:Contract) RETURN count(c) AS cnt', + ); + expect(rows[0].cnt).toBe(0); + await closeBridgeDb(handle); + }); + + it('test_queryBridge_returns_inserted_data', async () => { + const dbPath = path.join(tmpDir, 'test.lbug'); + const handle = await openBridgeDb(dbPath); + await ensureBridgeSchema(handle); + await queryBridge(handle, `CREATE (c:Contract { + id: 'abc123', contractId: 'http::GET::/api', type: 'http', role: 'provider', + repo: 'backend', confidence: 0.9 + })`); + const rows = await queryBridge<{ repo: string; confidence: number }>( + handle, + 'MATCH (c:Contract) RETURN c.repo AS repo, c.confidence AS confidence', + ); + expect(rows).toHaveLength(1); + expect(rows[0].repo).toBe('backend'); + expect(rows[0].confidence).toBe(0.9); + await closeBridgeDb(handle); + }); +}); +``` + +- [ ] **Step 2: Run tests to verify they fail** + +Run: `cd gitnexus && npx vitest run test/unit/group/bridge-db.test.ts` +Expected: FAIL — `bridge-db.js` doesn't exist. + +- [ ] **Step 3: Implement bridge-db.ts core functions** + +Create `gitnexus/src/core/group/bridge-db.ts`: + +```typescript +import * as fsp from 'node:fs/promises'; +import * as path from 'node:path'; +import { createHash } from 'node:crypto'; +import type { BridgeHandle, BridgeMeta, StoredContract, CrossLink, RepoSnapshot } from './types.js'; +import { BRIDGE_SCHEMA_QUERIES, BRIDGE_SCHEMA_VERSION } from './bridge-schema.js'; + +// LadybugDB native binding — same import path as pool-adapter.ts:19 +import lbug from '@ladybugdb/core'; + +export function contractNodeId( + repo: string, contractId: string, role: string, filePath: string, +): string { + return createHash('sha256') + .update(`${repo}\0${contractId}\0${role}\0${filePath}`) + .digest('hex'); +} + +export async function openBridgeDb(dbPath: string): Promise { + const parentDir = path.dirname(dbPath); + await fsp.mkdir(parentDir, { recursive: true }); + // LadybugDB constructor: (path, bufferManagerSize, enableCompression, readOnly) + // See pool-adapter.ts:265-270 for reference + const db = new lbug.Database(dbPath, 0, false, false); // writable + const conn = new lbug.Connection(db); + return { _db: db, _conn: conn, groupDir: parentDir } as BridgeHandle; +} + +export async function ensureBridgeSchema(handle: BridgeHandle): Promise { + const conn = handle._conn as any; + for (const q of BRIDGE_SCHEMA_QUERIES) { + try { + await conn.query(q); + } catch (err: any) { + const msg = err?.message ?? ''; + if (!msg.includes('already exists')) throw err; + } + } +} + +export async function queryBridge( + handle: BridgeHandle, + cypher: string, + params?: Record, +): Promise { + const conn = handle._conn as any; + if (params && Object.keys(params).length > 0) { + // Parameterized query — same pattern as pool-adapter.ts:524-532 + const stmt = await conn.prepare(cypher); + if (!stmt.isSuccess()) { + const errMsg = await stmt.getErrorMessage(); + throw new Error(`Prepare failed: ${errMsg}`); + } + const queryResult = await conn.execute(stmt, params); + const result = Array.isArray(queryResult) ? queryResult[0] : queryResult; + return (await result.getAll()) as T[]; + } + const result = await conn.query(cypher); + return (Array.isArray(result) ? await result[0].getAll() : await result.getAll()) as T[]; +} + +export async function closeBridgeDb(handle: BridgeHandle): Promise { + try { + const conn = handle._conn as any; + await conn.close(); // async — must await before renaming files on Windows + } catch { /* ignore */ } + try { + const db = handle._db as any; + await db.close(); + } catch { /* ignore */ } +} +``` + +- [ ] **Step 4: Run tests to verify they pass** + +Run: `cd gitnexus && npx vitest run test/unit/group/bridge-db.test.ts` +Expected: 3 tests PASS. + +- [ ] **Step 5: Commit** + +```bash +git add gitnexus/src/core/group/bridge-db.ts gitnexus/test/unit/group/bridge-db.test.ts +git commit -m "feat(group): bridge-db core — open, schema, query, close" +``` + +--- + +## Task 3: Bridge DB — writeBridge, readBridgeMeta, openBridgeDbReadOnly + +**Files:** +- Modify: `gitnexus/src/core/group/bridge-db.ts` +- Modify: `gitnexus/test/unit/group/bridge-db.test.ts` + +- [ ] **Step 1: Write failing tests for writeBridge round-trip** + +Append to `gitnexus/test/unit/group/bridge-db.test.ts`: + +```typescript +import { + writeBridge, + openBridgeDbReadOnly, + readBridgeMeta, + bridgeExists, +} from '../../../src/core/group/bridge-db.js'; +import type { StoredContract, CrossLink, RepoSnapshot } from '../../../src/core/group/types.js'; + +describe('writeBridge + read', () => { + let tmpDir: string; + + beforeEach(async () => { + tmpDir = await fs.mkdtemp(path.join(os.tmpdir(), 'bridge-write-')); + }); + + afterEach(async () => { + await fs.rm(tmpDir, { recursive: true, force: true }); + }); + + const makeContract = (overrides: Partial = {}): StoredContract => ({ + contractId: 'http::GET::/api/users', + type: 'http', + role: 'provider', + symbolUid: 'uid-1', + symbolRef: { filePath: 'src/routes.ts', name: 'getUsers' }, + symbolName: 'getUsers', + confidence: 0.85, + meta: {}, + repo: 'backend', + ...overrides, + }); + + it('test_writeBridge_creates_bridge_lbug_file', async () => { + await writeBridge(tmpDir, { + contracts: [makeContract()], + crossLinks: [], + repoSnapshots: { backend: { indexedAt: '2026-01-01', lastCommit: 'abc' } }, + missingRepos: ['missing-repo'], + }); + const exists = await bridgeExists(tmpDir); + expect(exists).toBe(true); + }); + + it('test_writeBridge_contracts_queryable', async () => { + await writeBridge(tmpDir, { + contracts: [makeContract(), makeContract({ repo: 'frontend', role: 'consumer' })], + crossLinks: [], + repoSnapshots: {}, + missingRepos: [], + }); + const handle = await openBridgeDbReadOnly(tmpDir); + expect(handle).not.toBeNull(); + const rows = await queryBridge<{ repo: string }>(handle!, 'MATCH (c:Contract) RETURN c.repo AS repo'); + expect(rows).toHaveLength(2); + await closeBridgeDb(handle!); + }); + + it('test_writeBridge_meta_json_persists_missingRepos', async () => { + await writeBridge(tmpDir, { + contracts: [], + crossLinks: [], + repoSnapshots: {}, + missingRepos: ['repo-a', 'repo-b'], + }); + const meta = await readBridgeMeta(tmpDir); + expect(meta.missingRepos).toEqual(['repo-a', 'repo-b']); + expect(meta.version).toBeGreaterThan(0); + expect(meta.generatedAt).toBeTruthy(); + }); + + it('test_writeBridge_repoSnapshots_queryable', async () => { + await writeBridge(tmpDir, { + contracts: [], + crossLinks: [], + repoSnapshots: { 'hr/backend': { indexedAt: '2026-01-01', lastCommit: 'abc' } }, + missingRepos: [], + }); + const handle = await openBridgeDbReadOnly(tmpDir); + const rows = await queryBridge<{ id: string; indexedAt: string }>( + handle!, + 'MATCH (s:RepoSnapshot) RETURN s.id AS id, s.indexedAt AS indexedAt', + ); + expect(rows).toHaveLength(1); + expect(rows[0].id).toBe('hr/backend'); + expect(rows[0].indexedAt).toBe('2026-01-01'); + await closeBridgeDb(handle!); + }); + + it('test_writeBridge_crossLinks_queryable', async () => { + const provider = makeContract({ repo: 'backend', role: 'provider' }); + const consumer = makeContract({ repo: 'frontend', role: 'consumer', filePath: 'src/api.ts', symbolName: 'fetchUsers' }); + const link: CrossLink = { + from: { repo: 'frontend', symbolUid: '', symbolRef: { filePath: 'src/api.ts', name: 'fetchUsers' } }, + to: { repo: 'backend', symbolUid: 'uid-1', symbolRef: { filePath: 'src/routes.ts', name: 'getUsers' } }, + type: 'http', + contractId: 'http::GET::/api/users', + matchType: 'exact', + confidence: 1.0, + }; + await writeBridge(tmpDir, { + contracts: [provider, consumer], + crossLinks: [link], + repoSnapshots: {}, + missingRepos: [], + }); + const handle = await openBridgeDbReadOnly(tmpDir); + const rows = await queryBridge<{ fromRepo: string; toRepo: string; matchType: string }>( + handle!, + 'MATCH (a:Contract)-[l:ContractLink]->(b:Contract) RETURN l.fromRepo AS fromRepo, l.toRepo AS toRepo, l.matchType AS matchType', + ); + expect(rows).toHaveLength(1); + expect(rows[0].fromRepo).toBe('frontend'); + expect(rows[0].toRepo).toBe('backend'); + expect(rows[0].matchType).toBe('exact'); + await closeBridgeDb(handle!); + }); + + it('test_openBridgeDbReadOnly_returns_null_for_missing', async () => { + const handle = await openBridgeDbReadOnly(path.join(tmpDir, 'nonexistent')); + expect(handle).toBeNull(); + }); + + it('test_bridgeExists_false_for_missing', async () => { + expect(await bridgeExists(path.join(tmpDir, 'nonexistent'))).toBe(false); + }); + + it('test_writeBridge_overwrites_previous', async () => { + await writeBridge(tmpDir, { + contracts: [makeContract()], + crossLinks: [], + repoSnapshots: {}, + missingRepos: [], + }); + await writeBridge(tmpDir, { + contracts: [makeContract({ repo: 'new-repo' })], + crossLinks: [], + repoSnapshots: {}, + missingRepos: [], + }); + const handle = await openBridgeDbReadOnly(tmpDir); + const rows = await queryBridge<{ repo: string }>(handle!, 'MATCH (c:Contract) RETURN c.repo AS repo'); + expect(rows).toHaveLength(1); + expect(rows[0].repo).toBe('new-repo'); + await closeBridgeDb(handle!); + }); + + it('test_readBridgeMeta_returns_defaults_for_missing', async () => { + const meta = await readBridgeMeta(path.join(tmpDir, 'nonexistent')); + expect(meta.version).toBe(0); + expect(meta.generatedAt).toBe(''); + expect(meta.missingRepos).toEqual([]); + }); +}); +``` + +- [ ] **Step 2: Run tests to verify they fail** + +Run: `cd gitnexus && npx vitest run test/unit/group/bridge-db.test.ts` +Expected: FAIL — `writeBridge` etc. not exported. + +- [ ] **Step 3: Implement writeBridge, readBridgeMeta, openBridgeDbReadOnly, bridgeExists** + +Append to `gitnexus/src/core/group/bridge-db.ts`: + +```typescript +const RETRY_CODES = new Set(['EBUSY', 'EPERM', 'EACCES']); + +async function retryRename(src: string, dst: string, attempts = 3): Promise { + for (let i = 1; i <= attempts; i++) { + try { await fsp.rename(src, dst); return; } catch (err: any) { + if (!RETRY_CODES.has(err.code) || i === attempts) throw err; + await new Promise(r => setTimeout(r, 100 * Math.pow(2, i - 1))); + } + } +} + +export async function writeBridgeMeta(groupDir: string, meta: BridgeMeta): Promise { + const target = path.join(groupDir, 'meta.json'); + const tmp = `${target}.tmp.${Date.now()}`; + await fsp.writeFile(tmp, JSON.stringify(meta, null, 2), 'utf-8'); + await fsp.rename(tmp, target); +} + +export async function readBridgeMeta(groupDir: string): Promise { + try { + const content = await fsp.readFile(path.join(groupDir, 'meta.json'), 'utf-8'); + return JSON.parse(content) as BridgeMeta; + } catch { + return { version: 0, generatedAt: '', missingRepos: [] }; + } +} + +export async function writeBridge( + groupDir: string, + data: { + contracts: StoredContract[]; + crossLinks: CrossLink[]; + repoSnapshots: Record; + missingRepos: string[]; + }, +): Promise { + const tempPath = path.join(groupDir, 'bridge.lbug.tmp'); + const finalPath = path.join(groupDir, 'bridge.lbug'); + + await fsp.rm(tempPath, { force: true }); + const tempHandle = await openBridgeDb(tempPath); + try { + await ensureBridgeSchema(tempHandle); + + // Insert contracts + for (const c of data.contracts) { + const id = contractNodeId(c.repo, c.contractId, c.role, c.symbolRef.filePath); + await queryBridge(tempHandle, `CREATE (n:Contract { + id: $id, contractId: $contractId, type: $type, role: $role, + repo: $repo, service: $service, symbolUid: $symbolUid, + filePath: $filePath, symbolName: $symbolName, + confidence: $confidence, meta: $meta + })`, { + id, contractId: c.contractId, type: c.type, role: c.role, + repo: c.repo, service: c.service ?? '', symbolUid: c.symbolUid, + filePath: c.symbolRef.filePath, symbolName: c.symbolName, + confidence: c.confidence, meta: JSON.stringify(c.meta), + }); + } + + // Insert cross-links + for (const link of data.crossLinks) { + const fromId = contractNodeId( + link.from.repo, link.contractId, 'consumer', link.from.symbolRef.filePath, + ); + const toId = contractNodeId( + link.to.repo, link.contractId, 'provider', link.to.symbolRef.filePath, + ); + await queryBridge(tempHandle, ` + MATCH (a:Contract), (b:Contract) + WHERE a.id = $fromId AND b.id = $toId + CREATE (a)-[:ContractLink { + matchType: $matchType, confidence: $confidence, + contractId: $contractId, fromRepo: $fromRepo, toRepo: $toRepo + }]->(b) + `, { + fromId, toId, + matchType: link.matchType, confidence: link.confidence, + contractId: link.contractId, + fromRepo: link.from.repo, toRepo: link.to.repo, + }); + } + + // Insert repo snapshots + for (const [repoPath, snap] of Object.entries(data.repoSnapshots)) { + await queryBridge(tempHandle, `CREATE (s:RepoSnapshot { + id: $id, indexedAt: $indexedAt, lastCommit: $lastCommit + })`, { id: repoPath, indexedAt: snap.indexedAt, lastCommit: snap.lastCommit }); + } + + await closeBridgeDb(tempHandle); + } catch (err) { + await closeBridgeDb(tempHandle).catch(() => {}); + await fsp.rm(tempPath, { force: true }); + throw err; + } + + // Atomic swap + const bakPath = path.join(groupDir, 'bridge.lbug.bak'); + await fsp.rm(bakPath, { force: true }); + try { await fsp.access(finalPath); await retryRename(finalPath, bakPath); } catch {} + await retryRename(tempPath, finalPath); + await fsp.rm(bakPath, { force: true }); + + // Write meta.json + await writeBridgeMeta(groupDir, { + version: BRIDGE_SCHEMA_VERSION, + generatedAt: new Date().toISOString(), + missingRepos: data.missingRepos, + }); +} + +export async function openBridgeDbReadOnly(groupDir: string): Promise { + const dbPath = path.join(groupDir, 'bridge.lbug'); + try { + await fsp.access(dbPath); + } catch { + // Check for .bak recovery + const bakPath = path.join(groupDir, 'bridge.lbug.bak'); + try { + await fsp.access(bakPath); + await fsp.rename(bakPath, dbPath); + } catch { + return null; + } + } + try { + const db = new lbug.Database(dbPath, 0, false, true); // readOnly + const conn = new lbug.Connection(db); + // Version check + const meta = await readBridgeMeta(groupDir); + if (meta.version > 0 && meta.version !== BRIDGE_SCHEMA_VERSION) { + conn.close(); db.close(); + return null; + } + return { _db: db, _conn: conn, groupDir } as BridgeHandle; + } catch { + return null; + } +} + +export async function bridgeExists(groupDir: string): Promise { + const handle = await openBridgeDbReadOnly(groupDir); + if (!handle) return false; + await closeBridgeDb(handle); + return true; +} + +// NOTE: openBridgeOrFallback lives in storage.ts (not bridge-db.ts) per spec. +// It uses readContractRegistryJson (private in storage.ts) for JSON fallback. +// See Task 7 Step 4 for the implementation. +``` + +- [ ] **Step 4: Run tests to verify they pass** + +Run: `cd gitnexus && npx vitest run test/unit/group/bridge-db.test.ts` +Expected: All tests PASS. + +- [ ] **Step 5: Commit** + +```bash +git add gitnexus/src/core/group/bridge-db.ts gitnexus/test/unit/group/bridge-db.test.ts +git commit -m "feat(group): bridge-db writeBridge, readBridgeMeta, openBridgeDbReadOnly" +``` + +--- + +## Task 4: gRPC Proto Map — buildProtoMap & resolveProtoConflict + +**Files:** +- Modify: `gitnexus/src/core/group/extractors/grpc-extractor.ts` +- Modify: `gitnexus/test/unit/group/grpc-extractor.test.ts` + +- [ ] **Step 1: Write failing tests for buildProtoMap** + +Add to `gitnexus/test/unit/group/grpc-extractor.test.ts` a new `describe('buildProtoMap')` block. Tests: + +- `test_buildProtoMap_single_proto_parses_package_service_methods` — create a temp dir with a `.proto` file containing `package com.example; service UserService { rpc GetUser(...) returns (...); rpc ListUsers(...) returns (...); }`, call `buildProtoMap(tmpDir)`, assert map has key `'UserService'` with one entry: `{ package: 'com.example', serviceName: 'UserService', methods: ['GetUser', 'ListUsers'], protoPath: ... }`. +- `test_buildProtoMap_no_package_declaration` — proto without `package` → `package: ''`. +- `test_buildProtoMap_no_protos_returns_empty` — empty dir → empty map. +- `test_buildProtoMap_conflicting_names` — two protos with same service name different packages → array of 2. + +- [ ] **Step 2: Run tests to verify they fail** + +Run: `cd gitnexus && npx vitest run test/unit/group/grpc-extractor.test.ts -t "buildProtoMap"` +Expected: FAIL. + +- [ ] **Step 3: Implement buildProtoMap** + +Add to `gitnexus/src/core/group/extractors/grpc-extractor.ts`: + +```typescript +export interface ProtoServiceInfo { + package: string; + serviceName: string; + methods: string[]; + protoPath: string; +} + +export async function buildProtoMap(repoPath: string): Promise> { + const map = new Map(); + const protoFiles = await glob('**/*.proto', { cwd: repoPath, absolute: false, nodir: true }); + + for (const rel of protoFiles) { + const content = readSafe(repoPath, rel); + if (!content) continue; + + const pkgMatch = content.match(/^\s*package\s+([\w.]+)\s*;/m); + const pkg = pkgMatch?.[1] ?? ''; + + const serviceBlocks = extractServiceBlocks(content); + for (const block of serviceBlocks) { + const rpcRe = /rpc\s+(\w+)\s*\(/g; + const methods: string[] = []; + let m: RegExpExecArray | null; + while ((m = rpcRe.exec(block.body)) !== null) { + methods.push(m[1]); + } + const info: ProtoServiceInfo = { + package: pkg, + serviceName: block.name, + methods, + protoPath: rel, + }; + const existing = map.get(block.name) ?? []; + existing.push(info); + map.set(block.name, existing); + } + } + return map; +} +``` + +- [ ] **Step 4: Write failing tests for resolveProtoConflict** + +Tests for: single candidate → returns it; multiple → directory proximity; no candidates → null. + +- [ ] **Step 5: Implement resolveProtoConflict** + +```typescript +export function resolveProtoConflict( + serviceName: string, + sourceFilePath: string, + candidates: ProtoServiceInfo[], +): ProtoServiceInfo | null { + if (candidates.length === 0) return null; + if (candidates.length === 1) return candidates[0]; + + const sourceDir = path.dirname(sourceFilePath); + let best = candidates[0]; + let bestScore = 0; + for (const c of candidates) { + const protoDir = path.dirname(c.protoPath); + let shared = 0; + const min = Math.min(sourceDir.length, protoDir.length); + for (let i = 0; i < min; i++) { + if (sourceDir[i] === protoDir[i]) shared++; else break; + } + if (shared > bestScore) { bestScore = shared; best = c; } + } + return best; +} +``` + +- [ ] **Step 6: Add serviceContractId helper** + +```typescript +export function serviceContractId(pkg: string, serviceName: string): string { + const prefix = pkg ? `${pkg}.${serviceName}` : serviceName; + return `grpc::${prefix}/*`; +} +``` + +- [ ] **Step 7: Run all grpc-extractor tests** + +Run: `cd gitnexus && npx vitest run test/unit/group/grpc-extractor.test.ts` +Expected: All PASS. + +- [ ] **Step 8: Commit** + +```bash +git add gitnexus/src/core/group/extractors/grpc-extractor.ts gitnexus/test/unit/group/grpc-extractor.test.ts +git commit -m "feat(group): buildProtoMap, resolveProtoConflict, serviceContractId" +``` + +--- + +## Task 5: gRPC Source Scanners — Proto-Aware Resolution + +**Files:** +- Modify: `gitnexus/src/core/group/extractors/grpc-extractor.ts` +- Modify: `gitnexus/test/unit/group/grpc-extractor.test.ts` + +- [ ] **Step 1: Write failing tests for proto-resolved Go provider** + +Test: given a temp dir with a `.proto` (`package com.example; service UserService { rpc GetUser... }`) and a Go source file with `RegisterUserServiceServer`, calling `extract()` should produce a contract with `contractId: 'grpc::com.example.UserService/*'` and `confidence: 0.8`. + +- [ ] **Step 2: Run test to verify it fails** + +- [ ] **Step 3: Modify `GrpcExtractor.extract()` to build protoMap and pass to scanners** + +In `GrpcExtractor.extract()`, add at the top: +```typescript +const protoMap = await buildProtoMap(repoPath); +``` + +Then pass `protoMap` to each scanner method. Modify each scanner (Go, Java, Python, TS) to accept `protoMap` as a parameter and resolve via `resolveProtoConflict()`. + +For Go provider (`scanGoProviders`): +```typescript +private scanGoProviders(content: string, filePath: string, protoMap: Map): ExtractedContract[] { + // ... existing regex ... + const serviceName = m[1]; + const candidates = protoMap.get(serviceName); + const proto = resolveProtoConflict(serviceName, filePath, candidates ?? []); + const cid = proto + ? serviceContractId(proto.package, proto.serviceName) + : serviceOnlyContractId(serviceName); + const conf = proto ? 0.8 : 0.65; + // ... push makeContract(cid, 'provider', filePath, ..., conf, ...) ... +} +``` + +Apply similar changes to Go consumer (conf: proto ? 0.75 : 0.55), Java, Python, TS scanners. + +For TS scanner — keep per-method contracts but add package: +```typescript +const proto = resolveProtoConflict(serviceName, filePath, protoMap.get(serviceName) ?? []); +const pkg = proto?.package ?? ''; +const cid = contractId(pkg, serviceName, methodName); +``` + +- [ ] **Step 4: Write test for fallback (no proto → reduced confidence)** + +Test: Go source with `RegisterFooServer` but no `.proto` file → `contractId: 'grpc::Foo/*'`, `confidence: 0.65`. + +- [ ] **Step 5: Run all grpc-extractor tests** + +Run: `cd gitnexus && npx vitest run test/unit/group/grpc-extractor.test.ts` +Expected: All PASS. + +- [ ] **Step 6: Commit** + +```bash +git add gitnexus/src/core/group/extractors/grpc-extractor.ts gitnexus/test/unit/group/grpc-extractor.test.ts +git commit -m "feat(group): proto-aware gRPC source scanners with confidence adjustments" +``` + +--- + +## Task 6: Matching — buildProviderIndex, runExactMatch skip, runWildcardMatch + +**Files:** +- Modify: `gitnexus/src/core/group/matching.ts` +- Modify: `gitnexus/test/unit/group/matching.test.ts` + +- [ ] **Step 1: Write failing tests for wildcard matching** + +Add tests to `gitnexus/test/unit/group/matching.test.ts`: + +- `test_runExactMatch_skips_grpc_wildcard_contracts` — consumer with `grpc::com.example.UserService/*` and provider with same → NOT matched as exact (both in unmatched). +- `test_runExactMatch_does_not_skip_http_wildcards` — HTTP wildcards still work. +- `test_runWildcardMatch_fq_service_match` — consumer `grpc::com.example.userservice/*` matches provider `grpc::com.example.userservice/GetUser`. +- `test_runWildcardMatch_bare_name_match` — consumer `grpc::userservice/*` matches provider `grpc::com.example.userservice/GetUser`. +- `test_runWildcardMatch_skips_wildcard_providers` — wildcard consumer vs wildcard provider → no match. +- `test_runWildcardMatch_confidence_min` — confidence = min(provider, consumer). +- `test_runWildcardMatch_matchType_is_wildcard` — CrossLink has `matchType: 'wildcard'`. +- `test_runWildcardMatch_contractId_is_consumers` — CrossLink has consumer's contractId. + +- [ ] **Step 2: Run tests to verify they fail** + +- [ ] **Step 3: Extract `buildProviderIndex` from `runExactMatch`** + +```typescript +export function buildProviderIndex(contracts: StoredContract[]): Map { + const providers = contracts.filter((c) => c.role === 'provider'); + const index = new Map(); + for (const p of providers) { + const key = normalizeContractId(p.contractId); + const list = index.get(key) || []; + list.push(p); + index.set(key, list); + } + return index; +} +``` + +- [ ] **Step 4: Modify `runExactMatch` to skip gRPC wildcards and accept optional index** + +```typescript +function isGrpcWildcard(contractId: string): boolean { + return contractId.startsWith('grpc::') && contractId.endsWith('/*'); +} + +export function runExactMatch( + contracts: StoredContract[], + providerIndex?: Map, +): MatchResult { + const index = providerIndex ?? buildProviderIndex(contracts); + // Filter OUT gRPC wildcard consumers from exact matching — they go to wildcard pass + const consumers = contracts.filter((c) => c.role === 'consumer' && !isGrpcWildcard(c.contractId)); + // ... rest same as before, using `index` instead of building one ... + // normalUnmatched already excludes matched contracts. + // gRPC wildcards were never passed to exact matching, so they're NOT in normalUnmatched. + // Re-add them to unmatched for the wildcard pass. + const grpcWildcardContracts = contracts.filter((c) => isGrpcWildcard(c.contractId)); + // Dedup: normalUnmatched won't contain wildcards (they were filtered from consumers/providers) + const unmatched = [...normalUnmatched, ...grpcWildcardContracts]; + return { matched, unmatched }; +} +``` + +- [ ] **Step 5: Implement `runWildcardMatch`** + +```typescript +export function runWildcardMatch( + unmatched: StoredContract[], + providerIndex: Map, +): { matched: CrossLink[]; remaining: StoredContract[] } { + const wildcardConsumers = unmatched.filter( + (c) => c.role === 'consumer' && isGrpcWildcard(c.contractId), + ); + const matched: CrossLink[] = []; + const matchedConsumerIds = new Set(); + + for (const consumer of wildcardConsumers) { + const normalized = normalizeContractId(consumer.contractId); + const fqService = normalized.slice(normalized.indexOf('::') + 2, -2); + + for (const [key, providers] of providerIndex) { + if (!key.startsWith('grpc::') || key.endsWith('/*')) continue; + const afterPrefix = key.slice(6); + const slashIdx = afterPrefix.indexOf('/'); + if (slashIdx < 0) continue; + const providerFqService = afterPrefix.slice(0, slashIdx); + + const isMatch = providerFqService === fqService + || (!fqService.includes('.') && providerFqService.endsWith('.' + fqService)); + + if (!isMatch) continue; + + for (const provider of providers) { + if (provider.repo === consumer.repo) { + if (!provider.service || !consumer.service || provider.service === consumer.service) continue; + } + matched.push({ + from: { repo: consumer.repo, service: consumer.service, symbolUid: consumer.symbolUid, symbolRef: consumer.symbolRef }, + to: { repo: provider.repo, service: provider.service, symbolUid: provider.symbolUid, symbolRef: provider.symbolRef }, + type: consumer.type, + contractId: consumer.contractId, + matchType: 'wildcard', + confidence: Math.min(provider.confidence, consumer.confidence), + }); + matchedConsumerIds.add(`${consumer.repo}::${consumer.contractId}`); + } + } + } + + const remaining = unmatched.filter((c) => { + if (c.role !== 'consumer' || !isGrpcWildcard(c.contractId)) return true; + return !matchedConsumerIds.has(`${c.repo}::${c.contractId}`); + }); + + return { matched, remaining }; +} +``` + +- [ ] **Step 6: Run tests** + +Run: `cd gitnexus && npx vitest run test/unit/group/matching.test.ts` +Expected: All PASS. + +- [ ] **Step 7: Commit** + +```bash +git add gitnexus/src/core/group/matching.ts gitnexus/test/unit/group/matching.test.ts +git commit -m "feat(group): buildProviderIndex, runExactMatch gRPC skip, runWildcardMatch" +``` + +--- + +## Task 7: Sync — writeBridge + Wildcard Pass + +**Files:** +- Modify: `gitnexus/src/core/group/sync.ts` +- Modify: `gitnexus/src/core/group/storage.ts` +- Modify: `gitnexus/test/unit/group/sync.test.ts` + +- [ ] **Step 1: Write failing test for sync creating bridge.lbug** + +In `gitnexus/test/unit/group/sync.test.ts`, add test: `syncGroup` with `groupDir` option produces a `bridge.lbug` file (use `bridgeExists()`), NOT a `contracts.json`. + +- [ ] **Step 2: Run test to verify it fails** + +- [ ] **Step 3: Update sync.ts** + +Replace `writeContractRegistry` import with `writeBridge` from bridge-db. Update the matching + write section: + +```typescript +import { writeBridge } from './bridge-db.js'; +import { buildProviderIndex, runExactMatch, runWildcardMatch } from './matching.js'; + +// ... inside syncGroup, after extraction ... +const providerIndex = buildProviderIndex(autoContracts); +const { matched: exactLinks, unmatched } = runExactMatch(autoContracts, providerIndex); +const { matched: wildcardLinks } = runWildcardMatch(unmatched, providerIndex); +const crossLinks: CrossLink[] = [...manifestResult.crossLinks, ...exactLinks, ...wildcardLinks]; +const allContracts: StoredContract[] = [...manifestResult.contracts, ...autoContracts]; + +if (opts?.groupDir && !opts.skipWrite) { + await writeBridge(opts.groupDir, { + contracts: allContracts, + crossLinks, + repoSnapshots, + missingRepos, + }); +} +``` + +- [ ] **Step 4: Update storage.ts — add openBridgeOrFallback, keep old API temporarily** + +Do NOT remove `writeContractRegistry`/`readContractRegistry` yet — `service.ts` and `cli/group.ts` still depend on them. They will be removed in Task 10 after all consumers are migrated. + +Add `openBridgeOrFallback` to `storage.ts` (per spec — it lives here, not in bridge-db.ts, because it uses private `readContractRegistryJson`): + +```typescript +import { openBridgeDbReadOnly, closeBridgeDb, readBridgeMeta } from './bridge-db.js'; +import type { BridgeHandle, BridgeMeta, LegacyContractRegistry } from './types.js'; + +// Rename existing readContractRegistry to readContractRegistryJson (keep export temporarily) +async function readContractRegistryJson(groupDir: string): Promise { + const filePath = path.join(groupDir, CONTRACTS_FILE); + try { + const content = await fsp.readFile(filePath, 'utf-8'); + return JSON.parse(content) as LegacyContractRegistry; + } catch (err: unknown) { + if ((err as NodeJS.ErrnoException).code === 'ENOENT') return null; + throw err; + } +} + +export async function openBridgeOrFallback(groupDir: string): Promise< + { type: 'bridge'; handle: BridgeHandle; meta: BridgeMeta } + | { type: 'json'; registry: LegacyContractRegistry; deprecationWarning: string } + | { type: 'none' } +> { + const handle = await openBridgeDbReadOnly(groupDir); + if (handle) { + const meta = await readBridgeMeta(groupDir); + return { type: 'bridge', handle, meta }; + } + const registry = await readContractRegistryJson(groupDir); + if (registry) { + return { + type: 'json', + registry, + deprecationWarning: 'contracts.json is deprecated. Run "gitnexus group sync " to migrate to bridge.lbug.', + }; + } + return { type: 'none' }; +} +``` + +- [ ] **Step 5: Update storage.test.ts — add openBridgeOrFallback tests** + +Add tests for bridge/json/none fallback paths. Keep existing `writeContractRegistry`/`readContractRegistry` tests (they still pass, functions not removed yet). + +- [ ] **Step 6: Run sync tests** + +Run: `cd gitnexus && npx vitest run test/unit/group/sync.test.ts` +Expected: All PASS. + +- [ ] **Step 7: Commit** + +```bash +git add gitnexus/src/core/group/sync.ts gitnexus/src/core/group/storage.ts gitnexus/test/unit/group/sync.test.ts gitnexus/test/unit/group/storage.test.ts +git commit -m "feat(group): sync writes to bridge.lbug with wildcard matching pass" +``` + +--- + +## Task 8: Cross-Impact — Cypher-Based Phase 2 + +**Files:** +- Modify: `gitnexus/src/core/group/cross-impact.ts` +- Modify: `gitnexus/test/unit/group/cross-impact.test.ts` + +- [ ] **Step 1: Write failing tests for new runGroupImpact with bridgeQuery** + +Tests for: +- Upstream direction: Cypher matches provider side, fans out to consumer repo. +- Downstream direction: Cypher matches consumer side, fans out to provider repo. +- Ref fallback: empty symbolUid still matches via filePath+symbolName. +- Subgroup filtering on fan-out side. +- Confidence ordering (high-confidence links processed first). + +- [ ] **Step 2: Run tests to verify they fail** + +- [ ] **Step 3: Rename current `runGroupImpact` → `runGroupImpactLegacy`** + +Keep the current function unchanged but renamed. Export both. + +- [ ] **Step 4: Implement new `runGroupImpact` with `bridgeQuery`** + +Update `GroupImpactOptions` interface in `cross-impact.ts`: +```typescript +export interface GroupImpactOptions { + groupName: string; + target: string; + repoPath: string; + direction: 'upstream' | 'downstream'; + bridgeQuery: (cypher: string, params: Record) => Promise; + localImpactFn: (target: string, direction: string) => Promise; + crossImpactFn: ( + targetGroupPath: string, + symbolUid: string, + direction: string, + hint?: { filePath: string; symbolName: string }, + ) => Promise; + maxDepth?: number; + minConfidence?: number; + subgroup?: string; + timeout?: number; + crossDepth?: number; +} +``` + +Phase 2 loop must pass `hint` when `fanOutUid` is empty: +```typescript +for (const row of rows) { + if (Date.now() > wallDeadline) { truncated = true; break; } + if (crossDepth < 1) break; + + // Pass hint for name-based fallback when UID is empty (gRPC contracts) + const hint = row.fanOutUid + ? undefined + : { filePath: row.fanOutFilePath, symbolName: row.fanOutSymbolName }; + const remote = await opts.crossImpactFn(row.fanOutRepo, row.fanOutUid, opts.direction, hint); + + // Guard: impact() returns { error: ... } on not-found (truthy but not a real result) + if (remote && typeof remote === 'object' && !('error' in (remote as Record))) { + // ... count as successful fan-out ... + } +} +``` + +Key constants: +```typescript +const UPSTREAM_QUERY = ` +MATCH (consumer:Contract)-[l:ContractLink]->(provider:Contract) +WHERE provider.repo = $sourceRepo + AND (provider.symbolUid IN $localUids + OR (NOT provider.symbolUid IN $localUids AND (provider.filePath + '::' + provider.symbolName) IN $localRefs)) + AND l.confidence >= $minConfidence + AND ($subgroup IS NULL OR consumer.repo = $subgroup OR consumer.repo STARTS WITH $subgroup + '/') +RETURN consumer.repo AS fanOutRepo, consumer.symbolUid AS fanOutUid, + consumer.filePath AS fanOutFilePath, consumer.symbolName AS fanOutSymbolName, + provider.symbolUid AS matchedLocalUid, + l.matchType AS matchType, l.confidence AS confidence, l.contractId AS contractId +ORDER BY l.confidence DESC`; + +const DOWNSTREAM_QUERY = `...`; // mirror with consumer/provider swapped +``` + +- [ ] **Step 5: Run tests** + +Run: `cd gitnexus && npx vitest run test/unit/group/cross-impact.test.ts` +Expected: All PASS. + +- [ ] **Step 6: Commit** + +```bash +git add gitnexus/src/core/group/cross-impact.ts gitnexus/test/unit/group/cross-impact.test.ts +git commit -m "feat(group): Cypher-based cross-impact with direction-dependent queries" +``` + +--- + +## Task 9: Service Layer — openBridgeOrFallback Integration + +**Files:** +- Modify: `gitnexus/src/core/group/service.ts` +- Modify: `gitnexus/test/unit/group/service.test.ts` + +- [ ] **Step 1: Write failing tests** + +Tests for `groupImpact`: +- With bridge.lbug → uses new `runGroupImpact` with `bridgeQuery`. +- With JSON fallback → uses `runGroupImpactLegacy` with registry. +- With no data → returns error. +- `crossImpactFn` hint param: empty UID → name-based fallback; error-object guard. + +Tests for `groupStatus`: +- With bridge → reads meta.json + RepoSnapshot Cypher query. +- With JSON → reads from legacy registry. + +Tests for `groupContracts`: +- With bridge → Cypher query with type/repo filters. + +- [ ] **Step 2: Run tests to verify they fail** + +- [ ] **Step 3: Update service.ts** + +Import `openBridgeOrFallback`, `queryBridge`, `closeBridgeDb`, `readBridgeMeta` from `bridge-db.ts`. Update `groupImpact()`, `groupContracts()`, `groupStatus()` to use `openBridgeOrFallback` with bridge/json/none branching. + +Extend `crossImpactFn` with `hint` parameter and error-object guard (as specified in the design). + +- [ ] **Step 4: Run tests** + +Run: `cd gitnexus && npx vitest run test/unit/group/service.test.ts` +Expected: All PASS. + +- [ ] **Step 5: Commit** + +```bash +git add gitnexus/src/core/group/service.ts gitnexus/test/unit/group/service.test.ts +git commit -m "feat(group): service layer uses openBridgeOrFallback with bridge/json/none" +``` + +--- + +## Task 10: CLI, MCP Tools, Remaining Tests & Cleanup + +**Files:** +- Modify: `gitnexus/src/cli/group.ts` +- Modify: `gitnexus/src/mcp/tools.ts` +- Modify: `gitnexus/src/mcp/local/local-backend.ts` +- Modify: `gitnexus/test/unit/tools.test.ts` +- Modify: `gitnexus/test/integration/group/group-impact.test.ts` +- Create: `gitnexus/test/integration/group/bridge-sync.test.ts` + +- [ ] **Step 1: Update CLI group.ts** + +Update `sync` command: remove `readContractRegistry` references. The sync command calls `syncGroup` which now writes to bridge.lbug internally. + +Update `impact` command: use `openBridgeOrFallback` check instead of `readContractRegistry`. Print deprecation warning from result if JSON fallback. + +Update `status` command: similar migration. + +- [ ] **Step 2: Update MCP tool descriptions** + +In `gitnexus/src/mcp/tools.ts`: + +```typescript +// group_sync description: +'Rebuild the Contract Registry (bridge.lbug) for a group: extract HTTP/gRPC/topic contracts, apply manifest links, exact-match and wildcard cross-links.' + +// group_contracts description: +'Inspect contracts and cross-links from the group bridge graph.' +``` + +- [ ] **Step 3: Update local-backend.ts (verify pass-through)** + +`groupImpact` and `groupContracts` in `local-backend.ts` already delegate to `GroupService` (lines 2469, 2473). Verify they don't directly use `readContractRegistry` — if not, no changes needed here. + +- [ ] **Step 4: Cleanup storage.ts — remove old API** + +Now that all consumers (sync, service, CLI) use the new bridge path: +- Remove `writeContractRegistry` (public export) +- Remove `readContractRegistry` (public export) — `readContractRegistryJson` (private) remains for fallback +- Remove `CONTRACTS_FILE` constant +- Rename `ContractRegistry` to `LegacyContractRegistry` in `types.ts` (the alias added in Task 1 becomes the only name; update all imports in cross-impact.ts, storage.ts, service.ts) +- Update `storage.test.ts`: remove tests for deleted functions, add/keep tests for `openBridgeOrFallback` +- Update `test/unit/group/types.test.ts` if it asserts on `ContractRegistry` name + +- [ ] **Step 5: Write bridge-sync integration test** + +Create `gitnexus/test/integration/group/bridge-sync.test.ts` with end-to-end test: create group config → sync → verify bridge.lbug exists → query contracts via Cypher → verify cross-links. + +- [ ] **Step 6: Update existing integration tests** + +Update `test/integration/group/group-impact.test.ts` to work with bridge.lbug instead of contracts.json. + +- [ ] **Step 7: Run full test suite** + +Run: `cd gitnexus && npx vitest run test/unit/group/ test/unit/tools.test.ts test/integration/group/` +Expected: All PASS. + +- [ ] **Step 8: Build check** + +Run: `cd gitnexus && npm run build` +Expected: Clean build. + +- [ ] **Step 9: Commit** + +```bash +git add -A +git commit -m "feat(group): CLI/MCP cleanup, old API removal, integration tests" +``` + +--- + +## Execution Notes + +### LadybugDB API Reference (from pool-adapter.ts) +- **Import:** `import lbug from '@ladybugdb/core'` (pool-adapter.ts:19) +- **Constructor:** `new lbug.Database(path, bufferManagerSize, enableCompression, readOnly)` — 4 positional args (pool-adapter.ts:265-270) +- **Writable:** `new lbug.Database(path, 0, false, false)` +- **Read-only:** `new lbug.Database(path, 0, false, true)` +- **Parameterized query:** `const stmt = await conn.prepare(cypher); stmt.isSuccess(); const result = await conn.execute(stmt, params)` (pool-adapter.ts:524-532) +- **Result extraction:** `const rows = await result.getAll()` (pool-adapter.ts:531) +- **stdout suppression:** pool-adapter uses `silenceStdout()`/`restoreStdout()` around DB operations — bridge-db should do the same if LadybugDB prints warnings + +### DDL Syntax +- Follow `schema.ts` pattern: backtick-wrapped template literals, `PRIMARY KEY (id)` as separate clause (not inline), try/catch for "already exists" errors instead of `IF NOT EXISTS` +- Verify actual DDL syntax works during Task 1 Step 3 (build check) + +### Known Issues to Address During Implementation +- **CrossLink dedup:** If proto+source contracts create duplicate CrossLinks (same `from.repo, to.repo, contractId`), dedup in `writeBridge` before inserting +- **gRPC symbolName quality:** When passing `hint.symbolName` for fan-out, strip technical prefixes (`Register`, `New`, `Server`, `Client`, `Stub`) to get the bare service name for `impact()` resolution. Add a `stripGrpcPrefix(name: string)` helper in grpc-extractor.ts +- **`BridgeHandle` typing:** `_db` and `_conn` are `unknown` in the interface (opaque). Cast to `any` in bridge-db.ts internally. If better typing is needed, import `lbug.Database`/`lbug.Connection` types +- **`LegacyContractRegistry` imports:** Update imports in `cross-impact.ts`, `storage.ts`, `service.ts` to use `LegacyContractRegistry` instead of `ContractRegistry` +- **`fromRepo`/`toRepo` denormalization:** Included in schema but not used by current Cypher queries. Keep for now; remove if not needed after all tests pass +- **`readSafe` reuse:** `buildProtoMap` uses `readSafe` which is module-level in grpc-extractor.ts (not exported). This is fine since `buildProtoMap` lives in the same file +- **`groupStatus` contractsStale:** Must query `RepoSnapshot` nodes from bridge.lbug and compare `indexedAt` with per-repo `meta.json`. Implement in Task 9 service layer +- **Integration test fixtures:** Use existing `test/fixtures/group/test-monorepo/` for bridge-sync integration tests +- **Validation command:** Use `npm run build` (which runs `node scripts/build.js` including tsc) for build validation +- **Cypher RETURN completeness:** Ensure both Cypher queries return `matchedLocalFilePath` and `matchedLocalSymbolName` per spec (Task 8) +- **`local-backend.ts` may need no changes:** `groupImpact`/`groupContracts` already delegate to `GroupService` (lines 2469, 2473). Verify during Task 10 +- **Test coverage gaps vs spec:** Spec lists corrupted bridge, fallback precedence, read-only rejection scenarios. Ensure bridge-db.test.ts covers all of them (Task 3 tests partially cover; add missing during Task 10) +- **`types.test.ts` updates:** If any tests assert on `ContractRegistry` name, update to `LegacyContractRegistry` during Task 10 cleanup +- **UX standardization:** All commands return `{ error: "No contract data. Run 'gitnexus group sync '." }` when no data source — implement consistently in Task 9 (service layer) diff --git a/docs/superpowers/specs/2026-04-03-bridge-lbug-grpc-normalization-design.md b/docs/superpowers/specs/2026-04-03-bridge-lbug-grpc-normalization-design.md new file mode 100644 index 0000000000..d72c3724c5 --- /dev/null +++ b/docs/superpowers/specs/2026-04-03-bridge-lbug-grpc-normalization-design.md @@ -0,0 +1,922 @@ +# Design: Bridge.lbug Storage & gRPC Canonical ID Normalization + +**Date:** 2026-04-03 +**PR:** #606 (cross-repo impact analysis via repository groups) +**Trigger:** Review feedback from abhigyanpatwari on PR #626 ([review](https://github.com/abhigyanpatwari/GitNexus/pull/626#pullrequestreview-4055362547)) +**Status:** Draft + +## Context + +PR #606 adds cross-repo impact analysis (`group_impact`) using a Contract Registry stored as `contracts.json`. The PR #626 reviewer requested two changes before merge: + +1. **Contract storage → bridge.lbug**: The virtual bridge graph needs Cypher-queryable edges for cross-repo impact traversal instead of static JSON. +2. **gRPC normalization mismatch**: `grpc::ServiceName/*` (from source code scanners) vs `grpc::pkg.Service/Method` (from .proto files) normalize differently, causing silent matching failures in cross-repo scenarios. + +## Component 1: Bridge.lbug — Contract Storage in LadybugDB + +### Architecture + +One writable LadybugDB per group at `groups//bridge.lbug`. This DB is a **single file** (not a directory — LadybugDB/DuckDB uses file-based storage; see `lbug-adapter.ts:140`). It contains the Contract Registry as a queryable graph — contracts as nodes, cross-links as edges. + +``` +groups/ + my-group/ + group.yaml # Group config (unchanged) + bridge.lbug # LadybugDB file (replaces contracts.json) + meta.json # Bridge metadata: { version, generatedAt, missingRepos } +``` + +### Schema + +New tables in bridge.lbug (separate from per-repo schema in `schema.ts`): + +**Node tables:** + +```sql +CREATE NODE TABLE IF NOT EXISTS Contract ( + id STRING PRIMARY KEY, -- full SHA-256 hex (64 chars) of "{repo}\0{contractId}\0{role}\0{filePath}" + contractId STRING, -- "http::GET /api/users", "grpc::pkg.Svc/Method" + type STRING, -- "http" | "grpc" | "topic" | "lib" | "custom" + role STRING, -- "provider" | "consumer" + repo STRING, -- repo path within group (e.g. "hr/hiring/backend") + service STRING DEFAULT '', -- service boundary within monorepo (from StoredContract.service, set by assignService() in sync.ts) + symbolUid STRING DEFAULT '', + filePath STRING DEFAULT '', + symbolName STRING DEFAULT '', + confidence DOUBLE DEFAULT 0.0, + meta STRING DEFAULT '{}' -- JSON-serialized Record +); + +CREATE NODE TABLE IF NOT EXISTS RepoSnapshot ( + id STRING PRIMARY KEY, -- repo path within group (e.g. "hr/hiring/backend") + indexedAt STRING DEFAULT '', + lastCommit STRING DEFAULT '' +); +``` + +**Relation table:** + +```sql +CREATE REL TABLE IF NOT EXISTS ContractLink ( + FROM Contract TO Contract, + matchType STRING, -- "exact" | "manifest" | "wildcard" | "bm25" | "embedding" + confidence DOUBLE, + contractId STRING, -- consumer's contractId (consistent with runExactMatch which stores consumer.contractId) + fromRepo STRING, -- denormalized source repo for index-only lookups + toRepo STRING -- denormalized target repo for index-only lookups +); +``` + +`fromRepo` / `toRepo` are denormalized onto ContractLink to avoid expensive JOINs when filtering by repo. + +**Bridge metadata** is stored as `meta.json` in the group directory (alongside `bridge.lbug`). Contains: + +```typescript +interface BridgeMeta { + version: number; // BRIDGE_SCHEMA_VERSION from bridge-schema.ts + // On version mismatch: writeBridge() recreates from scratch (no ALTER TABLE). + // openBridgeDbReadOnly() checks version; returns null if incompatible. + generatedAt: string; // ISO timestamp of last sync + missingRepos: string[]; // repos that failed to sync (preserved for groupStatus) +} +``` + +`missingRepos` is stored in `meta.json` rather than as DB nodes because it's metadata about the sync process, not queryable graph data. `groupStatus()` reads it from `meta.json` (matching the current behavior where it reads `registry.missingRepos`). + +### Contract Primary Key + +The `Contract.id` uses a **full SHA-256 hash** (64 hex chars) to avoid both delimiter collisions and birthday-problem collisions. Contract IDs contain `::` and `/` which make composite string keys ambiguous. + +```typescript +import { createHash } from 'node:crypto'; + +function contractNodeId(repo: string, contractId: string, role: string, filePath: string): string { + return createHash('sha256') + .update(`${repo}\0${contractId}\0${role}\0${filePath}`) + .digest('hex'); // full 64 hex chars — no truncation +} +``` + +**Why full hash:** LadybugDB PRIMARY KEY is the only uniqueness enforcement mechanism (no UNIQUE constraints on columns). A truncated hash risks silent overwrites on collision. Full SHA-256 makes collisions astronomically improbable. + +**Why `filePath` in the hash:** A proto-file contract (`filePath: "proto/user.proto"`) and a source-resolved contract (`filePath: "src/server.go"`) for the same `contractId` + `role` + `repo` are **different Contract nodes**. Both are stored — matching works by `contractId`, not by `id`. See "No Dedup Between Proto and Source" below. + +### New Module: `bridge-db.ts` + +Location: `gitnexus/src/core/group/bridge-db.ts` + +**Public API:** + +```typescript +/** + * Open or create a LadybugDB at the given file path (writable mode). + * Used internally by writeBridge() for temp DB. Not typically called by consumers. + */ +export async function openBridgeDb(dbPath: string): Promise; + +/** Apply schema (CREATE TABLE IF NOT EXISTS). Idempotent. */ +export async function ensureBridgeSchema(handle: BridgeHandle): Promise; + +/** + * Write contract data to a new bridge.lbug, then atomically swap it into place. + * Creates a temporary DB at bridge.lbug.tmp, inserts all data, then renames + * tmp → final. If insertion fails, the existing bridge.lbug is untouched. + */ +export async function writeBridge( + groupDir: string, + data: { + contracts: StoredContract[]; + crossLinks: CrossLink[]; + repoSnapshots: Record; + missingRepos: string[]; + }, +): Promise; + +/** Execute a read query against the bridge graph. */ +export async function queryBridge( + handle: BridgeHandle, + cypher: string, + params?: Record, +): Promise; + +/** Close the bridge DB connection. Must be called after openBridgeDbReadOnly(). */ +export async function closeBridgeDb(handle: BridgeHandle): Promise; + +/** + * Open bridge.lbug in read-only mode (for MCP/CLI reads). + * Returns null if file is missing or corrupt (wraps open in try/catch). + * Caller MUST call closeBridgeDb() when done. + * + * Usage (read path): + * const handle = await openBridgeDbReadOnly(groupDir); + * if (!handle) { /* fallback */ } + * const rows = await queryBridge(handle, cypher, params); + * await closeBridgeDb(handle); + */ +export async function openBridgeDbReadOnly(groupDir: string): Promise; + +/** + * Check if bridge.lbug exists and is openable. + * Delegates to openBridgeDbReadOnly + closeBridgeDb. + */ +export async function bridgeExists(groupDir: string): Promise; +``` + +**BridgeHandle** is an opaque wrapper around a LadybugDB `Database` + `Connection`, similar to how `pool-adapter.ts` manages per-repo DBs but simpler (single connection, no pool needed — bridge writes are sequential during sync, reads are single-query). + +**Lifecycle clarification:** +- **Write path:** `writeBridge(groupDir, data)` — manages its own DB lifecycle internally (open temp → write → close → rename). Callers don't need open/close. +- **Read path:** `openBridgeDbReadOnly()` + `queryBridge()` + `closeBridgeDb()` — callers manage the handle. This is used by `groupContracts()`, `groupImpact()`, etc. + +### Transaction Safety + +`writeBridge()` uses a **write-to-temp-then-rename** strategy. DuckDB/LadybugDB does not support rollback of DDL operations (DROP/CREATE TABLE), so we cannot rely on database transactions for atomic replacement. Since LadybugDB stores databases as single files, `fs.rename` is an atomic file operation on POSIX and near-atomic on Windows. + +**Strategy:** + +```typescript +async function writeBridge(groupDir: string, data: ...): Promise { + const tempPath = path.join(groupDir, 'bridge.lbug.tmp'); + const finalPath = path.join(groupDir, 'bridge.lbug'); + + // 1. Write to a temporary bridge DB file + await fs.rm(tempPath, { force: true }); + const tempHandle = await openBridgeDb(tempPath); + try { + await ensureBridgeSchema(tempHandle); + // Bulk insert Contract nodes (COPY or individual inserts) + // Bulk insert ContractLink edges + // Insert RepoSnapshot nodes + await closeBridgeDb(tempHandle); + } catch (err) { + await closeBridgeDb(tempHandle).catch(() => {}); + await fs.rm(tempPath, { force: true }); + throw err; + } + + // 2. Atomic swap: rename temp → final (move old to .bak, rename temp, remove .bak) + // retryRename() handles Windows EBUSY/EPERM/EACCES with exponential backoff. + const bakPath = path.join(groupDir, 'bridge.lbug.bak'); + await fs.rm(bakPath, { force: true }); + try { await fs.access(finalPath); await retryRename(finalPath, bakPath); } catch {} + await retryRename(tempPath, finalPath); + await fs.rm(bakPath, { force: true }); + + // 3. Write meta.json via atomic temp-file rename (meta.json.tmp → meta.json) + await writeBridgeMeta(groupDir, { + version: BRIDGE_SCHEMA_VERSION, + generatedAt: new Date().toISOString(), + missingRepos: data.missingRepos, + }); +} + +/** Rename with retry for Windows file-locking errors. */ +const RETRY_CODES = new Set(['EBUSY', 'EPERM', 'EACCES']); +async function retryRename(src: string, dst: string, attempts = 3): Promise { + for (let i = 1; i <= attempts; i++) { + try { await fs.rename(src, dst); return; } catch (err: any) { + if (!RETRY_CODES.has(err.code) || i === attempts) throw err; + await new Promise(r => setTimeout(r, 100 * Math.pow(2, i - 1))); + } + } +} +``` + +**Guarantees:** +- If insertion fails, `bridge.lbug.tmp` is cleaned up; `bridge.lbug` is untouched +- The rename sequence (old→bak, tmp→final, rm bak) minimizes the window where neither exists +- Windows EBUSY/EPERM/EACCES: `retryRename()` retries 3 times with exponential backoff (100ms, 200ms, 400ms) +- If crash occurs between renames: `bridge.lbug.bak` exists and can be restored manually; `openBridgeDbReadOnly()` checks for `.bak` as a recovery hint +- meta.json is written last via atomic temp-file rename (`meta.json.tmp` → `meta.json`); if meta write fails after successful DB swap, data is fresh but `generatedAt` and `missingRepos` are stale — next sync fixes both +- If EBUSY/EPERM/EACCES persists after 3 retries, sync fails with an explicit error suggesting to close MCP readers and retry + +**Corruption recovery:** If `bridge.lbug` exists but is unreadable, `openBridgeDbReadOnly()` returns `null`. Callers treat this the same as "no bridge" and follow the backward compatibility flow (see below). + +### Lifecycle + +**Write path** (`group sync`): +1. `syncGroup()` extracts contracts and runs matching (unchanged) +2. Instead of `writeContractRegistry(groupDir, registry)` → calls `writeBridge(groupDir, data)` +3. `writeBridge()` manages its own DB lifecycle internally (open temp → write → close → rename) + +**Read path** (`group_impact`, `group_contracts`, `group_status`): +1. Open bridge.lbug in read-only mode +2. Execute Cypher queries +3. Close when done + +### Consumer Migration + +| Consumer | Before | After | +|----------|--------|-------| +| `cross-impact.ts: runGroupImpact()` | Iterates `registry.crossLinks[]` in JS | Cypher query against bridge.lbug | +| `service.ts: groupContracts()` | `readContractRegistry()` → filter in JS | Cypher: `MATCH (c:Contract) WHERE c.type = $type RETURN ...`. Note: pre-existing bug where `--unmatched` uses `consumer.contractId` to check providers — out of scope for this PR but noted for follow-up. | +| `service.ts: groupImpact()` | Passes `registry` object | Passes `BridgeHandle` (or bridge executor fn) | +| `service.ts: groupStatus()` | `readContractRegistry()` for generatedAt + missingRepos + repoSnapshots | Uses `openBridgeOrFallback()`: bridge → reads `meta.json` + Cypher `MATCH (s:RepoSnapshot) RETURN s`; json → reads from `LegacyContractRegistry`; none → returns empty status | +| `cli/group.ts: status` | `readContractRegistry()` | Uses service layer (unchanged CLI output) | +| `cli/group.ts: impact` | Checks `readContractRegistry()` | Uses `openBridgeOrFallback()` (see backward compat) | + +### Deleted / Renamed Code + +- `storage.ts`: `writeContractRegistry()`, `readContractRegistry()` removed (public API); `CONTRACTS_FILE` removed. `readContractRegistryJson()` kept as private fallback. +- `types.ts`: `ContractRegistry` **renamed to `LegacyContractRegistry`** (not deleted). It's still used by: `cross-impact.ts` (JSON fallback path in `openBridgeOrFallback`), `storage.ts` (private `readContractRegistryJson`), and potentially `service.ts` (backward compat). All imports updated to use the new name. The type is marked `@deprecated`. +- `contracts.json` files no longer created by `group sync` + +### Cross-Impact: New `GroupImpactOptions` Interface + +The current `GroupImpactOptions` has `registry: ContractRegistry` for JS iteration. After migration, it receives a bridge query function instead: + +```typescript +export interface GroupImpactOptions { + groupName: string; + target: string; + repoPath: string; + direction: 'upstream' | 'downstream'; + // CHANGED: replaces `registry: ContractRegistry` + bridgeQuery: (cypher: string, params: Record) => Promise; + localImpactFn: (target: string, direction: string) => Promise; + crossImpactFn: ( + targetGroupPath: string, + symbolUid: string, + direction: string, + hint?: { filePath: string; symbolName: string }, + ) => Promise; + maxDepth?: number; + minConfidence?: number; + subgroup?: string; + timeout?: number; + crossDepth?: number; +} +``` + +**How the caller connects bridge to `runGroupImpact`** (in `service.ts: groupImpact()`): + +```typescript +const result = await openBridgeOrFallback(groupDir); +if (result.type === 'none') return { error: 'Run group_sync first.' }; +if (result.type === 'json') { + // Legacy path: current runGroupImpact is renamed to runGroupImpactLegacy + // and preserved unchanged (accepts `registry: LegacyContractRegistry`). + // New runGroupImpact accepts `bridgeQuery`. + return runGroupImpactLegacy({ ...opts, registry: result.registry }); +} +// Bridge path: +const handle = result.handle; +try { + return await runGroupImpact({ + ...opts, + bridgeQuery: (cypher, params) => queryBridge(handle, cypher, params), + }); +} finally { + await closeBridgeDb(handle); +} +``` + +**How `runGroupImpact` builds Cypher parameters from Phase 1 results:** + +```typescript +// After Phase 1 local impact: +const uids = collectPhase1Uids(local); // Set of symbol IDs +const phase1Refs = collectPhase1Refs(local); // Set of "filePath::symbolName" + +// Normalize subgroup before passing to Cypher +const normalizedSubgroup = opts.subgroup?.trim().replace(/\/+$/, '') || null; + +// Phase 2: Execute direction-dependent Cypher query +interface CrossImpactRow { + fanOutRepo: string; fanOutUid: string; fanOutFilePath: string; fanOutSymbolName: string; + matchedLocalUid: string; matchedLocalFilePath: string; matchedLocalSymbolName: string; + matchType: string; confidence: number; contractId: string; +} +const rows = await opts.bridgeQuery( + direction === 'upstream' ? UPSTREAM_QUERY : DOWNSTREAM_QUERY, + { + sourceRepo: opts.repoPath, + localUids: [...uids], + localRefs: [...phase1Refs], + minConfidence: opts.minConfidence ?? 0.5, + subgroup: normalizedSubgroup, + }, +); +``` + +### Cross-Impact Cypher Queries + +Phase 2 of `runGroupImpact()` needs **two direction-dependent queries** to preserve the current upstream/downstream semantics from `cross-impact.ts:142-161`. + +**Upstream query** (direction = 'upstream'): "I'm changing this symbol — who consumes it?" +The local impact found symbols in the source repo. We look for cross-links where the **target** (provider) matches a local symbol, and fan out to the **source** (consumer) side: + +```cypher +MATCH (consumer:Contract)-[l:ContractLink]->(provider:Contract) +WHERE provider.repo = $sourceRepo + AND (provider.symbolUid IN $localUids + OR (NOT provider.symbolUid IN $localUids AND (provider.filePath + '::' + provider.symbolName) IN $localRefs)) + AND l.confidence >= $minConfidence + AND ($subgroup IS NULL OR consumer.repo = $subgroup OR consumer.repo STARTS WITH $subgroup + '/') +RETURN consumer.repo AS fanOutRepo, + consumer.symbolUid AS fanOutUid, + consumer.filePath AS fanOutFilePath, + consumer.symbolName AS fanOutSymbolName, + provider.symbolUid AS matchedLocalUid, + provider.filePath AS matchedLocalFilePath, + provider.symbolName AS matchedLocalSymbolName, + l.matchType AS matchType, + l.confidence AS confidence, + l.contractId AS contractId +ORDER BY l.confidence DESC +``` + +**Downstream query** (direction = 'downstream'): "I'm changing this symbol — what does it consume?" +The local impact found symbols in the source repo. We look for cross-links where the **source** (consumer) matches a local symbol, and fan out to the **target** (provider) side: + +```cypher +MATCH (consumer:Contract)-[l:ContractLink]->(provider:Contract) +WHERE consumer.repo = $sourceRepo + AND (consumer.symbolUid IN $localUids + OR (NOT consumer.symbolUid IN $localUids AND (consumer.filePath + '::' + consumer.symbolName) IN $localRefs)) + AND l.confidence >= $minConfidence + AND ($subgroup IS NULL OR provider.repo = $subgroup OR provider.repo STARTS WITH $subgroup + '/') +RETURN provider.repo AS fanOutRepo, + provider.symbolUid AS fanOutUid, + provider.filePath AS fanOutFilePath, + provider.symbolName AS fanOutSymbolName, + consumer.symbolUid AS matchedLocalUid, + consumer.filePath AS matchedLocalFilePath, + consumer.symbolName AS matchedLocalSymbolName, + l.matchType AS matchType, + l.confidence AS confidence, + l.contractId AS contractId +ORDER BY l.confidence DESC +``` + +**Key differences from current JS code preserved:** +- **UID matching** (`cross-impact.ts:142-145`): Cypher checks `symbolUid IN $localUids` +- **Ref fallback** (`cross-impact.ts:146`, `linkMatchesRefs` at line 58-71): Cypher checks `filePath + '::' + symbolName IN $localRefs` when `symbolUid NOT IN $localUids`. This **preserves exact current semantics**: the JS code does `const refMatch = !uidMatch && linkMatchesRefs(...)` where `!uidMatch` means "UID didn't match the affected set" (regardless of whether UID is empty or stale). The Cypher `NOT ... IN $localUids` is equivalent. +- **Direction-dependent fan-out** (`cross-impact.ts:160-161`): Upstream fans out to `consumer.repo`, downstream fans out to `provider.repo` +- **Subgroup filter** (`cross-impact.ts:163`): Applied to the fan-out side, not the source side +- **Subgroup normalization**: `$subgroup` is normalized by the caller before passing to Cypher: `subgroup?.trim().replace(/\/+$/, '') || null`. This matches `inSubgroup()` from `cross-impact.ts:73-77` which does `subgroup.replace(/\/+$/, '')`. The Cypher `starts_with` check also needs the `=` case: `target.repo = $subgroup OR starts_with(target.repo, $subgroup + '/')` +- **Fan-out side info**: Query returns `fanOutFilePath` and `fanOutSymbolName` in addition to `fanOutUid` to support name-based fallback when UID is empty + +### Fan-Out with Empty symbolUid (gRPC contracts) + +The current `crossImpactFn` in `service.ts:189-199` uses `impactByUid()` which requires a valid UID (`MATCH (n) WHERE n.id = $uid`). gRPC contracts have `symbolUid: ''`, so `impactByUid('')` returns null — the fan-out silently fails. + +**Solution:** Extend `crossImpactFn` to fall back to name-based impact when UID is empty: + +```typescript +crossImpactFn: async (targetGroupPath: string, uid: string, d: string, hint?: { filePath: string; symbolName: string }) => { + const registryName = config.repos[targetGroupPath]; + if (!registryName) return null; + try { + const repoObj = await this.port.resolveRepo(registryName); + // If UID is available, use it (existing path) + if (uid) { + return this.port.impactByUid(repoObj.id, uid, d, impactOpts); + } + // Fallback: search by symbol name in the target repo. + // impact() resolves by name with priority ordering (Class > Interface > Function > ...), + // see local-backend.ts:1896. It does NOT support filePath scoping — if two symbols + // share the same name, the highest-priority label wins. This is a known limitation + // for gRPC fan-out: if "UserService" exists as both a Class and a Function in the + // target repo, the Class is chosen. In practice, gRPC service implementations are + // typically unique names within a repo, so this is acceptable. + if (hint?.symbolName) { + const result = await this.port.impact(repoObj, { + target: hint.symbolName, + direction: d as 'upstream' | 'downstream', + ...impactOpts, + }); + // impact() returns { error: ... } on not-found instead of null. + // Must check for error to avoid counting failures as successful fan-out + // (runGroupImpact truthy-checks the result at cross-impact.ts:176). + if (result && typeof result === 'object' && 'error' in result) return null; + return result; + } + return null; + } catch { + return null; + } +}, +``` + +The `hint` parameter carries `fanOutFilePath` and `fanOutSymbolName` from the Cypher query result. The caller (`runGroupImpact`) passes it when `fanOutUid` is empty: + +```typescript +// In cross-impact.ts Phase 2 loop: +const result = await opts.crossImpactFn( + row.fanOutRepo, + row.fanOutUid, + opts.direction, + row.fanOutUid ? undefined : { filePath: row.fanOutFilePath, symbolName: row.fanOutSymbolName }, +); +``` + +This ensures gRPC cross-links actually trigger remote impact analysis via name-based search, not silent null returns. + +### Future: Multi-Hop Traversal (crossDepth > 1) + +With bridge.lbug, multi-hop becomes a recursive Cypher query. Out of scope for this PR but the schema supports it. + +### Backward Compatibility + +**Unified fallback function:** All consumers (CLI, service, MCP) use a single `openBridgeOrFallback(groupDir)` helper: + +```typescript +async function openBridgeOrFallback(groupDir: string): Promise< + { type: 'bridge'; handle: BridgeHandle; meta: BridgeMeta } + | { type: 'json'; registry: LegacyContractRegistry; deprecationWarning: string } + | { type: 'none' } +> { + // 1. Try bridge.lbug + const handle = await openBridgeDbReadOnly(groupDir); + if (handle) { + const meta = await readBridgeMeta(groupDir); + return { type: 'bridge', handle, meta }; + } + // 2. Fallback to contracts.json (with deprecation warning) + const registry = await readContractRegistryJson(groupDir); + if (registry) { + // Return deprecation info in result — caller decides how to surface it. + // CLI prints warning; MCP includes it in response metadata (NOT console.warn, + // which would corrupt JSON-RPC protocol stream). + return { type: 'json', registry, deprecationWarning: 'contracts.json is deprecated. Run "gitnexus group sync " to migrate to bridge.lbug.' }; + } + // 3. Nothing found + return { type: 'none' }; +} +``` + +**Edge cases** (consistent behavior): +- Group has `contracts.json` but no `bridge.lbug` → fallback to JSON with deprecation warning +- Group has both → bridge.lbug wins (tried first), JSON ignored +- Group has corrupted `bridge.lbug` → `openBridgeDbReadOnly()` returns null → fallback to JSON if available, otherwise error +- Group has corrupted `bridge.lbug` + no JSON → error: "Run group_sync first" +- No `contracts.json` and no `bridge.lbug` → error: "Run group_sync first" + +**Key principle:** Corrupted bridge is NOT a hard error — it falls through to JSON like any "missing" bridge. Only when no data source is available does the user see an error. + +## Component 2: gRPC Canonical ID — Proto-Aware Extraction + +### Problem + +The gRPC extractor generates two incompatible contract ID formats: + +| Source | Format | Example | +|--------|--------|---------| +| `.proto` file | `grpc::package.Service/Method` | `grpc::com.example.UserService/GetUser` | +| Go `NewXClient()` | `grpc::ServiceName/*` | `grpc::UserService/*` | +| Java `ImplBase` | `grpc::ServiceName/*` | `grpc::UserService/*` | +| Python `Stub()` | `grpc::ServiceName/*` | `grpc::UserService/*` | +| TS `@GrpcMethod()` | `grpc::Service/Method` | `grpc::UserService/GetUser` | + +After normalization, `grpc::userservice/*` never matches `grpc::com.example.userservice/GetUser`. + +### Solution: Proto Map + +**New function** in `grpc-extractor.ts`: + +```typescript +interface ProtoServiceInfo { + package: string; // "com.example" (empty string if no package declaration) + serviceName: string; // "UserService" + methods: string[]; // ["GetUser", "ListUsers", ...] + protoPath: string; // "proto/user.proto" (for disambiguation) +} + +/** Scan .proto files in repo, build serviceName → package+methods map. */ +export async function buildProtoMap(repoPath: string): Promise>; +``` + +**Implementation:** +1. Glob `**/*.proto` in repoPath (reuse existing .proto scanning logic from lines 80-155) +2. Parse `package`, `service`, `rpc` declarations (regex-based, same as current .proto parsing) +3. If `.proto` has no `package` declaration, `package` is empty string `''` — `contractId` becomes `grpc::ServiceName/Method` (no dot prefix) +4. Build `Map` — key is bare service name (e.g. "UserService"), value is array (handles conflicts where two .proto files define the same service name with different packages) + +**Modified extraction flow:** + +``` +GrpcExtractor.extract(executor, repoPath, handle) + ├─ protoMap = buildProtoMap(repoPath) // NEW: build once per repo + ├─ extractFromProtoFiles(executor, repoPath) // unchanged — already full IDs + ├─ extractFromGoSource(executor, protoMap) // CHANGED: resolve via protoMap + ├─ extractFromJavaSource(executor, protoMap) // CHANGED + ├─ extractFromPythonSource(executor, protoMap) // CHANGED + ├─ extractFromTsSource(executor, protoMap) // CHANGED: resolve package via protoMap + └─ dedupe() // unchanged (see below) +``` + +### symbolUid for gRPC Contracts + +Currently, all gRPC contracts have `symbolUid: ''` (see `grpc-extractor.ts:70`). This is by design — the gRPC extractor doesn't query the graph for symbol UIDs because the contract represents a network boundary, not a code-level symbol. + +**This means `impactByUid` fan-out won't work for gRPC contracts.** Two levels of fallback are needed: + +1. **Bridge Cypher queries** (finding cross-links): Use ref fallback — `filePath + '::' + symbolName IN $localRefs` when UID is empty (see Cross-Impact Cypher Queries above). +2. **Remote fan-out** (`crossImpactFn`): Current `impactByUid()` in `service.ts:189-199` requires a valid UID (`MATCH (n) WHERE n.id = $uid`). With empty UID, it returns null — fan-out silently fails. **This must be extended** with a name-based fallback (see "Fan-Out with Empty symbolUid" in Cross-Impact section below). + +### No Dedup Between Proto and Source Contracts + +Proto-file contracts and source-resolved contracts are **both kept as separate Contract nodes** in bridge.lbug, even when they share the same `contractId`. This is correct because: + +1. They have different `filePath` values (e.g., `proto/user.proto` vs `src/server.go`) +2. They have different `symbolRef` — the proto entry points to the `.proto` definition, the source entry points to the actual Go/Java/Python code +3. During cross-impact analysis, developers need to see the **source code** file that implements/calls the service, not the `.proto` definition file +4. ContractLink matching works by `contractId`, not by Contract node `id` — both nodes participate in the same cross-links + +The existing `dedupe()` key `contractId|role|filePath` already handles this correctly: proto (`filePath: "proto/user.proto"`) and source (`filePath: "src/server.go"`) have different keys and both survive. + +**What dedupe still prevents:** True duplicates — e.g., if the same Go file is scanned twice, or if two different regex patterns match the same `RegisterXServer()` call. + +### Proto Map Disambiguation + +When multiple .proto files define the same service name with different packages, disambiguation uses **directory proximity** heuristic (not import path parsing, since Go/Java/Python scanners don't track imports): + +```typescript +function resolveProtoConflict( + serviceName: string, + sourceFilePath: string, + candidates: ProtoServiceInfo[], +): ProtoServiceInfo | null { + if (candidates.length === 0) return null; + if (candidates.length === 1) return candidates[0]; + + // Score by directory proximity: shared path prefix length + const sourceDir = path.dirname(sourceFilePath); + let best = candidates[0]; + let bestScore = 0; + for (const c of candidates) { + const protoDir = path.dirname(c.protoPath); + const shared = commonPrefixLength(sourceDir, protoDir); + if (shared > bestScore) { + bestScore = shared; + best = c; + } + } + return best; +} +``` + +### Per-Scanner Resolution: Service-Level vs Method-Level Contracts + +**Design decision:** When a source scanner (Go/Java/Python) resolves via proto map, it generates a **single service-level contract** with a synthesized `contractId`, NOT one contract per method. This avoids false-positive explosion. + +**Rationale:** `RegisterUserServiceServer()` or `NewUserServiceClient()` is evidence that the code *uses the service* — not evidence that it calls *every method*. Expanding to per-method contracts with high confidence would be misleading. + +**Provider resolution (example: Go):** + +```typescript +// Before: +contracts.push({ contractId: serviceOnlyContractId('UserService'), confidence: 0.8 }); +// → grpc::UserService/* + +// After: +const candidates = protoMap.get('UserService'); +const proto = resolveProtoConflict('UserService', sourceFilePath, candidates ?? []); +if (proto) { + // Single service-level contract with canonical package prefix + contracts.push({ + contractId: serviceContractId(proto.package, proto.serviceName), + // → grpc::com.example.UserService/* + confidence: 0.8, // unchanged — still service-level evidence + filePath: sourceFilePath, + }); +} else { + contracts.push({ + contractId: serviceOnlyContractId('UserService'), + confidence: 0.65, // reduced — unresolved, no package + filePath: sourceFilePath, + }); +} +``` + +**New helper:** +```typescript +function serviceContractId(pkg: string, serviceName: string): string { + const prefix = pkg ? `${pkg}.${serviceName}` : serviceName; + return `grpc::${prefix}/*`; +} +``` + +This produces `grpc::com.example.UserService/*` — a wildcard with the correct package prefix. It will match `.proto`-extracted `grpc::com.example.UserService/GetUser` via wildcard matching (see below). + +**Consumer resolution (example: Go consumer):** + +```typescript +const candidates = protoMap.get('UserService'); +const proto = resolveProtoConflict('UserService', sourceFilePath, candidates ?? []); +if (proto) { + contracts.push({ + contractId: serviceContractId(proto.package, proto.serviceName), + confidence: 0.75, // proto-resolved consumer + role: 'consumer', + filePath: sourceFilePath, + }); +} else { + contracts.push({ + contractId: serviceOnlyContractId('UserService'), + confidence: 0.55, // reduced — unresolved consumer + role: 'consumer', + filePath: sourceFilePath, + }); +} +``` + +**TS `@GrpcMethod` resolution:** TS already has method-level info — keep producing per-method contracts: + +```typescript +const candidates = protoMap.get(serviceName); +const proto = resolveProtoConflict(serviceName, sourceFilePath, candidates ?? []); +const pkg = proto?.package ?? ''; +const cid = contractId(pkg, serviceName, methodName); +// → grpc::com.example.UserService/GetUser (per-method, with package) +``` + +Four source scanners (Go, Java, Python, TS) receive `protoMap` and use `resolveProtoConflict()`. The `.proto` extraction is unchanged — it already produces full canonical IDs. + +### Confidence Adjustments + +| Scenario | Before | After | Rationale | +|----------|--------|-------|-----------| +| .proto rpc (provider) | 0.85 | 0.85 | Unchanged — gold standard, per-method | +| Go/Java/Python register (provider), proto-resolved | 0.8 | 0.8 | Service-level with package prefix | +| Go/Java/Python register (provider), no proto | 0.8 | 0.65 | Reduced — wildcard, no package | +| Go/Java/Python client (consumer), proto-resolved | 0.7 | 0.75 | Slight boost — canonical package known | +| Go/Java/Python client (consumer), no proto | 0.7 | 0.55 | Reduced — wildcard, no package | +| TS `@GrpcMethod`, proto-resolved | 0.8 | 0.8 | Per-method with package from proto | +| TS `@GrpcMethod`, no proto | 0.8 | 0.8 | Per-method, no package — unchanged | + +### Matching: Wildcard Fallback and `matchType` + +For `grpc::*/*` contracts (service-level wildcards), add wildcard matching as a **separate pass** after `runExactMatch()`. + +**Critical: `runExactMatch` must exclude gRPC wildcard contracts.** If both consumer and provider have `grpc::com.example.UserService/*`, they would match as `exact` with `confidence: 1.0` — false positive. Wildcard-to-wildcard and wildcard-to-method matching is handled exclusively by `runWildcardMatch()`. + +`runExactMatch` is modified to **skip gRPC wildcard contracts** (contracts where `contractId` starts with `grpc::` AND ends with `/*`). These contracts are passed through to `unmatched` and handled by the wildcard pass. HTTP wildcard contracts (`http::*::/path`) are NOT affected — they don't end with `/*` and continue to use existing `findMatchingKeys` logic. + +**New and modified functions** in `matching.ts`: + +```typescript +/** Build a normalized contractId → contracts index. Exported for reuse by wildcard pass. */ +export function buildProviderIndex( + contracts: StoredContract[], +): Map; + +/** + * Updated signature: accepts optional pre-built index. + * Skips gRPC wildcard contracts (contractId starting with "grpc::" and ending with "/*") + * — these appear in `unmatched` for the wildcard pass. + */ +export function runExactMatch( + contracts: StoredContract[], + providerIndex?: Map, +): MatchResult; + +interface WildcardMatchResult { + matched: CrossLink[]; + remaining: StoredContract[]; +} + +export function runWildcardMatch( + unmatched: StoredContract[], + providerIndex: Map, +): WildcardMatchResult; +``` + +`buildProviderIndex()` is extracted from the existing private logic inside `runExactMatch()` and exported. **Keys in the returned Map are `normalizeContractId(contract.contractId)`** — i.e., lowercased package parts for gRPC. This is critical for case-insensitive matching in `runWildcardMatch()`. The index includes gRPC wildcard providers (they won't match in exact pass but `runWildcardMatch` explicitly skips them via `key.endsWith('/*')` check). `runExactMatch()` is updated to accept an optional pre-built index. + +**Implementation:** + +1. Filter `unmatched` for consumers with `contractId` ending in `/*` +2. For each wildcard consumer, extract the bare service name: + ```typescript + // "grpc::com.example.userservice/*" → "com.example.userservice" + // "grpc::userservice/*" → "userservice" + const normalizedWildcard = normalizeContractId(consumer.contractId); + const fqServiceFromConsumer = normalizedWildcard.slice( + normalizedWildcard.indexOf('::') + 2, -2); // strip "grpc::" and "/*" + ``` +3. Search `providerIndex` for **non-wildcard** providers whose FQ service matches: + ```typescript + // Only match providers that have actual method-level IDs (not wildcards themselves) + for (const [key, providers] of providerIndex) { + if (!key.startsWith('grpc::') || key.endsWith('/*')) continue; // skip non-grpc and wildcards + const afterPrefix = key.slice(6); // strip "grpc::" + const slashIdx = afterPrefix.indexOf('/'); + if (slashIdx < 0) continue; + const fqServiceFromProvider = afterPrefix.slice(0, slashIdx); + // Exact match on FQ service, or bare-name match if consumer has no package + if (fqServiceFromProvider === fqServiceFromConsumer + || (!fqServiceFromConsumer.includes('.') && fqServiceFromProvider.endsWith('.' + fqServiceFromConsumer))) { + // Match found + } + } + ``` +4. Create CrossLink with `matchType: 'wildcard'`, `contractId: consumer.contractId` (the wildcard ID — consistent with `runExactMatch` which always stores `consumer.contractId`) +5. **Confidence:** `min(provider.confidence, consumer.confidence)` — no additional penalty. The wildcard penalty is already baked into the consumer's reduced confidence (0.55-0.75 depending on proto resolution). Applying an additional 0.5× multiplier would push values below `minConfidence` threshold. + +**Why no 0.5× multiplier:** With consumer confidence 0.55 (no proto) and provider 0.85, `0.5 × min(0.55, 0.85) = 0.275` — well below the default `minConfidence: 0.5`. The wildcard feature would be dead by default. Instead, the penalty is in the source confidence itself (0.55 vs 0.7 baseline), which keeps wildcard matches viable at default thresholds. + +**Sync integration:** + +```typescript +// In syncGroup(): +const providerIndex = buildProviderIndex(autoContracts); +const { matched: exactLinks, unmatched } = runExactMatch(autoContracts, providerIndex); +const { matched: wildcardLinks, remaining } = runWildcardMatch(unmatched, providerIndex); +const crossLinks = [...manifestResult.crossLinks, ...exactLinks, ...wildcardLinks]; +``` + +**Note:** `runExactMatch` and `runWildcardMatch` operate on `autoContracts` only — manifest contracts are already matched by `ManifestExtractor` and added separately. This prevents duplicate links. + +## Testing Strategy + +### Bridge.lbug Tests + +**Unit tests** (`test/unit/group/bridge-db.test.ts`): +- Schema creation (idempotent, re-run safe) +- Write + read contracts round-trip (verify all fields including filePath) +- Write + read cross-links round-trip (verify fromRepo/toRepo denormalization) +- Multiple contracts with same contractId but different filePath → both stored +- RepoSnapshot persistence (keyed by repo path within group) +- meta.json persistence (version from BRIDGE_SCHEMA_VERSION, generatedAt, missingRepos) +- missingRepos survives write + read cycle via meta.json +- Write-to-temp-then-rename: old data fully replaced, new data correct +- Read-only mode rejects writes (error thrown) +- Failed insert: bridge.lbug untouched, bridge.lbug.tmp cleaned up +- Atomic rename: .bak created during swap, removed after success +- Windows EBUSY/EPERM/EACCES retry: retryRename handles concurrent read-only handles +- Full SHA-256 PK: verify no truncation (64 hex chars) +- Corrupted bridge.lbug: `openBridgeDbReadOnly()` returns null +- `bridgeExists()`: true when DB opens, false when missing or corrupt +- Bridge.lbug is a file, not a directory + +**Integration tests** (`test/integration/group/bridge-sync.test.ts`): +- `syncGroup()` creates bridge.lbug with correct data +- `groupContracts()` reads from bridge.lbug (filtered by type, repo) +- `groupContracts()` with `--unmatched` flag after wildcard canonicalization +- `groupImpact()` traverses bridge.lbug cross-links (upstream direction) +- `groupImpact()` traverses bridge.lbug cross-links (downstream direction) +- `groupStatus()` returns missingRepos from meta.json and repoSnapshots from Cypher query +- `groupStatus()` contractsStale check uses RepoSnapshot nodes from bridge.lbug +- Backward compat: group with only contracts.json → fallback with deprecation warning +- Backward compat: group with both contracts.json and bridge.lbug → bridge.lbug wins +- Backward compat: corrupted bridge.lbug + contracts.json exists → fallback to JSON +- Backward compat: corrupted bridge.lbug + no JSON → error "Run group_sync first" +- Backward compat: no contracts.json and no bridge.lbug → error "Run group_sync first" +- Re-sync overwrites previous bridge.lbug data + +### gRPC Canonical ID Tests + +**Unit tests** (`test/unit/group/grpc-extractor.test.ts` — extend existing): +- `buildProtoMap()`: single proto, multiple protos, no protos in repo +- `buildProtoMap()`: proto without `package` declaration → `package: ''`, contractId = `grpc::ServiceName/Method` +- `buildProtoMap()`: conflicting service names (same name, different packages) → array with both entries +- `resolveProtoConflict()`: single candidate → returns it +- `resolveProtoConflict()`: multiple candidates → picks closest by directory +- `resolveProtoConflict()`: no candidates → returns null +- Proto-resolved extraction: Go provider + .proto → `grpc::pkg.Service/*` (service-level, NOT per-method) +- Proto-resolved extraction: Java consumer + .proto → `grpc::pkg.Service/*` (service-level) +- Proto-resolved extraction: TS `@GrpcMethod` + .proto → `grpc::pkg.Service/Method` (per-method, with package) +- Fallback: no .proto → `grpc::ServiceName/*` with reduced confidence (0.65 provider, 0.55 consumer) +- No dedup between proto and source: both kept as separate entries with different filePaths +- symbolUid is empty for all gRPC contracts (by design) + +**Unit tests** (`test/unit/group/matching.test.ts` — extend existing): +- `runWildcardMatch()`: `grpc::com.example.userservice/*` matches `grpc::com.example.userservice/GetUser` (FQ match) +- `runWildcardMatch()`: `grpc::userservice/*` matches `grpc::com.example.userservice/GetUser` (bare-name match) +- `runWildcardMatch()`: does NOT match `grpc::com.example.otherservice/GetUser` +- `runWildcardMatch()`: does NOT match non-grpc contracts +- `runWildcardMatch()`: does NOT match wildcard providers (`grpc::Service/*` consumer vs `grpc::Service/*` provider → skip) +- `runWildcardMatch()`: confidence = min(provider, consumer) — no 0.5× multiplier +- `runWildcardMatch()`: contractId on link = consumer's contractId (consistent with runExactMatch) +- `runWildcardMatch()`: matchType = 'wildcard' +- `runExactMatch()`: skips gRPC `/*` contracts (no wildcard-wildcard exact match); HTTP wildcards unaffected +- `runExactMatch()`: wildcard contracts appear in unmatched output +- Exact match runs first; wildcard only processes remaining unmatched +- Manifest contracts not passed to exact/wildcard match (no duplicate links) + +### Cross-Impact Tests + +**Unit tests** (`test/unit/group/cross-impact.test.ts` — extend existing): +- Phase 2 upstream: query matches on provider side, fans out to consumer repo +- Phase 2 downstream: query matches on consumer side, fans out to provider repo +- UID matching works against bridge DB +- Ref fallback matching works when symbolUid is empty (gRPC contracts) +- Ref fallback matching works when symbolUid is non-empty but stale (not in localUids) — preserves current `!uidMatch` semantics +- Subgroup filtering applied to fan-out side, not source side +- Confidence filtering via Cypher WHERE clause +- Direction-specific Cypher query equivalence with current JS loop +- Fan-out with empty symbolUid: crossImpactFn falls back to name-based search via hint +- Fan-out with valid symbolUid: crossImpactFn uses impactByUid (existing path) +- Subgroup normalization: trailing `/` stripped before Cypher, `team/a` matches `team/a` and `team/a/sub` +- Deprecation warning returned in result (not console.warn) when JSON fallback used + +## Files Changed + +### New Files +| File | Purpose | +|------|---------| +| `src/core/group/bridge-db.ts` | Bridge LadybugDB lifecycle, schema, read/write, atomic rename | +| `src/core/group/bridge-schema.ts` | Schema DDL constants + BRIDGE_SCHEMA_VERSION for bridge.lbug | +| `test/unit/group/bridge-db.test.ts` | Bridge DB unit tests | +| `test/integration/group/bridge-sync.test.ts` | Bridge DB integration tests | + +### Modified Files +| File | Changes | +|------|---------| +| `src/core/group/sync.ts` | Replace `writeContractRegistry()` with `writeBridge()`; add `runWildcardMatch()` pass on `autoContracts` only (not manifest) | +| `src/core/group/service.ts` | Use `openBridgeOrFallback()`; extend `crossImpactFn` with `hint` param for name-based fallback when UID empty | +| `src/core/group/cross-impact.ts` | Accept bridge executor; two direction-dependent Cypher queries with UID+ref fallback | +| `src/core/group/storage.ts` | Remove `writeContractRegistry()`, `readContractRegistry()`, `CONTRACTS_FILE`; keep `readContractRegistryJson()` as private fallback; add `openBridgeOrFallback()` (imports `openBridgeDbReadOnly`, `closeBridgeDb`, `readBridgeMeta` from `bridge-db.ts`) | +| `src/core/group/types.ts` | Add `BridgeHandle`, `BridgeMeta` types; rename `ContractRegistry` → `LegacyContractRegistry` (@deprecated) | +| `src/core/group/extractors/grpc-extractor.ts` | Add `buildProtoMap()`, `resolveProtoConflict()`, `serviceContractId()`; modify 4 source scanners; keep service-level contracts (no per-method expansion for Go/Java/Python) | +| `src/core/group/matching.ts` | Export `buildProviderIndex()` (normalized keys); `runExactMatch` skips gRPC `/*` contracts; add `runWildcardMatch()` excluding wildcard providers; add `'wildcard'` to MatchType | +| `src/cli/group.ts` | Update `sync`, `impact`, `status` to use `openBridgeOrFallback()` | +| `src/mcp/local/local-backend.ts` | Update `groupImpact()`, `groupContracts()` to use `openBridgeOrFallback()` | +| `test/unit/group/grpc-extractor.test.ts` | Add proto map, service-level canonical ID, no-package proto, TS per-method resolution tests | +| `test/unit/group/matching.test.ts` | Add `runWildcardMatch()` tests: wildcard-provider exclusion, no multiplier, matchType checks | +| `test/unit/group/cross-impact.test.ts` | Direction-specific Cypher equivalence, ref fallback for empty UID | +| `test/unit/group/sync.test.ts` | Update for bridge.lbug writes + wildcard on autoContracts only | +| `test/unit/group/service.test.ts` | Update for bridge.lbug reads + openBridgeOrFallback | + +### Deleted / Renamed +| File/Symbol | Reason | +|-------------|--------| +| `storage.ts: writeContractRegistry()` | Replaced by bridge-db.ts | +| `storage.ts: readContractRegistry()` | Replaced by bridge-db.ts (JSON fallback kept as private `readContractRegistryJson()`) | +| `storage.ts: CONTRACTS_FILE` | No longer used | +| `types.ts: ContractRegistry` | **Renamed** to `LegacyContractRegistry` (@deprecated); still used by JSON fallback path | +| `contracts.json` files | No longer created by `group sync` | + +## Risks & Mitigations + +| Risk | Mitigation | +|------|------------| +| LadybugDB write perf during sync | Bulk COPY (same pattern as lbug-adapter.ts), not individual inserts | +| Bridge DB data loss on failed sync | Write-to-temp-then-rename: old file untouched until new file fully written; .bak recovery hint | +| Bridge DB lock during concurrent reads | Write-to-temp eliminates lock contention; retryRename for Windows EBUSY/EPERM/EACCES | +| Bridge DB corruption | Falls through to JSON fallback (not a hard error); only errors if no data source available | +| Proto map parse errors on malformed .proto | Regex-based parsing with try/catch, skip unparseable files | +| Proto map conflicts (same service name) | Directory proximity heuristic; no import path parsing needed | +| Proto without package declaration | `package: ''` → contractId = `grpc::ServiceName/Method` (tested) | +| Backward compat: existing groups have contracts.json | `openBridgeOrFallback()`: bridge.lbug first, then JSON, then error | +| ContractLink JOIN perf for repo filtering | Denormalized `fromRepo`/`toRepo` on relation avoid JOINs | +| Wildcard false positives (general) | No per-method expansion for service-level evidence; confidence penalty in source extraction | +| Wildcard bare-name cross-package false positive | `grpc::userservice/*` can match `grpc::other.userservice/GetUser` (different package). Accepted risk: bare-name consumers (no proto) already have reduced confidence (0.55); false positives surface as low-confidence cross-links. Users can raise `minConfidence` to filter. A future improvement could require FQ match only when consumer has package prefix. | +| gRPC symbolUid empty | Ref fallback in Cypher queries (filePath+symbolName match); name-based fan-out with error-object guard | +| missingRepos data loss | Stored in meta.json alongside bridge.lbug | + +## Implementation Notes + +Issues identified during spec review that are best addressed during TDD implementation rather than in the design doc: + +1. **Swap partial failure recovery (#3):** If `temp→final` rename fails after `final→bak`, `bridge.lbug` is missing. Implementation should check for `.bak` in `openBridgeDbReadOnly()` and auto-restore if `bridge.lbug` is absent. +2. **`meta.json` read failure (#4):** `readBridgeMeta()` should return sensible defaults (`{ version: 0, generatedAt: '', missingRepos: [] }`) if file is missing/corrupt, not throw. `openBridgeOrFallback` should handle this gracefully. +3. **Proto+source dedup and matching identity (#5):** Two Contract nodes with same `(repo, contractId)` but different `filePath` will both mark as matched via `${repo}::${contractId}`. This means both proto and source entries mark as matched simultaneously — correct for `unmatched` filtering. But `runExactMatch` may create duplicate CrossLinks (one per node). Implementation should dedup CrossLinks by `(from.repo, to.repo, contractId)`. +4. **gRPC symbolName quality (#6):** Extractors store technical names (`RegisterUserServiceServer`, `NewUserServiceClient`). The name-based fallback may not find the actual symbol in the target repo. Implementation should extract the service name from these patterns (strip `Register`/`New`/`Server`/`Client` prefixes) before passing as `hint.symbolName`. +5. **DDL syntax compatibility (#11):** Schema DDL in this spec uses `IF NOT EXISTS` and inline `PRIMARY KEY`. Verify against actual LadybugDB/DuckDB version used in the project. If incompatible, adapt to match `schema.ts` conventions (separate PRIMARY KEY clause, no IF NOT EXISTS with try/catch wrapper). +6. **`fromRepo`/`toRepo` denormalization (#9):** Current Cypher queries don't use them. Remove from schema if no concrete use case emerges during implementation. Keep if needed for future index-only scans. +7. **`no data source` UX per command (#12):** `groupStatus` returns empty status; `groupImpact`/`groupContracts` return error. Standardize during implementation: all commands that require data return `{ error: "No contract data. Run 'gitnexus group sync '." }`. +8. **Blast radius (#10):** `src/mcp/tools.ts` tool descriptions reference `contracts.json` — update text. Tests `storage.test.ts`, `types.test.ts`, `group-impact.test.ts` also need updates — add to implementation plan. +9. **Fallback ownership (#8):** `openBridgeOrFallback()` lives in `storage.ts`. `cross-impact.ts` does NOT do fallback — it receives either `bridgeQuery` or `registry` from `service.ts`. Service layer owns the fallback decision. diff --git a/gitnexus/README.md b/gitnexus/README.md index 7e87c93b42..5c1e9576f8 100644 --- a/gitnexus/README.md +++ b/gitnexus/README.md @@ -172,6 +172,7 @@ gitnexus group remove # Remove a repo from a group gitnexus group list [name] # List groups, or show one group's config gitnexus group sync # Extract contracts and match across repos/services gitnexus group contracts # Inspect extracted contracts and cross-links +gitnexus group impact # Cross-repo blast radius analysis gitnexus group query # Search execution flows across all repos in a group gitnexus group status # Check staleness of repos in a group ``` diff --git a/gitnexus/src/cli/group.ts b/gitnexus/src/cli/group.ts index 70ca9537a3..6bd25b7997 100644 --- a/gitnexus/src/cli/group.ts +++ b/gitnexus/src/cli/group.ts @@ -97,16 +97,23 @@ export function registerGroupCommands(program: Command): void { .command('status ') .description('Check staleness of group and repos') .action(async (name: string) => { - const { readContractRegistry, getGroupDir, getDefaultGitnexusDir } = + const { getGroupDir, getDefaultGitnexusDir, openBridgeOrFallback } = await import('../core/group/storage.js'); + const { closeBridgeDb } = await import('../core/group/bridge-db.js'); const { LocalBackend } = await import('../mcp/local/local-backend.js'); const groupDir = getGroupDir(getDefaultGitnexusDir(), name); - const registry = await readContractRegistry(groupDir); + const fallback = await openBridgeOrFallback(groupDir); + const lastSync = + fallback.type === 'bridge' + ? fallback.meta.generatedAt + : fallback.type === 'json' + ? fallback.registry.generatedAt + : null; + if (fallback.type === 'json') console.warn(fallback.deprecationWarning); + if (fallback.type === 'bridge') await closeBridgeDb(fallback.handle); - console.log( - `Group: ${name}${registry ? ` (last sync: ${registry.generatedAt})` : ' (never synced)'}\n`, - ); + console.log(`Group: ${name}${lastSync ? ` (last sync: ${lastSync})` : ' (never synced)'}\n`); const backend = new LocalBackend(); try { @@ -157,31 +164,146 @@ export function registerGroupCommands(program: Command): void { const { getGroupDir, getDefaultGitnexusDir } = await import('../core/group/storage.js'); const { loadGroupConfig } = await import('../core/group/config-parser.js'); const { syncGroup } = await import('../core/group/sync.js'); + const { closeLbug } = await import('../core/lbug/pool-adapter.js'); + + try { + const groupDir = getGroupDir(getDefaultGitnexusDir(), name); + const config = await loadGroupConfig(groupDir); + + console.log(`Syncing group "${name}" (${Object.keys(config.repos).length} repos)...\n`); + + const result = await syncGroup(config, { + groupDir, + allowStale: Boolean(opts.allowStale), + verbose: Boolean(opts.verbose), + skipEmbeddings: Boolean(opts.skipEmbeddings), + exactOnly: Boolean(opts.exactOnly), + }); + + if (opts.json) { + console.log(JSON.stringify(result, null, 2)); + } else { + console.log(`\nMatching cascade:`); + const exactLinks = result.crossLinks.filter((l) => l.matchType === 'exact'); + console.log(` exact: ${exactLinks.length} cross-links (confidence 1.0)`); + console.log(` unmatched: ${result.unmatched.length} contracts`); + console.log( + `\nWrote bridge.lbug (${result.contracts.length} contracts, ${result.crossLinks.length} cross-links)`, + ); + } + } finally { + await closeLbug().catch(() => {}); + } + }); + + group + .command('impact ') + .description('Cross-index blast radius analysis') + .requiredOption('--target ', 'Symbol name to analyze') + .requiredOption('--repo ', 'Repo group path (e.g. hr/hiring/backend)') + .option('--direction ', 'upstream or downstream', 'upstream') + .option('--cross-depth ', 'Hops through boundaries (MVP: capped at 1)', '1') + .option('--max-depth ', 'Max depth within each repo', '3') + .option('--min-confidence ', 'Min confidence for cross-links', '0.5') + .option('--subgroup ', 'Limit fan-out scope') + .option('--timeout ', 'Total wall time budget in ms', '30000') + .option('--json', 'JSON output') + .action(async (name: string, opts: Record) => { + const { getGroupDir, getDefaultGitnexusDir, openBridgeOrFallback } = + await import('../core/group/storage.js'); + const { closeBridgeDb } = await import('../core/group/bridge-db.js'); + const { LocalBackend } = await import('../mcp/local/local-backend.js'); const groupDir = getGroupDir(getDefaultGitnexusDir(), name); - const config = await loadGroupConfig(groupDir); + const fallback = await openBridgeOrFallback(groupDir); + if (fallback.type === 'none') { + console.error(`No contract data found. Run: gitnexus group sync ${name}`); + process.exitCode = 1; + return; + } + if (fallback.type === 'json') console.warn(fallback.deprecationWarning); + if (fallback.type === 'bridge') await closeBridgeDb(fallback.handle); + + const repoGroupPath = opts.repo as string; + const targetSymbol = opts.target as string; + if (!repoGroupPath || !targetSymbol) { + console.error('Both --target and --repo are required.'); + process.exitCode = 1; + return; + } + const direction = (opts.direction as string) ?? 'upstream'; + const maxDepth = opts.maxDepth != null ? parseInt(String(opts.maxDepth), 10) : 3; + const minConfidence = + opts.minConfidence != null ? parseFloat(String(opts.minConfidence)) : 0.5; + const timeout = opts.timeout != null ? parseInt(String(opts.timeout), 10) : 30000; + const subgroup = opts.subgroup as string | undefined; + const requestedCrossDepth = + opts.crossDepth != null ? parseInt(String(opts.crossDepth), 10) : 1; + const crossDepth = Math.min(requestedCrossDepth, 1); + + const crossDepthWarning = + requestedCrossDepth > 1 + ? `Multi-hop cross-boundary traversal is not yet implemented. Using --cross-depth 1 (requested: ${requestedCrossDepth}).` + : undefined; - console.log(`Syncing group "${name}" (${Object.keys(config.repos).length} repos)...\n`); - - const result = await syncGroup(config, { - groupDir, - allowStale: Boolean(opts.allowStale), - verbose: Boolean(opts.verbose), - skipEmbeddings: Boolean(opts.skipEmbeddings), - exactOnly: Boolean(opts.exactOnly), - }); - - if (opts.json) { - console.log(JSON.stringify(result, null, 2)); - } else { - console.log(`\nMatching cascade:`); - const exactLinks = result.crossLinks.filter((l) => l.matchType === 'exact'); - console.log(` exact: ${exactLinks.length} cross-links (confidence 1.0)`); - console.log(` unmatched: ${result.unmatched.length} contracts`); + if (crossDepthWarning && !opts.json) { + console.log(`WARNING: ${crossDepthWarning}\n`); + } + + if (!opts.json) { console.log( - `\nWrote contracts.json (${result.contracts.length} contracts, ${result.crossLinks.length} cross-links)`, + `Analyzing impact of "${targetSymbol}" in ${repoGroupPath} (group: ${name})...\n`, ); } + + const backend = new LocalBackend(); + try { + await backend.init(); + const result = (await backend.getGroupService().groupImpact({ + name, + target: targetSymbol, + repo: repoGroupPath, + direction, + crossDepth, + maxDepth, + minConfidence, + subgroup, + timeout, + })) as import('../core/group/types.js').GroupImpactResult & { crossDepthWarning?: string }; + + if (opts.json) { + const jsonOutput = crossDepthWarning ? { ...result, crossDepthWarning } : result; + console.log(JSON.stringify(jsonOutput, null, 2)); + } else { + console.log(`Target: ${targetSymbol} (${repoGroupPath})`); + console.log(`Risk: ${result.risk}`); + console.log(`\nLocal impact: ${result.summary.direct} direct callers`); + if (result.cross.length > 0) { + console.log(`\nCross-repo impact (${result.cross.length} repos):`); + for (const cr of result.cross) { + console.log( + ` ${cr.repo_path} (via ${cr.contract.id}, ${cr.contract.match_type}, conf=${cr.contract.confidence}):`, + ); + for (const [depth, symbols] of Object.entries(cr.by_depth)) { + console.log(` d=${depth}: ${(symbols as unknown[]).length} symbols`); + } + } + } + if (result.outOfScope.length > 0) { + console.log(`\nOut of scope (${result.outOfScope.length} cross-links not followed):`); + for (const oos of result.outOfScope) { + console.log(` ${oos.from} -> ${oos.to} [${oos.contractId}]`); + } + } + if (result.truncated) { + console.log( + `\nWARNING: Timeout reached. Repos not analyzed: ${result.truncatedRepos.join(', ')}`, + ); + } + } + } finally { + await backend.dispose().catch(() => {}); + } }); group diff --git a/gitnexus/src/core/group/bridge-db.ts b/gitnexus/src/core/group/bridge-db.ts new file mode 100644 index 0000000000..27446a0ab9 --- /dev/null +++ b/gitnexus/src/core/group/bridge-db.ts @@ -0,0 +1,464 @@ +import fsp from 'node:fs/promises'; +import path from 'node:path'; +import { createHash } from 'node:crypto'; +import lbug from '@ladybugdb/core'; +import type { LbugValue } from '@ladybugdb/core'; +import type { BridgeHandle, BridgeMeta, StoredContract, CrossLink, RepoSnapshot } from './types.js'; +import { BRIDGE_SCHEMA_QUERIES, BRIDGE_SCHEMA_VERSION } from './bridge-schema.js'; +import { dedupeContracts, dedupeCrossLinks } from './normalization.js'; + +export function contractNodeId( + repo: string, + contractId: string, + role: string, + filePath: string, +): string { + return createHash('sha256').update(`${repo}\0${contractId}\0${role}\0${filePath}`).digest('hex'); +} + +export async function openBridgeDb(dbPath: string): Promise { + const parentDir = path.dirname(dbPath); + await fsp.mkdir(parentDir, { recursive: true }); + const db = new lbug.Database(dbPath, 0, false, false); // writable + const conn = new lbug.Connection(db); + return { _db: db, _conn: conn, groupDir: parentDir } as BridgeHandle; +} + +export async function ensureBridgeSchema(handle: BridgeHandle): Promise { + const conn = handle._conn as lbug.Connection; + for (const q of BRIDGE_SCHEMA_QUERIES) { + try { + await conn.query(q); + } catch (err: any) { + const msg = err?.message ?? ''; + if (!msg.includes('already exists')) throw err; + } + } +} + +export async function queryBridge( + handle: BridgeHandle, + cypher: string, + params?: Record, +): Promise { + const conn = handle._conn as lbug.Connection; + if (params && Object.keys(params).length > 0) { + const stmt = await conn.prepare(cypher); + if (!stmt.isSuccess()) { + const errMsg = await stmt.getErrorMessage(); + throw new Error(`Bridge query prepare failed: ${errMsg}`); + } + const queryResult = await conn.execute(stmt, params); + const result = Array.isArray(queryResult) ? queryResult[0] : queryResult; + return (await result.getAll()) as T[]; + } + const queryResult = await conn.query(cypher); + const result = Array.isArray(queryResult) ? queryResult[0] : queryResult; + return (await result.getAll()) as T[]; +} + +export async function closeBridgeDb(handle: BridgeHandle): Promise { + try { + await (handle._conn as lbug.Connection).close(); + } catch { + /* ignore */ + } + try { + await (handle._db as lbug.Database).close(); + } catch { + /* ignore */ + } +} + +/* ------------------------------------------------------------------ */ +/* retryRename — handles transient EBUSY/EPERM/EACCES on Windows */ +/* ------------------------------------------------------------------ */ + +const RETRY_CODES = new Set(['EBUSY', 'EPERM', 'EACCES']); + +export async function retryRename(src: string, dst: string, attempts = 3): Promise { + for (let i = 1; i <= attempts; i++) { + try { + await fsp.rename(src, dst); + return; + } catch (err: unknown) { + const code = (err as NodeJS.ErrnoException).code; + if (!code || !RETRY_CODES.has(code) || i === attempts) throw err; + await new Promise((r) => setTimeout(r, 100 * Math.pow(2, i - 1))); + } + } +} + +/* ------------------------------------------------------------------ */ +/* writeBridgeMeta / readBridgeMeta */ +/* ------------------------------------------------------------------ */ + +export async function writeBridgeMeta(groupDir: string, meta: BridgeMeta): Promise { + const target = path.join(groupDir, 'meta.json'); + const tmp = `${target}.tmp.${Date.now()}`; + await fsp.writeFile(tmp, JSON.stringify(meta, null, 2), 'utf-8'); + // Use retryRename for consistency with writeBridge's atomic swap — on + // Windows a concurrent reader can cause EBUSY/EPERM even on a tiny + // meta.json, and we don't want meta write to be less robust than the + // bridge.lbug swap it accompanies. + await retryRename(tmp, target); +} + +export async function readBridgeMeta(groupDir: string): Promise { + try { + const content = await fsp.readFile(path.join(groupDir, 'meta.json'), 'utf-8'); + return JSON.parse(content) as BridgeMeta; + } catch { + return { version: 0, generatedAt: '', missingRepos: [] }; + } +} + +/* ------------------------------------------------------------------ */ +/* writeBridge — atomic write-to-temp-then-rename */ +/* ------------------------------------------------------------------ */ + +export interface WriteBridgeInput { + contracts: StoredContract[]; + crossLinks: CrossLink[]; + repoSnapshots: Record; + missingRepos: string[]; +} + +/** + * Non-fatal issues encountered during writeBridge. Callers can log these to + * surface partial-success state without aborting the whole sync. + * `sampleErrors` is capped at MAX_SAMPLE_ERRORS per category to bound memory. + */ +export interface WriteBridgeReport { + contractsInserted: number; + contractsFailed: number; + snapshotsInserted: number; + snapshotsFailed: number; + linksInserted: number; + linksFailed: number; + /** Cross-links skipped because their from/to contract nodes weren't found. */ + linksDroppedMissingNode: number; + sampleErrors: Array<{ + kind: 'contract' | 'snapshot' | 'link'; + id: string; + message: string; + }>; +} + +const MAX_SAMPLE_ERRORS = 10; + +function errMessage(err: unknown): string { + if (err instanceof Error) return err.message; + try { + return String(err); + } catch { + return 'unknown error'; + } +} + +export async function writeBridge( + groupDir: string, + input: WriteBridgeInput, +): Promise { + await fsp.mkdir(groupDir, { recursive: true }); + const contracts = dedupeContracts(input.contracts); + const crossLinks = dedupeCrossLinks(input.crossLinks); + + const finalPath = path.join(groupDir, 'bridge.lbug'); + const tmpPath = path.join(groupDir, 'bridge.lbug.tmp'); + const bakPath = path.join(groupDir, 'bridge.lbug.bak'); + + const report: WriteBridgeReport = { + contractsInserted: 0, + contractsFailed: 0, + snapshotsInserted: 0, + snapshotsFailed: 0, + linksInserted: 0, + linksFailed: 0, + linksDroppedMissingNode: 0, + sampleErrors: [], + }; + + const recordError = (kind: 'contract' | 'snapshot' | 'link', id: string, err: unknown) => { + if (report.sampleErrors.length < MAX_SAMPLE_ERRORS) { + report.sampleErrors.push({ kind, id, message: errMessage(err) }); + } + }; + + // Clean up any leftover tmp + try { + await fsp.rm(tmpPath, { recursive: true, force: true }); + } catch { + /* ignore */ + } + + // 1. Create temp DB, insert all data. + // + // Everything after `openBridgeDb` must run inside a try/finally so that + // if ANY step before the explicit `closeBridgeDb` throws — schema + // creation, a contract insert loop that rethrows, a snapshot write, the + // cross-link loop, or anything else — the handle is still released. A + // leaked handle holds the native LadybugDB file lock on tmpPath, which + // (a) leaks a FD and (b) prevents the next writeBridge call from + // reusing the same tmp slot. + const handle = await openBridgeDb(tmpPath); + let handleClosed = false; + try { + await ensureBridgeSchema(handle); + + // Insert contracts — tolerate individual failures (e.g., a corrupt meta + // that can't be serialized). The whole sync must not fail because one + // contract is broken. + for (const c of contracts) { + const id = contractNodeId(c.repo, c.contractId, c.role, c.symbolRef.filePath); + try { + await queryBridge( + handle, + `CREATE (n:Contract { + id: $id, + contractId: $contractId, + type: $type, + role: $role, + repo: $repo, + service: $service, + symbolUid: $symbolUid, + filePath: $filePath, + symbolName: $symbolName, + confidence: $confidence, + meta: $meta + })`, + { + id, + contractId: c.contractId, + type: c.type, + role: c.role, + repo: c.repo, + service: c.service ?? '', + symbolUid: c.symbolUid, + filePath: c.symbolRef.filePath, + symbolName: c.symbolName, + confidence: c.confidence, + meta: JSON.stringify(c.meta), + }, + ); + report.contractsInserted++; + } catch (err) { + report.contractsFailed++; + recordError('contract', id, err); + } + } + + // Insert repo snapshots + for (const [repoId, snap] of Object.entries(input.repoSnapshots)) { + try { + await queryBridge( + handle, + `CREATE (s:RepoSnapshot { + id: $id, + indexedAt: $indexedAt, + lastCommit: $lastCommit + })`, + { + id: repoId, + indexedAt: snap.indexedAt, + lastCommit: snap.lastCommit, + }, + ); + report.snapshotsInserted++; + } catch (err) { + report.snapshotsFailed++; + recordError('snapshot', repoId, err); + } + } + + // Insert cross-links (tolerating missing nodes). + // Use repo-scoped matching: find FROM node by (repo, role=consumer) and TO by (repo, role=provider) + // with symbolRef matching, because link.contractId is the consumer's ID which may differ + // from the provider's contractId (e.g. wildcard consumer vs method-level provider). + const findContractNode = async ( + repo: string, + role: 'consumer' | 'provider', + symbolUid: string, + filePath: string, + symbolName: string, + ): Promise => { + if (symbolUid) { + const uidRows = await queryBridge<{ id: string }>( + handle, + `MATCH (c:Contract) WHERE c.repo = $repo AND c.role = $role + AND c.symbolUid = $symbolUid RETURN c.id AS id LIMIT 1`, + { repo, role, symbolUid }, + ); + if (uidRows.length > 0) return uidRows[0].id; + } + + const refRows = await queryBridge<{ id: string }>( + handle, + `MATCH (c:Contract) WHERE c.repo = $repo AND c.role = $role + AND c.filePath = $filePath AND c.symbolName = $symbolName + RETURN c.id AS id LIMIT 1`, + { repo, role, filePath, symbolName }, + ); + if (refRows.length > 0) return refRows[0].id; + + const fileRows = await queryBridge<{ id: string }>( + handle, + `MATCH (c:Contract) WHERE c.repo = $repo AND c.role = $role + AND c.filePath = $filePath + RETURN c.id AS id LIMIT 2`, + { repo, role, filePath }, + ); + if (fileRows.length === 1) return fileRows[0].id; + return null; + }; + + for (const link of crossLinks) { + const linkId = `${link.from.repo}::${link.contractId}->${link.to.repo}::${link.contractId}`; + try { + const fromId = await findContractNode( + link.from.repo, + 'consumer', + link.from.symbolUid, + link.from.symbolRef.filePath, + link.from.symbolRef.name, + ); + const toId = await findContractNode( + link.to.repo, + 'provider', + link.to.symbolUid, + link.to.symbolRef.filePath, + link.to.symbolRef.name, + ); + if (!fromId || !toId) { + report.linksDroppedMissingNode++; + continue; + } + await queryBridge( + handle, + ` + MATCH (a:Contract), (b:Contract) + WHERE a.id = $fromId AND b.id = $toId + CREATE (a)-[:ContractLink { + matchType: $matchType, + confidence: $confidence, + contractId: $contractId, + fromRepo: $fromRepo, + toRepo: $toRepo + }]->(b) + `, + { + fromId, + toId, + matchType: link.matchType, + confidence: link.confidence, + contractId: link.contractId, + fromRepo: link.from.repo, + toRepo: link.to.repo, + }, + ); + report.linksInserted++; + } catch (err) { + report.linksFailed++; + recordError('link', linkId, err); + } + } + + // 2. Close temp DB (happy path). The finally block also calls + // closeBridgeDb if we threw above; `handleClosed` prevents a + // double-close on the native handle. + await closeBridgeDb(handle); + handleClosed = true; + } finally { + if (!handleClosed) { + await closeBridgeDb(handle).catch(() => { + /* ignore: cleanup path, best effort */ + }); + } + } + + // 3. Atomic swap: old→.bak, tmp→final, rm .bak + try { + await fsp.access(finalPath); + await retryRename(finalPath, bakPath); + } catch { + /* no existing db */ + } + await retryRename(tmpPath, finalPath); + try { + await fsp.rm(bakPath, { recursive: true, force: true }); + } catch { + /* ignore */ + } + + // 4. Write meta.json + await writeBridgeMeta(groupDir, { + version: BRIDGE_SCHEMA_VERSION, + generatedAt: new Date().toISOString(), + missingRepos: input.missingRepos, + }); + + return report; +} + +/* ------------------------------------------------------------------ */ +/* openBridgeDbReadOnly */ +/* ------------------------------------------------------------------ */ + +export async function openBridgeDbReadOnly(groupDir: string): Promise { + const dbPath = path.join(groupDir, 'bridge.lbug'); + try { + await fsp.access(dbPath); + } catch { + // Check for .bak recovery + const bakPath = path.join(groupDir, 'bridge.lbug.bak'); + try { + await fsp.access(bakPath); + await fsp.rename(bakPath, dbPath); + } catch { + return null; + } + } + // Version gate: check meta.json version compatibility + const meta = await readBridgeMeta(groupDir); + if (meta.version > 0 && meta.version !== BRIDGE_SCHEMA_VERSION) { + return null; // incompatible schema version — fallback to JSON or re-sync + } + + // Open the native handle. If Connection construction throws AFTER + // Database was successfully allocated, we'd leak the native Database + // object. Wrap each step separately and tear down the partial handle. + let db: lbug.Database | undefined; + let conn: lbug.Connection | undefined; + try { + db = new lbug.Database(dbPath, 0, false, true); // readOnly + conn = new lbug.Connection(db); + return { _db: db, _conn: conn, groupDir } as BridgeHandle; + } catch { + if (conn) { + try { + await conn.close(); + } catch { + /* ignore */ + } + } + if (db) { + try { + await db.close(); + } catch { + /* ignore */ + } + } + return null; + } +} + +/* ------------------------------------------------------------------ */ +/* bridgeExists */ +/* ------------------------------------------------------------------ */ + +export async function bridgeExists(groupDir: string): Promise { + const handle = await openBridgeDbReadOnly(groupDir); + if (!handle) return false; + await closeBridgeDb(handle); + return true; +} diff --git a/gitnexus/src/core/group/bridge-schema.ts b/gitnexus/src/core/group/bridge-schema.ts new file mode 100644 index 0000000000..7998e28231 --- /dev/null +++ b/gitnexus/src/core/group/bridge-schema.ts @@ -0,0 +1,42 @@ +/** + * Bridge LadybugDB schema for cross-repo Contract Registry. + * Separate from per-repo schema in lbug/schema.ts. + */ + +export const BRIDGE_SCHEMA_VERSION = 1; + +export const CONTRACT_SCHEMA = ` +CREATE NODE TABLE Contract ( + id STRING, + contractId STRING, + type STRING, + role STRING, + repo STRING, + service STRING DEFAULT '', + symbolUid STRING DEFAULT '', + filePath STRING DEFAULT '', + symbolName STRING DEFAULT '', + confidence DOUBLE DEFAULT 0.0, + meta STRING DEFAULT '{}', + PRIMARY KEY (id) +)`; + +export const REPO_SNAPSHOT_SCHEMA = ` +CREATE NODE TABLE RepoSnapshot ( + id STRING, + indexedAt STRING DEFAULT '', + lastCommit STRING DEFAULT '', + PRIMARY KEY (id) +)`; + +export const CONTRACT_LINK_SCHEMA = ` +CREATE REL TABLE ContractLink ( + FROM Contract TO Contract, + matchType STRING, + confidence DOUBLE, + contractId STRING, + fromRepo STRING, + toRepo STRING +)`; + +export const BRIDGE_SCHEMA_QUERIES = [CONTRACT_SCHEMA, REPO_SNAPSHOT_SCHEMA, CONTRACT_LINK_SCHEMA]; diff --git a/gitnexus/src/core/group/cross-impact.ts b/gitnexus/src/core/group/cross-impact.ts new file mode 100644 index 0000000000..a0713d9dcf --- /dev/null +++ b/gitnexus/src/core/group/cross-impact.ts @@ -0,0 +1,518 @@ +import type { + ContractRegistry, + CrossLink, + GroupImpactResult, + CrossRepoImpact, + OutOfScopeLink, + TruncationReason, +} from './types.js'; + +export interface LegacyGroupImpactOptions { + groupName: string; + target: string; + repoPath: string; + direction: 'upstream' | 'downstream'; + registry: ContractRegistry; + localImpactFn: (target: string, direction: string) => Promise; + crossImpactFn: ( + targetGroupPath: string, + symbolUid: string, + direction: string, + ) => Promise; + maxDepth?: number; + minConfidence?: number; + subgroup?: string; + timeout?: number; + crossDepth?: number; +} + +export interface GroupImpactOptions { + groupName: string; + target: string; + repoPath: string; + direction: 'upstream' | 'downstream'; + bridgeQuery: (cypher: string, params: Record) => Promise; + localImpactFn: (target: string, direction: string) => Promise; + crossImpactFn: ( + targetGroupPath: string, + symbolUid: string, + direction: string, + hint?: { filePath: string; symbolName: string }, + ) => Promise; + maxDepth?: number; + minConfidence?: number; + subgroup?: string; + timeout?: number; + crossDepth?: number; +} + +function collectPhase1Uids(local: Record): Set { + const uids = new Set(); + const target = local.target as { id?: string } | undefined; + if (target?.id) uids.add(String(target.id)); + const byDepth = (local.byDepth || {}) as Record; + for (const arr of Object.values(byDepth)) { + for (const item of arr || []) { + if (item?.id) uids.add(String(item.id)); + } + } + return uids; +} + +function refKey(filePath: string, name: string): string { + return `${filePath}::${name}`; +} + +function collectPhase1Refs(local: Record): Set { + const refs = new Set(); + const t = local.target as { filePath?: string; name?: string } | undefined; + if (t?.filePath?.length && t?.name?.length) refs.add(refKey(t.filePath, t.name)); + const byDepth = (local.byDepth || {}) as Record; + for (const arr of Object.values(byDepth)) { + for (const item of arr || []) { + if (item.filePath?.length && item.name?.length) refs.add(refKey(item.filePath, item.name)); + } + } + return refs; +} + +function linkMatchesRefs( + link: CrossLink, + refs: Set, + direction: 'upstream' | 'downstream', +): boolean { + if (direction === 'upstream') { + const r = link.to.symbolRef; + if (!r.filePath?.length || !r.name?.length) return false; + return refs.has(refKey(r.filePath, r.name)); + } + const r = link.from.symbolRef; + if (!r.filePath?.length || !r.name?.length) return false; + return refs.has(refKey(r.filePath, r.name)); +} + +function inSubgroup(repoPath: string, subgroup?: string): boolean { + if (!subgroup?.trim()) return true; + const s = subgroup.replace(/\/+$/, ''); + return repoPath === s || repoPath.startsWith(`${s}/`); +} + +function mergeRisk( + base: string, + crossHits: number, + maxCrossConf: number, + distinctCrossRepos: number, +): string { + const order = ['LOW', 'MEDIUM', 'HIGH', 'CRITICAL']; + let idx = Math.max(0, order.indexOf(base)); + + if (crossHits > 0 && maxCrossConf >= 0.85) { + idx = Math.max(idx, order.indexOf('HIGH')); + } else if (crossHits > 0 && maxCrossConf > 0) { + idx = Math.max(idx, order.indexOf('MEDIUM')); + } + if (distinctCrossRepos >= 3) { + idx = Math.max(idx, order.indexOf('CRITICAL')); + } + + return order[idx] ?? base; +} + +function computePhase1Timeout(timeout: number): number { + if (timeout <= 1000) return timeout; + return Math.min(Math.ceil(timeout * 0.3), 10000); +} + +async function runPhase1WithTimeout( + timeout: number, + localImpactFn: (target: string, direction: string) => Promise, + target: string, + direction: 'upstream' | 'downstream', +): Promise<{ ok: true; v: unknown } | { ok: false }> { + let timer: ReturnType | undefined; + try { + return await Promise.race([ + localImpactFn(target, direction).then((v) => ({ ok: true as const, v })), + new Promise<{ ok: false }>((resolve) => { + timer = setTimeout(() => resolve({ ok: false }), timeout); + }), + ]); + } finally { + if (timer) clearTimeout(timer); + } +} + +// Multi-hop cross-boundary traversal is not implemented yet; both entry points +// enforce the 1-hop limit. When multi-hop lands, remove this clamp here AND in +// GroupService.groupImpact (which surfaces crossDepthWarning to the caller). +const MAX_SUPPORTED_CROSS_DEPTH = 1; + +export async function runGroupImpactLegacy( + opts: LegacyGroupImpactOptions, +): Promise { + const timeout = opts.timeout ?? 30000; + const minConfidence = opts.minConfidence ?? 0.5; + const crossDepth = Math.min(MAX_SUPPORTED_CROSS_DEPTH, opts.crossDepth ?? 1); + + const tStart = Date.now(); + const wallDeadline = tStart + timeout; + const phase1Timeout = computePhase1Timeout(timeout); + + const localResult = await runPhase1WithTimeout( + phase1Timeout, + opts.localImpactFn, + opts.target, + opts.direction, + ); + + let truncated = !localResult.ok; + let truncationReason: TruncationReason | undefined = localResult.ok + ? undefined + : 'phase1_timeout'; + const local = localResult.ok + ? (localResult.v as Record) + : ({ + target: { id: '', name: opts.target, filePath: '' }, + direction: opts.direction, + impactedCount: 0, + risk: 'LOW', + summary: { direct: 0, processes_affected: 0, modules_affected: 0 }, + affected_processes: [], + affected_modules: [], + byDepth: {}, + // Marks the local block as a placeholder produced by the Phase-1 timeout path. + // Consumers should treat zero counts as "unknown" rather than "verified empty". + phase1TimedOut: true, + } as Record); + + const uids = collectPhase1Uids(local); + const phase1Refs = collectPhase1Refs(local); + const cross: CrossRepoImpact[] = []; + const outOfScope: OutOfScopeLink[] = []; + const truncatedRepos: string[] = []; + if (!localResult.ok) { + truncatedRepos.push(opts.repoPath); + } + + const links = [...opts.registry.crossLinks] + .filter((l) => l.confidence >= minConfidence) + .sort((a, b) => b.confidence - a.confidence); + + const applicable: CrossLink[] = []; + for (const link of links) { + const uidMatch = + opts.direction === 'upstream' + ? Boolean(link.to.symbolUid && uids.has(link.to.symbolUid)) + : Boolean(link.from.symbolUid && uids.has(link.from.symbolUid)); + const refMatch = !uidMatch && linkMatchesRefs(link, phase1Refs, opts.direction); + if (!uidMatch && !refMatch) continue; + applicable.push(link); + } + + let maxCrossConf = 0; + const distinctRepos = new Set(); + + for (const link of applicable) { + if (Date.now() > wallDeadline) { + truncated = true; + truncationReason ??= 'wall_deadline'; + break; + } + + const fanOutRepo = opts.direction === 'upstream' ? link.from.repo : link.to.repo; + const symbolUid = opts.direction === 'upstream' ? link.from.symbolUid : link.to.symbolUid; + + if (!inSubgroup(fanOutRepo, opts.subgroup)) { + outOfScope.push({ + from: link.from.repo, + to: link.to.repo, + contractId: link.contractId, + matchType: link.matchType, + confidence: link.confidence, + }); + continue; + } + + if (crossDepth < 1) break; + + const remote = await opts.crossImpactFn(fanOutRepo, symbolUid, opts.direction); + if (remote) { + maxCrossConf = Math.max(maxCrossConf, link.confidence); + distinctRepos.add(fanOutRepo); + const r = remote as Record; + cross.push({ + repo: fanOutRepo, + repo_path: fanOutRepo, + contract: { + id: link.contractId, + type: link.type, + match_type: link.matchType, + confidence: link.confidence, + }, + by_depth: (r.byDepth || {}) as Record, + affected_processes: (r.affected_processes || []) as string[], + }); + } + + if (Date.now() > wallDeadline) { + truncated = true; + truncationReason ??= 'wall_deadline'; + truncatedRepos.push(fanOutRepo); + break; + } + } + + const summaryLocal = (local.summary || {}) as { + direct?: number; + processes_affected?: number; + modules_affected?: number; + }; + + const baseRisk = String(local.risk || 'LOW'); + const risk = mergeRisk(baseRisk, cross.length, maxCrossConf, distinctRepos.size); + + return { + local, + group: opts.groupName, + cross, + outOfScope, + truncated, + truncatedRepos, + ...(truncationReason ? { truncationReason } : {}), + summary: { + direct: summaryLocal.direct ?? 0, + processes_affected: summaryLocal.processes_affected ?? 0, + modules_affected: summaryLocal.modules_affected ?? 0, + cross_repo_hits: cross.length, + }, + risk, + }; +} + +/* ------------------------------------------------------------------ */ +/* Cypher-based Phase 2 */ +/* ------------------------------------------------------------------ */ + +const UPSTREAM_QUERY_RETURN = ` +RETURN consumer.repo AS fanOutRepo, consumer.symbolUid AS fanOutUid, + consumer.filePath AS fanOutFilePath, consumer.symbolName AS fanOutSymbolName, + provider.symbolUid AS matchedLocalUid, + provider.filePath AS matchedLocalFilePath, + provider.symbolName AS matchedLocalSymbolName, + l.matchType AS matchType, l.confidence AS confidence, l.contractId AS contractId, + consumer.type AS contractType +ORDER BY l.confidence DESC`; + +const DOWNSTREAM_QUERY_RETURN = ` +RETURN provider.repo AS fanOutRepo, provider.symbolUid AS fanOutUid, + provider.filePath AS fanOutFilePath, provider.symbolName AS fanOutSymbolName, + consumer.symbolUid AS matchedLocalUid, + consumer.filePath AS matchedLocalFilePath, + consumer.symbolName AS matchedLocalSymbolName, + l.matchType AS matchType, l.confidence AS confidence, l.contractId AS contractId, + consumer.type AS contractType +ORDER BY l.confidence DESC`; + +function buildBridgeQuery( + direction: 'upstream' | 'downstream', + hasUids: boolean, + hasRefs: boolean, + subgroup?: string, +): string | null { + const isUpstream = direction === 'upstream'; + const sourceAlias = isUpstream ? 'provider' : 'consumer'; + const fanOutAlias = isUpstream ? 'consumer' : 'provider'; + const localMatchers: string[] = []; + + if (hasUids) { + localMatchers.push(`${sourceAlias}.symbolUid IN $localUids`); + } + if (hasRefs) { + localMatchers.push( + `(${sourceAlias}.filePath + '::' + ${sourceAlias}.symbolName) IN $localRefs`, + ); + } + if (localMatchers.length === 0) { + return null; + } + + const whereClauses = [ + `${sourceAlias}.repo = $sourceRepo`, + `(${localMatchers.join(' OR ')})`, + 'l.confidence >= $minConfidence', + ]; + const normalizedSubgroup = subgroup?.trim().replace(/\/+$/, ''); + if (normalizedSubgroup) { + whereClauses.push( + `(${fanOutAlias}.repo = $subgroup OR ${fanOutAlias}.repo STARTS WITH $subgroup + '/')`, + ); + } + + const returnClause = isUpstream ? UPSTREAM_QUERY_RETURN : DOWNSTREAM_QUERY_RETURN; + return `MATCH (consumer:Contract)-[l:ContractLink]->(provider:Contract)\nWHERE ${whereClauses.join('\n AND ')}${returnClause}`; +} + +interface CrossImpactRow { + fanOutRepo: string; + fanOutUid: string; + fanOutFilePath: string; + fanOutSymbolName: string; + matchedLocalUid: string; + matchedLocalFilePath: string; + matchedLocalSymbolName: string; + matchType: string; + confidence: number; + contractId: string; + contractType: string; +} + +export async function runGroupImpact(opts: GroupImpactOptions): Promise { + const timeout = opts.timeout ?? 30000; + const minConfidence = opts.minConfidence ?? 0.5; + const crossDepth = Math.min(MAX_SUPPORTED_CROSS_DEPTH, opts.crossDepth ?? 1); + + const tStart = Date.now(); + const wallDeadline = tStart + timeout; + const phase1Timeout = computePhase1Timeout(timeout); + + const localResult = await runPhase1WithTimeout( + phase1Timeout, + opts.localImpactFn, + opts.target, + opts.direction, + ); + + let truncated = !localResult.ok; + let truncationReason: TruncationReason | undefined = localResult.ok + ? undefined + : 'phase1_timeout'; + const local = localResult.ok + ? (localResult.v as Record) + : ({ + target: { id: '', name: opts.target, filePath: '' }, + direction: opts.direction, + impactedCount: 0, + risk: 'LOW', + summary: { direct: 0, processes_affected: 0, modules_affected: 0 }, + affected_processes: [], + affected_modules: [], + byDepth: {}, + // Marks the local block as a placeholder produced by the Phase-1 timeout path. + // Consumers should treat zero counts as "unknown" rather than "verified empty". + phase1TimedOut: true, + } as Record); + + const uids = collectPhase1Uids(local); + const phase1Refs = collectPhase1Refs(local); + const cross: CrossRepoImpact[] = []; + const outOfScope: OutOfScopeLink[] = []; + const truncatedRepos: string[] = []; + if (!localResult.ok) { + truncatedRepos.push(opts.repoPath); + } + + /* Phase 2 — Cypher bridge query */ + const normalizedSubgroup = opts.subgroup?.trim().replace(/\/+$/, '') || null; + const localUids = [...uids]; + const localRefs = [...phase1Refs]; + const cypher = buildBridgeQuery( + opts.direction, + localUids.length > 0, + localRefs.length > 0, + normalizedSubgroup ?? undefined, + ); + const queryParams: Record = { + sourceRepo: opts.repoPath, + minConfidence, + }; + if (localUids.length > 0) { + queryParams.localUids = localUids; + } + if (localRefs.length > 0) { + queryParams.localRefs = localRefs; + } + if (normalizedSubgroup) { + queryParams.subgroup = normalizedSubgroup; + } + + const rows = cypher ? await opts.bridgeQuery(cypher, queryParams) : []; + + let maxCrossConf = 0; + const distinctRepos = new Set(); + + for (const row of rows) { + if (!inSubgroup(row.fanOutRepo, opts.subgroup)) { + outOfScope.push({ + from: row.fanOutRepo, + to: opts.repoPath, + contractId: row.contractId, + matchType: row.matchType as OutOfScopeLink['matchType'], + confidence: row.confidence, + }); + continue; + } + + if (Date.now() > wallDeadline) { + truncated = true; + truncationReason ??= 'wall_deadline'; + break; + } + if (crossDepth < 1) break; + + const hint = row.fanOutUid + ? undefined + : { filePath: row.fanOutFilePath, symbolName: row.fanOutSymbolName }; + const remote = await opts.crossImpactFn(row.fanOutRepo, row.fanOutUid, opts.direction, hint); + + if (remote && typeof remote === 'object' && !('error' in (remote as Record))) { + maxCrossConf = Math.max(maxCrossConf, row.confidence); + distinctRepos.add(row.fanOutRepo); + const r = remote as Record; + cross.push({ + repo: row.fanOutRepo, + repo_path: row.fanOutRepo, + contract: { + id: row.contractId, + type: row.contractType as CrossRepoImpact['contract']['type'], + match_type: row.matchType as CrossRepoImpact['contract']['match_type'], + confidence: row.confidence, + }, + by_depth: (r.byDepth || {}) as Record, + affected_processes: (r.affected_processes || []) as string[], + }); + } + + if (Date.now() > wallDeadline) { + truncated = true; + truncationReason ??= 'wall_deadline'; + truncatedRepos.push(row.fanOutRepo); + break; + } + } + + const summaryLocal = (local.summary || {}) as { + direct?: number; + processes_affected?: number; + modules_affected?: number; + }; + + const baseRisk = String(local.risk || 'LOW'); + const risk = mergeRisk(baseRisk, cross.length, maxCrossConf, distinctRepos.size); + + return { + local, + group: opts.groupName, + cross, + outOfScope, + truncated, + truncatedRepos, + ...(truncationReason ? { truncationReason } : {}), + summary: { + direct: summaryLocal.direct ?? 0, + processes_affected: summaryLocal.processes_affected ?? 0, + modules_affected: summaryLocal.modules_affected ?? 0, + cross_repo_hits: cross.length, + }, + risk, + }; +} diff --git a/gitnexus/src/core/group/extractors/grpc-extractor.ts b/gitnexus/src/core/group/extractors/grpc-extractor.ts index b4cefadc51..6e52c0b746 100644 --- a/gitnexus/src/core/group/extractors/grpc-extractor.ts +++ b/gitnexus/src/core/group/extractors/grpc-extractor.ts @@ -25,20 +25,110 @@ function serviceOnlyContractId(serviceName: string): string { return `grpc::${serviceName}/*`; } +/** + * Replace all .proto comments and string literals with spaces, preserving the + * original length and character offsets of the input. This lets downstream + * regex / brace-depth parsers run on a "sanitized" copy without having to + * understand proto syntax, while any RegExp.exec/index-based lookups that + * were already positional against `content` continue to work against the + * original string. + * + * Supported comment forms: `// line comment`, `/* block comment * /`. + * Supported strings: double-quoted ("…") and single-quoted ('…') with `\` + * escape handling. Raw/unterminated strings are not supported — we stop + * on a line break for line-style comments and on EOF for unterminated + * strings/blocks, which matches how most real proto files parse. + */ +function stripProtoCommentsAndStrings(content: string): string { + const out = new Array(content.length); + let i = 0; + while (i < content.length) { + const ch = content[i]; + const next = content[i + 1]; + + // Line comment: // ... \n + if (ch === '/' && next === '/') { + out[i] = ' '; + out[i + 1] = ' '; + i += 2; + while (i < content.length && content[i] !== '\n') { + out[i] = content[i] === '\r' ? '\r' : ' '; + i++; + } + continue; + } + + // Block comment: /* ... */ + if (ch === '/' && next === '*') { + out[i] = ' '; + out[i + 1] = ' '; + i += 2; + while (i < content.length) { + if (content[i] === '*' && content[i + 1] === '/') { + out[i] = ' '; + out[i + 1] = ' '; + i += 2; + break; + } + // Preserve newlines so line numbers stay stable for downstream code. + out[i] = content[i] === '\n' || content[i] === '\r' ? content[i] : ' '; + i++; + } + continue; + } + + // String literal: "..." or '...' + if (ch === '"' || ch === "'") { + const quote = ch; + out[i] = ' '; // replace opening quote + i++; + while (i < content.length) { + const c = content[i]; + if (c === '\\' && i + 1 < content.length) { + // Skip escaped pair (e.g. \" \n \\) + out[i] = ' '; + out[i + 1] = ' '; + i += 2; + continue; + } + if (c === quote) { + out[i] = ' '; + i++; + break; + } + // Preserve newlines; proto technically disallows unescaped newlines + // inside strings, but real files occasionally have them. + out[i] = c === '\n' || c === '\r' ? c : ' '; + i++; + } + continue; + } + + out[i] = ch; + i++; + } + return out.join(''); +} + function extractServiceBlocks(content: string): Array<{ name: string; body: string }> { const results: Array<{ name: string; body: string }> = []; - // v1: brace-depth only — braces inside comments or string literals are not filtered (see spec Fix 2) + // Sanitize comments and string literals so braces inside them don't + // throw off the depth counter. The sanitized copy has the same length + // and offsets as the original, so we use it ONLY to scan for service + // headers and braces; the service body we return is sliced from the + // ORIGINAL content to preserve exact source text for downstream use. + const sanitized = stripProtoCommentsAndStrings(content); const headerRe = /service\s+(\w+)\s*\{/g; let headerMatch: RegExpExecArray | null; - while ((headerMatch = headerRe.exec(content)) !== null) { + while ((headerMatch = headerRe.exec(sanitized)) !== null) { const serviceName = headerMatch[1]; const bodyStart = headerMatch.index + headerMatch[0].length; let depth = 1; let pos = bodyStart; - while (pos < content.length && depth > 0) { - const ch = content[pos]; + while (pos < sanitized.length && depth > 0) { + const ch = sanitized[pos]; if (ch === '{') depth++; else if (ch === '}') depth--; pos++; @@ -75,6 +165,163 @@ function makeContract( }; } +export interface ProtoServiceInfo { + package: string; + serviceName: string; + methods: string[]; + protoPath: string; +} + +function normalizeProtoPath(rel: string): string { + return rel.replace(/\\/g, '/'); +} + +function extractProtoImports(content: string): string[] { + const imports: string[] = []; + const re = /^\s*import\s+"([^"]+)"\s*;/gm; + let match: RegExpExecArray | null; + while ((match = re.exec(content)) !== null) { + imports.push(match[1]); + } + return imports; +} + +function longestSharedSegmentRun(aPath: string, bPath: string): number { + const a = aPath.split('/').filter(Boolean); + const b = bPath.split('/').filter(Boolean); + let best = 0; + + for (let i = 0; i < a.length; i++) { + for (let j = 0; j < b.length; j++) { + let run = 0; + while (a[i + run] && b[j + run] && a[i + run] === b[j + run]) { + run++; + } + if (run > best) best = run; + } + } + + return best; +} + +async function buildProtoContext(repoPath: string): Promise<{ + packagesByProto: Map; + servicesByName: Map; +}> { + const servicesByName = new Map(); + const protoFiles = await glob('**/*.proto', { + cwd: repoPath, + absolute: false, + nodir: true, + ignore: ['**/node_modules/**', '**/.git/**', '**/vendor/**'], + }); + const contents = new Map(); + + for (const rel of protoFiles) { + const content = readSafe(repoPath, rel); + if (!content) continue; + contents.set(normalizeProtoPath(rel), content); + } + + const packagesByProto = new Map(); + + const resolvePackage = (protoPath: string, seen = new Set()): string => { + if (packagesByProto.has(protoPath)) return packagesByProto.get(protoPath) ?? ''; + if (seen.has(protoPath)) return ''; + + const content = contents.get(protoPath); + if (!content) return ''; + + seen.add(protoPath); + const pkgMatch = content.match(/^\s*package\s+([\w.]+)\s*;/m); + if (pkgMatch?.[1]) { + packagesByProto.set(protoPath, pkgMatch[1]); + return pkgMatch[1]; + } + + for (const importPath of extractProtoImports(content)) { + const normalizedImport = normalizeProtoPath(importPath); + const candidates = [ + normalizeProtoPath( + path.posix.normalize(path.posix.join(path.posix.dirname(protoPath), normalizedImport)), + ), + normalizedImport, + ]; + for (const candidate of candidates) { + if (!contents.has(candidate)) continue; + const inheritedPackage = resolvePackage(candidate, seen); + if (inheritedPackage) { + packagesByProto.set(protoPath, inheritedPackage); + return inheritedPackage; + } + } + } + + packagesByProto.set(protoPath, ''); + return ''; + }; + + for (const rel of protoFiles) { + const normalizedRel = normalizeProtoPath(rel); + const content = contents.get(normalizedRel); + if (!content) continue; + const pkg = resolvePackage(normalizedRel); + + const serviceBlocks = extractServiceBlocks(content); + for (const block of serviceBlocks) { + const rpcRe = /rpc\s+(\w+)\s*\(/g; + const methods: string[] = []; + let m: RegExpExecArray | null; + while ((m = rpcRe.exec(block.body)) !== null) { + methods.push(m[1]); + } + const info: ProtoServiceInfo = { + package: pkg, + serviceName: block.name, + methods, + protoPath: normalizedRel, + }; + const existing = servicesByName.get(block.name) ?? []; + existing.push(info); + servicesByName.set(block.name, existing); + } + } + + return { packagesByProto, servicesByName }; +} + +export async function buildProtoMap(repoPath: string): Promise> { + const { servicesByName } = await buildProtoContext(repoPath); + return servicesByName; +} + +export function resolveProtoConflict( + _serviceName: string, + sourceFilePath: string, + candidates: ProtoServiceInfo[], +): ProtoServiceInfo | null { + if (candidates.length === 0) return null; + if (candidates.length === 1) return candidates[0]; + + const sourceDir = normalizeProtoPath(path.dirname(sourceFilePath)); + let best = candidates[0]; + let bestScore = -1; + for (const c of candidates) { + const protoDir = normalizeProtoPath(path.dirname(c.protoPath)); + const sharedRun = longestSharedSegmentRun(sourceDir, protoDir); + if (sharedRun > bestScore) { + bestScore = sharedRun; + best = c; + } + } + return best; +} + +export function serviceContractId(pkg: string, serviceName: string): string { + const prefix = pkg ? `${pkg}.${serviceName}` : serviceName; + return `grpc::${prefix}/*`; +} + export class GrpcExtractor implements ContractExtractor { type = 'grpc' as const; @@ -88,6 +335,7 @@ export class GrpcExtractor implements ContractExtractor { _repo: RepoHandle, ): Promise { const out: ExtractedContract[] = []; + const protoContext = await buildProtoContext(repoPath); // Proto files — definitive provider source const protoFiles = await glob('**/*.proto', { @@ -97,8 +345,17 @@ export class GrpcExtractor implements ContractExtractor { }); for (const rel of protoFiles) { const content = readSafe(repoPath, rel); - if (content) out.push(...this.parseProtoFile(content, rel)); + if (content) { + out.push( + ...this.parseProtoFile( + content, + rel, + protoContext.packagesByProto.get(normalizeProtoPath(rel)) ?? '', + ), + ); + } } + const protoMap = protoContext.servicesByName; // Source files — server/client detection const sourceFiles = await glob('**/*.{go,java,py,ts,tsx,js,jsx}', { @@ -112,28 +369,26 @@ export class GrpcExtractor implements ContractExtractor { const ext = path.extname(rel).toLowerCase(); if (ext === '.go') { - out.push(...this.scanGoProviders(content, rel)); - out.push(...this.scanGoConsumers(content, rel)); + out.push(...this.scanGoProviders(content, rel, protoMap)); + out.push(...this.scanGoConsumers(content, rel, protoMap)); } else if (ext === '.java') { - out.push(...this.scanJavaProviders(content, rel)); - out.push(...this.scanJavaConsumers(content, rel)); + out.push(...this.scanJavaProviders(content, rel, protoMap)); + out.push(...this.scanJavaConsumers(content, rel, protoMap)); } else if (ext === '.py') { - out.push(...this.scanPythonProviders(content, rel)); - out.push(...this.scanPythonConsumers(content, rel)); + out.push(...this.scanPythonProviders(content, rel, protoMap)); + out.push(...this.scanPythonConsumers(content, rel, protoMap)); } else if (['.ts', '.tsx', '.js', '.jsx'].includes(ext)) { - out.push(...this.scanTsProviders(content, rel)); + out.push(...this.scanTsProviders(content, rel, protoMap)); + out.push(...this.scanTsConsumers(content, rel, protoMap)); } } return this.dedupe(out); } - private parseProtoFile(content: string, filePath: string): ExtractedContract[] { + private parseProtoFile(content: string, filePath: string, pkg: string): ExtractedContract[] { const out: ExtractedContract[] = []; - const pkgMatch = content.match(/^package\s+([\w.]+)\s*;/m); - const pkg = pkgMatch ? pkgMatch[1] : ''; - for (const { name: serviceName, body } of extractServiceBlocks(content)) { const rpcRe = /rpc\s+(\w+)\s*\(/g; let rpcMatch: RegExpExecArray | null; @@ -154,7 +409,11 @@ export class GrpcExtractor implements ContractExtractor { return out; } - private scanGoProviders(content: string, filePath: string): ExtractedContract[] { + private scanGoProviders( + content: string, + filePath: string, + protoMap: Map, + ): ExtractedContract[] { const out: ExtractedContract[] = []; // pb.RegisterXxxServer( @@ -162,15 +421,17 @@ export class GrpcExtractor implements ContractExtractor { let m: RegExpExecArray | null; while ((m = registerRe.exec(content)) !== null) { const serviceName = m[1]; + const candidates = protoMap.get(serviceName); + const proto = resolveProtoConflict(serviceName, filePath, candidates ?? []); + const cid = proto + ? serviceContractId(proto.package, proto.serviceName) + : serviceOnlyContractId(serviceName); + const conf = proto ? 0.8 : 0.65; out.push( - makeContract( - serviceOnlyContractId(serviceName), - 'provider', - filePath, - `Register${serviceName}Server`, - 0.8, - { service: serviceName, source: 'go_register' }, - ), + makeContract(cid, 'provider', filePath, `Register${serviceName}Server`, conf, { + service: serviceName, + source: 'go_register', + }), ); } @@ -178,51 +439,74 @@ export class GrpcExtractor implements ContractExtractor { const unimplRe = /\w+\.Unimplemented(\w+)Server\b/g; while ((m = unimplRe.exec(content)) !== null) { const serviceName = m[1]; + const candidates = protoMap.get(serviceName); + const proto = resolveProtoConflict(serviceName, filePath, candidates ?? []); + const cid = proto + ? serviceContractId(proto.package, proto.serviceName) + : serviceOnlyContractId(serviceName); + const conf = proto ? 0.8 : 0.65; out.push( - makeContract( - serviceOnlyContractId(serviceName), - 'provider', - filePath, - `Unimplemented${serviceName}Server`, - 0.8, - { service: serviceName, source: 'go_unimplemented' }, - ), + makeContract(cid, 'provider', filePath, `Unimplemented${serviceName}Server`, conf, { + service: serviceName, + source: 'go_unimplemented', + }), ); } return out; } - private scanGoConsumers(content: string, filePath: string): ExtractedContract[] { + private scanGoConsumers( + content: string, + filePath: string, + protoMap: Map, + ): ExtractedContract[] { const out: ExtractedContract[] = []; const re = /\w+\.New(\w+)Client\s*\(/g; let m: RegExpExecArray | null; while ((m = re.exec(content)) !== null) { const serviceName = m[1]; + const candidates = protoMap.get(serviceName); + const proto = resolveProtoConflict(serviceName, filePath, candidates ?? []); + const cid = proto + ? serviceContractId(proto.package, proto.serviceName) + : serviceOnlyContractId(serviceName); + const conf = proto ? 0.75 : 0.55; out.push( - makeContract( - serviceOnlyContractId(serviceName), - 'consumer', - filePath, - `New${serviceName}Client`, - 0.7, - { service: serviceName, source: 'go_client' }, - ), + makeContract(cid, 'consumer', filePath, `New${serviceName}Client`, conf, { + service: serviceName, + source: 'go_client', + }), ); } return out; } - private scanJavaProviders(content: string, filePath: string): ExtractedContract[] { + private scanJavaProviders( + content: string, + filePath: string, + protoMap: Map, + ): ExtractedContract[] { const out: ExtractedContract[] = []; + const resolveJava = (svcName: string): { cid: string; conf: number } => { + const candidates = protoMap.get(svcName); + const proto = resolveProtoConflict(svcName, filePath, candidates ?? []); + const cid = proto + ? serviceContractId(proto.package, proto.serviceName) + : serviceOnlyContractId(svcName); + const conf = proto ? 0.8 : 0.65; + return { cid, conf }; + }; + // @GrpcService if (content.includes('@GrpcService')) { const implBaseRe = /extends\s+(\w+)Grpc\.(\w+)ImplBase/; const m = content.match(implBaseRe); if (m) { + const { cid, conf } = resolveJava(m[1]); out.push( - makeContract(serviceOnlyContractId(m[1]), 'provider', filePath, m[2], 0.8, { + makeContract(cid, 'provider', filePath, m[2], conf, { service: m[1], source: 'java_grpc_service', }), @@ -234,8 +518,9 @@ export class GrpcExtractor implements ContractExtractor { const cm = content.match(classRe); if (cm) { const svcName = cm[2].replace(/Grpc$/, ''); + const { cid, conf } = resolveJava(svcName); out.push( - makeContract(serviceOnlyContractId(svcName), 'provider', filePath, cm[1], 0.8, { + makeContract(cid, 'provider', filePath, cm[1], conf, { service: svcName, source: 'java_grpc_service', }), @@ -250,8 +535,9 @@ export class GrpcExtractor implements ContractExtractor { const m = content.match(implRe); if (m) { const svcName = m[2] || m[1].replace(/Grpc$/, ''); + const { cid, conf } = resolveJava(svcName); out.push( - makeContract(serviceOnlyContractId(svcName), 'provider', filePath, svcName, 0.8, { + makeContract(cid, 'provider', filePath, svcName, conf, { service: svcName, source: 'java_impl_base', }), @@ -262,49 +548,65 @@ export class GrpcExtractor implements ContractExtractor { return out; } - private scanJavaConsumers(content: string, filePath: string): ExtractedContract[] { + private scanJavaConsumers( + content: string, + filePath: string, + protoMap: Map, + ): ExtractedContract[] { const out: ExtractedContract[] = []; // XxxGrpc.newBlockingStub( or XxxGrpc.newStub( const re = /(\w+)Grpc\.new(?:Blocking)?Stub\s*\(/g; let m: RegExpExecArray | null; while ((m = re.exec(content)) !== null) { const serviceName = m[1]; + const candidates = protoMap.get(serviceName); + const proto = resolveProtoConflict(serviceName, filePath, candidates ?? []); + const cid = proto + ? serviceContractId(proto.package, proto.serviceName) + : serviceOnlyContractId(serviceName); + const conf = proto ? 0.75 : 0.55; out.push( - makeContract( - serviceOnlyContractId(serviceName), - 'consumer', - filePath, - `${serviceName}Stub`, - 0.7, - { service: serviceName, source: 'java_stub' }, - ), + makeContract(cid, 'consumer', filePath, `${serviceName}Stub`, conf, { + service: serviceName, + source: 'java_stub', + }), ); } return out; } - private scanPythonProviders(content: string, filePath: string): ExtractedContract[] { + private scanPythonProviders( + content: string, + filePath: string, + protoMap: Map, + ): ExtractedContract[] { const out: ExtractedContract[] = []; // add_XxxServicer_to_server( const re = /add_(\w+?)Servicer_to_server\s*\(/g; let m: RegExpExecArray | null; while ((m = re.exec(content)) !== null) { const serviceName = m[1]; + const candidates = protoMap.get(serviceName); + const proto = resolveProtoConflict(serviceName, filePath, candidates ?? []); + const cid = proto + ? serviceContractId(proto.package, proto.serviceName) + : serviceOnlyContractId(serviceName); + const conf = proto ? 0.8 : 0.65; out.push( - makeContract( - serviceOnlyContractId(serviceName), - 'provider', - filePath, - `add_${serviceName}Servicer_to_server`, - 0.8, - { service: serviceName, source: 'python_servicer' }, - ), + makeContract(cid, 'provider', filePath, `add_${serviceName}Servicer_to_server`, conf, { + service: serviceName, + source: 'python_servicer', + }), ); } return out; } - private scanPythonConsumers(content: string, filePath: string): ExtractedContract[] { + private scanPythonConsumers( + content: string, + filePath: string, + protoMap: Map, + ): ExtractedContract[] { const out: ExtractedContract[] = []; // XxxStub( const re = /(\w+)Stub\s*\(/g; @@ -313,8 +615,14 @@ export class GrpcExtractor implements ContractExtractor { const name = m[1]; // Filter out common false positives if (['Mock', 'Test', 'Fake', 'Stub'].includes(name)) continue; + const candidates = protoMap.get(name); + const proto = resolveProtoConflict(name, filePath, candidates ?? []); + const cid = proto + ? serviceContractId(proto.package, proto.serviceName) + : serviceOnlyContractId(name); + const conf = proto ? 0.75 : 0.55; out.push( - makeContract(serviceOnlyContractId(name), 'consumer', filePath, `${name}Stub`, 0.7, { + makeContract(cid, 'consumer', filePath, `${name}Stub`, conf, { service: name, source: 'python_stub', }), @@ -323,7 +631,11 @@ export class GrpcExtractor implements ContractExtractor { return out; } - private scanTsProviders(content: string, filePath: string): ExtractedContract[] { + private scanTsProviders( + content: string, + filePath: string, + protoMap: Map, + ): ExtractedContract[] { const out: ExtractedContract[] = []; // @GrpcMethod('ServiceName', 'MethodName') const re = /@GrpcMethod\s*\(\s*['"](\w+)['"]\s*,\s*['"](\w+)['"]\s*\)/g; @@ -331,7 +643,10 @@ export class GrpcExtractor implements ContractExtractor { while ((m = re.exec(content)) !== null) { const serviceName = m[1]; const methodName = m[2]; - const cid = contractId('', serviceName, methodName); + const candidates = protoMap.get(serviceName); + const proto = resolveProtoConflict(serviceName, filePath, candidates ?? []); + const pkg = proto?.package ?? ''; + const cid = contractId(pkg, serviceName, methodName); out.push( makeContract(cid, 'provider', filePath, `${serviceName}.${methodName}`, 0.8, { service: serviceName, @@ -343,15 +658,74 @@ export class GrpcExtractor implements ContractExtractor { return out; } - private dedupe(items: ExtractedContract[]): ExtractedContract[] { - const seen = new Set(); + private scanTsConsumers( + content: string, + filePath: string, + protoMap: Map, + ): ExtractedContract[] { const out: ExtractedContract[] = []; + const pushConsumer = ( + serviceName: string, + symbolName: string, + source: string, + confidenceWithProto = 0.75, + confidenceWithoutProto = 0.55, + ): void => { + const candidates = protoMap.get(serviceName); + const proto = resolveProtoConflict(serviceName, filePath, candidates ?? []); + const cid = proto + ? serviceContractId(proto.package, proto.serviceName) + : serviceOnlyContractId(serviceName); + const conf = proto ? confidenceWithProto : confidenceWithoutProto; + out.push( + makeContract(cid, 'consumer', filePath, symbolName, conf, { + service: serviceName, + source, + }), + ); + }; + + const grpcClientDecoratorRe = + /@GrpcClient\s*\([^)]*\)\s*(?:private|protected|public)?\s*(?:readonly\s+)?\w+[!?]?\s*:\s*(\w+Service)Client\b/g; + let match: RegExpExecArray | null; + while ((match = grpcClientDecoratorRe.exec(content)) !== null) { + pushConsumer(match[1], `${match[1]}Client`, 'ts_grpc_client_decorator'); + } + + const getServiceRe = /\.getService(?:<[^>]+>)?\s*\(\s*['"](\w+)['"]\s*\)/g; + while ((match = getServiceRe.exec(content)) !== null) { + pushConsumer(match[1], `${match[1]}Client`, 'ts_client_grpc_get_service'); + } + + const clientCtorRe = /new\s+(\w+Service)Client\s*\(/g; + while ((match = clientCtorRe.exec(content)) !== null) { + pushConsumer(match[1], `${match[1]}Client`, 'ts_generated_client'); + } + + if (content.includes('loadPackageDefinition')) { + const packageCtorRe = /new\s+[\w$.]*\.([A-Z]\w+)\s*\(/g; + while ((match = packageCtorRe.exec(content)) !== null) { + pushConsumer(match[1], `${match[1]}Client`, 'ts_load_package_definition'); + } + } + + return out; + } + + private dedupe(items: ExtractedContract[]): ExtractedContract[] { + const byKey = new Map(); for (const c of items) { const k = `${c.contractId}|${c.role}|${c.symbolRef.filePath}`; - if (seen.has(k)) continue; - seen.add(k); - out.push(c); + const existing = byKey.get(k); + if ( + !existing || + c.confidence > existing.confidence || + (c.confidence === existing.confidence && + String(c.meta.source) < String(existing.meta.source)) + ) { + byKey.set(k, c); + } } - return out; + return Array.from(byKey.values()); } } diff --git a/gitnexus/src/core/group/extractors/http-route-extractor.ts b/gitnexus/src/core/group/extractors/http-route-extractor.ts index 8dfb242bfc..2a9f662c3b 100644 --- a/gitnexus/src/core/group/extractors/http-route-extractor.ts +++ b/gitnexus/src/core/group/extractors/http-route-extractor.ts @@ -226,7 +226,7 @@ export class HttpRouteExtractor implements ContractExtractor { } private async extractProvidersSourceScan(repoPath: string): Promise { - const files = await glob('**/*.{ts,tsx,js,jsx,java,vue,svelte,php,py}', { + const files = await glob('**/*.{ts,tsx,js,jsx,java,vue,svelte,php,py,go}', { cwd: repoPath, ignore: ['**/node_modules/**', '**/.git/**', '**/dist/**', '**/build/**'], nodir: true, @@ -236,7 +236,9 @@ export class HttpRouteExtractor implements ContractExtractor { const content = readSafe(repoPath, rel); if (!content) continue; out.push(...this.scanSpringProviders(content, rel)); + out.push(...this.scanNestProviders(content, rel)); out.push(...this.scanExpressProviders(content, rel)); + out.push(...this.scanGoProviders(content, rel)); out.push(...this.scanLaravelProviders(content, rel)); out.push(...this.scanFastApiProviders(content, rel)); } @@ -312,6 +314,48 @@ export class HttpRouteExtractor implements ContractExtractor { return out; } + private scanNestProviders(content: string, filePath: string): ExtractedContract[] { + const out: ExtractedContract[] = []; + const controllerMatch = content.match(/@Controller\s*\(\s*['"`]([^'"`]+)['"`]\s*\)/); + const controllerPrefix = controllerMatch ? controllerMatch[1].replace(/\/+$/, '') : ''; + const re = /@(Get|Post|Put|Delete|Patch)\s*\(\s*['"`]?([^'"`)]*)['"`]?\s*\)/gi; + let m: RegExpExecArray | null; + while ((m = re.exec(content)) !== null) { + const method = m[1].toUpperCase(); + const routePath = String(m[2] || ''); + const fullPath = controllerPrefix + ? `${controllerPrefix}/${routePath.replace(/^\/+/, '')}` + : routePath; + const pathNorm = normalizeHttpPath(fullPath.startsWith('/') ? fullPath : `/${fullPath}`); + const sub = content.slice(m.index); + const nameMatch = sub.match( + /(?:public|protected|private)?\s*(?:async\s+)?(\w+)\s*\([^)]*\)\s*\{/, + ); + const name = nameMatch ? nameMatch[1] : m[0]; + out.push(this.makeProvider(filePath, method, pathNorm, name, 0.8)); + } + return out; + } + + private scanGoProviders(content: string, filePath: string): ExtractedContract[] { + const out: ExtractedContract[] = []; + const frameworkRe = + /(?:^|\W)\w+\.(GET|POST|PUT|DELETE|PATCH)\s*\(\s*['"]([^'"]+)['"]\s*,\s*(\w+)/gim; + let m: RegExpExecArray | null; + while ((m = frameworkRe.exec(content)) !== null) { + out.push(this.makeProvider(filePath, m[1].toUpperCase(), normalizeHttpPath(m[2]), m[3], 0.8)); + } + + const handleFuncRe = + /(?:http|\w+)\.HandleFunc\s*\(\s*['"]([^'"]+)['"]\s*,\s*(\w+)\s*\)(?:\s*\.\s*Methods\s*\(\s*['"](\w+)['"]\s*\))?/gim; + while ((m = handleFuncRe.exec(content)) !== null) { + const method = (m[3] || 'GET').toUpperCase(); + out.push(this.makeProvider(filePath, method, normalizeHttpPath(m[1]), m[2], 0.8)); + } + + return out; + } + private makeProvider( filePath: string, method: string, @@ -407,7 +451,7 @@ export class HttpRouteExtractor implements ContractExtractor { } private async extractConsumersSourceScan(repoPath: string): Promise { - const files = await glob('**/*.{ts,tsx,js,jsx,vue,svelte}', { + const files = await glob('**/*.{ts,tsx,js,jsx,vue,svelte,py,java,go}', { cwd: repoPath, ignore: ['**/node_modules/**', '**/.git/**'], nodir: true, @@ -418,6 +462,9 @@ export class HttpRouteExtractor implements ContractExtractor { if (!content) continue; out.push(...this.scanFetchConsumers(content, rel)); out.push(...this.scanAxiosConsumers(content, rel)); + out.push(...this.scanPythonRequestsConsumers(content, rel)); + out.push(...this.scanJavaConsumers(content, rel)); + out.push(...this.scanGoConsumers(content, rel)); } return this.dedupeContracts(out); } @@ -439,18 +486,127 @@ export class HttpRouteExtractor implements ContractExtractor { return url.replace(/\$\{[^}]+\}/g, '{param}'); } + private normalizeConsumerPath(url: string): string { + const templated = this.templateToPattern(url.trim()); + let pathOnly = templated; + if (/^https?:\/\//i.test(templated)) { + try { + pathOnly = new URL(templated).pathname; + } catch { + pathOnly = templated.replace(/^https?:\/\/[^/]+/i, ''); + } + } + + const normalized = normalizeHttpPath(pathOnly || '/'); + const segments = normalized + .split('/') + .filter(Boolean) + .map((segment) => { + if (/^\d+$/.test(segment)) return '{param}'; + return segment; + }); + return `/${segments.join('/')}`.replace(/\/+$/, '') || '/'; + } + private scanAxiosConsumers(content: string, filePath: string): ExtractedContract[] { const out: ExtractedContract[] = []; const re = /axios\.(get|post|put|delete|patch)\s*\(\s*[`'"]([^`'"]+)[`'"]/gi; let m: RegExpExecArray | null; while ((m = re.exec(content)) !== null) { const method = m[1].toUpperCase(); - const pathNorm = normalizeHttpPath(this.templateToPattern(m[2])); + const pathNorm = this.normalizeConsumerPath(m[2]); out.push(this.makeConsumer(filePath, method, pathNorm, 0.7)); } return out; } + private scanPythonRequestsConsumers(content: string, filePath: string): ExtractedContract[] { + const out: ExtractedContract[] = []; + const methodRe = /requests\.(get|post|put|delete|patch)\s*\(\s*['"]([^'"]+)['"]/gi; + let m: RegExpExecArray | null; + while ((m = methodRe.exec(content)) !== null) { + out.push( + this.makeConsumer(filePath, m[1].toUpperCase(), this.normalizeConsumerPath(m[2]), 0.7), + ); + } + + const genericRe = /requests\.request\s*\(\s*['"](\w+)['"]\s*,\s*['"]([^'"]+)['"]/gi; + while ((m = genericRe.exec(content)) !== null) { + out.push( + this.makeConsumer(filePath, m[1].toUpperCase(), this.normalizeConsumerPath(m[2]), 0.7), + ); + } + + return out; + } + + private scanJavaConsumers(content: string, filePath: string): ExtractedContract[] { + const out: ExtractedContract[] = []; + const restTemplateMethods: Array<[RegExp, string]> = [ + [/restTemplate\.getFor(?:Object|Entity)\s*\(\s*['"]([^'"]+)['"]/gi, 'GET'], + [/restTemplate\.postFor(?:Object|Entity)\s*\(\s*['"]([^'"]+)['"]/gi, 'POST'], + [/restTemplate\.put\s*\(\s*['"]([^'"]+)['"]/gi, 'PUT'], + [/restTemplate\.delete\s*\(\s*['"]([^'"]+)['"]/gi, 'DELETE'], + [/restTemplate\.patchForObject\s*\(\s*['"]([^'"]+)['"]/gi, 'PATCH'], + ]; + for (const [re, method] of restTemplateMethods) { + let m: RegExpExecArray | null; + while ((m = re.exec(content)) !== null) { + out.push(this.makeConsumer(filePath, method, this.normalizeConsumerPath(m[1]), 0.7)); + } + } + + const webClientMethodRe = + /webClient\.method\s*\(\s*HttpMethod\.(GET|POST|PUT|DELETE|PATCH)\s*,\s*['"]([^'"]+)['"]/gi; + let m: RegExpExecArray | null; + while ((m = webClientMethodRe.exec(content)) !== null) { + out.push( + this.makeConsumer(filePath, m[1].toUpperCase(), this.normalizeConsumerPath(m[2]), 0.7), + ); + } + + const okHttpRe = + /new\s+Request\.Builder\s*\(\)\s*\.url\s*\(\s*['"]([^'"]+)['"]\s*\)(?:\s*\.\s*method\s*\(\s*['"](\w+)['"])?/gim; + while ((m = okHttpRe.exec(content)) !== null) { + out.push( + this.makeConsumer( + filePath, + (m[2] || 'GET').toUpperCase(), + this.normalizeConsumerPath(m[1]), + 0.7, + ), + ); + } + + return out; + } + + private scanGoConsumers(content: string, filePath: string): ExtractedContract[] { + const out: ExtractedContract[] = []; + const httpMethodRe = /\bhttp\.(Get|Post|Head)\s*\(\s*['"]([^'"]+)['"]/gi; + let m: RegExpExecArray | null; + while ((m = httpMethodRe.exec(content)) !== null) { + const method = m[1].toUpperCase() === 'HEAD' ? 'GET' : m[1].toUpperCase(); + out.push(this.makeConsumer(filePath, method, this.normalizeConsumerPath(m[2]), 0.7)); + } + + const newRequestRe = /\bhttp\.NewRequest\s*\(\s*['"](\w+)['"]\s*,\s*['"]([^'"]+)['"]/gi; + while ((m = newRequestRe.exec(content)) !== null) { + out.push( + this.makeConsumer(filePath, m[1].toUpperCase(), this.normalizeConsumerPath(m[2]), 0.7), + ); + } + + const restyRe = /\b\w+\.R\(\)\.(Get|Post|Put|Delete|Patch)\s*\(\s*['"]([^'"]+)['"]/gi; + while ((m = restyRe.exec(content)) !== null) { + out.push( + this.makeConsumer(filePath, m[1].toUpperCase(), this.normalizeConsumerPath(m[2]), 0.7), + ); + } + + return out; + } + private makeConsumer( filePath: string, method: string, diff --git a/gitnexus/src/core/group/extractors/manifest-extractor.ts b/gitnexus/src/core/group/extractors/manifest-extractor.ts new file mode 100644 index 0000000000..09817b356c --- /dev/null +++ b/gitnexus/src/core/group/extractors/manifest-extractor.ts @@ -0,0 +1,228 @@ +import type { ContractType, CrossLink, GroupManifestLink, StoredContract } from '../types.js'; +import type { CypherExecutor } from '../contract-extractor.js'; + +export interface ManifestExtractResult { + contracts: StoredContract[]; + crossLinks: CrossLink[]; +} + +/** + * Canonicalize an HTTP path for matching against Route.name in the graph. + * Mirrors core/ingestion/pipeline.ts ensureSlash semantics: + * - Ensures a leading slash. + * - Strips trailing slashes (except the root "/"). + * - Normalizes consecutive slashes. + * - Does NOT lowercase (route matching is case-sensitive). + */ +function normalizeRoutePath(raw: string): string { + const trimmed = raw.trim(); + if (!trimmed) return '/'; + const withLeading = trimmed.startsWith('/') ? trimmed : `/${trimmed}`; + const collapsed = withLeading.replace(/\/+/g, '/'); + if (collapsed === '/') return '/'; + return collapsed.replace(/\/+$/, ''); +} + +/** + * Stable synthetic symbolUid for a manifest-declared contract whose target + * symbol could not be resolved against the per-repo graph (resolveSymbol + * returned null). Two reasons we don't leave the uid empty: + * + * 1. The bridge stores Contract nodes keyed in part by symbolUid; an empty + * uid means downstream Cypher queries that anchor on `provider.symbolUid` + * can't tell two different unresolved manifest contracts apart. + * 2. The cross-impact bridge query in cross-impact.ts joins local impact + * results to bridge contracts via `WHERE provider.symbolUid IN $localUids`. + * If the local impact engine produces a deterministic identifier for the + * unresolved target, it must agree with the value the bridge stored. A + * synthetic uid keyed off (repo, contractId) is the only thing both sides + * can derive without knowing about each other. + * + * Format: `manifest::::`. Stable across syncs, scoped to a + * single repo within a group, and never collides with real indexer uids + * (which never start with `manifest::`). + */ +export function manifestSymbolUid(repo: string, contractId: string): string { + return `manifest::${repo}::${contractId}`; +} + +export class ManifestExtractor { + async extractFromManifest( + links: GroupManifestLink[], + dbExecutors?: Map, + ): Promise { + const contracts: StoredContract[] = []; + const crossLinks: CrossLink[] = []; + + for (const link of links) { + const contractId = this.buildContractId(link.type, link.contract); + + const providerRepo = link.role === 'provider' ? link.from : link.to; + const consumerRepo = link.role === 'provider' ? link.to : link.from; + + const providerSymbol = await this.resolveSymbol(providerRepo, link, dbExecutors); + const consumerSymbol = await this.resolveSymbol(consumerRepo, link, dbExecutors); + const providerRef = providerSymbol || { filePath: '', name: link.contract }; + const consumerRef = consumerSymbol || { filePath: '', name: link.contract }; + // When the resolver finds a real graph symbol we keep its uid, otherwise + // fall back to the deterministic synthetic uid (see manifestSymbolUid). + const providerUid = providerSymbol?.uid || manifestSymbolUid(providerRepo, contractId); + const consumerUid = consumerSymbol?.uid || manifestSymbolUid(consumerRepo, contractId); + + contracts.push({ + contractId, + type: link.type, + role: 'provider', + symbolUid: providerUid, + symbolRef: providerRef, + symbolName: link.contract, + confidence: 1.0, + meta: { source: 'manifest' }, + repo: providerRepo, + }); + + contracts.push({ + contractId, + type: link.type, + role: 'consumer', + symbolUid: consumerUid, + symbolRef: consumerRef, + symbolName: link.contract, + confidence: 1.0, + meta: { source: 'manifest' }, + repo: consumerRepo, + }); + + crossLinks.push({ + from: { repo: consumerRepo, symbolUid: consumerUid, symbolRef: consumerRef }, + to: { repo: providerRepo, symbolUid: providerUid, symbolRef: providerRef }, + type: link.type, + contractId, + matchType: 'manifest', + confidence: 1.0, + }); + } + + return { contracts, crossLinks }; + } + + private async resolveSymbol( + repoPathKey: string, + link: GroupManifestLink, + dbExecutors?: Map, + ): Promise<{ filePath: string; name: string; uid: string } | null> { + const executor = dbExecutors?.get(repoPathKey); + if (!executor) return null; + + // NOTE: All lookups use EXACT equality on the relevant name field and + // deterministic ORDER BY before LIMIT 1. Previous versions used CONTAINS + // for fuzzy matching (plus an unconditional ".proto" fallback for gRPC) + // which produced silent false positives: e.g. manifest "/orders" would + // match "/suborders", and a gRPC manifest entry in a repo with any + // .proto file would attach to a random proto symbol. + // + // If resolveSymbol returns null, the extractor creates a contract with + // an empty symbolUid/ref — cross-impact still works via name-based + // matching through the `hint` path in runGroupImpact. + try { + let rows: Record[]; + if (link.type === 'http') { + // Route.name is the canonicalized URL path (see + // core/ingestion/pipeline.ts ensureSlash + generateId('Route', ...)). + // Normalize the manifest contract the same way so a user-written + // "/api/orders" matches "api/orders" in the graph. + const normalized = normalizeRoutePath(link.contract); + rows = await executor( + `MATCH (handler)-[r:CodeRelation {type: 'HANDLES_ROUTE'}]->(route:Route) + WHERE route.name = $normalized + RETURN handler.id AS uid, handler.name AS name, handler.filePath AS filePath + ORDER BY handler.filePath ASC + LIMIT 1`, + { normalized }, + ); + } else if (link.type === 'topic') { + rows = await executor( + `MATCH (n) WHERE n.name = $contract + RETURN n.id AS uid, n.name AS name, n.filePath AS filePath + ORDER BY n.filePath ASC + LIMIT 1`, + { contract: link.contract }, + ); + } else if (link.type === 'grpc') { + // Contract is "Service/Method" or just "Service" (or package.Service + // variants). Prefer matching by method name when present, otherwise + // by service name. NO .proto path fallback — that's guaranteed to + // return a wrong symbol in any repo with more than one proto file. + const parts = link.contract.split('/'); + const serviceName = parts[0]?.trim() ?? ''; + const methodName = parts[1]?.trim() ?? ''; + if (methodName) { + rows = await executor( + `MATCH (n) WHERE n.name = $methodName + RETURN n.id AS uid, n.name AS name, n.filePath AS filePath + ORDER BY n.filePath ASC + LIMIT 1`, + { methodName }, + ); + } else if (serviceName) { + rows = await executor( + `MATCH (n) WHERE n.name = $serviceName + RETURN n.id AS uid, n.name AS name, n.filePath AS filePath + ORDER BY n.filePath ASC + LIMIT 1`, + { serviceName }, + ); + } else { + rows = []; + } + } else if (link.type === 'lib') { + // Only exact match on the symbol's name. Previous fallback to + // CONTAINS on n.filePath would promote "react" to "react-native" + // or "@types/react" — silent wrong attribution. + rows = await executor( + `MATCH (n) WHERE n.name = $contract + RETURN n.id AS uid, n.name AS name, n.filePath AS filePath + ORDER BY n.filePath ASC + LIMIT 1`, + { contract: link.contract }, + ); + } else { + return null; + } + if (rows.length > 0) { + return { + filePath: rows[0].filePath as string, + name: rows[0].name as string, + uid: String(rows[0].uid ?? ''), + }; + } + } catch (err) { + // Log but don't throw: a broken graph query in one repo shouldn't + // fail the whole manifest extraction. Unresolved contracts still + // get a synthetic symbolUid below, so cross-impact can proceed. + const message = err instanceof Error ? err.message : String(err); + console.warn( + `[manifest-extractor] resolveSymbol failed for ${link.type}:${link.contract} ` + + `in ${repoPathKey}: ${message}`, + ); + } + return null; + } + + private buildContractId(type: ContractType, contract: string): string { + switch (type) { + case 'http': { + if (/^[A-Za-z]+::/.test(contract)) return `http::${contract}`; + return `http::*::${contract}`; + } + case 'grpc': + return `grpc::${contract}`; + case 'topic': + return `topic::${contract}`; + case 'lib': + return `lib::${contract}`; + case 'custom': + return `custom::${contract}`; + } + } +} diff --git a/gitnexus/src/core/group/extractors/topic-extractor.ts b/gitnexus/src/core/group/extractors/topic-extractor.ts index c27b419bbd..3d5280d862 100644 --- a/gitnexus/src/core/group/extractors/topic-extractor.ts +++ b/gitnexus/src/core/group/extractors/topic-extractor.ts @@ -6,6 +6,9 @@ import type { ExtractedContract, RepoHandle } from '../types.js'; type Broker = 'kafka' | 'rabbitmq' | 'nats'; +const KAFKAJS_CONSUMER_RUN_RE = /consumer\.run\s*\(\s*\{\s*eachMessage:/; +const KAFKAJS_SUBSCRIBE_RE = /consumer\.subscribe\s*\(\s*\{\s*topic:\s*['"]([^'"]+)['"]/g; + function readSafe(repoPath: string, rel: string): string | null { const abs = path.resolve(repoPath, rel); const base = path.resolve(repoPath); @@ -116,6 +119,54 @@ const KAFKA_PATTERNS: PatternDef[] = [ topicGroup: 1, symbolName: 'producer.send', }, + // Go: sarama.ProducerMessage{Topic: "xxx"} struct literal (emitted by + // both NewSyncProducer and NewAsyncProducer client code paths). + // + // Previous pattern was `sarama.NewSyncProducer[\s\S]{0,300}?Topic:...` + // which anchored to the producer constructor and used a 300-char + // lookahead. In a loop like + // producer := sarama.NewSyncProducer(...) + // for _, item := range items { + // msg1 := &sarama.ProducerMessage{Topic: "order.created"} + // msg2 := &sarama.ProducerMessage{Topic: "order.shipped"} + // } + // the regex captured only "order.created" (first Topic after the + // constructor) and silently missed "order.shipped". Matching on the + // struct literal directly fixes both the false negative in loops and + // the spurious cross-message capture when multiple unrelated messages + // sit within 300 chars of the constructor. + { + regex: /sarama\.ProducerMessage\s*\{[\s\S]{0,200}?Topic:\s*"([^"]+)"/g, + role: 'provider', + broker: 'kafka', + confidence: 0.75, + topicGroup: 1, + symbolName: 'sarama.ProducerMessage', + }, + // Go: kafka-go writer construction. kafka-go does NOT wrap messages in + // a struct with a Topic field (the writer owns the topic), so we match + // the Writer itself. A 200-char window bridges the gap between + // `kafka.NewWriter(...)` / `kafka.Writer{` and the Topic field inside + // the config literal — kafka-go writer configs are small and rarely + // contain more than one Topic field, so the risk of cross-message + // capture is low here. + { + regex: /kafka\.(?:NewWriter|Writer)\b[\s\S]{0,200}?Topic:\s*"([^"]+)"/g, + role: 'provider', + broker: 'kafka', + confidence: 0.75, + topicGroup: 1, + symbolName: 'kafka.Writer', + }, + // Go: kafka-go reader construction, mirrors Writer above. + { + regex: /kafka\.(?:NewReader|Reader)\b[\s\S]{0,200}?Topic:\s*"([^"]+)"/g, + role: 'consumer', + broker: 'kafka', + confidence: 0.75, + topicGroup: 1, + symbolName: 'kafka.Reader', + }, ]; // --- RabbitMQ patterns --- @@ -205,6 +256,42 @@ const NATS_PATTERNS: PatternDef[] = [ topicGroup: 1, symbolName: 'nc.Publish', }, + // Go/Node JetStream: js.Subscribe("xxx" + { + regex: /js\.(?:S|s)ubscribe\s*\(\s*"([^"]+)"/g, + role: 'consumer', + broker: 'nats', + confidence: 0.8, + topicGroup: 1, + symbolName: 'js.Subscribe', + }, + // Go/Node JetStream: js.Publish("xxx" + { + regex: /js\.(?:P|p)ublish\s*\(\s*"([^"]+)"/g, + role: 'provider', + broker: 'nats', + confidence: 0.8, + topicGroup: 1, + symbolName: 'js.Publish', + }, + // Python: await nc.subscribe("xxx") + { + regex: /await\s+nc\.subscribe\s*\(\s*['"]([^'"]+)['"]/g, + role: 'consumer', + broker: 'nats', + confidence: 0.75, + topicGroup: 1, + symbolName: 'nc.subscribe', + }, + // Python: await nc.publish("xxx") + { + regex: /await\s+nc\.publish\s*\(\s*['"]([^'"]+)['"]/g, + role: 'provider', + broker: 'nats', + confidence: 0.75, + topicGroup: 1, + symbolName: 'nc.publish', + }, ]; const ALL_PATTERNS: PatternDef[] = [...KAFKA_PATTERNS, ...RABBITMQ_PATTERNS, ...NATS_PATTERNS]; @@ -229,6 +316,7 @@ export class TopicExtractor implements ContractExtractor { const out: ExtractedContract[] = []; for (const rel of files) { + if (rel.endsWith('_test.go')) continue; const content = readSafe(repoPath, rel); if (!content) continue; out.push(...this.scanFile(content, rel)); @@ -260,6 +348,16 @@ export class TopicExtractor implements ContractExtractor { } } + if (KAFKAJS_CONSUMER_RUN_RE.test(content)) { + const subscribeRe = new RegExp(KAFKAJS_SUBSCRIBE_RE.source, KAFKAJS_SUBSCRIBE_RE.flags); + let subscribeMatch: RegExpExecArray | null; + while ((subscribeMatch = subscribeRe.exec(content)) !== null) { + const topicName = subscribeMatch[1]; + if (!topicName) continue; + out.push(makeContract(topicName, 'consumer', filePath, 'consumer.run', 0.75, 'kafka')); + } + } + return out; } diff --git a/gitnexus/src/core/group/matching.ts b/gitnexus/src/core/group/matching.ts index 6d39f4ce4d..ec793968bd 100644 --- a/gitnexus/src/core/group/matching.ts +++ b/gitnexus/src/core/group/matching.ts @@ -5,6 +5,15 @@ export interface MatchResult { unmatched: StoredContract[]; } +export interface WildcardMatchResult { + matched: CrossLink[]; + remaining: StoredContract[]; +} + +function isGrpcWildcard(cid: string): boolean { + return cid.startsWith('grpc::') && cid.endsWith('/*'); +} + export function normalizeContractId(id: string): string { const colonIdx = id.indexOf('::'); if (colonIdx === -1) return id; @@ -24,6 +33,22 @@ export function normalizeContractId(id: string): string { return id; } case 'grpc': { + // Canonical form: `grpc::[/]`. + // + // The package/service segment is lowercased because gRPC package + // names are effectively case-insensitive across language bindings + // (`auth.AuthService`, `auth.authservice`, `AUTH.AUTHSERVICE` all + // describe the same wire protocol service). The RPC method segment + // is preserved as-is because the HTTP/2 path used on the wire is + // case-sensitive per the gRPC spec (`/Service/MethodName`), and + // method names in generated clients match the proto source exactly. + // + // A package-only id (no slash) and a package/method id are treated + // as DISTINCT canonical forms: `grpc::userservice` does not match + // `grpc::userservice/Login`. That's by design — callers that want + // service-level manifest matching against method-level providers + // should use the gRPC wildcard form `grpc::UserService/*` which is + // handled by runWildcardMatch below. const slashIdx = rest.indexOf('/'); if (slashIdx > 0) { const pkg = rest.substring(0, slashIdx).toLowerCase(); @@ -31,12 +56,12 @@ export function normalizeContractId(id: string): string { return `grpc::${pkg}${method}`; } if (slashIdx === 0) { - // Malformed "package/method" with leading slash — do not lowercase the whole string - // (method segment is case-sensitive per spec). + // Malformed "/method" with leading slash — keep as-is so two + // equally malformed ids can still match each other. return `grpc::${rest}`; } - // No slash: spec is ambiguous (package-only vs full service.method). MVP: lowercase - // the whole token; differs from pkg/method split above where RPC method keeps case. + // No slash: package/service only. Lowercase to match the package + // segment produced by the pkg/method branch above. return `grpc::${rest.toLowerCase()}`; } case 'topic': @@ -66,27 +91,36 @@ function findMatchingKeys(contractId: string, index: Map { const providers = contracts.filter((c) => c.role === 'provider'); - const consumers = contracts.filter((c) => c.role === 'consumer'); - - const providerIndex = new Map(); + const index = new Map(); for (const p of providers) { const key = normalizeContractId(p.contractId); - const list = providerIndex.get(key) || []; + const list = index.get(key) || []; list.push(p); - providerIndex.set(key, list); + index.set(key, list); } + return index; +} + +export function runExactMatch( + contracts: StoredContract[], + providerIndex?: Map, +): MatchResult { + const index = providerIndex ?? buildProviderIndex(contracts); + + // Skip gRPC wildcard consumers — they go to wildcard pass only + const consumers = contracts.filter((c) => c.role === 'consumer' && !isGrpcWildcard(c.contractId)); const matched: CrossLink[] = []; const matchedConsumerIds = new Set(); const matchedProviderIds = new Set(); for (const consumer of consumers) { - const matchingKeys = findMatchingKeys(consumer.contractId, providerIndex); + const matchingKeys = findMatchingKeys(consumer.contractId, index); if (matchingKeys.length === 0) continue; - const allMatchingProviders = matchingKeys.flatMap((k) => providerIndex.get(k) || []); + const allMatchingProviders = matchingKeys.flatMap((k) => index.get(k) || []); for (const provider of allMatchingProviders) { if (provider.repo === consumer.repo) { if (!provider.service || !consumer.service || provider.service === consumer.service) { @@ -118,10 +152,86 @@ export function runExactMatch(contracts: StoredContract[]): MatchResult { } } - const unmatched = contracts.filter((c) => { + // normalUnmatched: contracts that weren't matched in exact pass + const normalUnmatched = contracts.filter((c) => { + if (isGrpcWildcard(c.contractId)) return false; // excluded from exact, handled separately const id = `${c.repo}::${c.contractId}`; return c.role === 'provider' ? !matchedProviderIds.has(id) : !matchedConsumerIds.has(id); }); + // Re-add gRPC wildcard contracts — they were never in exact matching + const grpcWildcards = contracts.filter((c) => isGrpcWildcard(c.contractId)); + const unmatched = [...normalUnmatched, ...grpcWildcards]; + return { matched, unmatched }; } + +export function runWildcardMatch( + unmatched: StoredContract[], + providerIndex: Map, +): WildcardMatchResult { + const wildcardConsumers = unmatched.filter( + (c) => c.role === 'consumer' && isGrpcWildcard(c.contractId), + ); + const matched: CrossLink[] = []; + const matchedConsumerIds = new Set(); + + for (const consumer of wildcardConsumers) { + const normalized = normalizeContractId(consumer.contractId); + // "grpc::com.example.userservice/*" → "com.example.userservice" + // "grpc::userservice/*" → "userservice" + const fqService = normalized.slice(normalized.indexOf('::') + 2, -2); // strip "grpc::" and "/*" + + for (const [key, providers] of providerIndex) { + // Only match against non-wildcard gRPC providers (method-level IDs) + if (!key.startsWith('grpc::') || key.endsWith('/*')) continue; + const afterPrefix = key.slice(6); // strip "grpc::" + const slashIdx = afterPrefix.indexOf('/'); + if (slashIdx < 0) continue; + const providerFqService = afterPrefix.slice(0, slashIdx); + + // Match: exact FQ service, or bare-name match when consumer has no package + const isMatch = + providerFqService === fqService || + (!fqService.includes('.') && providerFqService.endsWith('.' + fqService)); + + if (!isMatch) continue; + + for (const provider of providers) { + // Skip same-repo same-service (same logic as runExactMatch) + if (provider.repo === consumer.repo) { + if (!provider.service || !consumer.service || provider.service === consumer.service) { + continue; + } + } + + matched.push({ + from: { + repo: consumer.repo, + service: consumer.service, + symbolUid: consumer.symbolUid, + symbolRef: consumer.symbolRef, + }, + to: { + repo: provider.repo, + service: provider.service, + symbolUid: provider.symbolUid, + symbolRef: provider.symbolRef, + }, + type: consumer.type, + contractId: consumer.contractId, // consumer's wildcard ID + matchType: 'wildcard', + confidence: Math.min(provider.confidence, consumer.confidence), + }); + matchedConsumerIds.add(`${consumer.repo}::${consumer.contractId}`); + } + } + } + + const remaining = unmatched.filter((c) => { + if (c.role !== 'consumer' || !isGrpcWildcard(c.contractId)) return true; + return !matchedConsumerIds.has(`${c.repo}::${c.contractId}`); + }); + + return { matched, remaining }; +} diff --git a/gitnexus/src/core/group/normalization.ts b/gitnexus/src/core/group/normalization.ts new file mode 100644 index 0000000000..192aec5b9d --- /dev/null +++ b/gitnexus/src/core/group/normalization.ts @@ -0,0 +1,105 @@ +import type { CrossLink, CrossLinkEndpoint, StoredContract } from './types.js'; + +function contractKey(contract: StoredContract): string { + return [contract.repo, contract.contractId, contract.role, contract.symbolRef.filePath].join( + '\0', + ); +} + +function endpointKey(endpoint: CrossLinkEndpoint): string { + return [ + endpoint.repo, + endpoint.service ?? '', + endpoint.symbolRef.filePath, + endpoint.symbolRef.name, + ].join('\0'); +} + +function contractRichness(contract: StoredContract): number { + let score = 0; + if (contract.symbolUid) score += 3; + if (contract.symbolRef.filePath) score += 2; + if (contract.symbolRef.name && contract.symbolRef.name !== contract.contractId) score += 2; + if (contract.symbolName && contract.symbolName !== contract.contractId) score += 2; + if (contract.service) score += 1; + if (contract.meta.source !== 'manifest') score += 1; + return score; +} + +function mergeContracts(existing: StoredContract, incoming: StoredContract): StoredContract { + const [primary, secondary] = + contractRichness(incoming) > contractRichness(existing) + ? [incoming, existing] + : [existing, incoming]; + const symbolRefName = primary.symbolRef.name || secondary.symbolRef.name; + return { + ...secondary, + ...primary, + symbolUid: primary.symbolUid || secondary.symbolUid, + symbolRef: { + filePath: primary.symbolRef.filePath || secondary.symbolRef.filePath, + name: symbolRefName, + }, + symbolName: primary.symbolName || secondary.symbolName || symbolRefName, + confidence: Math.max(existing.confidence, incoming.confidence), + service: primary.service ?? secondary.service, + meta: { ...secondary.meta, ...primary.meta }, + }; +} + +function mergeEndpoints( + existing: CrossLinkEndpoint, + incoming: CrossLinkEndpoint, +): CrossLinkEndpoint { + return { + repo: existing.repo, + service: existing.service ?? incoming.service, + symbolUid: existing.symbolUid || incoming.symbolUid, + symbolRef: { + filePath: existing.symbolRef.filePath || incoming.symbolRef.filePath, + name: existing.symbolRef.name || incoming.symbolRef.name, + }, + }; +} + +function crossLinkKey(link: CrossLink): string { + return [ + link.type, + link.contractId, + link.matchType, + endpointKey(link.from), + endpointKey(link.to), + ].join('\0'); +} + +export function dedupeContracts(items: StoredContract[]): StoredContract[] { + const deduped = new Map(); + for (const contract of items) { + const key = contractKey(contract); + const existing = deduped.get(key); + deduped.set(key, existing ? mergeContracts(existing, contract) : contract); + } + return [...deduped.values()]; +} + +export function dedupeCrossLinks(items: CrossLink[]): CrossLink[] { + const deduped = new Map(); + for (const link of items) { + const key = crossLinkKey(link); + const existing = deduped.get(key); + if (!existing) { + deduped.set(key, link); + continue; + } + const keepIncoming = link.confidence > existing.confidence; + const primary = keepIncoming ? link : existing; + const secondary = keepIncoming ? existing : link; + deduped.set(key, { + ...primary, + confidence: Math.max(existing.confidence, link.confidence), + from: mergeEndpoints(primary.from, secondary.from), + to: mergeEndpoints(primary.to, secondary.to), + }); + } + return [...deduped.values()]; +} diff --git a/gitnexus/src/core/group/service.ts b/gitnexus/src/core/group/service.ts index 1530cd6ddc..8d532f6802 100644 --- a/gitnexus/src/core/group/service.ts +++ b/gitnexus/src/core/group/service.ts @@ -1,11 +1,13 @@ /** - * Group orchestration shared by MCP (LocalBackend) and CLI. + * Cross-repo group orchestration shared by MCP (LocalBackend) and CLI. * DB access is injected via GroupToolPort so this module stays free of LocalBackend private API. */ import { checkStaleness } from '../git-staleness.js'; +import { queryBridge, closeBridgeDb } from './bridge-db.js'; import { loadGroupConfig } from './config-parser.js'; -import { getDefaultGitnexusDir, getGroupDir, listGroups, readContractRegistry } from './storage.js'; +import { runGroupImpact, runGroupImpactLegacy } from './cross-impact.js'; +import { getDefaultGitnexusDir, getGroupDir, listGroups, openBridgeOrFallback } from './storage.js'; import { syncGroup } from './sync.js'; export interface GroupRepoHandle { @@ -103,23 +105,379 @@ export class GroupService { const name = String(params.name ?? '').trim(); if (!name) return { error: 'name is required' }; const groupDir = getGroupDir(getDefaultGitnexusDir(), name); - const registry = await readContractRegistry(groupDir); - if (!registry) { - return { error: `No contracts.json for group "${name}". Run group_sync first.` }; - } - let contracts = registry.contracts; - if (params.type) contracts = contracts.filter((c) => c.type === params.type); - if (params.repo) contracts = contracts.filter((c) => c.repo === params.repo); - if (params.unmatchedOnly) { - const matchedIds = new Set( - registry.crossLinks.flatMap((l) => [ - `${l.from.repo}::${l.contractId}`, - `${l.to.repo}::${l.contractId}`, - ]), + + const fallback = await openBridgeOrFallback(groupDir); + if (fallback.type === 'none') { + return { error: `No contract data for group "${name}". Run group_sync first.` }; + } + + if (fallback.type === 'json') { + const registry = fallback.registry; + let contracts = registry.contracts; + if (params.type) contracts = contracts.filter((c) => c.type === params.type); + if (params.repo) contracts = contracts.filter((c) => c.repo === params.repo); + if (params.unmatchedOnly) { + const matchedIds = new Set( + registry.crossLinks.flatMap((l) => [ + `${l.from.repo}::${l.contractId}`, + `${l.to.repo}::${l.contractId}`, + ]), + ); + contracts = contracts.filter((c) => !matchedIds.has(`${c.repo}::${c.contractId}`)); + } + return { contracts, crossLinks: registry.crossLinks }; + } + + // Bridge path — query Contract nodes with flat field projection + const handle = fallback.handle; + try { + let cypher = 'MATCH (c:Contract)'; + const queryParams: Record = {}; + const whereClauses: string[] = []; + if (params.type) { + whereClauses.push('c.type = $type'); + queryParams.type = params.type; + } + if (params.repo) { + whereClauses.push('c.repo = $repo'); + queryParams.repo = params.repo; + } + if (whereClauses.length > 0) { + cypher += ` WHERE ${whereClauses.join(' AND ')}`; + } + cypher += + ' RETURN c.contractId AS contractId, c.type AS type, c.role AS role, c.repo AS repo,' + + ' c.service AS service, c.symbolUid AS symbolUid, c.filePath AS filePath,' + + ' c.symbolName AS symbolName, c.confidence AS confidence, c.meta AS meta'; + + const rawContracts = await queryBridge<{ + contractId: string; + type: string; + role: string; + repo: string; + service: string; + symbolUid: string; + filePath: string; + symbolName: string; + confidence: number; + meta: string; + }>(handle, cypher, queryParams as Record); + + // Reconstruct StoredContract shape for CLI compatibility. + // meta is stored in the bridge as a JSON-stringified blob (see + // writeBridge()). A single corrupted or non-JSON meta row must not + // take down the whole groupContracts() call — degrade to an empty + // object for that row and keep going. The error is swallowed + // intentionally here because there is no per-row logger yet; the + // metaParseFailures counter is returned so callers can surface it. + let metaParseFailures = 0; + const safeParseMeta = (raw: unknown): Record => { + if (raw == null) return {}; + if (typeof raw !== 'string') return raw as Record; + try { + return JSON.parse(raw) as Record; + } catch { + metaParseFailures++; + return {}; + } + }; + let contracts = rawContracts.map((r) => ({ + contractId: r.contractId, + type: r.type, + role: r.role, + repo: r.repo, + service: r.service, + symbolUid: r.symbolUid, + symbolRef: { filePath: r.filePath, name: r.symbolName }, + symbolName: r.symbolName, + confidence: r.confidence, + meta: safeParseMeta(r.meta), + })); + + // Query cross-links + const rawLinks = await queryBridge<{ + fromRepo: string; + toRepo: string; + matchType: string; + confidence: number; + linkContractId: string; + }>( + handle, + `MATCH (a:Contract)-[l:ContractLink]->(b:Contract) + RETURN l.fromRepo AS fromRepo, l.toRepo AS toRepo, + l.matchType AS matchType, l.confidence AS confidence, + l.contractId AS linkContractId`, ); - contracts = contracts.filter((c) => !matchedIds.has(`${c.repo}::${c.contractId}`)); + const crossLinks = rawLinks.map((l) => ({ + from: { repo: l.fromRepo }, + to: { repo: l.toRepo }, + matchType: l.matchType, + confidence: l.confidence, + contractId: l.linkContractId, + })); + + // Apply unmatchedOnly filter + if (params.unmatchedOnly) { + const matchedIds = new Set( + crossLinks.flatMap((l) => [ + `${l.from.repo}::${l.contractId}`, + `${l.to.repo}::${l.contractId}`, + ]), + ); + contracts = contracts.filter((c) => !matchedIds.has(`${c.repo}::${c.contractId}`)); + } + + return { + contracts, + crossLinks, + ...(metaParseFailures > 0 ? { metaParseFailures } : {}), + }; + } finally { + await closeBridgeDb(handle); + } + } + + async groupImpact(params: Record): Promise { + const name = String(params.name ?? '').trim(); + const targetSymbol = String(params.target ?? '').trim(); + const repoGroupPath = String(params.repo ?? '').trim(); + if (!name || !targetSymbol || !repoGroupPath) { + return { error: 'name, target, and repo are required' }; + } + + // Strict validation for numeric/enum params — MCP is a public interface + // and the worker may be invoked by untrusted LLMs. Bounds are chosen + // conservatively to prevent DoS (unbounded impact walks, long timeouts) + // while still allowing reasonable traversal. + if ( + params.direction !== undefined && + params.direction !== 'upstream' && + params.direction !== 'downstream' + ) { + return { + error: `direction must be 'upstream' or 'downstream', got ${JSON.stringify(params.direction)}`, + }; + } + const direction: 'upstream' | 'downstream' = + params.direction === 'downstream' ? 'downstream' : 'upstream'; + + const maxDepthRaw = params.maxDepth; + if (maxDepthRaw !== undefined) { + if ( + typeof maxDepthRaw !== 'number' || + !Number.isFinite(maxDepthRaw) || + !Number.isInteger(maxDepthRaw) || + maxDepthRaw < 1 || + maxDepthRaw > 10 + ) { + return { + error: `maxDepth must be an integer in [1, 10], got ${JSON.stringify(maxDepthRaw)}`, + }; + } + } + const maxDepth = typeof maxDepthRaw === 'number' ? maxDepthRaw : 3; + + const minConfidenceRaw = params.minConfidence; + if (minConfidenceRaw !== undefined) { + if ( + typeof minConfidenceRaw !== 'number' || + !Number.isFinite(minConfidenceRaw) || + minConfidenceRaw < 0 || + minConfidenceRaw > 1 + ) { + return { + error: `minConfidence must be a number in [0, 1], got ${JSON.stringify(minConfidenceRaw)}`, + }; + } + } + const minConfidence = typeof minConfidenceRaw === 'number' ? minConfidenceRaw : 0.5; + + const timeoutRaw = params.timeout; + if (timeoutRaw !== undefined) { + if ( + typeof timeoutRaw !== 'number' || + !Number.isFinite(timeoutRaw) || + timeoutRaw < 100 || + timeoutRaw > 300000 + ) { + return { + error: `timeout must be a number in [100, 300000] ms, got ${JSON.stringify(timeoutRaw)}`, + }; + } + } + const timeout = typeof timeoutRaw === 'number' ? timeoutRaw : 30000; + + const crossDepthRaw = params.crossDepth; + if (crossDepthRaw !== undefined) { + if ( + typeof crossDepthRaw !== 'number' || + !Number.isFinite(crossDepthRaw) || + !Number.isInteger(crossDepthRaw) || + crossDepthRaw < 0 || + crossDepthRaw > 10 + ) { + return { + error: `crossDepth must be an integer in [0, 10], got ${JSON.stringify(crossDepthRaw)}`, + }; + } + } + const requestedCrossDepth = typeof crossDepthRaw === 'number' ? crossDepthRaw : 1; + + const subgroup = typeof params.subgroup === 'string' ? params.subgroup : undefined; + const groupDir = getGroupDir(getDefaultGitnexusDir(), name); + + const config = await loadGroupConfig(groupDir); + + const fallback = await openBridgeOrFallback(groupDir); + if (fallback.type === 'none') { + return { error: `No contract data for group "${name}". Run group_sync first.` }; + } + // NOTE: crossDepth is clamped to 1 by runGroupImpact itself + // (MAX_SUPPORTED_CROSS_DEPTH); we only surface a warning here. + const crossDepth = Math.max(0, Math.min(requestedCrossDepth, 1)); + const crossDepthWarning = + requestedCrossDepth > 1 + ? `Multi-hop cross-boundary traversal is not yet implemented. Using --cross-depth 1 (requested: ${requestedCrossDepth}).` + : undefined; + + const defaultRelTypes = ['CALLS', 'IMPORTS', 'EXTENDS', 'IMPLEMENTS']; + + // impactOpts.minConfidence is intentionally 0: it's applied to the + // intra-repo impact walk where edges (CALLS/IMPORTS/EXTENDS/IMPLEMENTS) + // don't carry a meaningful confidence score. The user-facing + // `minConfidence` param filters CROSS-REPO contract links in + // runGroupImpact/runGroupImpactLegacy (see minConfidence passed below). + const impactOpts = { + maxDepth, + relationTypes: defaultRelTypes, + minConfidence: 0, + includeTests: false, + }; + + const resolveGroupRepo = async (groupPath: string): Promise => { + const registryName = config.repos[groupPath]; + if (!registryName) throw new Error(`Repo "${groupPath}" not found in group "${name}"`); + return this.port.resolveRepo(registryName); + }; + + // Wrap localImpactFn callbacks so any exception (e.g. resolveGroupRepo + // throwing on a missing repo) becomes a null/error result instead of + // bubbling past runPhase1WithTimeout (which only catches timeouts, not + // rejections). Without this wrap, an unhandled rejection from the + // callback would crash the MCP tool handler. + const safeLocalImpact = async (t: string, d: string): Promise => { + try { + const repoObj = await resolveGroupRepo(repoGroupPath); + return await this.port.impact(repoObj, { + target: t, + direction: d as 'upstream' | 'downstream', + ...impactOpts, + }); + } catch (err) { + return { + error: `local impact failed: ${err instanceof Error ? err.message : String(err)}`, + target: { id: '', name: t, filePath: '' }, + direction: d, + impactedCount: 0, + risk: 'LOW', + summary: { direct: 0, processes_affected: 0, modules_affected: 0 }, + affected_processes: [], + affected_modules: [], + byDepth: {}, + }; + } + }; + + if (fallback.type === 'json') { + // Legacy JSON path + const result = await runGroupImpactLegacy({ + groupName: name, + target: targetSymbol, + repoPath: repoGroupPath, + direction, + registry: fallback.registry, + localImpactFn: safeLocalImpact, + crossImpactFn: async (targetGroupPath: string, uid: string, d: string) => { + const registryName = config.repos[targetGroupPath]; + if (!registryName) return null; + try { + const repoObj = await this.port.resolveRepo(registryName); + return this.port.impactByUid(repoObj.id, uid, d, impactOpts); + } catch { + return null; + } + }, + maxDepth, + minConfidence, + subgroup, + timeout, + crossDepth, + }); + + if (crossDepthWarning) { + result.crossDepthWarning = crossDepthWarning; + } + return result; + } + + // Bridge path + const handle = fallback.handle; + try { + const result = await runGroupImpact({ + groupName: name, + target: targetSymbol, + repoPath: repoGroupPath, + direction, + bridgeQuery: (cypher, p) => + queryBridge(handle, cypher, p as Record), + localImpactFn: safeLocalImpact, + crossImpactFn: async ( + targetGroupPath: string, + uid: string, + d: string, + hint?: { filePath: string; symbolName: string }, + ) => { + const registryName = config.repos[targetGroupPath]; + if (!registryName) return null; + try { + const repoObj = await this.port.resolveRepo(registryName); + if (uid) { + return this.port.impactByUid(repoObj.id, uid, d, impactOpts); + } + // Name-based fallback for empty UID (gRPC contracts) + if (hint?.symbolName) { + const hintResult = await this.port.impact(repoObj, { + target: hint.symbolName, + direction: d as 'upstream' | 'downstream', + ...impactOpts, + }); + if ( + hintResult && + typeof hintResult === 'object' && + 'error' in (hintResult as Record) + ) + return null; + return hintResult; + } + return null; + } catch { + return null; + } + }, + maxDepth, + minConfidence, + subgroup, + timeout, + crossDepth, + }); + + if (crossDepthWarning) { + result.crossDepthWarning = crossDepthWarning; + } + return result; + } finally { + await closeBridgeDb(handle); } - return { contracts, crossLinks: registry.crossLinks }; } async groupQuery(params: Record): Promise { @@ -172,52 +530,118 @@ export class GroupService { if (!name) return { error: 'name is required' }; const groupDir = getGroupDir(getDefaultGitnexusDir(), name); const config = await loadGroupConfig(groupDir); - const registry = await readContractRegistry(groupDir); - const repoStatuses: Record< - string, - { - indexStale: boolean; - contractsStale: boolean; - missing: boolean; - commitsBehind?: number; - } - > = {}; + const fallback = await openBridgeOrFallback(groupDir); + if (fallback.type === 'none') { + return { group: name, lastSync: null, missingRepos: [], repos: {} }; + } const fsp = await import('node:fs/promises'); const pathMod = await import('node:path'); - for (const [repoPath, registryName] of Object.entries(config.repos)) { - try { - const repoObj = await this.port.resolveRepo(registryName); - const metaPath = pathMod.join(repoObj.storagePath, 'meta.json'); - const metaRaw = await fsp.readFile(metaPath, 'utf-8').catch(() => '{}'); - const meta = JSON.parse(metaRaw) as { lastCommit?: string; indexedAt?: string }; - - const staleness = meta.lastCommit - ? checkStaleness(repoObj.repoPath, meta.lastCommit) - : { isStale: true, commitsBehind: -1 }; - - const snapshot = registry?.repoSnapshots[repoPath]; - const contractsStale = - snapshot && meta.indexedAt ? snapshot.indexedAt !== meta.indexedAt : !snapshot; - - repoStatuses[repoPath] = { - indexStale: staleness.isStale, - contractsStale: Boolean(contractsStale), - missing: false, - commitsBehind: staleness.commitsBehind, - }; - } catch { - repoStatuses[repoPath] = { indexStale: false, contractsStale: false, missing: true }; + if (fallback.type === 'json') { + const registry = fallback.registry; + const repoStatuses: Record< + string, + { + indexStale: boolean; + contractsStale: boolean; + missing: boolean; + commitsBehind?: number; + } + > = {}; + + for (const [repoPath, registryName] of Object.entries(config.repos)) { + try { + const repoObj = await this.port.resolveRepo(registryName); + const metaPath = pathMod.join(repoObj.storagePath, 'meta.json'); + const metaRaw = await fsp.readFile(metaPath, 'utf-8').catch(() => '{}'); + const meta = JSON.parse(metaRaw) as { lastCommit?: string; indexedAt?: string }; + + const staleness = meta.lastCommit + ? checkStaleness(repoObj.repoPath, meta.lastCommit) + : { isStale: true, commitsBehind: -1 }; + + const snapshot = registry.repoSnapshots[repoPath]; + const contractsStale = + snapshot && meta.indexedAt ? snapshot.indexedAt !== meta.indexedAt : !snapshot; + + repoStatuses[repoPath] = { + indexStale: staleness.isStale, + contractsStale: Boolean(contractsStale), + missing: false, + commitsBehind: staleness.commitsBehind, + }; + } catch { + repoStatuses[repoPath] = { indexStale: false, contractsStale: false, missing: true }; + } } + + return { + group: name, + lastSync: registry.generatedAt || null, + missingRepos: registry.missingRepos || [], + repos: repoStatuses, + }; } - return { - group: name, - lastSync: registry?.generatedAt || null, - missingRepos: registry?.missingRepos || [], - repos: repoStatuses, - }; + // Bridge path + const handle = fallback.handle; + const meta = fallback.meta; + try { + const snapshots = await queryBridge<{ id: string; indexedAt: string; lastCommit: string }>( + handle, + 'MATCH (s:RepoSnapshot) RETURN s.id AS id, s.indexedAt AS indexedAt, s.lastCommit AS lastCommit', + ); + const bridgeSnapshots: Record = {}; + for (const s of snapshots) { + bridgeSnapshots[s.id] = { indexedAt: s.indexedAt, lastCommit: s.lastCommit }; + } + + const repoStatuses: Record< + string, + { + indexStale: boolean; + contractsStale: boolean; + missing: boolean; + commitsBehind?: number; + } + > = {}; + + for (const [repoPath, registryName] of Object.entries(config.repos)) { + try { + const repoObj = await this.port.resolveRepo(registryName); + const metaPath = pathMod.join(repoObj.storagePath, 'meta.json'); + const metaRaw = await fsp.readFile(metaPath, 'utf-8').catch(() => '{}'); + const repoMeta = JSON.parse(metaRaw) as { lastCommit?: string; indexedAt?: string }; + + const staleness = repoMeta.lastCommit + ? checkStaleness(repoObj.repoPath, repoMeta.lastCommit) + : { isStale: true, commitsBehind: -1 }; + + const snapshot = bridgeSnapshots[repoPath]; + const contractsStale = + snapshot && repoMeta.indexedAt ? snapshot.indexedAt !== repoMeta.indexedAt : !snapshot; + + repoStatuses[repoPath] = { + indexStale: staleness.isStale, + contractsStale: Boolean(contractsStale), + missing: false, + commitsBehind: staleness.commitsBehind, + }; + } catch { + repoStatuses[repoPath] = { indexStale: false, contractsStale: false, missing: true }; + } + } + + return { + group: name, + lastSync: meta.generatedAt || null, + missingRepos: meta.missingRepos || [], + repos: repoStatuses, + }; + } finally { + await closeBridgeDb(handle); + } } } diff --git a/gitnexus/src/core/group/storage.ts b/gitnexus/src/core/group/storage.ts index aa6a781a53..7d48cfdb12 100644 --- a/gitnexus/src/core/group/storage.ts +++ b/gitnexus/src/core/group/storage.ts @@ -2,9 +2,13 @@ import * as fs from 'node:fs'; import * as fsp from 'node:fs/promises'; import * as path from 'node:path'; import * as os from 'node:os'; -import type { ContractRegistry } from './types.js'; - -const CONTRACTS_FILE = 'contracts.json'; +import type { + ContractRegistry, + BridgeHandle, + BridgeMeta, + LegacyContractRegistry, +} from './types.js'; +import { closeBridgeDb, openBridgeDbReadOnly, readBridgeMeta } from './bridge-db.js'; export function getDefaultGitnexusDir(): string { return process.env.GITNEXUS_HOME || path.join(os.homedir(), '.gitnexus'); @@ -29,19 +33,12 @@ export function getGroupDir(gitnexusDir: string, groupName: string): string { return path.join(gitnexusDir, 'groups', groupName); } -export async function writeContractRegistry( - groupDir: string, - registry: ContractRegistry, -): Promise { - const targetPath = path.join(groupDir, CONTRACTS_FILE); - const tmpPath = `${targetPath}.tmp.${Date.now()}`; - - await fsp.writeFile(tmpPath, JSON.stringify(registry, null, 2), 'utf-8'); - await fsp.rename(tmpPath, targetPath); -} - -export async function readContractRegistry(groupDir: string): Promise { - const filePath = path.join(groupDir, CONTRACTS_FILE); +/** + * @deprecated Used only as internal JSON fallback for openBridgeOrFallback. + * New data is written to bridge.lbug via writeBridge. + */ +async function readContractRegistryJson(groupDir: string): Promise { + const filePath = path.join(groupDir, 'contracts.json'); try { const content = await fsp.readFile(filePath, 'utf-8'); return JSON.parse(content) as ContractRegistry; @@ -107,3 +104,38 @@ matching: await fsp.writeFile(path.join(groupDir, 'group.yaml'), template, 'utf-8'); return groupDir; } + +export async function openBridgeOrFallback( + groupDir: string, +): Promise< + | { type: 'bridge'; handle: BridgeHandle; meta: BridgeMeta } + | { type: 'json'; registry: LegacyContractRegistry; deprecationWarning: string } + | { type: 'none' } +> { + const handle = await openBridgeDbReadOnly(groupDir); + if (handle) { + // readBridgeMeta has its own try/catch and returns a default when + // meta.json is missing, but defensively guard against any other + // failure so we never leak an opened bridge handle. + try { + const meta = await readBridgeMeta(groupDir); + return { type: 'bridge', handle, meta }; + } catch (err) { + await closeBridgeDb(handle).catch(() => { + /* ignore: cleanup path, best effort */ + }); + throw err; + } + } + // JSON fallback + const registry = await readContractRegistryJson(groupDir); + if (registry) { + return { + type: 'json', + registry, + deprecationWarning: + 'contracts.json is deprecated. Run "gitnexus group sync " to migrate to bridge.lbug.', + }; + } + return { type: 'none' }; +} diff --git a/gitnexus/src/core/group/sync.ts b/gitnexus/src/core/group/sync.ts index 92cd9fe5f1..f53a146711 100644 --- a/gitnexus/src/core/group/sync.ts +++ b/gitnexus/src/core/group/sync.ts @@ -7,11 +7,12 @@ import type { GroupConfig, RepoHandle, RepoSnapshot, StoredContract, CrossLink } import { HttpRouteExtractor } from './extractors/http-route-extractor.js'; import { GrpcExtractor } from './extractors/grpc-extractor.js'; import { TopicExtractor } from './extractors/topic-extractor.js'; -import { runExactMatch } from './matching.js'; +import { ManifestExtractor } from './extractors/manifest-extractor.js'; +import { buildProviderIndex, runExactMatch, runWildcardMatch } from './matching.js'; import { detectServiceBoundaries, assignService } from './service-boundary-detector.js'; import type { CypherExecutor } from './contract-extractor.js'; -import { writeContractRegistry } from './storage.js'; -import type { ContractRegistry } from './types.js'; +import { writeBridge } from './bridge-db.js'; +import { dedupeContracts, dedupeCrossLinks } from './normalization.js'; export interface SyncOptions { extractorOverride?: @@ -26,12 +27,49 @@ export interface SyncOptions { skipEmbeddings?: boolean; } +/** + * Per-repo failure kind captured during syncGroup. A non-empty array on + * the result means at least one repo had something fail mid-pipeline; the + * repo was NOT marked missing (we kept whatever the other steps produced), + * but the user should see these to debug incomplete coverage. + * + * Label meanings: + * - `init` — opening the per-repo LadybugDB pool failed; repo + * gets added to missingRepos and the other steps are + * skipped for that repo. + * - `boundaries` — detectServiceBoundaries() threw; contracts are + * still extracted but without service attribution. + * - `http|grpc|topic` — the named extractor threw; the other extractors + * in the same repo still run. + * - `manifest` — ManifestExtractor.extractFromManifest() threw. + * - `bridge_write` — a non-fatal error inside writeBridge (individual + * contracts/links/snapshots that failed to insert). + * The bridge is still written; `message` includes a + * summary of the partial-failure counts. + */ +export type ExtractorKind = + | 'init' + | 'boundaries' + | 'http' + | 'grpc' + | 'topic' + | 'manifest' + | 'bridge_write'; + +export interface ExtractorFailure { + repo: string; + extractor: ExtractorKind; + message: string; +} + export interface SyncResult { contracts: StoredContract[]; crossLinks: CrossLink[]; unmatched: StoredContract[]; missingRepos: string[]; repoSnapshots: Record; + /** Populated when individual extractors threw. See ExtractorFailure. */ + extractorFailures?: ExtractorFailure[]; } export function stableRepoPoolId(entry: RegistryEntry, allEntries: RegistryEntry[]): string { @@ -60,15 +98,27 @@ function defaultResolveHandle(allEntries: RegistryEntry[]) { }; } +function errMessage(err: unknown): string { + if (err instanceof Error) return err.message; + try { + return String(err); + } catch { + return 'unknown error'; + } +} + export async function syncGroup(config: GroupConfig, opts?: SyncOptions): Promise { const missingRepos: string[] = []; const repoSnapshots: Record = {}; + const extractorFailures: ExtractorFailure[] = []; let autoContracts: StoredContract[] = []; let dbExecutors: Map | undefined; + let manifestResult: Awaited>; const eo = opts?.extractorOverride; if (eo && eo.length === 0) { autoContracts = await (eo as () => Promise)(); + manifestResult = await new ManifestExtractor().extractFromManifest(config.links); } else { const entries = await readRegistry(); const resolve = opts?.resolveRepoHandle ?? defaultResolveHandle(entries); @@ -88,18 +138,45 @@ export async function syncGroup(config: GroupConfig, opts?: SyncOptions): Promis const poolId = handle.id; const lbugPath = path.join(handle.storagePath, 'lbug'); + + // Step 1: open the per-repo LadybugDB pool. Failure here means the + // repo itself is broken/unindexed — mark missing and skip entirely. try { await initLbug(poolId, lbugPath); openPoolIds.push(poolId); + } catch (err) { + missingRepos.push(groupPath); + extractorFailures.push({ + repo: groupPath, + extractor: 'init', + message: errMessage(err), + }); + continue; + } - const executor: CypherExecutor = (query, params) => - executeParameterized(poolId, query, params ?? {}); + const executor: CypherExecutor = (query, params) => + executeParameterized(poolId, query, params ?? {}); - dbExecutors.set(groupPath, executor); + dbExecutors.set(groupPath, executor); - const boundaries = await detectServiceBoundaries(handle.repoPath); + // Step 2: service boundary detection. Degrade gracefully to empty + // boundaries on failure — contracts will still be extracted, just + // without service attribution. + let boundaries: Awaited> = []; + try { + boundaries = await detectServiceBoundaries(handle.repoPath); + } catch (err) { + extractorFailures.push({ + repo: groupPath, + extractor: 'boundaries', + message: errMessage(err), + }); + } - if (config.detect.http) { + // Step 3: run each extractor in isolation. One failure must not + // cascade to the others in the same repo. + if (config.detect.http) { + try { const extracted = await httpEx.extract(executor, handle.repoPath, handle); for (const c of extracted) { autoContracts.push({ @@ -108,9 +185,17 @@ export async function syncGroup(config: GroupConfig, opts?: SyncOptions): Promis service: assignService(c.symbolRef.filePath, boundaries), }); } + } catch (err) { + extractorFailures.push({ + repo: groupPath, + extractor: 'http', + message: errMessage(err), + }); } + } - if (config.detect.grpc) { + if (config.detect.grpc) { + try { const extracted = await grpcEx.extract(executor, handle.repoPath, handle); for (const c of extracted) { autoContracts.push({ @@ -119,9 +204,17 @@ export async function syncGroup(config: GroupConfig, opts?: SyncOptions): Promis service: assignService(c.symbolRef.filePath, boundaries), }); } + } catch (err) { + extractorFailures.push({ + repo: groupPath, + extractor: 'grpc', + message: errMessage(err), + }); } + } - if (config.detect.topics) { + if (config.detect.topics) { + try { const extracted = await topicEx.extract(executor, handle.repoPath, handle); for (const c of extracted) { autoContracts.push({ @@ -130,27 +223,46 @@ export async function syncGroup(config: GroupConfig, opts?: SyncOptions): Promis service: assignService(c.symbolRef.filePath, boundaries), }); } + } catch (err) { + extractorFailures.push({ + repo: groupPath, + extractor: 'topic', + message: errMessage(err), + }); } + } - const metaPath = path.join(handle.storagePath, 'meta.json'); - try { - const raw = await fs.readFile(metaPath, 'utf-8'); - const m = JSON.parse(raw) as { indexedAt?: string; lastCommit?: string }; - repoSnapshots[groupPath] = { - indexedAt: m.indexedAt || '', - lastCommit: m.lastCommit || '', - }; - } catch { - const e = entries.find((en) => en.name === regName); - repoSnapshots[groupPath] = { - indexedAt: e?.indexedAt || '', - lastCommit: e?.lastCommit || '', - }; - } + // Step 4: read repo snapshot meta. Pre-existing fallback is fine. + const metaPath = path.join(handle.storagePath, 'meta.json'); + try { + const raw = await fs.readFile(metaPath, 'utf-8'); + const m = JSON.parse(raw) as { indexedAt?: string; lastCommit?: string }; + repoSnapshots[groupPath] = { + indexedAt: m.indexedAt || '', + lastCommit: m.lastCommit || '', + }; } catch { - missingRepos.push(groupPath); + const e = entries.find((en) => en.name === regName); + repoSnapshots[groupPath] = { + indexedAt: e?.indexedAt || '', + lastCommit: e?.lastCommit || '', + }; } } + + try { + manifestResult = await new ManifestExtractor().extractFromManifest( + config.links, + dbExecutors, + ); + } catch (err) { + extractorFailures.push({ + repo: '*', + extractor: 'manifest', + message: errMessage(err), + }); + manifestResult = { contracts: [], crossLinks: [] }; + } } finally { for (const id of [...new Set(openPoolIds)]) { await closeLbug(id).catch(() => {}); @@ -158,28 +270,57 @@ export async function syncGroup(config: GroupConfig, opts?: SyncOptions): Promis } } - const { matched, unmatched } = runExactMatch(autoContracts); - const crossLinks: CrossLink[] = matched; - const allContracts: StoredContract[] = autoContracts; - - const registry: ContractRegistry = { - version: 1, - generatedAt: new Date().toISOString(), - repoSnapshots, - missingRepos, - contracts: allContracts, - crossLinks, + autoContracts = dedupeContracts(autoContracts); + manifestResult = { + contracts: dedupeContracts(manifestResult.contracts), + crossLinks: dedupeCrossLinks(manifestResult.crossLinks), }; + const providerIndex = buildProviderIndex(autoContracts); + const { matched: exactLinks, unmatched } = runExactMatch(autoContracts, providerIndex); + const { matched: wildcardLinks, remaining } = runWildcardMatch(unmatched, providerIndex); + const crossLinks: CrossLink[] = dedupeCrossLinks([ + ...manifestResult.crossLinks, + ...exactLinks, + ...wildcardLinks, + ]); + const allContracts: StoredContract[] = dedupeContracts([ + ...manifestResult.contracts, + ...autoContracts, + ]); + if (opts?.groupDir && !opts.skipWrite) { - await writeContractRegistry(opts.groupDir, registry); + const writeReport = await writeBridge(opts.groupDir, { + contracts: allContracts, + crossLinks, + repoSnapshots, + missingRepos, + }); + // Surface per-item write failures as sync-level extractorFailures so the + // user sees them alongside extractor errors. Repo='*' because the error + // is at the bridge layer, not tied to a single source repo. + if ( + writeReport.contractsFailed > 0 || + writeReport.linksFailed > 0 || + writeReport.snapshotsFailed > 0 + ) { + const summary = + `bridge write: ${writeReport.contractsFailed} contracts, ` + + `${writeReport.snapshotsFailed} snapshots, ` + + `${writeReport.linksFailed} links failed to insert` + + (writeReport.sampleErrors.length > 0 + ? `; first error: ${writeReport.sampleErrors[0].kind}[${writeReport.sampleErrors[0].id}]: ${writeReport.sampleErrors[0].message}` + : ''); + extractorFailures.push({ repo: '*', extractor: 'bridge_write', message: summary }); + } } return { contracts: allContracts, crossLinks, - unmatched, + unmatched: remaining, missingRepos, repoSnapshots, + ...(extractorFailures.length > 0 ? { extractorFailures } : {}), }; } diff --git a/gitnexus/src/core/group/types.ts b/gitnexus/src/core/group/types.ts index 7ab0f071a2..8795a3995a 100644 --- a/gitnexus/src/core/group/types.ts +++ b/gitnexus/src/core/group/types.ts @@ -1,5 +1,5 @@ export type ContractType = 'http' | 'grpc' | 'topic' | 'lib' | 'custom'; -export type MatchType = 'exact' | 'manifest' | 'bm25' | 'embedding'; +export type MatchType = 'exact' | 'manifest' | 'wildcard' | 'bm25' | 'embedding'; export type ContractRole = 'provider' | 'consumer'; export interface GroupConfig { @@ -96,6 +96,8 @@ export interface RepoHandle { storagePath: string; } +export type TruncationReason = 'phase1_timeout' | 'wall_deadline'; + export interface GroupImpactResult { local: unknown; group: string; @@ -103,6 +105,23 @@ export interface GroupImpactResult { outOfScope: OutOfScopeLink[]; truncated: boolean; truncatedRepos: string[]; + /** + * Why the result is partial. Absent when `truncated` is false. + * - `phase1_timeout` — local impact walk hit the Phase-1 timeout; Phase-2 + * continued with empty local seed, so cross-repo fanout is almost certainly + * empty. Local stub fields (`impactedCount: 0`, `risk: 'LOW'`, `byDepth: {}`) + * are placeholders, NOT real zero-impact results. + * - `wall_deadline` — Phase-2 ran out of wall-clock time while iterating + * cross-link candidates; some results may be present but more could exist. + */ + truncationReason?: TruncationReason; + /** + * Populated when the caller requested a crossDepth greater than the + * MVP-supported max (currently 1). The traversal still runs at the + * supported depth, but the warning is echoed back so the caller (CLI, + * MCP, test) can surface it to the user. + */ + crossDepthWarning?: string; summary: { direct: number; processes_affected: number; @@ -129,5 +148,28 @@ export interface OutOfScopeLink { from: string; to: string; contractId: string; + matchType: MatchType; confidence: number; } + +/** + * @deprecated Use bridge.lbug instead. Kept for JSON fallback during migration. + * This is a type alias — ContractRegistry is NOT removed yet. + * In Task 10 (cleanup), ContractRegistry will be renamed to LegacyContractRegistry + * and all imports updated. For now, both names work. + */ +export type LegacyContractRegistry = ContractRegistry; + +/** Opaque handle to an open bridge LadybugDB. */ +export interface BridgeHandle { + /** Internal — do not access directly. */ + readonly _db: unknown; + readonly _conn: unknown; + readonly groupDir: string; +} + +export interface BridgeMeta { + version: number; + generatedAt: string; + missingRepos: string[]; +} diff --git a/gitnexus/src/mcp/local/local-backend.ts b/gitnexus/src/mcp/local/local-backend.ts index 8f8b4d94b7..a8537d4115 100644 --- a/gitnexus/src/mcp/local/local-backend.ts +++ b/gitnexus/src/mcp/local/local-backend.ts @@ -2447,6 +2447,8 @@ export class LocalBackend { return this.groupSync(params); case 'group_contracts': return this.groupContracts(params); + case 'group_impact': + return this.groupImpact(params); case 'group_query': return this.groupQuery(params); case 'group_status': @@ -2468,6 +2470,11 @@ export class LocalBackend { return this.getGroupService().groupContracts(params); } + private async groupImpact(params: Record): Promise { + await this.refreshRepos(); + return this.getGroupService().groupImpact(params); + } + private async groupQuery(params: Record): Promise { await this.refreshRepos(); return this.getGroupService().groupQuery(params); diff --git a/gitnexus/src/mcp/tools.ts b/gitnexus/src/mcp/tools.ts index c74b19464f..8f5fa1410f 100644 --- a/gitnexus/src/mcp/tools.ts +++ b/gitnexus/src/mcp/tools.ts @@ -18,6 +18,11 @@ export interface ToolDefinition { default?: any; items?: { type: string }; enum?: string[]; + // Numeric bounds (JSON Schema draft 7). Clients that honor these + // reject out-of-range values before the request hits the server. + // The server still validates server-side — these are belt-and-suspenders. + minimum?: number; + maximum?: number; } >; required: string[]; @@ -381,7 +386,7 @@ Returns: single route object when one match, or { routes: [...], total: N } for name: 'group_list', description: `List all configured repository groups, or return details for one group (repos, manifest links). -WHEN TO USE: Discover groups before group_sync. Optional "name" returns a single group's config.`, +WHEN TO USE: Discover groups before group_sync or group_impact. Optional "name" returns a single group's config.`, inputSchema: { type: 'object', properties: { @@ -392,7 +397,7 @@ WHEN TO USE: Discover groups before group_sync. Optional "name" returns a single }, { name: 'group_sync', - description: `Rebuild the Contract Registry (contracts.json) for a group: extract HTTP contracts, apply manifest links, exact-match cross-links. + description: `Rebuild the Contract Registry (bridge.lbug) for a group: extract HTTP/gRPC/topic contracts, apply manifest links, exact-match and wildcard cross-links. WHEN TO USE: After changing group.yaml or re-indexing member repos.`, inputSchema: { @@ -410,7 +415,7 @@ WHEN TO USE: After changing group.yaml or re-indexing member repos.`, }, { name: 'group_contracts', - description: `Inspect contracts and cross-links from the group's contracts.json. + description: `Inspect contracts and cross-links from the group bridge graph. WHEN TO USE: Debug cross-repo links after group_sync.`, inputSchema: { @@ -424,6 +429,63 @@ WHEN TO USE: Debug cross-repo links after group_sync.`, required: ['name'], }, }, + { + name: 'group_impact', + description: `Cross-repository blast radius: local impact in the source repo, then one-hop fan-out via Contract Registry (exact/manifest links). + +WHEN TO USE: When a symbol may affect other repos in the same group. Multi-hop cross-boundary is not implemented; crossDepth is capped at 1.`, + inputSchema: { + type: 'object', + properties: { + name: { type: 'string', description: 'Group name' }, + target: { type: 'string', description: 'Symbol name (same as impact tool)' }, + repo: { + type: 'string', + description: 'Group path of the source repo (e.g. hr/hiring/backend)', + }, + direction: { + type: 'string', + description: 'upstream or downstream', + enum: ['upstream', 'downstream'], + }, + // Numeric bounds mirror the server-side validation in + // GroupService.groupImpact(). Keep them in sync: clients that + // honor the JSON Schema get a client-side rejection before the + // request round-trips, and clients that don't still hit the + // server-side guard. + crossDepth: { + type: 'integer', + minimum: 0, + maximum: 10, + description: + 'Cross-boundary hops (MVP: capped at 1; values above 1 are ignored with a warning)', + }, + maxDepth: { + type: 'integer', + minimum: 1, + maximum: 10, + description: 'Max graph depth within each repo (default 3)', + }, + minConfidence: { + type: 'number', + minimum: 0, + maximum: 1, + description: 'Minimum cross-link confidence (default 0.5)', + }, + subgroup: { + type: 'string', + description: 'Only fan out into repos under this group path prefix', + }, + timeout: { + type: 'integer', + minimum: 100, + maximum: 300000, + description: 'Wall-clock budget in ms (default 30000)', + }, + }, + required: ['name', 'target', 'repo'], + }, + }, { name: 'group_query', description: `Run the query tool across all repos in a group and merge process results via reciprocal rank fusion. diff --git a/gitnexus/test/fixtures/group/group-cross-repo.yaml b/gitnexus/test/fixtures/group/group-cross-repo.yaml new file mode 100644 index 0000000000..e59a807dcb --- /dev/null +++ b/gitnexus/test/fixtures/group/group-cross-repo.yaml @@ -0,0 +1,29 @@ +version: 1 +name: cross-repo-fixture +description: "Cross-repo fixture backed by split monorepo services" + +repos: + platform/auth: test-monorepo/services/auth + platform/orders: test-monorepo/services/orders + platform/gateway: test-monorepo/services/gateway + +links: + - from: platform/orders + to: platform/auth + type: grpc + contract: auth.AuthService/Login + role: consumer + +packages: {} + +detect: + http: true + grpc: true + topics: false + shared_libs: false + embedding_fallback: false + +matching: + bm25_threshold: 0.7 + embedding_threshold: 0.65 + max_candidates_per_step: 3 diff --git a/gitnexus/test/integration/group/group-cli.test.ts b/gitnexus/test/integration/group/group-cli.test.ts index 02da904dc4..20c9fcdf38 100644 --- a/gitnexus/test/integration/group/group-cli.test.ts +++ b/gitnexus/test/integration/group/group-cli.test.ts @@ -49,20 +49,4 @@ describe('group CLI', () => { expect(l.status).toBe(0); expect(l.stdout).toContain('acme'); }); - - it('test_create_with_invalid_name_fails', () => { - const result = runGroup(['create', '../../evil']); - expect(result.status).not.toBe(0); - expect(result.stderr).toContain('Invalid group name'); - }); - - it('test_sync_command_source_does_not_call_blanket_closeLbug', () => { - const cliGroupPath = path.join(repoRoot, 'src', 'cli', 'group.ts'); - const source = fs.readFileSync(cliGroupPath, 'utf-8'); - - // closeLbug() without arguments (blanket close) must not appear. - // Match closeLbug() but not closeLbug(someArg) - const blanketClosePattern = /closeLbug\s*\(\s*\)/; - expect(source).not.toMatch(blanketClosePattern); - }); }); diff --git a/gitnexus/test/integration/group/group-impact.test.ts b/gitnexus/test/integration/group/group-impact.test.ts new file mode 100644 index 0000000000..1f8e64d39e --- /dev/null +++ b/gitnexus/test/integration/group/group-impact.test.ts @@ -0,0 +1,370 @@ +/** + * Group impact integration keeps a cheap mocked wiring test and adds a real + * bridge-backed fixture path through GroupService using indexed fixture repos. + */ +import { describe, it, expect } from 'vitest'; +import * as fs from 'node:fs'; +import * as os from 'node:os'; +import * as path from 'node:path'; +import { fileURLToPath } from 'node:url'; +import { runGroupImpactLegacy } from '../../../src/core/group/cross-impact.js'; +import { parseGroupConfig } from '../../../src/core/group/config-parser.js'; +import { GroupService } from '../../../src/core/group/service.js'; +import { syncGroup } from '../../../src/core/group/sync.js'; +import { runFullAnalysis } from '../../../src/core/run-analyze.js'; +import type { + ContractRegistry, + RepoHandle, + StoredContract, +} from '../../../src/core/group/types.js'; + +const __dirname = path.dirname(fileURLToPath(import.meta.url)); +const FIXTURES_DIR = path.resolve(__dirname, '../../fixtures/group'); +const FIXTURE_REPOS = [ + ['platform/auth', 'test-monorepo/services/auth'], + ['platform/orders', 'test-monorepo/services/orders'], + ['platform/gateway', 'test-monorepo/services/gateway'], +] as const; +const ANALYZE_CALLBACKS = { + onProgress: () => {}, + onLog: () => {}, +}; + +function minimalRegistry(crossLinks: ContractRegistry['crossLinks']): ContractRegistry { + return { + version: 1, + generatedAt: new Date().toISOString(), + repoSnapshots: {}, + missingRepos: [], + contracts: [], + crossLinks, + }; +} + +function withIsolatedHomes(run: (tempRoot: string) => Promise): Promise { + const previous = { + GITNEXUS_HOME: process.env.GITNEXUS_HOME, + USERPROFILE: process.env.USERPROFILE, + HOME: process.env.HOME, + }; + const tempRoot = fs.mkdtempSync(path.join(os.tmpdir(), 'gitnexus-group-impact-')); + process.env.GITNEXUS_HOME = path.join(tempRoot, '.gitnexus-home'); + process.env.USERPROFILE = tempRoot; + process.env.HOME = tempRoot; + + return run(tempRoot).finally(async () => { + if (previous.GITNEXUS_HOME === undefined) delete process.env.GITNEXUS_HOME; + else process.env.GITNEXUS_HOME = previous.GITNEXUS_HOME; + if (previous.USERPROFILE === undefined) delete process.env.USERPROFILE; + else process.env.USERPROFILE = previous.USERPROFILE; + if (previous.HOME === undefined) delete process.env.HOME; + else process.env.HOME = previous.HOME; + }); +} + +function stageFixtureRepos(tempRoot: string): string { + const fixtureRoot = path.join(tempRoot, 'fixture-root'); + for (const [, relPath] of FIXTURE_REPOS) { + const sourcePath = path.join(FIXTURES_DIR, relPath); + const targetPath = path.join(fixtureRoot, relPath); + fs.mkdirSync(path.dirname(targetPath), { recursive: true }); + fs.cpSync(sourcePath, targetPath, { recursive: true }); + fs.rmSync(path.join(targetPath, '.gitnexus'), { recursive: true, force: true }); + } + return fixtureRoot; +} + +async function analyzeStagedFixtureRepos(fixtureRoot: string): Promise { + for (const [, relPath] of FIXTURE_REPOS) { + const repoPath = path.join(fixtureRoot, relPath); + await runFullAnalysis( + repoPath, + { + force: true, + embeddings: false, + skipAgentsMd: true, + }, + ANALYZE_CALLBACKS, + ); + } +} + +function makeRepoHandle( + fixtureRoot: string, + groupPath: (typeof FIXTURE_REPOS)[number][0], + registryName: string, + fixtureRelPath = registryName, +): RepoHandle { + const repoPath = path.join(fixtureRoot, fixtureRelPath); + return { + id: registryName, + path: groupPath, + repoPath, + storagePath: path.join(repoPath, '.gitnexus'), + }; +} + +function makeMinimalImpact(result: { id: string; name: string; filePath: string; type?: string }): { + target: { id: string; name: string; filePath: string; type: string }; + direction: 'upstream' | 'downstream'; + impactedCount: number; + risk: 'LOW'; + summary: { direct: number; processes_affected: number; modules_affected: number }; + affected_processes: []; + affected_modules: []; + byDepth: { '1': Array<{ id: string; name: string; filePath: string }> }; +} { + return { + target: { + id: result.id, + name: result.name, + filePath: result.filePath, + type: result.type ?? 'Function', + }, + direction: 'upstream', + impactedCount: 1, + risk: 'LOW', + summary: { direct: 1, processes_affected: 0, modules_affected: 0 }, + affected_processes: [], + affected_modules: [], + byDepth: { + '1': [{ id: result.id, name: result.name, filePath: result.filePath }], + }, + }; +} + +describe('Group impact integration', () => { + it('runs phase 1 and fan-out when cross-link matches UID', async () => { + const registry = minimalRegistry([ + { + from: { + repo: 'app/frontend', + symbolUid: 'remote-1', + symbolRef: { filePath: 'f.ts', name: 'x' }, + }, + to: { + repo: 'app/backend', + symbolUid: 'local-target', + symbolRef: { filePath: 'b.ts', name: 'y' }, + }, + type: 'http', + contractId: 'http::GET::/x', + matchType: 'exact', + confidence: 1.0, + }, + ]); + + const localImpactFn = async () => ({ + target: { id: 'local-target', name: 'T', filePath: 'b.ts' }, + direction: 'upstream', + impactedCount: 1, + risk: 'LOW', + summary: { direct: 1, processes_affected: 0, modules_affected: 0 }, + affected_processes: [], + affected_modules: [], + byDepth: { '1': [{ id: 'local-target', name: 'T', filePath: 'b.ts' }] }, + }); + + let fanOutCalls = 0; + const crossImpactFn = async (groupPath: string, uid: string, _direction: string) => { + fanOutCalls++; + expect(groupPath).toBe('app/frontend'); + expect(uid).toBe('remote-1'); + return { byDepth: {}, affected_processes: [] }; + }; + + const result = await runGroupImpactLegacy({ + groupName: 'g', + target: 'T', + repoPath: 'app/backend', + direction: 'upstream', + registry, + localImpactFn, + crossImpactFn, + crossDepth: 1, + timeout: 5000, + }); + + expect(result.cross.length).toBe(1); + expect(fanOutCalls).toBe(1); + expect(result.summary.cross_repo_hits).toBe(1); + }); + + it('runs real cross-repo impact through bridge and indexed fixture repos', async () => { + await withIsolatedHomes(async () => { + const groupHome = process.env.GITNEXUS_HOME!; + const groupDir = path.join(groupHome, 'groups', 'cross-repo-fixture'); + const fixtureRoot = stageFixtureRepos(groupHome); + const groupYaml = `version: 1 +name: cross-repo-fixture +description: "Cross-repo fixture backed by split monorepo services" + +repos: + platform/auth: auth + platform/orders: orders + platform/gateway: gateway + +links: + - from: platform/orders + to: platform/auth + type: grpc + contract: "auth.AuthService/Login" + role: consumer + +packages: {} + +detect: + http: true + grpc: false + topics: false + shared_libs: false + embedding_fallback: false + +matching: + bm25_threshold: 0.7 + embedding_threshold: 0.65 + max_candidates_per_step: 3 +`; + fs.mkdirSync(groupDir, { recursive: true }); + fs.writeFileSync(path.join(groupDir, 'group.yaml'), groupYaml); + + await analyzeStagedFixtureRepos(fixtureRoot); + + const config = parseGroupConfig(groupYaml); + const handles = new Map( + FIXTURE_REPOS.map(([groupPath, relPath]) => [ + path.basename(relPath), + makeRepoHandle(fixtureRoot, groupPath, path.basename(relPath), relPath), + ]), + ); + + const syncResult = await syncGroup(config, { + groupDir, + resolveRepoHandle: async (registryName, groupPath) => { + const fixtureRelPath = FIXTURE_REPOS.find( + ([, relPath]) => path.basename(relPath) === registryName, + )?.[1]; + if (!fixtureRelPath) return null; + return makeRepoHandle( + fixtureRoot, + groupPath as (typeof FIXTURE_REPOS)[number][0], + registryName, + fixtureRelPath, + ); + }, + }); + + expect(syncResult.crossLinks.length).toBeGreaterThan(0); + + const bootstrapService = new GroupService({ + resolveRepo: async (repoParam?: string) => { + const handle = handles.get(repoParam ?? ''); + if (!handle) throw new Error(`Unknown repo: ${repoParam ?? ''}`); + return { + id: handle.id, + name: repoParam ?? handle.id, + repoPath: handle.repoPath, + storagePath: handle.storagePath, + }; + }, + impact: async () => ({ error: 'bootstrap-only' }), + query: async () => ({ processes: [] }), + impactByUid: async () => null, + }); + + const bootstrapContracts = (await bootstrapService.groupContracts({ + name: 'cross-repo-fixture', + })) as { contracts?: StoredContract[] }; + const contractRows = bootstrapContracts.contracts ?? []; + const authProvider = contractRows.find( + (contract) => + contract.repo === 'platform/auth' && + contract.role === 'provider' && + contract.contractId === 'grpc::auth.AuthService/Login', + ); + const ordersConsumer = contractRows.find( + (contract) => + contract.repo === 'platform/orders' && + contract.role === 'consumer' && + contract.contractId === 'grpc::auth.AuthService/Login', + ); + + expect(authProvider).toBeDefined(); + expect(ordersConsumer).toBeDefined(); + + const groupService = new GroupService({ + resolveRepo: async (repoParam?: string) => { + const handle = handles.get(repoParam ?? ''); + if (!handle) throw new Error(`Unknown repo: ${repoParam ?? ''}`); + return { + id: handle.id, + name: repoParam ?? handle.id, + repoPath: handle.repoPath, + storagePath: handle.storagePath, + }; + }, + impact: async (_repo, params) => { + const contract = contractRows.find( + (row) => + row.repo === 'platform/auth' && + row.role === 'provider' && + row.symbolName === params.target, + ); + if (!contract) { + return { error: `Target "${params.target}" not found in bridge contracts` }; + } + return { + ...makeMinimalImpact({ + id: contract.symbolUid || `${contract.repo}::${contract.contractId}`, + name: contract.symbolName, + filePath: contract.symbolRef.filePath, + }), + direction: params.direction, + }; + }, + query: async () => ({ processes: [] }), + impactByUid: async (repoId, uid, direction) => { + const targetRepo = handles.get(repoId)?.path; + const contract = contractRows.find( + (row) => + row.repo === targetRepo && + (row.symbolUid === uid || + (row.repo === ordersConsumer?.repo && + row.contractId === ordersConsumer?.contractId && + row.symbolName === ordersConsumer?.symbolName)), + ); + if (!contract) return null; + return { + ...makeMinimalImpact({ + id: contract.symbolUid || `${contract.repo}::${contract.contractId}`, + name: contract.symbolName, + filePath: contract.symbolRef.filePath, + }), + direction, + }; + }, + }); + + const result = (await groupService.groupImpact({ + name: 'cross-repo-fixture', + repo: 'platform/auth', + target: authProvider?.symbolName ?? 'AuthService.Login', + direction: 'upstream', + })) as { + error?: string; + cross?: Array<{ repo_path: string; contract: { id: string } }>; + summary?: { cross_repo_hits: number }; + }; + + expect(result.error).toBeUndefined(); + expect(result.summary?.cross_repo_hits).toBeGreaterThan(0); + expect( + result.cross?.some( + (hit) => + hit.repo_path === 'platform/orders' && + hit.contract.id === 'grpc::auth.AuthService/Login', + ), + ).toBe(true); + }); + }, 120000); +}); diff --git a/gitnexus/test/integration/group/group-sync.test.ts b/gitnexus/test/integration/group/group-sync.test.ts index 3cceb200b6..0cd2e7f527 100644 --- a/gitnexus/test/integration/group/group-sync.test.ts +++ b/gitnexus/test/integration/group/group-sync.test.ts @@ -1,17 +1,101 @@ /** - * Group sync integration — uses `extractorOverride` / parsed YAML only (no LadybugDB). - * Full pipeline with indexed fixture repos is a follow-up (needs `.gitnexus/lbug`). + * Group sync integration — keeps a cheap mocked orchestration check and adds + * a real indexed-fixture path for bridge generation. */ import { describe, it, expect } from 'vitest'; import * as path from 'node:path'; import * as fs from 'node:fs'; +import * as os from 'node:os'; import { fileURLToPath } from 'node:url'; import { parseGroupConfig } from '../../../src/core/group/config-parser.js'; import { syncGroup } from '../../../src/core/group/sync.js'; -import type { StoredContract } from '../../../src/core/group/types.js'; +import { runFullAnalysis } from '../../../src/core/run-analyze.js'; +import { bridgeExists } from '../../../src/core/group/bridge-db.js'; +import type { RepoHandle, StoredContract } from '../../../src/core/group/types.js'; const __dirname = path.dirname(fileURLToPath(import.meta.url)); const FIXTURES_DIR = path.resolve(__dirname, '../../fixtures/group'); +const CROSS_REPO_FIXTURE = path.join(FIXTURES_DIR, 'group-cross-repo.yaml'); +const FIXTURE_REPOS = [ + 'test-monorepo/services/auth', + 'test-monorepo/services/orders', + 'test-monorepo/services/gateway', +] as const; + +const ANALYZE_CALLBACKS = { + onProgress: () => {}, + onLog: () => {}, +}; + +function withIsolatedHomes(run: (tempRoot: string) => Promise): Promise { + const previous = { + GITNEXUS_HOME: process.env.GITNEXUS_HOME, + USERPROFILE: process.env.USERPROFILE, + HOME: process.env.HOME, + }; + const tempRoot = fs.mkdtempSync(path.join(os.tmpdir(), 'gitnexus-group-sync-')); + process.env.GITNEXUS_HOME = path.join(tempRoot, '.gitnexus-home'); + process.env.USERPROFILE = tempRoot; + process.env.HOME = tempRoot; + + return run(tempRoot).finally(async () => { + if (previous.GITNEXUS_HOME === undefined) delete process.env.GITNEXUS_HOME; + else process.env.GITNEXUS_HOME = previous.GITNEXUS_HOME; + if (previous.USERPROFILE === undefined) delete process.env.USERPROFILE; + else process.env.USERPROFILE = previous.USERPROFILE; + if (previous.HOME === undefined) delete process.env.HOME; + else process.env.HOME = previous.HOME; + await removeTree(tempRoot); + }); +} + +function stageFixtureRepos(tempRoot: string): string { + const fixtureRoot = path.join(tempRoot, 'fixture-root'); + for (const relPath of FIXTURE_REPOS) { + const sourcePath = path.join(FIXTURES_DIR, relPath); + const targetPath = path.join(fixtureRoot, relPath); + fs.mkdirSync(path.dirname(targetPath), { recursive: true }); + fs.cpSync(sourcePath, targetPath, { recursive: true }); + fs.rmSync(path.join(targetPath, '.gitnexus'), { recursive: true, force: true }); + } + return fixtureRoot; +} + +async function analyzeFixtureRepos(fixtureRoot: string): Promise { + for (const relPath of FIXTURE_REPOS) { + const repoPath = path.join(fixtureRoot, relPath); + await runFullAnalysis( + repoPath, + { + force: true, + embeddings: false, + skipAgentsMd: true, + }, + ANALYZE_CALLBACKS, + ); + } +} + +async function removeTree(targetPath: string): Promise { + for (let attempt = 0; attempt < 5; attempt++) { + try { + fs.rmSync(targetPath, { recursive: true, force: true }); + return; + } catch { + await new Promise((resolve) => setTimeout(resolve, 50 * (attempt + 1))); + } + } +} + +function resolveFixtureRepoHandle(registryName: string, groupPath: string): RepoHandle { + const repoPath = path.join(FIXTURES_DIR, registryName); + return { + id: groupPath.replace(/[^a-z0-9]+/gi, '-').toLowerCase(), + path: groupPath, + repoPath, + storagePath: path.join(repoPath, '.gitnexus'), + }; +} describe('Group sync integration', () => { it('parses fixture group.yaml', () => { @@ -76,4 +160,44 @@ describe('Group sync integration', () => { const healthUnmatched = result.unmatched.some((c) => c.contractId.includes('/api/health')); expect(healthUnmatched).toBe(true); }); + + it('builds bridge.lbug from real per-repo indexes for cross-repo fixture', async () => { + await withIsolatedHomes(async (tempRoot) => { + const config = parseGroupConfig(fs.readFileSync(CROSS_REPO_FIXTURE, 'utf-8')); + const groupDir = path.join(tempRoot, 'group-output'); + const fixtureRoot = stageFixtureRepos(tempRoot); + fs.mkdirSync(groupDir, { recursive: true }); + + await analyzeFixtureRepos(fixtureRoot); + + const result = await syncGroup(config, { + groupDir, + resolveRepoHandle: async (registryName, groupPath) => ({ + ...resolveFixtureRepoHandle(registryName, groupPath), + repoPath: path.join(fixtureRoot, registryName), + storagePath: path.join(fixtureRoot, registryName, '.gitnexus'), + }), + }); + + const grpcLink = result.crossLinks.find( + (link) => + link.type === 'grpc' && + link.matchType === 'wildcard' && + link.from.repo === 'platform/orders' && + link.to.repo === 'platform/auth', + ); + expect(grpcLink).toBeDefined(); + + const httpLink = result.crossLinks.find( + (link) => + link.type === 'http' && + link.matchType === 'exact' && + link.contractId === 'http::POST::/api/orders' && + link.from.repo === 'platform/gateway' && + link.to.repo === 'platform/orders', + ); + expect(httpLink).toBeDefined(); + expect(await bridgeExists(groupDir)).toBe(true); + }); + }, 120000); }); diff --git a/gitnexus/test/integration/group/monorepo-sync.test.ts b/gitnexus/test/integration/group/monorepo-sync.test.ts index a9dc38ceca..15901af1da 100644 --- a/gitnexus/test/integration/group/monorepo-sync.test.ts +++ b/gitnexus/test/integration/group/monorepo-sync.test.ts @@ -15,7 +15,11 @@ import { detectServiceBoundaries, assignService, } from '../../../src/core/group/service-boundary-detector.js'; -import { runExactMatch } from '../../../src/core/group/matching.js'; +import { + buildProviderIndex, + runExactMatch, + runWildcardMatch, +} from '../../../src/core/group/matching.js'; import type { RepoHandle, StoredContract } from '../../../src/core/group/types.js'; const __dirname = path.dirname(fileURLToPath(import.meta.url)); @@ -117,4 +121,51 @@ describe('Monorepo sync integration', () => { // Summary: we should have at least 2 cross-links expect(matched.length).toBeGreaterThanOrEqual(2); }); + + it('matches wildcard gRPC consumers to method providers with degraded confidence', async () => { + const handle = makeHandle(); + const boundaries = await detectServiceBoundaries(MONOREPO_DIR); + const grpcContracts = await new GrpcExtractor().extract(null, MONOREPO_DIR, handle); + const allContracts: StoredContract[] = grpcContracts.map((contract) => ({ + ...contract, + repo: REPO_GROUP_PATH, + service: assignService(contract.symbolRef.filePath, boundaries), + })); + + const providerIndex = buildProviderIndex(allContracts); + const { unmatched } = runExactMatch(allContracts, providerIndex); + const { matched } = runWildcardMatch(unmatched, providerIndex); + + const methodProvider = allContracts.find( + (contract) => + contract.role === 'provider' && + contract.contractId === 'grpc::auth.AuthService/Login' && + contract.symbolRef.filePath.includes('services/auth/'), + ); + + expect(methodProvider).toBeDefined(); + + const wildcardLink = matched.find( + (link) => + link.matchType === 'wildcard' && + link.type === 'grpc' && + link.contractId.endsWith('/*') && + link.to.symbolRef.filePath.includes('services/auth/') && + link.to.symbolRef.name === methodProvider?.symbolRef.name, + ); + + expect(wildcardLink).toBeDefined(); + const wildcardConsumer = allContracts.find( + (contract) => + contract.role === 'consumer' && + contract.contractId === wildcardLink?.contractId && + contract.symbolRef.filePath === wildcardLink?.from.symbolRef.filePath, + ); + + expect(wildcardConsumer).toBeDefined(); + expect(wildcardLink?.confidence).toBe( + Math.min(wildcardConsumer?.confidence ?? 1, methodProvider?.confidence ?? 1), + ); + expect(wildcardLink?.confidence).toBeLessThan(methodProvider?.confidence ?? 1); + }); }); diff --git a/gitnexus/test/unit/group/bridge-db-edge.test.ts b/gitnexus/test/unit/group/bridge-db-edge.test.ts new file mode 100644 index 0000000000..da1033d130 --- /dev/null +++ b/gitnexus/test/unit/group/bridge-db-edge.test.ts @@ -0,0 +1,190 @@ +import { describe, it, expect, beforeEach, afterEach } from 'vitest'; +import fsp from 'node:fs/promises'; +import path from 'node:path'; +import os from 'node:os'; +import { + writeBridge, + openBridgeDbReadOnly, + queryBridge, + closeBridgeDb, +} from '../../../src/core/group/bridge-db.js'; +import type { StoredContract, CrossLink } from '../../../src/core/group/types.js'; + +const makeContract = (overrides: Partial = {}): StoredContract => ({ + contractId: 'http::GET::/api/users', + type: 'http', + role: 'provider', + symbolUid: 'uid-1', + symbolRef: { filePath: 'src/routes.ts', name: 'getUsers' }, + symbolName: 'getUsers', + confidence: 0.85, + meta: {}, + repo: 'backend', + ...overrides, +}); + +describe('bridge-db edge cases', () => { + let tmpDir: string; + + beforeEach(async () => { + tmpDir = await fsp.mkdtemp(path.join(os.tmpdir(), 'bridge-edge-')); + }); + + afterEach(async () => { + await fsp.rm(tmpDir, { recursive: true, force: true }); + }); + + it('test_openBridgeDbReadOnly_version_gate_returns_null_for_incompatible', async () => { + // Create a dummy bridge.lbug file so the access check passes + await fsp.writeFile(path.join(tmpDir, 'bridge.lbug'), 'dummy'); + // Write meta.json with an incompatible version (999) + await fsp.writeFile( + path.join(tmpDir, 'meta.json'), + JSON.stringify({ version: 999, generatedAt: '', missingRepos: [] }), + ); + + const handle = await openBridgeDbReadOnly(tmpDir); + expect(handle).toBeNull(); + }); + + it('test_openBridgeDbReadOnly_bak_recovery_restores_bridge', async () => { + // Write a valid bridge + await writeBridge(tmpDir, { + contracts: [makeContract()], + crossLinks: [], + repoSnapshots: {}, + missingRepos: [], + }); + // Move bridge.lbug → bridge.lbug.bak (simulating interrupted swap) + const dbPath = path.join(tmpDir, 'bridge.lbug'); + const bakPath = path.join(tmpDir, 'bridge.lbug.bak'); + await fsp.rename(dbPath, bakPath); + + // openBridgeDbReadOnly should auto-recover from .bak + const handle = await openBridgeDbReadOnly(tmpDir); + expect(handle).not.toBeNull(); + const rows = await queryBridge<{ repo: string }>( + handle!, + 'MATCH (c:Contract) RETURN c.repo AS repo', + ); + expect(rows).toHaveLength(1); + await closeBridgeDb(handle!); + }); + + it('test_writeBridge_crossLink_with_missing_to_node_silently_skipped', async () => { + const provider = makeContract({ repo: 'backend', role: 'provider' }); + const consumer = makeContract({ + repo: 'frontend', + role: 'consumer', + symbolRef: { filePath: 'src/api.ts', name: 'fetchUsers' }, + symbolName: 'fetchUsers', + }); + // CrossLink referencing a 'to' endpoint that doesn't match any contract node + const link: CrossLink = { + from: { + repo: 'frontend', + symbolUid: '', + symbolRef: { filePath: 'src/api.ts', name: 'fetchUsers' }, + }, + to: { + repo: 'nonexistent-repo', + symbolUid: 'uid-missing', + symbolRef: { filePath: 'src/missing.ts', name: 'missingFn' }, + }, + type: 'http', + contractId: 'http::GET::/api/users', + matchType: 'exact', + confidence: 1.0, + }; + + // Should not throw — the link is silently skipped + await writeBridge(tmpDir, { + contracts: [provider, consumer], + crossLinks: [link], + repoSnapshots: {}, + missingRepos: [], + }); + + const handle = await openBridgeDbReadOnly(tmpDir); + expect(handle).not.toBeNull(); + // No cross-links should exist since 'to' node was missing + const rows = await queryBridge<{ matchType: string }>( + handle!, + 'MATCH (a:Contract)-[l:ContractLink]->(b:Contract) RETURN l.matchType AS matchType', + ); + expect(rows).toHaveLength(0); + // But contracts should still be present + const contractRows = await queryBridge<{ repo: string }>( + handle!, + 'MATCH (c:Contract) RETURN c.repo AS repo', + ); + expect(contractRows).toHaveLength(2); + await closeBridgeDb(handle!); + }); + + it('test_writeBridge_manifest_grpc_link_with_symbol_uids_persists_queryable_contract_edge', async () => { + const provider = makeContract({ + contractId: 'grpc::auth.AuthService/Login', + type: 'grpc', + role: 'provider', + repo: 'platform/auth', + symbolUid: 'uid-auth-login', + symbolRef: { filePath: 'src/auth.proto', name: 'Login' }, + symbolName: 'auth.AuthService/Login', + }); + const consumer = makeContract({ + contractId: 'grpc::auth.AuthService/Login', + type: 'grpc', + role: 'consumer', + repo: 'platform/orders', + symbolUid: 'uid-orders-client', + symbolRef: { filePath: 'src/client.ts', name: 'AuthServiceClient' }, + symbolName: 'auth.AuthService/Login', + }); + const link: CrossLink = { + from: { + repo: 'platform/orders', + symbolUid: 'uid-orders-client', + symbolRef: { filePath: 'src/client.ts', name: 'AuthServiceClient' }, + }, + to: { + repo: 'platform/auth', + symbolUid: 'uid-auth-login', + symbolRef: { filePath: 'src/auth.proto', name: 'Login' }, + }, + type: 'grpc', + contractId: 'grpc::auth.AuthService/Login', + matchType: 'manifest', + confidence: 1.0, + }; + + await writeBridge(tmpDir, { + contracts: [provider, consumer], + crossLinks: [link], + repoSnapshots: {}, + missingRepos: [], + }); + + const handle = await openBridgeDbReadOnly(tmpDir); + expect(handle).not.toBeNull(); + const rows = await queryBridge<{ + contractId: string; + matchType: string; + fromRepo: string; + toRepo: string; + }>( + handle!, + `MATCH (a:Contract)-[l:ContractLink]->(b:Contract) + RETURN l.contractId AS contractId, l.matchType AS matchType, l.fromRepo AS fromRepo, l.toRepo AS toRepo`, + ); + expect(rows).toEqual([ + { + contractId: 'grpc::auth.AuthService/Login', + matchType: 'manifest', + fromRepo: 'platform/orders', + toRepo: 'platform/auth', + }, + ]); + await closeBridgeDb(handle!); + }); +}); diff --git a/gitnexus/test/unit/group/bridge-db.test.ts b/gitnexus/test/unit/group/bridge-db.test.ts new file mode 100644 index 0000000000..31add36bdd --- /dev/null +++ b/gitnexus/test/unit/group/bridge-db.test.ts @@ -0,0 +1,468 @@ +import { describe, it, expect, beforeEach, afterEach, vi } from 'vitest'; +import fsp from 'node:fs/promises'; +import path from 'node:path'; +import os from 'node:os'; +import { + openBridgeDb, + ensureBridgeSchema, + queryBridge, + closeBridgeDb, + contractNodeId, + retryRename, + writeBridge, + openBridgeDbReadOnly, + readBridgeMeta, + bridgeExists, +} from '../../../src/core/group/bridge-db.js'; +import type { StoredContract, CrossLink } from '../../../src/core/group/types.js'; + +describe('bridge-db core', () => { + let tmpDir: string; + + beforeEach(async () => { + tmpDir = await fsp.mkdtemp(path.join(os.tmpdir(), 'bridge-test-')); + }); + + afterEach(async () => { + await fsp.rm(tmpDir, { recursive: true, force: true }); + }); + + it('test_openBridgeDb_returns_handle_and_closes', async () => { + const dbPath = path.join(tmpDir, 'test.lbug'); + const handle = await openBridgeDb(dbPath); + expect(handle).toBeDefined(); + expect(handle._db).toBeDefined(); + expect(handle._conn).toBeDefined(); + expect(handle.groupDir).toBe(tmpDir); + // Close should not throw + await closeBridgeDb(handle); + }); + + it('test_ensureBridgeSchema_creates_tables_idempotent', async () => { + const dbPath = path.join(tmpDir, 'test.lbug'); + const handle = await openBridgeDb(dbPath); + await ensureBridgeSchema(handle); + // Run again — should not throw + await ensureBridgeSchema(handle); + const rows = await queryBridge<{ cnt: number }>( + handle, + 'MATCH (c:Contract) RETURN count(c) AS cnt', + ); + expect(rows[0].cnt).toBe(0); + await closeBridgeDb(handle); + }); + + it('test_queryBridge_returns_inserted_data', async () => { + const dbPath = path.join(tmpDir, 'test.lbug'); + const handle = await openBridgeDb(dbPath); + await ensureBridgeSchema(handle); + await queryBridge( + handle, + `CREATE (c:Contract { + id: 'abc123', contractId: 'http::GET::/api', type: 'http', role: 'provider', + repo: 'backend', confidence: 0.9 + })`, + ); + const rows = await queryBridge<{ repo: string; confidence: number }>( + handle, + 'MATCH (c:Contract) RETURN c.repo AS repo, c.confidence AS confidence', + ); + expect(rows).toHaveLength(1); + expect(rows[0].repo).toBe('backend'); + expect(rows[0].confidence).toBe(0.9); + await closeBridgeDb(handle); + }); + + it('test_queryBridge_parameterized', async () => { + const dbPath = path.join(tmpDir, 'test.lbug'); + const handle = await openBridgeDb(dbPath); + await ensureBridgeSchema(handle); + await queryBridge( + handle, + `CREATE (c:Contract { + id: 'p1', contractId: 'http::GET::/api', type: 'http', role: 'provider', + repo: 'backend', confidence: 0.9 + })`, + ); + const rows = await queryBridge<{ repo: string }>( + handle, + 'MATCH (c:Contract) WHERE c.repo = $r RETURN c.repo AS repo', + { r: 'backend' }, + ); + expect(rows).toHaveLength(1); + expect(rows[0].repo).toBe('backend'); + await closeBridgeDb(handle); + }); + + it('test_contractNodeId_full_sha256', () => { + const id = contractNodeId('backend', 'http::GET::/api', 'provider', 'src/routes.ts'); + expect(id).toHaveLength(64); // full SHA-256 hex + // Same inputs → same hash + const id2 = contractNodeId('backend', 'http::GET::/api', 'provider', 'src/routes.ts'); + expect(id).toBe(id2); + // Different filePath → different hash + const id3 = contractNodeId('backend', 'http::GET::/api', 'provider', 'src/other.ts'); + expect(id).not.toBe(id3); + }); +}); + +describe('writeBridge + read', () => { + let tmpDir: string; + + beforeEach(async () => { + tmpDir = await fsp.mkdtemp(path.join(os.tmpdir(), 'bridge-write-')); + }); + + afterEach(async () => { + await fsp.rm(tmpDir, { recursive: true, force: true }); + }); + + const makeContract = (overrides: Partial = {}): StoredContract => ({ + contractId: 'http::GET::/api/users', + type: 'http', + role: 'provider', + symbolUid: 'uid-1', + symbolRef: { filePath: 'src/routes.ts', name: 'getUsers' }, + symbolName: 'getUsers', + confidence: 0.85, + meta: {}, + repo: 'backend', + ...overrides, + }); + + it('test_writeBridge_creates_bridge_lbug_file', async () => { + await writeBridge(tmpDir, { + contracts: [makeContract()], + crossLinks: [], + repoSnapshots: { backend: { indexedAt: '2026-01-01', lastCommit: 'abc' } }, + missingRepos: ['missing-repo'], + }); + const exists = await bridgeExists(tmpDir); + expect(exists).toBe(true); + }); + + it('test_writeBridge_returns_report_with_insert_counts', async () => { + const report = await writeBridge(tmpDir, { + contracts: [makeContract(), makeContract({ repo: 'frontend', role: 'consumer' })], + crossLinks: [], + repoSnapshots: { backend: { indexedAt: '2026-01-01', lastCommit: 'abc' } }, + missingRepos: [], + }); + expect(report.contractsInserted).toBe(2); + expect(report.contractsFailed).toBe(0); + expect(report.snapshotsInserted).toBe(1); + expect(report.snapshotsFailed).toBe(0); + expect(report.linksInserted).toBe(0); + expect(report.linksFailed).toBe(0); + expect(report.linksDroppedMissingNode).toBe(0); + expect(report.sampleErrors).toHaveLength(0); + }); + + it('test_writeBridge_counts_dropped_links_with_missing_nodes', async () => { + // Provider + cross-link that references a non-existent consumer node → + // findContractNode returns null for `from`, link gets dropped. + const provider = makeContract({ role: 'provider' }); + const report = await writeBridge(tmpDir, { + contracts: [provider], + crossLinks: [ + { + from: { + repo: 'ghost', + symbolUid: '', + symbolRef: { filePath: 'nowhere.ts', name: 'ghostFn' }, + }, + to: { + repo: provider.repo, + symbolUid: provider.symbolUid, + symbolRef: provider.symbolRef, + }, + type: 'http', + contractId: provider.contractId, + matchType: 'exact', + confidence: 1.0, + }, + ], + repoSnapshots: {}, + missingRepos: [], + }); + expect(report.linksInserted).toBe(0); + expect(report.linksDroppedMissingNode).toBe(1); + expect(report.linksFailed).toBe(0); + expect(report.contractsInserted).toBe(1); + }); + + it('test_writeBridge_contracts_queryable', async () => { + await writeBridge(tmpDir, { + contracts: [makeContract(), makeContract({ repo: 'frontend', role: 'consumer' })], + crossLinks: [], + repoSnapshots: {}, + missingRepos: [], + }); + const handle = await openBridgeDbReadOnly(tmpDir); + expect(handle).not.toBeNull(); + const rows = await queryBridge<{ repo: string }>( + handle!, + 'MATCH (c:Contract) RETURN c.repo AS repo', + ); + expect(rows).toHaveLength(2); + await closeBridgeDb(handle!); + }); + + it('test_writeBridge_meta_json_persists_missingRepos', async () => { + await writeBridge(tmpDir, { + contracts: [], + crossLinks: [], + repoSnapshots: {}, + missingRepos: ['repo-a', 'repo-b'], + }); + const meta = await readBridgeMeta(tmpDir); + expect(meta.missingRepos).toEqual(['repo-a', 'repo-b']); + expect(meta.version).toBeGreaterThan(0); + expect(meta.generatedAt).toBeTruthy(); + }); + + it('test_writeBridge_repoSnapshots_queryable', async () => { + await writeBridge(tmpDir, { + contracts: [], + crossLinks: [], + repoSnapshots: { 'hr/backend': { indexedAt: '2026-01-01', lastCommit: 'abc' } }, + missingRepos: [], + }); + const handle = await openBridgeDbReadOnly(tmpDir); + const rows = await queryBridge<{ id: string; indexedAt: string }>( + handle!, + 'MATCH (s:RepoSnapshot) RETURN s.id AS id, s.indexedAt AS indexedAt', + ); + expect(rows).toHaveLength(1); + expect(rows[0].id).toBe('hr/backend'); + expect(rows[0].indexedAt).toBe('2026-01-01'); + await closeBridgeDb(handle!); + }); + + it('test_writeBridge_crossLinks_queryable', async () => { + const provider = makeContract({ repo: 'backend', role: 'provider' }); + const consumer = makeContract({ + repo: 'frontend', + role: 'consumer', + symbolRef: { filePath: 'src/api.ts', name: 'fetchUsers' }, + symbolName: 'fetchUsers', + }); + const link: CrossLink = { + from: { + repo: 'frontend', + symbolUid: '', + symbolRef: { filePath: 'src/api.ts', name: 'fetchUsers' }, + }, + to: { + repo: 'backend', + symbolUid: 'uid-1', + symbolRef: { filePath: 'src/routes.ts', name: 'getUsers' }, + }, + type: 'http', + contractId: 'http::GET::/api/users', + matchType: 'exact', + confidence: 1.0, + }; + await writeBridge(tmpDir, { + contracts: [provider, consumer], + crossLinks: [link], + repoSnapshots: {}, + missingRepos: [], + }); + const handle = await openBridgeDbReadOnly(tmpDir); + const rows = await queryBridge<{ fromRepo: string; toRepo: string; matchType: string }>( + handle!, + 'MATCH (a:Contract)-[l:ContractLink]->(b:Contract) RETURN l.fromRepo AS fromRepo, l.toRepo AS toRepo, l.matchType AS matchType', + ); + expect(rows).toHaveLength(1); + expect(rows[0].fromRepo).toBe('frontend'); + expect(rows[0].toRepo).toBe('backend'); + expect(rows[0].matchType).toBe('exact'); + await closeBridgeDb(handle!); + }); + + it('test_writeBridge_duplicate_contracts_and_links_are_deduped', async () => { + const provider = makeContract({ + repo: 'backend', + role: 'provider', + symbolUid: '', + symbolName: 'auth.AuthService/Login', + symbolRef: { filePath: 'src/auth.proto', name: 'Login' }, + contractId: 'grpc::auth.AuthService/Login', + type: 'grpc', + meta: { source: 'manifest' }, + }); + const concreteProvider = makeContract({ + ...provider, + symbolUid: 'uid-auth-login', + symbolName: 'Login', + confidence: 0.85, + meta: { source: 'analyze' }, + }); + const consumer = makeContract({ + repo: 'frontend', + role: 'consumer', + symbolUid: '', + symbolName: 'auth.AuthService/Login', + symbolRef: { filePath: 'src/client.ts', name: 'AuthServiceClient' }, + contractId: 'grpc::auth.AuthService/Login', + type: 'grpc', + meta: { source: 'manifest' }, + }); + const link: CrossLink = { + from: { + repo: 'frontend', + symbolUid: '', + symbolRef: { filePath: 'src/client.ts', name: 'AuthServiceClient' }, + }, + to: { + repo: 'backend', + symbolUid: '', + symbolRef: { filePath: 'src/auth.proto', name: 'Login' }, + }, + type: 'grpc', + contractId: 'grpc::auth.AuthService/Login', + matchType: 'manifest', + confidence: 1, + }; + + await writeBridge(tmpDir, { + contracts: [provider, concreteProvider, consumer], + crossLinks: [link, { ...link }], + repoSnapshots: {}, + missingRepos: [], + }); + + const handle = await openBridgeDbReadOnly(tmpDir); + const contracts = await queryBridge<{ repo: string; symbolUid: string; symbolName: string }>( + handle!, + 'MATCH (c:Contract) RETURN c.repo AS repo, c.symbolUid AS symbolUid, c.symbolName AS symbolName ORDER BY c.repo', + ); + const links = await queryBridge<{ fromRepo: string; toRepo: string }>( + handle!, + 'MATCH (a:Contract)-[l:ContractLink]->(b:Contract) RETURN l.fromRepo AS fromRepo, l.toRepo AS toRepo', + ); + + expect(contracts).toHaveLength(2); + expect(contracts[0]).toEqual({ + repo: 'backend', + symbolUid: 'uid-auth-login', + symbolName: 'Login', + }); + expect(links).toHaveLength(1); + await closeBridgeDb(handle!); + }); + + it('test_openBridgeDbReadOnly_returns_null_for_missing', async () => { + const handle = await openBridgeDbReadOnly(path.join(tmpDir, 'nonexistent')); + expect(handle).toBeNull(); + }); + + it('test_bridgeExists_false_for_missing', async () => { + expect(await bridgeExists(path.join(tmpDir, 'nonexistent'))).toBe(false); + }); + + it('test_writeBridge_overwrites_previous', async () => { + await writeBridge(tmpDir, { + contracts: [makeContract()], + crossLinks: [], + repoSnapshots: {}, + missingRepos: [], + }); + await writeBridge(tmpDir, { + contracts: [makeContract({ repo: 'new-repo' })], + crossLinks: [], + repoSnapshots: {}, + missingRepos: [], + }); + const handle = await openBridgeDbReadOnly(tmpDir); + const rows = await queryBridge<{ repo: string }>( + handle!, + 'MATCH (c:Contract) RETURN c.repo AS repo', + ); + expect(rows).toHaveLength(1); + expect(rows[0].repo).toBe('new-repo'); + await closeBridgeDb(handle!); + }); + + it('test_readBridgeMeta_returns_defaults_for_missing', async () => { + const meta = await readBridgeMeta(path.join(tmpDir, 'nonexistent')); + expect(meta.version).toBe(0); + expect(meta.generatedAt).toBe(''); + expect(meta.missingRepos).toEqual([]); + }); +}); + +describe('retryRename', () => { + afterEach(() => { + vi.restoreAllMocks(); + }); + + it('retries on EBUSY and eventually succeeds', async () => { + // Spy on fs.promises.rename and make the first two attempts fail with + // EBUSY, then succeed on the third. Verifies that Windows-style + // transient rename failures don't immediately bubble up. + const attempts: Array<[string, string]> = []; + let calls = 0; + const spy = vi.spyOn(fsp, 'rename').mockImplementation(async (src, dst) => { + attempts.push([String(src), String(dst)]); + calls++; + if (calls < 3) { + const err = new Error('resource busy or locked') as NodeJS.ErrnoException; + err.code = 'EBUSY'; + throw err; + } + // Third attempt: pretend the rename worked. + return undefined; + }); + + await retryRename('/src/a', '/dst/b', 3); + + expect(spy).toHaveBeenCalledTimes(3); + expect(attempts.every(([s, d]) => s === '/src/a' && d === '/dst/b')).toBe(true); + }); + + it('rethrows non-retryable errors immediately', async () => { + // A non-retryable code (e.g. ENOENT) should NOT be swallowed into a + // retry loop — that would mask real bugs and waste time. + let calls = 0; + vi.spyOn(fsp, 'rename').mockImplementation(async () => { + calls++; + const err = new Error('no such file') as NodeJS.ErrnoException; + err.code = 'ENOENT'; + throw err; + }); + + await expect(retryRename('/src/a', '/dst/b', 5)).rejects.toMatchObject({ code: 'ENOENT' }); + expect(calls).toBe(1); + }); + + it('gives up after the configured number of attempts', async () => { + let calls = 0; + vi.spyOn(fsp, 'rename').mockImplementation(async () => { + calls++; + const err = new Error('locked') as NodeJS.ErrnoException; + err.code = 'EPERM'; + throw err; + }); + + await expect(retryRename('/src/a', '/dst/b', 3)).rejects.toMatchObject({ code: 'EPERM' }); + expect(calls).toBe(3); + }); + + it('retries on EACCES as well', async () => { + let calls = 0; + vi.spyOn(fsp, 'rename').mockImplementation(async () => { + calls++; + if (calls < 2) { + const err = new Error('permission denied') as NodeJS.ErrnoException; + err.code = 'EACCES'; + throw err; + } + return undefined; + }); + + await retryRename('/src/a', '/dst/b', 3); + expect(calls).toBe(2); + }); +}); diff --git a/gitnexus/test/unit/group/cross-impact.test.ts b/gitnexus/test/unit/group/cross-impact.test.ts new file mode 100644 index 0000000000..b9efd7685b --- /dev/null +++ b/gitnexus/test/unit/group/cross-impact.test.ts @@ -0,0 +1,562 @@ +import { describe, it, expect, vi } from 'vitest'; +import { runGroupImpactLegacy, runGroupImpact } from '../../../src/core/group/cross-impact.js'; +import type { ContractRegistry } from '../../../src/core/group/types.js'; + +describe('runGroupImpactLegacy', () => { + const mockRegistry: ContractRegistry = { + version: 1, + generatedAt: '2026-03-31T10:00:00Z', + repoSnapshots: { + 'app/backend': { indexedAt: '2026-03-31T09:00:00Z', lastCommit: 'abc123' }, + 'app/frontend': { indexedAt: '2026-03-31T09:00:00Z', lastCommit: 'def456' }, + }, + missingRepos: [], + contracts: [], + crossLinks: [ + { + from: { + repo: 'app/frontend', + symbolUid: 'uid-fetch', + symbolRef: { filePath: 'src/api.ts', name: 'fetchUsers' }, + }, + to: { + repo: 'app/backend', + symbolUid: 'uid-ctrl', + symbolRef: { filePath: 'src/ctrl.ts', name: 'UserController.list' }, + }, + type: 'http', + contractId: 'http::GET::/api/users', + matchType: 'exact', + confidence: 1.0, + }, + ], + }; + + it('returns local impact when no cross-links match', async () => { + const result = await runGroupImpactLegacy({ + groupName: 'test', + target: 'SomeUnrelatedFn', + repoPath: 'app/backend', + direction: 'upstream', + registry: mockRegistry, + localImpactFn: async () => ({ + target: { id: 'uid-x', name: 'SomeUnrelatedFn', filePath: 'src/x.ts' }, + direction: 'upstream', + impactedCount: 1, + risk: 'LOW', + summary: { direct: 1, processes_affected: 0, modules_affected: 0 }, + affected_processes: [], + affected_modules: [], + byDepth: { '1': [{ id: 'uid-y', name: 'CallerFn', filePath: 'src/y.ts' }] }, + }), + crossImpactFn: async () => null, + }); + + expect(result.cross).toHaveLength(0); + expect(result.summary.cross_repo_hits).toBe(0); + expect(result.risk).toBe('LOW'); + }); + + it('fans out through cross-links for upstream direction', async () => { + const result = await runGroupImpactLegacy({ + groupName: 'test', + target: 'UserController.list', + repoPath: 'app/backend', + direction: 'upstream', + registry: mockRegistry, + localImpactFn: async () => ({ + target: { id: 'uid-ctrl', name: 'UserController.list', filePath: 'src/ctrl.ts' }, + direction: 'upstream', + impactedCount: 2, + risk: 'LOW', + summary: { direct: 2, processes_affected: 0, modules_affected: 0 }, + affected_processes: [], + affected_modules: [], + byDepth: { + '1': [{ id: 'uid-ctrl', name: 'UserController.list', filePath: 'src/ctrl.ts' }], + }, + }), + crossImpactFn: async () => ({ + target: { id: 'uid-fetch', name: 'fetchUsers', filePath: 'src/api.ts' }, + direction: 'upstream', + impactedCount: 1, + risk: 'LOW', + summary: { direct: 1, processes_affected: 0, modules_affected: 0 }, + affected_processes: [], + affected_modules: [], + byDepth: { + '1': [ + { id: 'uid-profile', name: 'UserProfile', filePath: 'src/components/UserProfile.tsx' }, + ], + }, + }), + }); + + expect(result.cross).toHaveLength(1); + expect(result.cross[0].repo_path).toBe('app/frontend'); + expect(result.cross[0].contract.match_type).toBe('exact'); + expect(result.summary.cross_repo_hits).toBe(1); + expect(['HIGH', 'CRITICAL']).toContain(result.risk); + }); + + it('fans out for downstream direction (consumer repo → provider repo)', async () => { + const result = await runGroupImpactLegacy({ + groupName: 'test', + target: 'fetchUsers', + repoPath: 'app/frontend', + direction: 'downstream', + registry: mockRegistry, + localImpactFn: async () => ({ + target: { id: 'uid-fetch', name: 'fetchUsers', filePath: 'src/api.ts' }, + direction: 'downstream', + impactedCount: 1, + risk: 'LOW', + summary: { direct: 1, processes_affected: 0, modules_affected: 0 }, + affected_processes: [], + affected_modules: [], + byDepth: { '1': [{ id: 'uid-fetch', name: 'fetchUsers', filePath: 'src/api.ts' }] }, + }), + crossImpactFn: async (groupPath, uid, _direction) => { + expect(groupPath).toBe('app/backend'); + expect(uid).toBe('uid-ctrl'); + expect(_direction).toBe('downstream'); + return { + byDepth: { + '1': [{ id: 'uid-ctrl', name: 'UserController.list', filePath: 'src/ctrl.ts' }], + }, + affected_processes: [], + }; + }, + }); + + expect(result.cross).toHaveLength(1); + expect(result.cross[0].repo_path).toBe('app/backend'); + expect(result.summary.cross_repo_hits).toBe(1); + }); + + it('respects subgroup filter', async () => { + const result = await runGroupImpactLegacy({ + groupName: 'test', + target: 'UserController.list', + repoPath: 'app/backend', + direction: 'upstream', + registry: mockRegistry, + subgroup: 'other/team', + localImpactFn: async () => ({ + target: { id: 'uid-ctrl', name: 'UserController.list', filePath: 'src/ctrl.ts' }, + direction: 'upstream', + impactedCount: 1, + risk: 'LOW', + summary: { direct: 1, processes_affected: 0, modules_affected: 0 }, + affected_processes: [], + affected_modules: [], + byDepth: { + '1': [{ id: 'uid-ctrl', name: 'UserController.list', filePath: 'src/ctrl.ts' }], + }, + }), + crossImpactFn: async () => null, + }); + + expect(result.cross).toHaveLength(0); + expect(result.outOfScope).toHaveLength(1); + expect(result.outOfScope[0].from).toBe('app/frontend'); + }); + + it('respects timeout and returns truncated result', async () => { + const result = await runGroupImpactLegacy({ + groupName: 'test', + target: 'UserController.list', + repoPath: 'app/backend', + direction: 'upstream', + timeout: 1, + registry: mockRegistry, + localImpactFn: async () => { + await new Promise((r) => setTimeout(r, 50)); + return { + target: { id: 'uid-ctrl', name: 'UserController.list', filePath: 'src/ctrl.ts' }, + direction: 'upstream', + impactedCount: 0, + risk: 'LOW', + summary: { direct: 0, processes_affected: 0, modules_affected: 0 }, + affected_processes: [], + affected_modules: [], + byDepth: {}, + }; + }, + crossImpactFn: async () => null, + }); + + expect(result.truncated).toBe(true); + }); + + it.each([ + [500, 500], + [5000, 1500], + [30000, 9000], + [60000, 10000], + ])( + 'test_runGroupImpactLegacy_phase1_timeout_contract_%i_to_%i', + async (timeout, expectedPhase1Timeout) => { + const t0 = Date.now(); + const result = await runGroupImpactLegacy({ + groupName: 'test', + target: 'slowTarget', + repoPath: 'app/backend', + direction: 'upstream', + timeout, + registry: mockRegistry, + localImpactFn: async () => { + await new Promise((resolve) => setTimeout(resolve, expectedPhase1Timeout + 50)); + return { + target: { id: '', name: 'slowTarget', filePath: '' }, + direction: 'upstream', + impactedCount: 0, + risk: 'LOW', + summary: {}, + affected_processes: [], + affected_modules: [], + byDepth: {}, + }; + }, + crossImpactFn: async () => null, + }); + expect(result.truncated).toBe(true); + expect(Date.now() - t0).toBeLessThan(expectedPhase1Timeout + 250); + }, + ); +}); + +describe('runGroupImpact (Cypher-based)', () => { + function makeBridgeQuery(rows: Record[]) { + return vi.fn().mockResolvedValue(rows); + } + + const localImpactFn = async () => ({ + target: { id: 'uid-ctrl', name: 'UserController.list', filePath: 'src/ctrl.ts' }, + direction: 'upstream', + impactedCount: 2, + risk: 'LOW', + summary: { direct: 2, processes_affected: 0, modules_affected: 0 }, + affected_processes: [], + affected_modules: [], + byDepth: { + '1': [{ id: 'uid-ctrl', name: 'UserController.list', filePath: 'src/ctrl.ts' }], + }, + }); + + it('test_runGroupImpact_upstream_calls_bridgeQuery_with_correct_params', async () => { + const bridgeQuery = makeBridgeQuery([ + { + fanOutRepo: 'app/frontend', + fanOutUid: 'uid-fetch', + fanOutFilePath: 'src/api.ts', + fanOutSymbolName: 'fetchUsers', + matchedLocalUid: 'uid-ctrl', + matchedLocalFilePath: 'src/ctrl.ts', + matchedLocalSymbolName: 'UserController.list', + matchType: 'exact', + confidence: 1.0, + contractId: 'http::GET::/api/users', + contractType: 'http', + }, + ]); + const crossImpactFn = vi.fn().mockResolvedValue({ + byDepth: { + '1': [ + { id: 'uid-profile', name: 'UserProfile', filePath: 'src/components/UserProfile.tsx' }, + ], + }, + affected_processes: [], + }); + + const result = await runGroupImpact({ + groupName: 'test', + target: 'UserController.list', + repoPath: 'app/backend', + direction: 'upstream', + bridgeQuery, + localImpactFn, + crossImpactFn, + }); + + expect(bridgeQuery).toHaveBeenCalledOnce(); + const [cypher, params] = bridgeQuery.mock.calls[0]; + expect(cypher).toContain('provider.repo = $sourceRepo'); + expect(params.sourceRepo).toBe('app/backend'); + expect(params.localUids).toContain('uid-ctrl'); + expect(params.minConfidence).toBe(0.5); + expect(params.subgroup).toBeUndefined(); + + expect(result.cross).toHaveLength(1); + expect(result.cross[0].repo_path).toBe('app/frontend'); + expect(result.cross[0].contract.match_type).toBe('exact'); + expect(result.summary.cross_repo_hits).toBe(1); + expect(result.outOfScope).toHaveLength(0); + }); + + it('test_runGroupImpact_refs_only_bridge_query_omits_empty_uid_clause', async () => { + const bridgeQuery = makeBridgeQuery([ + { + fanOutRepo: 'app/backend', + fanOutUid: '', + fanOutFilePath: 'src/ctrl.ts', + fanOutSymbolName: 'UserController.list', + matchedLocalUid: '', + matchedLocalFilePath: 'src/api.ts', + matchedLocalSymbolName: 'fetchUsers', + matchType: 'exact', + confidence: 1, + contractId: 'http::GET::/api/users', + contractType: 'http', + }, + ]); + const crossImpactFn = vi.fn().mockResolvedValue({ + byDepth: {}, + affected_processes: [], + }); + + const result = await runGroupImpact({ + groupName: 'test', + target: 'fetchUsers', + repoPath: 'app/frontend', + direction: 'downstream', + bridgeQuery, + localImpactFn: async () => ({ + target: { id: '', name: 'fetchUsers', filePath: 'src/api.ts' }, + direction: 'downstream', + impactedCount: 1, + risk: 'LOW', + summary: { direct: 1, processes_affected: 0, modules_affected: 0 }, + affected_processes: [], + affected_modules: [], + byDepth: {}, + }), + crossImpactFn, + }); + + expect(bridgeQuery).toHaveBeenCalledOnce(); + const [cypher, params] = bridgeQuery.mock.calls[0]; + expect(cypher).toContain("(consumer.filePath + '::' + consumer.symbolName) IN $localRefs"); + expect(cypher).not.toContain('consumer.symbolUid IN $localUids'); + expect(params.localRefs).toEqual(['src/api.ts::fetchUsers']); + expect(params.localUids).toBeUndefined(); + expect(result.summary.cross_repo_hits).toBe(1); + }); + + it('test_runGroupImpact_downstream_fans_out_to_provider', async () => { + const bridgeQuery = makeBridgeQuery([ + { + fanOutRepo: 'app/backend', + fanOutUid: 'uid-ctrl', + fanOutFilePath: 'src/ctrl.ts', + fanOutSymbolName: 'UserController.list', + matchedLocalUid: 'uid-fetch', + matchedLocalFilePath: 'src/api.ts', + matchedLocalSymbolName: 'fetchUsers', + matchType: 'exact', + confidence: 0.9, + contractId: 'http::GET::/api/users', + contractType: 'http', + }, + ]); + const crossImpactFn = vi.fn().mockResolvedValue({ + byDepth: { '1': [{ id: 'uid-ctrl', name: 'UserController.list', filePath: 'src/ctrl.ts' }] }, + affected_processes: [], + }); + + const result = await runGroupImpact({ + groupName: 'test', + target: 'fetchUsers', + repoPath: 'app/frontend', + direction: 'downstream', + bridgeQuery, + localImpactFn: async () => ({ + target: { id: 'uid-fetch', name: 'fetchUsers', filePath: 'src/api.ts' }, + direction: 'downstream', + impactedCount: 1, + risk: 'LOW', + summary: { direct: 1, processes_affected: 0, modules_affected: 0 }, + affected_processes: [], + affected_modules: [], + byDepth: { '1': [{ id: 'uid-fetch', name: 'fetchUsers', filePath: 'src/api.ts' }] }, + }), + crossImpactFn, + }); + + expect(bridgeQuery).toHaveBeenCalledOnce(); + const [cypher] = bridgeQuery.mock.calls[0]; + expect(cypher).toContain('consumer.repo = $sourceRepo'); + + expect(crossImpactFn).toHaveBeenCalledWith('app/backend', 'uid-ctrl', 'downstream', undefined); + expect(result.cross).toHaveLength(1); + expect(result.cross[0].repo_path).toBe('app/backend'); + expect(result.summary.cross_repo_hits).toBe(1); + }); + + it('test_runGroupImpact_hint_passed_when_uid_empty', async () => { + const bridgeQuery = makeBridgeQuery([ + { + fanOutRepo: 'app/frontend', + fanOutUid: '', + fanOutFilePath: 'src/api.ts', + fanOutSymbolName: 'fetchUsers', + matchedLocalUid: 'uid-ctrl', + matchedLocalFilePath: 'src/ctrl.ts', + matchedLocalSymbolName: 'UserController.list', + matchType: 'bm25', + confidence: 0.7, + contractId: 'grpc::UserService', + contractType: 'grpc', + }, + ]); + const crossImpactFn = vi.fn().mockResolvedValue({ + byDepth: {}, + affected_processes: [], + }); + + await runGroupImpact({ + groupName: 'test', + target: 'UserController.list', + repoPath: 'app/backend', + direction: 'upstream', + bridgeQuery, + localImpactFn, + crossImpactFn, + }); + + expect(crossImpactFn).toHaveBeenCalledWith('app/frontend', '', 'upstream', { + filePath: 'src/api.ts', + symbolName: 'fetchUsers', + }); + }); + + it('test_runGroupImpact_subgroup_passed_to_bridgeQuery', async () => { + const bridgeQuery = makeBridgeQuery([]); + const crossImpactFn = vi.fn(); + + await runGroupImpact({ + groupName: 'test', + target: 'UserController.list', + repoPath: 'app/backend', + direction: 'upstream', + subgroup: 'team/backend', + bridgeQuery, + localImpactFn, + crossImpactFn, + }); + + expect(bridgeQuery).toHaveBeenCalledOnce(); + const [cypher, params] = bridgeQuery.mock.calls[0]; + // When subgroup is provided, the query should include the subgroup filter clause + expect(params.subgroup).toBe('team/backend'); + // The Cypher should include the subgroup WHERE clause + expect(cypher).toContain('$subgroup'); + }); + + it('test_runGroupImpact_subgroup_filtered_rows_are_reported_out_of_scope', async () => { + const bridgeQuery = makeBridgeQuery([ + { + fanOutRepo: 'other/team/frontend', + fanOutUid: 'uid-fetch', + fanOutFilePath: 'src/api.ts', + fanOutSymbolName: 'fetchUsers', + matchedLocalUid: 'uid-ctrl', + matchedLocalFilePath: 'src/ctrl.ts', + matchedLocalSymbolName: 'UserController.list', + matchType: 'exact', + confidence: 1.0, + contractId: 'http::GET::/api/users', + contractType: 'http', + }, + ]); + + const result = await runGroupImpact({ + groupName: 'test', + target: 'UserController.list', + repoPath: 'app/backend', + direction: 'upstream', + subgroup: 'team/backend', + bridgeQuery, + localImpactFn, + crossImpactFn: vi.fn().mockResolvedValue(null), + }); + + expect(result.cross).toHaveLength(0); + expect(result.outOfScope).toEqual([ + expect.objectContaining({ + from: 'other/team/frontend', + to: 'app/backend', + contractId: 'http::GET::/api/users', + matchType: 'exact', + }), + ]); + }); + + it('test_runGroupImpact_error_object_not_counted_as_hit', async () => { + const bridgeQuery = makeBridgeQuery([ + { + fanOutRepo: 'app/frontend', + fanOutUid: 'uid-fetch', + fanOutFilePath: 'src/api.ts', + fanOutSymbolName: 'fetchUsers', + matchedLocalUid: 'uid-ctrl', + matchedLocalFilePath: 'src/ctrl.ts', + matchedLocalSymbolName: 'UserController.list', + matchType: 'exact', + confidence: 1.0, + contractId: 'http::GET::/api/users', + contractType: 'http', + }, + ]); + const crossImpactFn = vi.fn().mockResolvedValue({ error: 'repo not indexed' }); + + const result = await runGroupImpact({ + groupName: 'test', + target: 'UserController.list', + repoPath: 'app/backend', + direction: 'upstream', + bridgeQuery, + localImpactFn, + crossImpactFn, + }); + + expect(result.cross).toHaveLength(0); + expect(result.summary.cross_repo_hits).toBe(0); + }); + + it.each([ + [500, 500], + [5000, 1500], + [30000, 9000], + [60000, 10000], + ])( + 'test_runGroupImpact_phase1_timeout_contract_%i_to_%i', + async (timeout, expectedPhase1Timeout) => { + const t0 = Date.now(); + const result = await runGroupImpact({ + groupName: 'test', + target: 'slowTarget', + repoPath: 'app/backend', + direction: 'upstream', + timeout, + bridgeQuery: vi.fn().mockResolvedValue([]), + localImpactFn: async () => { + await new Promise((resolve) => setTimeout(resolve, expectedPhase1Timeout + 50)); + return { + target: { id: '', name: 'slowTarget', filePath: '' }, + direction: 'upstream', + impactedCount: 0, + risk: 'LOW', + summary: {}, + affected_processes: [], + affected_modules: [], + byDepth: {}, + }; + }, + crossImpactFn: vi.fn().mockResolvedValue(null), + }); + expect(result.truncated).toBe(true); + expect(Date.now() - t0).toBeLessThan(expectedPhase1Timeout + 250); + }, + ); +}); diff --git a/gitnexus/test/unit/group/group-tools.test.ts b/gitnexus/test/unit/group/group-tools.test.ts index e58077442c..0c2b3a7cd9 100644 --- a/gitnexus/test/unit/group/group-tools.test.ts +++ b/gitnexus/test/unit/group/group-tools.test.ts @@ -6,12 +6,13 @@ const GROUP_TOOL_NAMES = [ 'group_list', 'group_sync', 'group_contracts', + 'group_impact', 'group_query', 'group_status', ]; describe('Group MCP tools', () => { - it('all 5 group tools are registered', () => { + it('all 6 group tools are registered', () => { for (const name of GROUP_TOOL_NAMES) { const tool = GITNEXUS_TOOLS.find((t) => t.name === name); expect(tool, `tool ${name} should be registered`).toBeDefined(); @@ -20,8 +21,22 @@ describe('Group MCP tools', () => { } }); + it('group_impact requires name, target, repo', () => { + const tool = GITNEXUS_TOOLS.find((t) => t.name === 'group_impact')!; + expect(tool.inputSchema.required).toContain('name'); + expect(tool.inputSchema.required).toContain('target'); + expect(tool.inputSchema.required).toContain('repo'); + }); + it('group_sync requires name', () => { const tool = GITNEXUS_TOOLS.find((t) => t.name === 'group_sync')!; expect(tool.inputSchema.required).toContain('name'); }); + + it('group_impact has crossDepth param with max 1 note in description', () => { + const tool = GITNEXUS_TOOLS.find((t) => t.name === 'group_impact')!; + const crossDepth = tool.inputSchema.properties.crossDepth as { description?: string }; + expect(crossDepth).toBeDefined(); + expect(crossDepth.description).toContain('capped at 1'); + }); }); diff --git a/gitnexus/test/unit/group/grpc-extractor.test.ts b/gitnexus/test/unit/group/grpc-extractor.test.ts index b4fc63b5c7..82d79cbd69 100644 --- a/gitnexus/test/unit/group/grpc-extractor.test.ts +++ b/gitnexus/test/unit/group/grpc-extractor.test.ts @@ -1,17 +1,23 @@ import { describe, it, expect, beforeEach, afterEach } from 'vitest'; import * as fs from 'node:fs'; +import fsp from 'node:fs/promises'; import * as path from 'node:path'; import * as os from 'node:os'; -import { GrpcExtractor } from '../../../src/core/group/extractors/grpc-extractor.js'; +import { + GrpcExtractor, + buildProtoMap, + resolveProtoConflict, + serviceContractId, +} from '../../../src/core/group/extractors/grpc-extractor.js'; +import type { ProtoServiceInfo } from '../../../src/core/group/extractors/grpc-extractor.js'; import type { RepoHandle } from '../../../src/core/group/types.js'; describe('GrpcExtractor', () => { let tmpDir: string; let extractor: GrpcExtractor; - beforeEach(() => { - tmpDir = path.join(os.tmpdir(), `gitnexus-grpc-${Date.now()}`); - fs.mkdirSync(tmpDir, { recursive: true }); + beforeEach(async () => { + tmpDir = await fsp.mkdtemp(path.join(os.tmpdir(), 'gitnexus-grpc-')); extractor = new GrpcExtractor(); }); @@ -205,6 +211,66 @@ service IncompleteService { // The old regex would find partial match; the new parser should skip it expect(providers).toHaveLength(0); }); + + it('test_extract_proto_ignores_braces_inside_string_literals', async () => { + // Regression for a known parser limitation: braces inside string + // literals used to be counted as real service-body braces, which + // would terminate the service early and drop methods after the + // offending string. + writeFile( + 'api/strings.proto', + `syntax = "proto3"; +package strings; + +service TrickyService { + rpc First (Req) returns (Res) { + option (google.api.http).additional_bindings = { + post: "/v1/first"; + }; + } + // Previously the "{" inside this literal would close the service body. + option deprecated_reason = "use NewService { instead"; + rpc Second (Req) returns (Res); + rpc Third (Req) returns (Res); +} +`, + ); + + const contracts = await extractor.extract(null, tmpDir, makeRepo(tmpDir)); + const protoProviders = contracts.filter( + (c) => c.role === 'provider' && c.symbolRef.filePath === 'api/strings.proto', + ); + // All three methods must be extracted even though a string literal + // contains an unbalanced "{". + expect(protoProviders.map((c) => c.symbolName).sort()).toEqual([ + 'TrickyService.First', + 'TrickyService.Second', + 'TrickyService.Third', + ]); + }); + + it('test_extract_proto_ignores_braces_inside_comments', async () => { + writeFile( + 'api/commented.proto', + `syntax = "proto3"; +package commented; + +service Svc { + // TODO: move { or } from this comment — parser used to count them + /* A block comment with { unbalanced braces } */ + rpc Alpha (Req) returns (Res); + // }} end of the method block (in comment) + rpc Beta (Req) returns (Res); +} +`, + ); + + const contracts = await extractor.extract(null, tmpDir, makeRepo(tmpDir)); + const protoProviders = contracts.filter( + (c) => c.role === 'provider' && c.symbolRef.filePath === 'api/commented.proto', + ); + expect(protoProviders.map((c) => c.symbolName).sort()).toEqual(['Svc.Alpha', 'Svc.Beta']); + }); }); describe('Go server detection', () => { @@ -228,7 +294,7 @@ func main() { expect(providers.length).toBeGreaterThanOrEqual(1); expect(providers[0].contractId).toContain('grpc::'); expect(providers[0].contractId).toContain('AuthService'); - expect(providers[0].confidence).toBe(0.8); + expect(providers[0].confidence).toBe(0.65); }); it('test_extract_go_unimplemented_server_returns_provider', async () => { @@ -267,7 +333,7 @@ func NewAuthClient(conn *grpc.ClientConn) pb.AuthServiceClient { expect(consumers.length).toBeGreaterThanOrEqual(1); expect(consumers[0].contractId).toContain('AuthService'); - expect(consumers[0].confidence).toBe(0.7); + expect(consumers[0].confidence).toBe(0.55); }); }); @@ -287,7 +353,7 @@ public class AuthGrpcService extends AuthServiceGrpc.AuthServiceImplBase { expect(providers.length).toBeGreaterThanOrEqual(1); expect(providers[0].contractId).toContain('AuthService'); - expect(providers[0].confidence).toBe(0.8); + expect(providers[0].confidence).toBe(0.65); }); it('test_extract_java_blocking_stub_returns_consumer', async () => { @@ -306,7 +372,7 @@ public class AuthGrpcService extends AuthServiceGrpc.AuthServiceImplBase { expect(consumers.length).toBeGreaterThanOrEqual(1); expect(consumers[0].contractId).toContain('AuthService'); - expect(consumers[0].confidence).toBe(0.7); + expect(consumers[0].confidence).toBe(0.55); }); }); @@ -328,7 +394,7 @@ def serve(): expect(providers.length).toBeGreaterThanOrEqual(1); expect(providers[0].contractId).toContain('AuthService'); - expect(providers[0].confidence).toBe(0.8); + expect(providers[0].confidence).toBe(0.65); }); it('test_extract_python_stub_returns_consumer', async () => { @@ -346,7 +412,7 @@ stub = auth_pb2_grpc.AuthServiceStub(channel)`, expect(consumers.length).toBeGreaterThanOrEqual(1); expect(consumers[0].contractId).toContain('AuthService'); - expect(consumers[0].confidence).toBe(0.7); + expect(consumers[0].confidence).toBe(0.55); }); }); @@ -372,6 +438,165 @@ export class AuthController { expect(providers[0].contractId).toContain('Login'); expect(providers[0].confidence).toBe(0.8); }); + + it('test_extract_ts_grpc_client_decorator_returns_consumer', async () => { + writeFile( + 'proto/auth.proto', + `syntax = "proto3"; +package auth.v1; +service AuthService { + rpc Login (LoginRequest) returns (LoginResponse); +}`, + ); + writeFile( + 'src/auth.client.ts', + `import { GrpcClient } from '@nestjs/microservices'; +import type { AuthServiceClient } from './generated/auth'; + +export class AuthGateway { + @GrpcClient({ package: 'auth.v1', protoPath: 'proto/auth.proto' }) + private readonly authClient!: AuthServiceClient; +}`, + ); + + const contracts = await extractor.extract(null, tmpDir, makeRepo(tmpDir)); + const consumers = contracts.filter((c) => c.role === 'consumer'); + + expect(consumers).toHaveLength(1); + expect(consumers[0].contractId).toBe('grpc::auth.v1.AuthService/*'); + }); + + it('test_extract_ts_getService_without_decorator_returns_consumer', async () => { + writeFile( + 'proto/auth.proto', + `syntax = "proto3"; +package auth.v1; +service AuthService { + rpc Login (LoginRequest) returns (LoginResponse); +}`, + ); + writeFile( + 'src/auth.client.ts', + `import type { ClientGrpc } from '@nestjs/microservices'; + +export function createAuthClient(client: ClientGrpc) { + return client.getService('AuthService'); +}`, + ); + + const contracts = await extractor.extract(null, tmpDir, makeRepo(tmpDir)); + const consumers = contracts.filter((c) => c.role === 'consumer'); + + expect(consumers).toHaveLength(1); + expect(consumers[0].contractId).toBe('grpc::auth.v1.AuthService/*'); + }); + + it('test_extract_ts_generated_client_constructor_returns_consumer', async () => { + writeFile( + 'proto/auth.proto', + `syntax = "proto3"; +package auth.v1; +service AuthService { + rpc Login (LoginRequest) returns (LoginResponse); +}`, + ); + writeFile( + 'src/auth.client.ts', + `import { credentials } from '@grpc/grpc-js'; +import { AuthServiceClient } from './generated/auth'; + +export const authClient = new AuthServiceClient('localhost:50051', credentials.createInsecure());`, + ); + + const contracts = await extractor.extract(null, tmpDir, makeRepo(tmpDir)); + const consumers = contracts.filter((c) => c.role === 'consumer'); + + expect(consumers).toHaveLength(1); + expect(consumers[0].contractId).toBe('grpc::auth.v1.AuthService/*'); + }); + + it('test_extract_ts_non_service_client_constructor_is_ignored', async () => { + writeFile( + 'proto/auth.proto', + `syntax = "proto3"; +package auth.v1; +service AuthService { + rpc Login (LoginRequest) returns (LoginResponse); +}`, + ); + writeFile( + 'src/auth.client.ts', + `import { AuthClient } from './generated/auth'; + +export const authClient = new AuthClient('localhost:50051');`, + ); + + const contracts = await extractor.extract(null, tmpDir, makeRepo(tmpDir)); + const consumers = contracts.filter((c) => c.role === 'consumer'); + + expect(consumers).toHaveLength(0); + }); + + it('test_extract_ts_loadPackageDefinition_constructor_returns_consumer', async () => { + writeFile( + 'proto/auth.proto', + `syntax = "proto3"; +package auth.v1; +service AuthService { + rpc Login (LoginRequest) returns (LoginResponse); +}`, + ); + writeFile( + 'src/auth.client.ts', + `import * as grpc from '@grpc/grpc-js'; +import * as protoLoader from '@grpc/proto-loader'; + +const definition = protoLoader.loadSync('proto/auth.proto'); +const authProto = grpc.loadPackageDefinition(definition) as any; +export const authClient = new authProto.auth.v1.AuthService( + 'localhost:50051', + grpc.credentials.createInsecure(), +);`, + ); + + const contracts = await extractor.extract(null, tmpDir, makeRepo(tmpDir)); + const consumers = contracts.filter((c) => c.role === 'consumer'); + + expect(consumers).toHaveLength(1); + expect(consumers[0].contractId).toBe('grpc::auth.v1.AuthService/*'); + }); + + it('test_extract_ts_duplicate_consumer_patterns_in_one_file_dedupes_deterministically', async () => { + writeFile( + 'proto/auth.proto', + `syntax = "proto3"; +package auth.v1; +service AuthService { + rpc Login (LoginRequest) returns (LoginResponse); +}`, + ); + writeFile( + 'src/auth.client.ts', + `import * as grpc from '@grpc/grpc-js'; +import type { ClientGrpc } from '@nestjs/microservices'; +import { AuthServiceClient } from './generated/auth'; + +export class AuthGateway { + constructor(private readonly client: ClientGrpc) {} + + connect() { + this.client.getService('AuthService'); + return new AuthServiceClient('localhost:50051', grpc.credentials.createInsecure()); + } +}`, + ); + + const contracts = await extractor.extract(null, tmpDir, makeRepo(tmpDir)); + const consumers = contracts.filter((c) => c.role === 'consumer'); + + expect(consumers).toHaveLength(1); + expect(consumers[0].contractId).toBe('grpc::auth.v1.AuthService/*'); + }); }); describe('edge cases', () => { @@ -389,3 +614,297 @@ export class AuthController { }); }); }); + +describe('buildProtoMap', () => { + let tmpDir: string; + beforeEach(async () => { + tmpDir = await fsp.mkdtemp(path.join(os.tmpdir(), 'proto-test-')); + }); + afterEach(async () => { + await fsp.rm(tmpDir, { recursive: true, force: true }); + }); + + it('test_buildProtoMap_single_proto_parses_package_service_methods', async () => { + const protoContent = ` +syntax = "proto3"; +package com.example; + +service UserService { + rpc GetUser (GetUserRequest) returns (GetUserResponse); + rpc ListUsers (ListUsersRequest) returns (ListUsersResponse); +}`; + await fsp.mkdir(path.join(tmpDir, 'proto'), { recursive: true }); + await fsp.writeFile(path.join(tmpDir, 'proto', 'user.proto'), protoContent); + + const map = await buildProtoMap(tmpDir); + expect(map.has('UserService')).toBe(true); + const entries = map.get('UserService')!; + expect(entries).toHaveLength(1); + expect(entries[0].package).toBe('com.example'); + expect(entries[0].serviceName).toBe('UserService'); + expect(entries[0].methods).toEqual(['GetUser', 'ListUsers']); + expect(entries[0].protoPath).toBe('proto/user.proto'); + }); + + it('test_buildProtoMap_no_package_declaration', async () => { + const protoContent = ` +syntax = "proto3"; +service Foo { rpc Bar (Req) returns (Res); }`; + await fsp.writeFile(path.join(tmpDir, 'foo.proto'), protoContent); + + const map = await buildProtoMap(tmpDir); + const entries = map.get('Foo')!; + expect(entries[0].package).toBe(''); + }); + + it('test_buildProtoMap_no_protos_returns_empty', async () => { + const map = await buildProtoMap(tmpDir); + expect(map.size).toBe(0); + }); + + it('test_buildProtoMap_conflicting_names', async () => { + await fsp.mkdir(path.join(tmpDir, 'a'), { recursive: true }); + await fsp.mkdir(path.join(tmpDir, 'b'), { recursive: true }); + await fsp.writeFile( + path.join(tmpDir, 'a', 'svc.proto'), + 'package pkg.a;\nservice Svc { rpc Do (R) returns (R); }', + ); + await fsp.writeFile( + path.join(tmpDir, 'b', 'svc.proto'), + 'package pkg.b;\nservice Svc { rpc Do (R) returns (R); }', + ); + + const map = await buildProtoMap(tmpDir); + expect(map.get('Svc')).toHaveLength(2); + }); + + it('test_buildProtoMap_imported_package_is_inherited_for_split_service_definition', async () => { + await fsp.mkdir(path.join(tmpDir, 'proto', 'shared'), { recursive: true }); + await fsp.mkdir(path.join(tmpDir, 'proto', 'services'), { recursive: true }); + await fsp.writeFile( + path.join(tmpDir, 'proto', 'shared', 'package.proto'), + 'package auth.v1;\nmessage LoginRequest {}', + ); + await fsp.writeFile( + path.join(tmpDir, 'proto', 'services', 'auth.proto'), + 'import "../shared/package.proto";\nservice AuthService { rpc Login (LoginRequest) returns (LoginRequest); }', + ); + + const map = await buildProtoMap(tmpDir); + const entries = map.get('AuthService')!; + + expect(entries).toHaveLength(1); + expect(entries[0].package).toBe('auth.v1'); + }); +}); + +describe('resolveProtoConflict', () => { + const makeInfo = (pkg: string, protoPath: string): ProtoServiceInfo => ({ + package: pkg, + serviceName: 'Svc', + methods: ['Do'], + protoPath, + }); + + it('test_single_candidate_returns_it', () => { + const result = resolveProtoConflict('Svc', 'src/main.go', [makeInfo('pkg', 'proto/svc.proto')]); + expect(result?.package).toBe('pkg'); + }); + + it('test_multiple_candidates_picks_closest_directory', () => { + const candidates = [ + makeInfo('far', 'other/dir/svc.proto'), + makeInfo('close', 'src/proto/svc.proto'), + ]; + const result = resolveProtoConflict('Svc', 'src/server.go', candidates); + expect(result?.package).toBe('close'); + }); + + it('test_centralized_proto_layout_prefers_shared_path_segments_over_prefix_only', () => { + const candidates = [ + makeInfo('billing', 'proto/services/billing/svc.proto'), + makeInfo('auth', 'proto/services/auth/svc.proto'), + ]; + const result = resolveProtoConflict('Svc', 'services/auth/src/server.ts', candidates); + expect(result?.package).toBe('auth'); + }); + + it('test_no_candidates_returns_null', () => { + expect(resolveProtoConflict('Svc', 'src/main.go', [])).toBeNull(); + }); +}); + +describe('serviceContractId', () => { + it('test_with_package', () => { + expect(serviceContractId('com.example', 'UserService')).toBe('grpc::com.example.UserService/*'); + }); + + it('test_without_package', () => { + expect(serviceContractId('', 'UserService')).toBe('grpc::UserService/*'); + }); +}); + +describe('proto-aware source scanners', () => { + let tmpDir: string; + let extractor: GrpcExtractor; + + beforeEach(async () => { + tmpDir = await fsp.mkdtemp(path.join(os.tmpdir(), 'scanner-test-')); + extractor = new GrpcExtractor(); + }); + afterEach(async () => { + await fsp.rm(tmpDir, { recursive: true, force: true }); + }); + + const makeRepo = (repoPath: string): RepoHandle => ({ + id: 'test-repo', + path: '', + repoPath, + storagePath: '', + }); + + it('test_go_provider_with_proto_uses_canonical_service_id', async () => { + await fsp.mkdir(path.join(tmpDir, 'proto'), { recursive: true }); + await fsp.writeFile( + path.join(tmpDir, 'proto', 'user.proto'), + 'package com.example;\nservice UserService { rpc GetUser (R) returns (R); }', + ); + await fsp.mkdir(path.join(tmpDir, 'src'), { recursive: true }); + await fsp.writeFile( + path.join(tmpDir, 'src', 'server.go'), + 'package main\nfunc init() { pb.RegisterUserServiceServer(srv, &impl{}) }', + ); + + const contracts = await extractor.extract(null, tmpDir, makeRepo(tmpDir)); + + const goProvider = contracts.find((c) => c.meta.source === 'go_register'); + expect(goProvider).toBeDefined(); + expect(goProvider!.contractId).toBe('grpc::com.example.UserService/*'); + expect(goProvider!.confidence).toBe(0.8); + }); + + it('test_go_provider_without_proto_reduced_confidence', async () => { + await fsp.mkdir(path.join(tmpDir, 'src'), { recursive: true }); + await fsp.writeFile( + path.join(tmpDir, 'src', 'server.go'), + 'package main\nfunc init() { pb.RegisterFooServer(srv, &impl{}) }', + ); + + const contracts = await extractor.extract(null, tmpDir, makeRepo(tmpDir)); + + const goProvider = contracts.find((c) => c.meta.source === 'go_register'); + expect(goProvider).toBeDefined(); + expect(goProvider!.contractId).toBe('grpc::Foo/*'); + expect(goProvider!.confidence).toBe(0.65); + }); + + it('test_go_consumer_with_proto_uses_canonical_service_id', async () => { + await fsp.mkdir(path.join(tmpDir, 'proto'), { recursive: true }); + await fsp.writeFile( + path.join(tmpDir, 'proto', 'user.proto'), + 'package com.example;\nservice UserService { rpc GetUser (R) returns (R); }', + ); + await fsp.mkdir(path.join(tmpDir, 'src'), { recursive: true }); + await fsp.writeFile( + path.join(tmpDir, 'src', 'client.go'), + 'package main\nfunc init() { client := pb.NewUserServiceClient(conn) }', + ); + + const contracts = await extractor.extract(null, tmpDir, makeRepo(tmpDir)); + + const goConsumer = contracts.find((c) => c.meta.source === 'go_client'); + expect(goConsumer).toBeDefined(); + expect(goConsumer!.contractId).toBe('grpc::com.example.UserService/*'); + expect(goConsumer!.confidence).toBe(0.75); + }); + + it('test_java_provider_with_proto_uses_canonical_service_id', async () => { + await fsp.mkdir(path.join(tmpDir, 'proto'), { recursive: true }); + await fsp.writeFile( + path.join(tmpDir, 'proto', 'user.proto'), + 'package com.example;\nservice UserService { rpc GetUser (R) returns (R); }', + ); + await fsp.mkdir(path.join(tmpDir, 'src', 'main', 'java'), { recursive: true }); + await fsp.writeFile( + path.join(tmpDir, 'src', 'main', 'java', 'UserGrpcService.java'), + `@GrpcService +public class UserGrpcService extends UserServiceGrpc.UserServiceImplBase { + @Override + public void getUser(GetUserRequest req, StreamObserver obs) {} +}`, + ); + + const contracts = await extractor.extract(null, tmpDir, makeRepo(tmpDir)); + + const javaProvider = contracts.find((c) => c.meta.source === 'java_grpc_service'); + expect(javaProvider).toBeDefined(); + expect(javaProvider!.contractId).toBe('grpc::com.example.UserService/*'); + expect(javaProvider!.confidence).toBe(0.8); + }); + + it('test_python_consumer_with_proto_uses_canonical_service_id', async () => { + await fsp.mkdir(path.join(tmpDir, 'proto'), { recursive: true }); + await fsp.writeFile( + path.join(tmpDir, 'proto', 'user.proto'), + 'package com.example;\nservice UserService { rpc GetUser (R) returns (R); }', + ); + await fsp.writeFile( + path.join(tmpDir, 'client.py'), + `import grpc +channel = grpc.insecure_channel('localhost:50051') +stub = UserServiceStub(channel)`, + ); + + const contracts = await extractor.extract(null, tmpDir, makeRepo(tmpDir)); + + const pyConsumer = contracts.find((c) => c.meta.source === 'python_stub'); + expect(pyConsumer).toBeDefined(); + expect(pyConsumer!.contractId).toBe('grpc::com.example.UserService/*'); + expect(pyConsumer!.confidence).toBe(0.75); + }); + + it('test_ts_provider_with_proto_adds_package', async () => { + await fsp.mkdir(path.join(tmpDir, 'proto'), { recursive: true }); + await fsp.writeFile( + path.join(tmpDir, 'proto', 'user.proto'), + 'package com.example;\nservice UserService { rpc GetUser (R) returns (R); }', + ); + await fsp.mkdir(path.join(tmpDir, 'src'), { recursive: true }); + await fsp.writeFile( + path.join(tmpDir, 'src', 'controller.ts'), + "@GrpcMethod('UserService', 'GetUser')\nasync getUser() {}", + ); + + const contracts = await extractor.extract(null, tmpDir, makeRepo(tmpDir)); + + const tsProvider = contracts.find((c) => c.meta.source === 'ts_grpc_method'); + expect(tsProvider).toBeDefined(); + expect(tsProvider!.contractId).toBe('grpc::com.example.UserService/GetUser'); + expect(tsProvider!.confidence).toBe(0.8); + }); + + it('test_proto_provider_inherits_package_from_imported_definition', async () => { + await fsp.mkdir(path.join(tmpDir, 'proto', 'shared'), { recursive: true }); + await fsp.mkdir(path.join(tmpDir, 'proto', 'services'), { recursive: true }); + await fsp.writeFile( + path.join(tmpDir, 'proto', 'shared', 'package.proto'), + 'package auth.v1;\nmessage LoginRequest {}', + ); + await fsp.writeFile( + path.join(tmpDir, 'proto', 'services', 'auth.proto'), + `syntax = "proto3"; +import "../shared/package.proto"; +service AuthService { + rpc Login (LoginRequest) returns (LoginRequest); +}`, + ); + + const contracts = await extractor.extract(null, tmpDir, makeRepo(tmpDir)); + + const protoProvider = contracts.find( + (c) => c.symbolRef.filePath === 'proto/services/auth.proto', + ); + expect(protoProvider).toBeDefined(); + expect(protoProvider!.contractId).toBe('grpc::auth.v1.AuthService/Login'); + }); +}); diff --git a/gitnexus/test/unit/group/http-route-extractor.test.ts b/gitnexus/test/unit/group/http-route-extractor.test.ts index 2290c806a8..653b4952c4 100644 --- a/gitnexus/test/unit/group/http-route-extractor.test.ts +++ b/gitnexus/test/unit/group/http-route-extractor.test.ts @@ -157,6 +157,89 @@ export default router; providers.find((c) => c.contractId === 'http::DELETE::/api/users/{param}'), ).toBeDefined(); }); + + it('extracts Go Gin and Echo route registrations', async () => { + const dir = path.join(tmpDir, 'go-frameworks'); + fs.mkdirSync(path.join(dir, 'cmd'), { recursive: true }); + fs.writeFileSync( + path.join(dir, 'cmd', 'server.go'), + ` +package main + +func createOrder(c *gin.Context) {} +func listOrders(c echo.Context) error { return nil } + +func main() { + r := gin.Default() + r.POST("/api/orders/:id", createOrder) + + e := echo.New() + e.GET("/api/orders", listOrders) +} +`, + ); + + const contracts = await extractor.extract(null, dir, makeRepo(dir)); + const providers = contracts.filter((c) => c.role === 'provider'); + + const ginRoute = providers.find((c) => c.contractId === 'http::POST::/api/orders/{param}'); + expect(ginRoute).toBeDefined(); + expect(ginRoute?.symbolName).toBe('createOrder'); + + const echoRoute = providers.find((c) => c.contractId === 'http::GET::/api/orders'); + expect(echoRoute).toBeDefined(); + expect(echoRoute?.symbolName).toBe('listOrders'); + }); + + it('extracts stdlib HandleFunc providers', async () => { + const dir = path.join(tmpDir, 'go-stdlib-provider'); + fs.mkdirSync(path.join(dir, 'cmd'), { recursive: true }); + fs.writeFileSync( + path.join(dir, 'cmd', 'server.go'), + ` +package main + +func healthHandler(w http.ResponseWriter, r *http.Request) {} + +func main() { + http.HandleFunc("/api/health", healthHandler) +} +`, + ); + + const contracts = await extractor.extract(null, dir, makeRepo(dir)); + const providers = contracts.filter((c) => c.role === 'provider'); + + const healthRoute = providers.find((c) => c.contractId === 'http::GET::/api/health'); + expect(healthRoute).toBeDefined(); + expect(healthRoute?.symbolName).toBe('healthHandler'); + }); + + it('extracts NestJS controller decorators', async () => { + const dir = path.join(tmpDir, 'nestjs'); + fs.mkdirSync(path.join(dir, 'src'), { recursive: true }); + fs.writeFileSync( + path.join(dir, 'src', 'orders.controller.ts'), + ` +import { Controller, Patch } from '@nestjs/common'; + +@Controller('orders') +export class OrdersController { + @Patch(':id') + updateOrder() { + return {}; + } +} +`, + ); + + const contracts = await extractor.extract(null, dir, makeRepo(dir)); + const providers = contracts.filter((c) => c.role === 'provider'); + + const patchRoute = providers.find((c) => c.contractId === 'http::PATCH::/orders/{param}'); + expect(patchRoute).toBeDefined(); + expect(patchRoute?.symbolName).toBe('updateOrder'); + }); }); describe('consumer extraction — fetch patterns', () => { @@ -206,6 +289,91 @@ export const deleteUser = (id: string) => axios.delete(\`/api/users/\${id}\`); consumers.find((c) => c.contractId === 'http::DELETE::/api/users/{param}'), ).toBeDefined(); }); + + it('extracts Python requests calls', async () => { + const dir = path.join(tmpDir, 'python-consumer'); + fs.mkdirSync(path.join(dir, 'src'), { recursive: true }); + fs.writeFileSync( + path.join(dir, 'src', 'client.py'), + ` +import requests + +def create_order(): + return requests.post("https://svc.local/api/orders/42", json={"id": 42}) +`, + ); + + const contracts = await extractor.extract(null, dir, makeRepo(dir)); + const consumers = contracts.filter((c) => c.role === 'consumer'); + + expect( + consumers.find((c) => c.contractId === 'http::POST::/api/orders/{param}'), + ).toBeDefined(); + }); + + it('extracts Java RestTemplate, WebClient and OkHttp calls', async () => { + const dir = path.join(tmpDir, 'java-consumer'); + fs.mkdirSync(path.join(dir, 'src'), { recursive: true }); + fs.writeFileSync( + path.join(dir, 'src', 'ApiClient.java'), + ` +import org.springframework.http.HttpMethod; +import org.springframework.web.client.RestTemplate; +import org.springframework.web.reactive.function.client.WebClient; +import okhttp3.Request; + +class ApiClient { + void run(RestTemplate restTemplate, WebClient webClient) { + restTemplate.getForObject("/api/users/{id}", String.class, 42); + webClient.method(HttpMethod.PATCH, "/api/users/42"); + new Request.Builder().url("/api/orders/42").build(); + } +} +`, + ); + + const contracts = await extractor.extract(null, dir, makeRepo(dir)); + const consumers = contracts.filter((c) => c.role === 'consumer'); + + expect(consumers.find((c) => c.contractId === 'http::GET::/api/users/{param}')).toBeDefined(); + expect( + consumers.find((c) => c.contractId === 'http::PATCH::/api/users/{param}'), + ).toBeDefined(); + expect( + consumers.find((c) => c.contractId === 'http::GET::/api/orders/{param}'), + ).toBeDefined(); + }); + + it('extracts Go stdlib and resty calls', async () => { + const dir = path.join(tmpDir, 'go-consumer'); + fs.mkdirSync(path.join(dir, 'cmd'), { recursive: true }); + fs.writeFileSync( + path.join(dir, 'cmd', 'client.go'), + ` +package main + +import ( + "net/http" + + "github.com/go-resty/resty/v2" +) + +func main() { + http.Get("/api/health") + client := resty.New() + client.R().Delete("/api/orders/42") +} +`, + ); + + const contracts = await extractor.extract(null, dir, makeRepo(dir)); + const consumers = contracts.filter((c) => c.role === 'consumer'); + + expect(consumers.find((c) => c.contractId === 'http::GET::/api/health')).toBeDefined(); + expect( + consumers.find((c) => c.contractId === 'http::DELETE::/api/orders/{param}'), + ).toBeDefined(); + }); }); describe('provider extraction — Laravel', () => { diff --git a/gitnexus/test/unit/group/manifest-extractor.test.ts b/gitnexus/test/unit/group/manifest-extractor.test.ts new file mode 100644 index 0000000000..c2c67a33ad --- /dev/null +++ b/gitnexus/test/unit/group/manifest-extractor.test.ts @@ -0,0 +1,308 @@ +import { describe, it, expect } from 'vitest'; +import { ManifestExtractor } from '../../../src/core/group/extractors/manifest-extractor.js'; +import type { GroupManifestLink } from '../../../src/core/group/types.js'; + +describe('ManifestExtractor', () => { + const extractor = new ManifestExtractor(); + + it('creates provider + consumer contracts and a cross-link for each manifest link', async () => { + const links: GroupManifestLink[] = [ + { + from: 'hr/payroll/backend', + to: 'hr/hiring/backend', + type: 'topic', + contract: 'employee.hired', + role: 'provider', + }, + ]; + + const result = await extractor.extractFromManifest(links); + + expect(result.contracts).toHaveLength(2); + + const provider = result.contracts.find((c) => c.role === 'provider'); + expect(provider).toBeDefined(); + expect(provider!.contractId).toBe('topic::employee.hired'); + expect(provider!.type).toBe('topic'); + expect(provider!.confidence).toBe(1.0); + + const consumer = result.contracts.find((c) => c.role === 'consumer'); + expect(consumer).toBeDefined(); + expect(consumer!.contractId).toBe('topic::employee.hired'); + + expect(result.crossLinks).toHaveLength(1); + expect(result.crossLinks[0].matchType).toBe('manifest'); + expect(result.crossLinks[0].confidence).toBe(1.0); + expect(result.crossLinks[0].from.repo).toBe('hr/hiring/backend'); + expect(result.crossLinks[0].to.repo).toBe('hr/payroll/backend'); + }); + + it('handles role: consumer (from-repo is consumer)', async () => { + const links: GroupManifestLink[] = [ + { + from: 'sales/admin/bff', + to: 'sales/crm/backend', + type: 'http', + contract: '/api/v2/leads/*', + role: 'consumer', + }, + ]; + + const result = await extractor.extractFromManifest(links); + + const provider = result.contracts.find((c) => c.role === 'provider'); + const consumer = result.contracts.find((c) => c.role === 'consumer'); + + expect(consumer!.contractId).toBe('http::*::/api/v2/leads/*'); + expect(provider!.contractId).toBe('http::*::/api/v2/leads/*'); + + expect(result.crossLinks[0].from.repo).toBe('sales/admin/bff'); + expect(result.crossLinks[0].to.repo).toBe('sales/crm/backend'); + }); + + it('resolves grpc manifest provider by exact method name (no .proto fallback)', async () => { + const links: GroupManifestLink[] = [ + { + from: 'platform/orders', + to: 'platform/auth', + type: 'grpc', + contract: 'auth.AuthService/Login', + role: 'consumer', + }, + ]; + + const dbExecutors = new Map< + string, + (cypher: string, params?: Record) => Promise[]> + >([ + [ + 'platform/auth', + async (_cypher, params) => { + // Exact match on method name. + if (params?.methodName === 'Login') { + return [ + { + uid: 'uid-auth-login', + name: 'Login', + filePath: 'src/auth.proto', + }, + ]; + } + return []; + }, + ], + [ + 'platform/orders', + async (_cypher, params) => { + // No symbol with the exact method name — resolve returns null and + // the consumer contract gets an empty symbolUid, falling back to + // name-based hint at cross-impact time. + if (params?.methodName === 'Login') return []; + return []; + }, + ], + ]); + + const result = await extractor.extractFromManifest(links, dbExecutors); + + const provider = result.contracts.find((c) => c.role === 'provider'); + const consumer = result.contracts.find((c) => c.role === 'consumer'); + + // Provider resolved to the concrete proto symbol. + expect(provider?.symbolUid).toBe('uid-auth-login'); + expect(provider?.symbolRef.filePath).toBe('src/auth.proto'); + + // Consumer falls back to a deterministic synthetic uid + name-based ref. + // The synthetic uid lets the bridge cross-impact query anchor on it + // even when the indexer doesn't expose a matching symbol. + expect(consumer?.symbolUid).toBe('manifest::platform/orders::grpc::auth.AuthService/Login'); + expect(consumer?.symbolRef.name).toBe('auth.AuthService/Login'); + + expect(result.crossLinks[0].to.symbolRef.filePath).toBe('src/auth.proto'); + expect(result.crossLinks[0].from.symbolUid).toBe( + 'manifest::platform/orders::grpc::auth.AuthService/Login', + ); + }); + + it('does NOT resolve grpc manifest to an arbitrary .proto file', async () => { + // Regression test for a previous bug: the extractor had an unconditional + // `OR n.filePath ENDS WITH '.proto'` fallback that returned the first + // proto symbol in the repo, regardless of whether it matched the contract. + const links: GroupManifestLink[] = [ + { + from: 'platform/orders', + to: 'platform/auth', + type: 'grpc', + contract: 'auth.AuthService/Login', + role: 'consumer', + }, + ]; + + const dbExecutors = new Map< + string, + (cypher: string, params?: Record) => Promise[]> + >([ + [ + 'platform/auth', + // Executor returns matches for ANY query (simulates the old buggy + // fallback that returned a random .proto file). The new code must + // only accept a hit when the method/service name matches exactly. + async (_cypher, params) => { + if (params?.methodName === 'Login' || params?.serviceName === 'auth.AuthService') { + return [ + { + uid: 'uid-correct-login', + name: 'Login', + filePath: 'src/auth.proto', + }, + ]; + } + return []; + }, + ], + ['platform/orders', async () => []], + ]); + + const result = await extractor.extractFromManifest(links, dbExecutors); + const provider = result.contracts.find((c) => c.role === 'provider'); + // Must resolve to the correct symbol (not a random proto one). + expect(provider?.symbolUid).toBe('uid-correct-login'); + }); + + it('resolves lib manifest links by exact name only', async () => { + const links: GroupManifestLink[] = [ + { + from: 'platform/web', + to: 'platform/shared-lib', + type: 'lib', + contract: '@platform/contracts', + role: 'consumer', + }, + ]; + + const dbExecutors = new Map< + string, + (cypher: string, params?: Record) => Promise[]> + >([ + [ + 'platform/shared-lib', + async (_cypher, params) => { + if (params?.contract !== '@platform/contracts') return []; + return [ + { + uid: 'uid-lib', + name: '@platform/contracts', + filePath: 'src/index.ts', + }, + ]; + }, + ], + [ + 'platform/web', + async (_cypher, params) => { + if (params?.contract !== '@platform/contracts') return []; + return []; + }, + ], + ]); + + const result = await extractor.extractFromManifest(links, dbExecutors); + + const provider = result.contracts.find((c) => c.role === 'provider'); + const consumer = result.contracts.find((c) => c.role === 'consumer'); + + expect(provider?.symbolUid).toBe('uid-lib'); + // Consumer doesn't have a symbol named exactly '@platform/contracts' — + // exact matching returns null, falling back to the synthetic manifest uid. + expect(consumer?.symbolUid).toBe('manifest::platform/web::lib::@platform/contracts'); + }); + + it('does NOT resolve lib manifest via CONTAINS on name', async () => { + // Regression test: previous CONTAINS fallback would match "react" to + // "react-native" or "@types/react". Exact matching must reject both. + const links: GroupManifestLink[] = [ + { + from: 'web', + to: 'packages/ui', + type: 'lib', + contract: 'react', + role: 'consumer', + }, + ]; + + const dbExecutors = new Map< + string, + (cypher: string, params?: Record) => Promise[]> + >([ + [ + 'packages/ui', + async (_cypher, params) => { + // Executor is called with contract='react'. Only exact matches + // should come back; return only wrong candidates to verify the + // Cypher uses `=` not `CONTAINS`. + if (params?.contract === 'react') { + // Simulated DB returns nothing because it has only "react-native" + // and "@types/react" — neither is an exact match for "react". + return []; + } + return []; + }, + ], + ['web', async () => []], + ]); + + const result = await extractor.extractFromManifest(links, dbExecutors); + const provider = result.contracts.find((c) => c.role === 'provider'); + // No exact match → synthetic manifest uid, not a wrong real one. + expect(provider?.symbolUid).toBe('manifest::packages/ui::lib::react'); + }); + + it('normalizes http contract path for exact Route.name match', async () => { + // Manifest may be written as "/api/orders/" or "api/orders"; both should + // match the canonical "/api/orders" stored in the graph. + const variants = ['/api/orders', '/api/orders/', 'api/orders', '//api//orders']; + for (const raw of variants) { + const links: GroupManifestLink[] = [ + { + from: 'gateway', + to: 'orders-svc', + type: 'http', + contract: raw, + role: 'consumer', + }, + ]; + + let seenParam: string | undefined; + const dbExecutors = new Map< + string, + (cypher: string, params?: Record) => Promise[]> + >([ + [ + 'orders-svc', + async (_cypher, params) => { + seenParam = params?.normalized as string; + return [ + { + uid: 'uid-orders-list', + name: 'listOrders', + filePath: 'src/orders.ts', + }, + ]; + }, + ], + ['gateway', async () => []], + ]); + + const result = await extractor.extractFromManifest(links, dbExecutors); + expect(seenParam).toBe('/api/orders'); + const provider = result.contracts.find((c) => c.role === 'provider'); + expect(provider?.symbolUid).toBe('uid-orders-list'); + } + }); + + it('returns empty for no links', async () => { + const result = await extractor.extractFromManifest([]); + expect(result.contracts).toHaveLength(0); + expect(result.crossLinks).toHaveLength(0); + }); +}); diff --git a/gitnexus/test/unit/group/matching.test.ts b/gitnexus/test/unit/group/matching.test.ts index bbbe5f6642..c5713d9093 100644 --- a/gitnexus/test/unit/group/matching.test.ts +++ b/gitnexus/test/unit/group/matching.test.ts @@ -1,5 +1,10 @@ import { describe, it, expect } from 'vitest'; -import { runExactMatch, normalizeContractId } from '../../../src/core/group/matching.js'; +import { + runExactMatch, + normalizeContractId, + buildProviderIndex, + runWildcardMatch, +} from '../../../src/core/group/matching.js'; import type { StoredContract } from '../../../src/core/group/types.js'; describe('normalizeContractId', () => { @@ -21,6 +26,16 @@ describe('normalizeContractId', () => { expect(normalizeContractId('grpc::/MyPkg/DoThing')).toBe('grpc::/MyPkg/DoThing'); }); + it('handles malformed grpc with leading slash and no package', () => { + // grpc::/Method — leading slash, no package + expect(normalizeContractId('grpc::/Method')).toBe('grpc::/Method'); + }); + + it('handles grpc with no slash at all', () => { + // grpc::ServiceName — no slash, ambiguous; MVP: lowercase entire token + expect(normalizeContractId('grpc::ServiceName')).toBe('grpc::servicename'); + }); + it('trims and lowercases topic', () => { expect(normalizeContractId('topic:: Employee.Hired ')).toBe('topic::employee.hired'); }); @@ -180,3 +195,211 @@ describe('runExactMatch', () => { expect(unmatched).toHaveLength(0); }); }); + +// --------------------------------------------------------------------------- +// Helpers for Task 6 tests +// --------------------------------------------------------------------------- +function makeGrpcContract( + id: string, + role: 'provider' | 'consumer', + repo: string, + overrides: Partial = {}, +): StoredContract { + return { + contractId: id, + type: 'grpc', + role, + symbolUid: `uid-${repo}-${id}`, + symbolRef: { filePath: `src/${repo}.ts`, name: `fn-${id}` }, + symbolName: `fn-${id}`, + confidence: 0.9, + meta: {}, + repo, + ...overrides, + }; +} + +// --------------------------------------------------------------------------- +// buildProviderIndex +// --------------------------------------------------------------------------- +describe('buildProviderIndex', () => { + it('test_buildProviderIndex_creates_normalized_keys', () => { + const contracts: StoredContract[] = [ + makeGrpcContract('grpc::Com.Example.UserService/GetUser', 'provider', 'backend'), + makeGrpcContract('grpc::Com.Example.UserService/GetUser', 'consumer', 'frontend'), + ]; + + const index = buildProviderIndex(contracts); + + // Only providers should be in the index + expect(index.size).toBe(1); + // Key should be normalized (lowercased package) + expect(index.has('grpc::com.example.userservice/GetUser')).toBe(true); + expect(index.get('grpc::com.example.userservice/GetUser')).toHaveLength(1); + expect(index.get('grpc::com.example.userservice/GetUser')![0].role).toBe('provider'); + }); +}); + +// --------------------------------------------------------------------------- +// runExactMatch — gRPC wildcard skip +// --------------------------------------------------------------------------- +describe('runExactMatch — gRPC wildcard handling', () => { + it('test_runExactMatch_skips_grpc_wildcard_contracts', () => { + const contracts: StoredContract[] = [ + makeGrpcContract('grpc::com.example.UserService/*', 'consumer', 'frontend'), + makeGrpcContract('grpc::com.example.UserService/*', 'provider', 'backend'), + ]; + + const { matched, unmatched } = runExactMatch(contracts); + + // gRPC wildcards should NOT be matched in exact pass + expect(matched).toHaveLength(0); + // Both should appear in unmatched + expect(unmatched).toHaveLength(2); + }); + + it('test_runExactMatch_does_not_skip_http_wildcards', () => { + const contracts: StoredContract[] = [ + { + contractId: 'http::GET::/api/users', + type: 'http', + role: 'provider', + symbolUid: 'uid-backend-http', + symbolRef: { filePath: 'src/backend.ts', name: 'fn-http' }, + symbolName: 'fn-http', + confidence: 0.9, + meta: {}, + repo: 'backend', + }, + { + contractId: 'http::*::/api/users', + type: 'http', + role: 'consumer', + symbolUid: 'uid-frontend-http', + symbolRef: { filePath: 'src/frontend.ts', name: 'fn-http' }, + symbolName: 'fn-http', + confidence: 0.9, + meta: {}, + repo: 'frontend', + }, + ]; + + const { matched } = runExactMatch(contracts); + // HTTP wildcard should still match via findMatchingKeys + expect(matched).toHaveLength(1); + expect(matched[0].contractId).toBe('http::*::/api/users'); + }); +}); + +// --------------------------------------------------------------------------- +// runWildcardMatch +// --------------------------------------------------------------------------- +describe('runWildcardMatch', () => { + it('test_runWildcardMatch_fq_service_match', () => { + const consumer = makeGrpcContract('grpc::com.example.UserService/*', 'consumer', 'frontend'); + const provider = makeGrpcContract( + 'grpc::com.example.UserService/GetUser', + 'provider', + 'backend', + ); + + const providerIndex = buildProviderIndex([provider]); + const { matched } = runWildcardMatch([consumer], providerIndex); + + expect(matched).toHaveLength(1); + expect(matched[0].from.repo).toBe('frontend'); + expect(matched[0].to.repo).toBe('backend'); + }); + + it('test_runWildcardMatch_bare_name_match', () => { + const consumer = makeGrpcContract('grpc::UserService/*', 'consumer', 'frontend'); + const provider = makeGrpcContract( + 'grpc::com.example.UserService/GetUser', + 'provider', + 'backend', + ); + + const providerIndex = buildProviderIndex([provider]); + const { matched } = runWildcardMatch([consumer], providerIndex); + + expect(matched).toHaveLength(1); + expect(matched[0].from.repo).toBe('frontend'); + expect(matched[0].to.repo).toBe('backend'); + }); + + it('test_runWildcardMatch_no_match_different_service', () => { + const consumer = makeGrpcContract('grpc::UserService/*', 'consumer', 'frontend'); + const provider = makeGrpcContract( + 'grpc::com.example.OtherService/GetUser', + 'provider', + 'backend', + ); + + const providerIndex = buildProviderIndex([provider]); + const { matched, remaining } = runWildcardMatch([consumer], providerIndex); + + expect(matched).toHaveLength(0); + expect(remaining).toContainEqual(consumer); + }); + + it('test_runWildcardMatch_skips_wildcard_providers', () => { + const consumer = makeGrpcContract('grpc::com.example.UserService/*', 'consumer', 'frontend'); + const provider = makeGrpcContract('grpc::com.example.UserService/*', 'provider', 'backend'); + + const providerIndex = buildProviderIndex([provider]); + const { matched } = runWildcardMatch([consumer], providerIndex); + + // Wildcard provider key ends with /*, so it should be skipped + expect(matched).toHaveLength(0); + }); + + it('test_runWildcardMatch_confidence_min', () => { + const consumer = makeGrpcContract('grpc::com.example.UserService/*', 'consumer', 'frontend', { + confidence: 0.7, + }); + const provider = makeGrpcContract( + 'grpc::com.example.UserService/GetUser', + 'provider', + 'backend', + { + confidence: 0.5, + }, + ); + + const providerIndex = buildProviderIndex([provider]); + const { matched } = runWildcardMatch([consumer], providerIndex); + + expect(matched).toHaveLength(1); + expect(matched[0].confidence).toBe(0.5); + }); + + it('test_runWildcardMatch_matchType_wildcard', () => { + const consumer = makeGrpcContract('grpc::com.example.UserService/*', 'consumer', 'frontend'); + const provider = makeGrpcContract( + 'grpc::com.example.UserService/GetUser', + 'provider', + 'backend', + ); + + const providerIndex = buildProviderIndex([provider]); + const { matched } = runWildcardMatch([consumer], providerIndex); + + expect(matched).toHaveLength(1); + expect(matched[0].matchType).toBe('wildcard'); + }); + + it('test_runWildcardMatch_contractId_is_consumers', () => { + const consumer = makeGrpcContract('grpc::com.example.UserService/*', 'consumer', 'frontend'); + const provider = makeGrpcContract( + 'grpc::com.example.UserService/GetUser', + 'provider', + 'backend', + ); + + const providerIndex = buildProviderIndex([provider]); + const { matched } = runWildcardMatch([consumer], providerIndex); + + expect(matched).toHaveLength(1); + expect(matched[0].contractId).toBe('grpc::com.example.UserService/*'); + }); +}); diff --git a/gitnexus/test/unit/group/service.test.ts b/gitnexus/test/unit/group/service.test.ts index e4b10443ca..940c79d1c7 100644 --- a/gitnexus/test/unit/group/service.test.ts +++ b/gitnexus/test/unit/group/service.test.ts @@ -7,11 +7,23 @@ import { type GroupToolPort, type GroupRepoHandle, } from '../../../src/core/group/service.js'; -import { writeContractRegistry } from '../../../src/core/group/storage.js'; +import { writeBridge } from '../../../src/core/group/bridge-db.js'; import type { ContractRegistry, StoredContract, CrossLink } from '../../../src/core/group/types.js'; +/** Test helper: write legacy contracts.json for JSON-fallback tests */ +async function writeContractRegistryJson( + groupDir: string, + registry: ContractRegistry, +): Promise { + const targetPath = path.join(groupDir, 'contracts.json'); + fs.writeFileSync(targetPath, JSON.stringify(registry, null, 2), 'utf-8'); +} + function makeTmpGroup(): { tmpDir: string; groupDir: string; cleanup: () => void } { - const tmpDir = path.join(os.tmpdir(), `gitnexus-svc-${Date.now()}`); + const tmpDir = path.join( + os.tmpdir(), + `gitnexus-svc-${Date.now()}-${Math.random().toString(36).slice(2, 8)}`, + ); const groupDir = path.join(tmpDir, 'groups', 'test-group'); fs.mkdirSync(groupDir, { recursive: true }); @@ -113,20 +125,20 @@ describe('GroupService', () => { expect(result.error).toContain('name is required'); }); - it('test_groupContracts_returns_error_when_no_registry', async () => { + it('test_groupContracts_no_data_returns_error', async () => { const { cleanup, tmpDir } = makeTmpGroup(); try { vi.stubEnv('GITNEXUS_HOME', tmpDir); const svc = new GroupService(makePort()); const result = (await svc.groupContracts({ name: 'test-group' })) as { error: string }; - expect(result.error).toContain('No contracts.json'); + expect(result.error).toContain('No contract data'); } finally { vi.unstubAllEnvs(); cleanup(); } }); - it('test_groupContracts_returns_all_contracts', async () => { + it('test_groupContracts_json_fallback_returns_all_contracts', async () => { const { groupDir, cleanup, tmpDir } = makeTmpGroup(); try { vi.stubEnv('GITNEXUS_HOME', tmpDir); @@ -134,7 +146,7 @@ describe('GroupService', () => { makeContract('http::GET::/api/users', 'provider', 'app/backend'), makeContract('http::GET::/api/users', 'consumer', 'app/frontend'), ]; - await writeContractRegistry(groupDir, makeRegistry(contracts)); + await writeContractRegistryJson(groupDir, makeRegistry(contracts)); const svc = new GroupService(makePort()); const result = (await svc.groupContracts({ name: 'test-group' })) as { @@ -147,7 +159,7 @@ describe('GroupService', () => { } }); - it('test_groupContracts_filters_by_type', async () => { + it('test_groupContracts_json_fallback_filters_by_type', async () => { const { groupDir, cleanup, tmpDir } = makeTmpGroup(); try { vi.stubEnv('GITNEXUS_HOME', tmpDir); @@ -158,7 +170,7 @@ describe('GroupService', () => { type: 'grpc' as const, }, ]; - await writeContractRegistry(groupDir, makeRegistry(contracts)); + await writeContractRegistryJson(groupDir, makeRegistry(contracts)); const svc = new GroupService(makePort()); const result = (await svc.groupContracts({ name: 'test-group', type: 'grpc' })) as { @@ -172,7 +184,7 @@ describe('GroupService', () => { } }); - it('test_groupContracts_filters_by_repo', async () => { + it('test_groupContracts_json_fallback_filters_by_repo', async () => { const { groupDir, cleanup, tmpDir } = makeTmpGroup(); try { vi.stubEnv('GITNEXUS_HOME', tmpDir); @@ -180,7 +192,7 @@ describe('GroupService', () => { makeContract('http::GET::/api/users', 'provider', 'app/backend'), makeContract('http::GET::/api/users', 'consumer', 'app/frontend'), ]; - await writeContractRegistry(groupDir, makeRegistry(contracts)); + await writeContractRegistryJson(groupDir, makeRegistry(contracts)); const svc = new GroupService(makePort()); const result = (await svc.groupContracts({ name: 'test-group', repo: 'app/backend' })) as { @@ -194,7 +206,7 @@ describe('GroupService', () => { } }); - it('test_groupContracts_unmatchedOnly_filters_matched', async () => { + it('test_groupContracts_json_fallback_unmatchedOnly_filters_matched', async () => { const { groupDir, cleanup, tmpDir } = makeTmpGroup(); try { vi.stubEnv('GITNEXUS_HOME', tmpDir); @@ -217,7 +229,7 @@ describe('GroupService', () => { matchType: 'exact', confidence: 1.0, }; - await writeContractRegistry( + await writeContractRegistryJson( groupDir, makeRegistry([provider, consumer, orphan], [crossLink]), ); @@ -233,6 +245,107 @@ describe('GroupService', () => { cleanup(); } }); + + it('test_groupContracts_with_bridge_returns_contracts', async () => { + const { groupDir, cleanup, tmpDir } = makeTmpGroup(); + try { + vi.stubEnv('GITNEXUS_HOME', tmpDir); + const contracts = [ + makeContract('http::GET::/api/users', 'provider', 'app/backend'), + makeContract('http::GET::/api/users', 'consumer', 'app/frontend'), + ]; + await writeBridge(groupDir, { + contracts, + crossLinks: [], + repoSnapshots: {}, + missingRepos: [], + }); + + const svc = new GroupService(makePort()); + const result = (await svc.groupContracts({ name: 'test-group' })) as { + contracts: unknown[]; + crossLinks: unknown[]; + }; + expect(result.contracts).toHaveLength(2); + expect(result.crossLinks).toEqual([]); + } finally { + vi.unstubAllEnvs(); + cleanup(); + } + }); + + it('test_groupContracts_bridge_path_filters_by_type', async () => { + const { groupDir, cleanup, tmpDir } = makeTmpGroup(); + try { + vi.stubEnv('GITNEXUS_HOME', tmpDir); + const contracts = [ + makeContract('http::GET::/api/users', 'provider', 'app/backend'), + { + ...makeContract('grpc::auth.AuthService/Login', 'provider', 'app/backend'), + type: 'grpc' as const, + }, + ]; + await writeBridge(groupDir, { + contracts, + crossLinks: [], + repoSnapshots: {}, + missingRepos: [], + }); + + const svc = new GroupService(makePort()); + const result = (await svc.groupContracts({ name: 'test-group', type: 'grpc' })) as { + contracts: { type: string }[]; + }; + expect(result.contracts).toHaveLength(1); + expect(result.contracts[0].type).toBe('grpc'); + } finally { + vi.unstubAllEnvs(); + cleanup(); + } + }); + + it('test_groupContracts_bridge_path_unmatchedOnly_filters_matched', async () => { + const { groupDir, cleanup, tmpDir } = makeTmpGroup(); + try { + vi.stubEnv('GITNEXUS_HOME', tmpDir); + const provider = makeContract('http::GET::/api/users', 'provider', 'app/backend'); + const consumer = makeContract('http::GET::/api/users', 'consumer', 'app/frontend'); + const orphan = makeContract('http::GET::/api/health', 'provider', 'app/backend'); + const crossLink: CrossLink = { + from: { + repo: 'app/frontend', + symbolUid: consumer.symbolUid, + symbolRef: consumer.symbolRef, + }, + to: { + repo: 'app/backend', + symbolUid: provider.symbolUid, + symbolRef: provider.symbolRef, + }, + type: 'http', + contractId: 'http::GET::/api/users', + matchType: 'exact', + confidence: 1.0, + }; + await writeBridge(groupDir, { + contracts: [provider, consumer, orphan], + crossLinks: [crossLink], + repoSnapshots: {}, + missingRepos: [], + }); + + const svc = new GroupService(makePort()); + const result = (await svc.groupContracts({ name: 'test-group', unmatchedOnly: true })) as { + contracts: { contractId: string }[]; + }; + // Only the orphan should remain after filtering out matched ones + expect(result.contracts).toHaveLength(1); + expect(result.contracts[0].contractId).toBe('http::GET::/api/health'); + } finally { + vi.unstubAllEnvs(); + cleanup(); + } + }); }); describe('groupSync', () => { @@ -330,6 +443,201 @@ describe('GroupService', () => { }); }); + describe('groupImpact', () => { + it('test_groupImpact_no_data_returns_error', async () => { + const { cleanup, tmpDir } = makeTmpGroup(); + try { + vi.stubEnv('GITNEXUS_HOME', tmpDir); + const svc = new GroupService(makePort()); + const result = (await svc.groupImpact({ + name: 'test-group', + target: 'someSymbol', + repo: 'app/backend', + })) as { error: string }; + expect(result.error).toContain('No contract data'); + } finally { + vi.unstubAllEnvs(); + cleanup(); + } + }); + + it('test_groupImpact_with_json_fallback_uses_legacy', async () => { + const { groupDir, cleanup, tmpDir } = makeTmpGroup(); + try { + vi.stubEnv('GITNEXUS_HOME', tmpDir); + const contracts = [makeContract('http::GET::/api/users', 'provider', 'app/backend')]; + await writeContractRegistryJson(groupDir, makeRegistry(contracts)); + + const port = makePort({ + impact: vi.fn(async () => ({ + target: { id: 'uid-1', name: 'someSymbol', filePath: 'src/app.ts' }, + direction: 'upstream', + impactedCount: 0, + risk: 'LOW', + summary: { direct: 0, processes_affected: 0, modules_affected: 0 }, + affected_processes: [], + affected_modules: [], + byDepth: {}, + })), + }); + + const svc = new GroupService(port); + const result = (await svc.groupImpact({ + name: 'test-group', + target: 'someSymbol', + repo: 'app/backend', + })) as { local: unknown; group: string; risk: string }; + + expect(result.group).toBe('test-group'); + expect(result.local).toBeDefined(); + expect(result.risk).toBeDefined(); + expect(port.impact).toHaveBeenCalled(); + } finally { + vi.unstubAllEnvs(); + cleanup(); + } + }); + + it('test_groupImpact_with_bridge_uses_new_runGroupImpact', async () => { + const { groupDir, cleanup, tmpDir } = makeTmpGroup(); + try { + vi.stubEnv('GITNEXUS_HOME', tmpDir); + const contracts = [makeContract('http::GET::/api/users', 'provider', 'app/backend')]; + await writeBridge(groupDir, { + contracts, + crossLinks: [], + repoSnapshots: {}, + missingRepos: [], + }); + + const port = makePort({ + impact: vi.fn(async () => ({ + target: { id: 'uid-1', name: 'someSymbol', filePath: 'src/app.ts' }, + direction: 'upstream', + impactedCount: 0, + risk: 'LOW', + summary: { direct: 0, processes_affected: 0, modules_affected: 0 }, + affected_processes: [], + affected_modules: [], + byDepth: {}, + })), + }); + + const svc = new GroupService(port); + const result = (await svc.groupImpact({ + name: 'test-group', + target: 'someSymbol', + repo: 'app/backend', + })) as { local: unknown; group: string; risk: string; cross: unknown[] }; + + expect(result.group).toBe('test-group'); + expect(result.local).toBeDefined(); + expect(result.risk).toBeDefined(); + expect(result.cross).toEqual([]); + expect(port.impact).toHaveBeenCalled(); + } finally { + vi.unstubAllEnvs(); + cleanup(); + } + }); + + it('test_groupImpact_returns_error_when_params_missing', async () => { + const svc = new GroupService(makePort()); + const result = (await svc.groupImpact({})) as { error: string }; + expect(result.error).toContain('name, target, and repo are required'); + }); + + it('test_groupImpact_rejects_unknown_direction', async () => { + const svc = new GroupService(makePort()); + const result = (await svc.groupImpact({ + name: 'x', + target: 'x', + repo: 'x', + direction: 'upstreem', // typo + })) as { error: string }; + expect(result.error).toMatch(/direction must be/); + }); + + it('test_groupImpact_rejects_out_of_range_maxDepth', async () => { + const svc = new GroupService(makePort()); + for (const bad of [-1, 0, 11, 1000, 1.5, Number.NaN]) { + const result = (await svc.groupImpact({ + name: 'x', + target: 'x', + repo: 'x', + maxDepth: bad, + })) as { error?: string }; + expect(result.error).toMatch(/maxDepth must be/); + } + }); + + it('test_groupImpact_rejects_out_of_range_minConfidence', async () => { + const svc = new GroupService(makePort()); + for (const bad of [-0.1, 1.1, -5, 10]) { + const result = (await svc.groupImpact({ + name: 'x', + target: 'x', + repo: 'x', + minConfidence: bad, + })) as { error?: string }; + expect(result.error).toMatch(/minConfidence must be/); + } + }); + + it('test_groupImpact_rejects_out_of_range_timeout', async () => { + const svc = new GroupService(makePort()); + for (const bad of [0, 99, 300001, 1e9]) { + const result = (await svc.groupImpact({ + name: 'x', + target: 'x', + repo: 'x', + timeout: bad, + })) as { error?: string }; + expect(result.error).toMatch(/timeout must be/); + } + }); + + it('test_groupImpact_rejects_out_of_range_crossDepth', async () => { + const svc = new GroupService(makePort()); + for (const bad of [-1, 11, 1.5, Number.NaN]) { + const result = (await svc.groupImpact({ + name: 'x', + target: 'x', + repo: 'x', + crossDepth: bad, + })) as { error?: string }; + expect(result.error).toMatch(/crossDepth must be/); + } + }); + + it('test_groupImpact_wraps_localImpactFn_exception_from_missing_repo', async () => { + // If the configured repoGroupPath is not in the group's config, the + // resolveGroupRepo helper throws. That exception must NOT bubble past + // runPhase1WithTimeout — it should be caught inside safeLocalImpact + // and surfaced as a local.error field on the result. + const { groupDir, cleanup, tmpDir } = makeTmpGroup(); + try { + vi.stubEnv('GITNEXUS_HOME', tmpDir); + await writeContractRegistryJson( + groupDir, + makeRegistry([makeContract('http::GET::/api/x', 'provider', 'app/backend')]), + ); + const svc = new GroupService(makePort()); + const result = (await svc.groupImpact({ + name: 'test-group', + target: 'whatever', + repo: 'not/in/config', + })) as { local?: { error?: string } }; + // Should not throw; instead local.error is populated. + expect(result).toBeDefined(); + expect(result.local?.error).toMatch(/local impact failed/); + } finally { + vi.unstubAllEnvs(); + cleanup(); + } + }); + }); + describe('groupStatus', () => { it('test_groupStatus_returns_error_when_name_empty', async () => { const svc = new GroupService(makePort()); @@ -337,10 +645,105 @@ describe('GroupService', () => { expect(result.error).toContain('name is required'); }); - it('test_groupStatus_marks_unresolvable_repos_as_missing', async () => { + it('test_groupStatus_no_data_returns_empty', async () => { const { cleanup, tmpDir } = makeTmpGroup(); try { vi.stubEnv('GITNEXUS_HOME', tmpDir); + const svc = new GroupService(makePort()); + const result = (await svc.groupStatus({ name: 'test-group' })) as { + group: string; + lastSync: null; + missingRepos: string[]; + repos: Record; + }; + expect(result.group).toBe('test-group'); + expect(result.lastSync).toBeNull(); + expect(result.missingRepos).toEqual([]); + expect(result.repos).toEqual({}); + } finally { + vi.unstubAllEnvs(); + cleanup(); + } + }); + + it('test_groupStatus_json_fallback_marks_unresolvable_repos_as_missing', async () => { + const { groupDir, cleanup, tmpDir } = makeTmpGroup(); + try { + vi.stubEnv('GITNEXUS_HOME', tmpDir); + await writeContractRegistryJson(groupDir, makeRegistry([])); + + const port = makePort({ + resolveRepo: vi.fn(async () => { + throw new Error('repo not found'); + }), + }); + + const svc = new GroupService(port); + const result = (await svc.groupStatus({ name: 'test-group' })) as { + group: string; + repos: Record; + }; + + expect(result.group).toBe('test-group'); + expect(result.repos['app/backend'].missing).toBe(true); + expect(result.repos['app/frontend'].missing).toBe(true); + } finally { + vi.unstubAllEnvs(); + cleanup(); + } + }); + + it('test_groupStatus_reads_from_bridge_meta_and_snapshots', async () => { + const { groupDir, cleanup, tmpDir } = makeTmpGroup(); + try { + vi.stubEnv('GITNEXUS_HOME', tmpDir); + await writeBridge(groupDir, { + contracts: [], + crossLinks: [], + repoSnapshots: { + 'app/backend': { indexedAt: '2026-01-01T00:00:00Z', lastCommit: 'abc123' }, + }, + missingRepos: ['app/frontend'], + }); + + const port = makePort({ + resolveRepo: vi.fn(async () => { + throw new Error('repo not found'); + }), + }); + + const svc = new GroupService(port); + const result = (await svc.groupStatus({ name: 'test-group' })) as { + group: string; + lastSync: string; + missingRepos: string[]; + repos: Record; + }; + + expect(result.group).toBe('test-group'); + expect(result.lastSync).toBeTruthy(); + expect(result.missingRepos).toContain('app/frontend'); + expect(result.repos['app/backend'].missing).toBe(true); + expect(result.repos['app/frontend'].missing).toBe(true); + } finally { + vi.unstubAllEnvs(); + cleanup(); + } + }); + + it('test_groupStatus_bridge_path_reads_repoSnapshots', async () => { + const { groupDir, cleanup, tmpDir } = makeTmpGroup(); + try { + vi.stubEnv('GITNEXUS_HOME', tmpDir); + await writeBridge(groupDir, { + contracts: [], + crossLinks: [], + repoSnapshots: { + 'app/backend': { indexedAt: '2026-02-01T00:00:00Z', lastCommit: 'abc123' }, + 'app/frontend': { indexedAt: '2026-02-01T00:00:00Z', lastCommit: 'def456' }, + }, + missingRepos: [], + }); const port = makePort({ resolveRepo: vi.fn(async () => { @@ -351,10 +754,15 @@ describe('GroupService', () => { const svc = new GroupService(port); const result = (await svc.groupStatus({ name: 'test-group' })) as { group: string; + lastSync: string; + missingRepos: string[]; repos: Record; }; expect(result.group).toBe('test-group'); + expect(result.lastSync).toBeTruthy(); + expect(result.missingRepos).toEqual([]); + // Both repos should be marked missing since resolveRepo throws expect(result.repos['app/backend'].missing).toBe(true); expect(result.repos['app/frontend'].missing).toBe(true); } finally { diff --git a/gitnexus/test/unit/group/storage.test.ts b/gitnexus/test/unit/group/storage.test.ts index 8a4500e053..c526cbc14b 100644 --- a/gitnexus/test/unit/group/storage.test.ts +++ b/gitnexus/test/unit/group/storage.test.ts @@ -5,12 +5,12 @@ import * as os from 'node:os'; import { getGroupDir, getGroupsBaseDir, - writeContractRegistry, - readContractRegistry, listGroups, createGroupDir, validateGroupName, + openBridgeOrFallback, } from '../../../src/core/group/storage.js'; +import { writeBridge, closeBridgeDb } from '../../../src/core/group/bridge-db.js'; import type { ContractRegistry } from '../../../src/core/group/types.js'; describe('Group storage', () => { @@ -34,36 +34,6 @@ describe('Group storage', () => { expect(dir).toBe(path.join(tmpDir, 'groups', 'company')); }); - it('writeContractRegistry writes atomically and readContractRegistry reads back', async () => { - const groupDir = path.join(tmpDir, 'groups', 'test-group'); - fs.mkdirSync(groupDir, { recursive: true }); - - const registry: ContractRegistry = { - version: 1, - generatedAt: '2026-03-31T10:00:00Z', - repoSnapshots: {}, - missingRepos: [], - contracts: [], - crossLinks: [], - }; - - await writeContractRegistry(groupDir, registry); - - const filePath = path.join(groupDir, 'contracts.json'); - expect(fs.existsSync(filePath)).toBe(true); - - const loaded = await readContractRegistry(groupDir); - expect(loaded?.version).toBe(1); - expect(loaded?.generatedAt).toBe('2026-03-31T10:00:00Z'); - }); - - it('readContractRegistry returns null when file does not exist', async () => { - const groupDir = path.join(tmpDir, 'groups', 'nonexistent'); - fs.mkdirSync(groupDir, { recursive: true }); - const result = await readContractRegistry(groupDir); - expect(result).toBeNull(); - }); - it('listGroups returns group names', async () => { const groupsDir = path.join(tmpDir, 'groups'); fs.mkdirSync(path.join(groupsDir, 'company'), { recursive: true }); @@ -135,4 +105,61 @@ describe('Group storage', () => { await expect(createGroupDir(tmpDir, '../evil')).rejects.toThrow(/Invalid group name/); }); }); + + describe('openBridgeOrFallback', () => { + it('test_openBridgeOrFallback_bridge_exists_returns_bridge', async () => { + const groupDir = path.join(tmpDir, 'bridge-test'); + fs.mkdirSync(groupDir, { recursive: true }); + + await writeBridge(groupDir, { + contracts: [], + crossLinks: [], + repoSnapshots: {}, + missingRepos: [], + }); + + const result = await openBridgeOrFallback(groupDir); + expect(result.type).toBe('bridge'); + if (result.type === 'bridge') { + expect(result.handle).toBeDefined(); + expect(result.meta).toBeDefined(); + await closeBridgeDb(result.handle); + } + }); + + it('test_openBridgeOrFallback_json_only_returns_json_with_deprecation', async () => { + const groupDir = path.join(tmpDir, 'json-test'); + fs.mkdirSync(groupDir, { recursive: true }); + + const registry: ContractRegistry = { + version: 1, + generatedAt: '2026-04-01T00:00:00Z', + repoSnapshots: {}, + missingRepos: [], + contracts: [], + crossLinks: [], + }; + fs.writeFileSync( + path.join(groupDir, 'contracts.json'), + JSON.stringify(registry, null, 2), + 'utf-8', + ); + + const result = await openBridgeOrFallback(groupDir); + expect(result.type).toBe('json'); + if (result.type === 'json') { + expect(result.registry.version).toBe(1); + expect(result.deprecationWarning).toContain('deprecated'); + expect(result.deprecationWarning).toContain('bridge.lbug'); + } + }); + + it('test_openBridgeOrFallback_neither_exists_returns_none', async () => { + const groupDir = path.join(tmpDir, 'empty-test'); + fs.mkdirSync(groupDir, { recursive: true }); + + const result = await openBridgeOrFallback(groupDir); + expect(result.type).toBe('none'); + }); + }); }); diff --git a/gitnexus/test/unit/group/sync.test.ts b/gitnexus/test/unit/group/sync.test.ts index 50c9093b93..0ecb7f9829 100644 --- a/gitnexus/test/unit/group/sync.test.ts +++ b/gitnexus/test/unit/group/sync.test.ts @@ -3,6 +3,7 @@ import * as fs from 'node:fs'; import * as path from 'node:path'; import * as os from 'node:os'; import { syncGroup, stableRepoPoolId } from '../../../src/core/group/sync.js'; +import { bridgeExists } from '../../../src/core/group/bridge-db.js'; import type { GroupConfig, StoredContract, RepoHandle } from '../../../src/core/group/types.js'; import type { RegistryEntry } from '../../../src/storage/repo-manager.js'; @@ -113,6 +114,28 @@ describe('syncGroup', () => { expect(result.crossLinks[0].to.service).toBe('services/auth'); }); + it('deduplicates duplicate contracts and links before returning', async () => { + const config = makeConfig({ 'app/backend': 'backend-repo', 'app/frontend': 'frontend-repo' }); + + const duplicateProvider = makeContract('http::GET::/api/users', 'provider', 'app/backend'); + const duplicateConsumer = makeContract('http::GET::/api/users', 'consumer', 'app/frontend'); + + const result = await syncGroup(config, { + extractorOverride: async () => [ + duplicateProvider, + { ...duplicateProvider, confidence: 0.9, meta: { source: 'manifest' } }, + duplicateConsumer, + { ...duplicateConsumer, confidence: 0.75, meta: { source: 'manifest' } }, + ], + skipWrite: true, + }); + + expect(result.contracts).toHaveLength(2); + expect(result.crossLinks).toHaveLength(1); + expect(result.contracts.find((contract) => contract.role === 'provider')?.confidence).toBe(0.9); + expect(result.contracts.find((contract) => contract.role === 'consumer')?.confidence).toBe(0.8); + }); + function makeContract(id: string, role: 'provider' | 'consumer', repo: string): StoredContract { return { contractId: id, @@ -202,7 +225,50 @@ describe('syncGroup', () => { } }); - it('writes registry to groupDir when skipWrite is false', async () => { + it('reports initLbug failures via extractorFailures and marks repo missing', async () => { + const config = makeConfig({ + 'app/backend': 'backend-repo', + 'app/frontend': 'frontend-repo', + }); + + const { vi } = await import('vitest'); + const poolAdapter = await import('../../../src/core/lbug/pool-adapter.js'); + const initSpy = vi.spyOn(poolAdapter, 'initLbug').mockImplementation(async (id: string) => { + if (id === 'app-backend') throw new Error('lbug corruption: CRC mismatch'); + }); + const closeSpy = vi.spyOn(poolAdapter, 'closeLbug').mockResolvedValue(undefined); + + try { + const result = await syncGroup(config, { + resolveRepoHandle: async (_name, groupPath) => ({ + id: groupPath.replace(/\//g, '-'), + path: groupPath, + repoPath: '/tmp/' + groupPath, + storagePath: '/tmp/' + groupPath + '/.gitnexus', + }), + skipWrite: true, + }).catch(() => undefined); + + expect(result).toBeDefined(); + // app/backend should be missing (initLbug threw) AND reported in + // extractorFailures so the user can see the real reason. + expect(result!.missingRepos).toContain('app/backend'); + expect(result!.extractorFailures).toBeDefined(); + const failure = result!.extractorFailures!.find((f) => f.repo === 'app/backend'); + expect(failure).toBeDefined(); + expect(failure!.message).toMatch(/CRC mismatch/); + // initLbug failure should be labeled 'init', not 'boundaries' + // (which is reserved for detectServiceBoundaries failures). + expect(failure!.extractor).toBe('init'); + // app/frontend init succeeded — it must NOT be in missingRepos. + expect(result!.missingRepos).not.toContain('app/frontend'); + } finally { + initSpy.mockRestore(); + closeSpy.mockRestore(); + } + }); + + it('writes bridge.lbug to groupDir when skipWrite is false', async () => { const tmpDir = path.join(os.tmpdir(), `gitnexus-sync-write-${Date.now()}`); fs.mkdirSync(tmpDir, { recursive: true }); @@ -215,13 +281,7 @@ describe('syncGroup', () => { }); expect(result.contracts).toHaveLength(0); - - const registryPath = path.join(tmpDir, 'contracts.json'); - expect(fs.existsSync(registryPath)).toBe(true); - - const registry = JSON.parse(fs.readFileSync(registryPath, 'utf-8')); - expect(registry.version).toBe(1); - expect(registry.contracts).toHaveLength(0); + expect(await bridgeExists(tmpDir)).toBe(true); } finally { fs.rmSync(tmpDir, { recursive: true, force: true }); } diff --git a/gitnexus/test/unit/group/topic-extractor.test.ts b/gitnexus/test/unit/group/topic-extractor.test.ts index c6a1161a00..bf821de63a 100644 --- a/gitnexus/test/unit/group/topic-extractor.test.ts +++ b/gitnexus/test/unit/group/topic-extractor.test.ts @@ -75,8 +75,7 @@ public void handleUserCreated(ConsumerRecord record) { it('test_extract_kafkajs_subscribe_returns_consumer', async () => { writeFile( 'src/consumer.ts', - `await consumer.subscribe({ topic: 'order.placed', fromBeginning: true }); -await consumer.run({ eachMessage: async ({ message }) => {} });`, + `await consumer.subscribe({ topic: 'order.placed', fromBeginning: true });`, ); const contracts = await extractor.extract(null, tmpDir, makeRepo(tmpDir)); @@ -101,6 +100,23 @@ await consumer.run({ eachMessage: async ({ message }) => {} });`, }); }); + describe('KafkaJS consumer run', () => { + it('test_extract_kafkajs_consumer_run_eachmessage_returns_consumer', async () => { + writeFile( + 'src/consumer.ts', + `await consumer.subscribe({ topic: 'user.logged-in' }); +await consumer.run({ eachMessage: async () => {} });`, + ); + + const contracts = await extractor.extract(null, tmpDir, makeRepo(tmpDir)); + const consumers = contracts.filter((c) => c.role === 'consumer'); + + expect(consumers).toHaveLength(1); + expect(consumers[0].contractId).toBe('topic::user.logged-in'); + expect(consumers[0].meta.broker).toBe('kafka'); + }); + }); + describe('RabbitMQ — Java', () => { it('test_extract_rabbit_listener_returns_consumer', async () => { writeFile( @@ -174,6 +190,62 @@ public void processOrder(OrderMessage msg) {}`, }); }); + describe('JetStream', () => { + it('test_extract_jetstream_publish_returns_provider', async () => { + writeFile('src/stream.go', `js.Publish("orders.created", payload)`); + + const contracts = await extractor.extract(null, tmpDir, makeRepo(tmpDir)); + const producers = contracts.filter((c) => c.role === 'provider'); + + expect(producers).toHaveLength(1); + expect(producers[0].contractId).toBe('topic::orders.created'); + expect(producers[0].meta.broker).toBe('nats'); + }); + + it('test_extract_jetstream_subscribe_returns_consumer', async () => { + writeFile('src/stream.go', `js.Subscribe("orders.created", handler)`); + + const contracts = await extractor.extract(null, tmpDir, makeRepo(tmpDir)); + const consumers = contracts.filter((c) => c.role === 'consumer'); + + expect(consumers).toHaveLength(1); + expect(consumers[0].contractId).toBe('topic::orders.created'); + expect(consumers[0].meta.broker).toBe('nats'); + }); + }); + + describe('Python NATS', () => { + it('test_extract_python_nats_subscribe_returns_consumer', async () => { + writeFile( + 'src/subscriber.py', + `nc = await nats.connect() +await nc.subscribe("orders.created", cb=handler)`, + ); + + const contracts = await extractor.extract(null, tmpDir, makeRepo(tmpDir)); + const consumers = contracts.filter((c) => c.role === 'consumer'); + + expect(consumers).toHaveLength(1); + expect(consumers[0].contractId).toBe('topic::orders.created'); + expect(consumers[0].meta.broker).toBe('nats'); + }); + + it('test_extract_python_nats_publish_returns_provider', async () => { + writeFile( + 'src/publisher.py', + `nc = await nats.connect() +await nc.publish("orders.created", payload)`, + ); + + const contracts = await extractor.extract(null, tmpDir, makeRepo(tmpDir)); + const producers = contracts.filter((c) => c.role === 'provider'); + + expect(producers).toHaveLength(1); + expect(producers[0].contractId).toBe('topic::orders.created'); + expect(producers[0].meta.broker).toBe('nats'); + }); + }); + describe('NATS', () => { it('test_extract_nats_subscribe_go_returns_consumer', async () => { writeFile( @@ -248,6 +320,96 @@ partConsumer, _ := consumer.ConsumePartition("inventory.update", 0, sarama.Offse expect(consumers[0].contractId).toBe('topic::inventory.update'); expect(consumers[0].meta.broker).toBe('kafka'); }); + + it('test_extract_sarama_sync_producer_returns_provider', async () => { + writeFile( + 'internal/publisher.go', + `package publisher +producer, _ := sarama.NewSyncProducer(brokers, cfg) +producer.SendMessage(&sarama.ProducerMessage{Topic: "inventory.update"})`, + ); + + const contracts = await extractor.extract(null, tmpDir, makeRepo(tmpDir)); + const producers = contracts.filter((c) => c.role === 'provider'); + + expect(producers).toHaveLength(1); + expect(producers[0].contractId).toBe('topic::inventory.update'); + expect(producers[0].meta.broker).toBe('kafka'); + }); + + it('test_extract_sarama_async_producer_returns_provider', async () => { + writeFile( + 'internal/publisher.go', + `package publisher +producer, _ := sarama.NewAsyncProducer(brokers, cfg) +producer.Input() <- &sarama.ProducerMessage{Topic: "inventory.update"}`, + ); + + const contracts = await extractor.extract(null, tmpDir, makeRepo(tmpDir)); + const producers = contracts.filter((c) => c.role === 'provider'); + + expect(producers).toHaveLength(1); + expect(producers[0].contractId).toBe('topic::inventory.update'); + expect(producers[0].meta.broker).toBe('kafka'); + }); + + it('test_extract_sarama_producer_in_loop_captures_all_topics', async () => { + // Regression: a for loop that constructs multiple ProducerMessage + // literals inside a single NewSyncProducer scope. The previous + // regex anchored on NewSyncProducer and captured only the first + // Topic within 300 chars, silently dropping the rest. + writeFile( + 'internal/multi-publisher.go', + `package publisher + +func publishAll(producer sarama.SyncProducer, items []Item) error { + _, _ = sarama.NewSyncProducer(brokers, cfg) + for _, item := range items { + msg1 := &sarama.ProducerMessage{Topic: "order.created"} + msg2 := &sarama.ProducerMessage{Topic: "order.shipped"} + _ = msg1 + _ = msg2 + } + return nil +}`, + ); + + const contracts = await extractor.extract(null, tmpDir, makeRepo(tmpDir)); + const producers = contracts.filter((c) => c.role === 'provider'); + const topics = producers.map((c) => c.contractId).sort(); + // Both topics must appear (exact set match to catch any duplicates). + expect(topics).toEqual(['topic::order.created', 'topic::order.shipped']); + }); + + it('test_extract_kafka_go_writer_returns_provider', async () => { + writeFile( + 'internal/writer.go', + `package publisher +writer := &kafka.Writer{Topic: "inventory.update"}`, + ); + + const contracts = await extractor.extract(null, tmpDir, makeRepo(tmpDir)); + const producers = contracts.filter((c) => c.role === 'provider'); + + expect(producers).toHaveLength(1); + expect(producers[0].contractId).toBe('topic::inventory.update'); + expect(producers[0].meta.broker).toBe('kafka'); + }); + + it('test_extract_kafka_go_reader_returns_consumer', async () => { + writeFile( + 'internal/reader.go', + `package consumer +reader := kafka.NewReader(kafka.ReaderConfig{Topic: "inventory.update"})`, + ); + + const contracts = await extractor.extract(null, tmpDir, makeRepo(tmpDir)); + const consumers = contracts.filter((c) => c.role === 'consumer'); + + expect(consumers).toHaveLength(1); + expect(consumers[0].contractId).toBe('topic::inventory.update'); + expect(consumers[0].meta.broker).toBe('kafka'); + }); }); describe('Kafka — Python', () => { @@ -309,5 +471,16 @@ await consumer.subscribe({ topic: 'order.placed' });`, expect(producers).toHaveLength(2); expect(consumers).toHaveLength(1); }); + + it('test_extract_ignores_go_test_files', async () => { + writeFile( + 'src/orders_test.go', + `consumer.ConsumePartition("fake-topic", 0, sarama.OffsetNewest)`, + ); + + const contracts = await extractor.extract(null, tmpDir, makeRepo(tmpDir)); + + expect(contracts).toEqual([]); + }); }); }); diff --git a/gitnexus/test/unit/tools.test.ts b/gitnexus/test/unit/tools.test.ts index 4274716a78..e8939fc96a 100644 --- a/gitnexus/test/unit/tools.test.ts +++ b/gitnexus/test/unit/tools.test.ts @@ -2,7 +2,7 @@ * Unit Tests: MCP Tool Definitions * * Tests: GITNEXUS_TOOLS from tools.ts - * - All 16 tools are defined (per-repo + group_*) + * - All 17 tools are defined (per-repo + group_*) * - Each tool has valid name, description, inputSchema * - Required fields are correct * - Optional repo parameter is present on tools that need it @@ -14,13 +14,14 @@ const GROUP_TOOLS = new Set([ 'group_list', 'group_sync', 'group_contracts', + 'group_impact', 'group_query', 'group_status', ]); describe('GITNEXUS_TOOLS', () => { - it('exports all tools (7 base + 3 route/tool/shape + 1 api_impact + 5 group)', () => { - expect(GITNEXUS_TOOLS).toHaveLength(16); + it('exports all tools (7 base + 3 route/tool/shape + 1 api_impact + 6 group)', () => { + expect(GITNEXUS_TOOLS).toHaveLength(17); }); it('contains all expected tool names', () => { @@ -101,6 +102,13 @@ describe('GITNEXUS_TOOLS', () => { } }); + it('group_impact uses repo as required group path', () => { + const groupImpact = GITNEXUS_TOOLS.find((t) => t.name === 'group_impact')!; + expect(groupImpact.inputSchema.required).toContain('repo'); + expect(groupImpact.inputSchema.required).toContain('name'); + expect(groupImpact.inputSchema.required).toContain('target'); + }); + it('group_contracts has optional repo filter', () => { const groupContracts = GITNEXUS_TOOLS.find((t) => t.name === 'group_contracts')!; expect(groupContracts.inputSchema.properties).toHaveProperty('repo');