fix(group): derive grpc consumer FQN from java imports for client-jar consumers#1889
Conversation
…consumers don't fall back to short names
Java gRPC microservices commonly follow the "client-jar" pattern: the
service owner publishes a pre-compiled stub jar to a Maven repository
and consumer repos depend on the jar instead of carrying the
originating `.proto` files. gRPC's official Java quickstart, Alibaba
HSF, ByteDance KiteX-Java and google-cloud-java all document this
shape.
Before this commit, `GrpcExtractor` resolved a fully-qualified
contract id (`grpc::<package>.<Service>/*`) only when the consumer
repo also carried a matching `.proto`. Client-jar consumers had no
proto, so they fell back to a short-name contract id
(`grpc::<Service>/*`) that never matched the provider's contract id.
Cross-repo grpc cross-link counts dropped to zero on every realistic
Java microservice group — including all of crsdp's `crsdp-backend →
unipus_cloud_framework` connections.
Fix: derive the proto package directly from each consumer file's
`import <pkg>.<XxxGrpc>;` statement. The package from the import is
exactly the proto package, so the contract id matches the provider's
verbatim — no `.proto` lookup needed in the consumer repo.
Implementation
--------------
* `grpc-patterns/types.ts` — `GrpcDetection` gains an optional
`protoPackage` field. Plugins set it when the package can be
derived from the source file alone.
* `grpc-patterns/java.ts` — adds `GRPC_CLASS_IMPORT_PATTERNS`, a
tree-sitter query that captures every
`import_declaration > scoped_identifier { scope, name }` pair where
the imported name ends in `Grpc`. `import static …` and
`import w.x.*;` are excluded by tree-sitter shape: the `name:` field
is only present on the non-static, non-wildcard form. The plugin
builds a per-file `XxxGrpc → fullPackage` map and tags every
provider / consumer detection it emits.
* `grpc-extractor.ts` — `detectionToContract()` now resolves the
contract id in three steps:
1. detection-supplied `protoPackage` wins (skips the proto map
entirely so an unrelated same-name service in the consumer
repo can't blur the FQN);
2. otherwise consult the legacy per-repo proto map;
3. otherwise fall back to a short-name contract id, preserving
pre-fix behaviour.
Confidence stays at the "with proto" tier when the import path
resolves: an import statement in real source is at least as
authoritative as a per-repo proto map.
Same-short-name disambiguation
-------------------------------
The motivating case `unipus_cloud_framework` defines two distinct
`ContentRpcService` services in different proto packages
(`cn.unipus.ucf.api.proto.client.service.ContentRpcService` vs
`cn.unipus.ucf.admin.proto.client.service.ContentRpcService`). Two
consumer files importing the two flavours now emit two distinct FQNs;
neither could be told apart from the other under the legacy short-
name fallback.
Out of scope
------------
`import w.x.*;` (wildcard service imports) are left to the legacy
short-name fallback. Wildcard imports are discouraged by Google's
Java style guide and IntelliJ's defaults, and resolving them
unambiguously would require either group-level proto-package
catalogs or per-class disambiguation, both of which are larger
follow-ups. This commit only changes behaviour for the dominant
specific-import case.
Tests
-----
`test/unit/group/grpc-extractor.test.ts` adds a new "Java client-jar
consumer (import-derived FQN)" describe block with 9 cases covering
both the happy paths (consumer/provider FQN derivation, same-short-
name disambiguation, import-vs-local-proto precedence) and the
regression-protection paths (no import + no detection emitted, static
imports / wildcards ignored, mixed-file repos preserved).
End-to-end verification
-----------------------
Ran the patched cli on the real `crsdp-backend` (consumer, no
`.proto`) and `unipus_cloud_framework` (provider, has `.proto`)
repos. Synced as a two-repo group, every `XxxGrpc` referenced via a
specific import in `UcfAdminGrpcClientService.java` produced an FQN
contract id that exact-matched the provider repo's FQN — 9 grpc
cross-links surfaced where there were 0 before.
Verification
------------
* `npx tsc --noEmit`: pass
* `npx tsc` (dist rebuild): pass
* `test/unit/group/grpc-extractor.test.ts`: 60/60 pass (51 existing
+ 9 new)
* `test/unit/group/`: 30 files / 545 tests all green
* `npx prettier --check` on touched files: pass
* `npx eslint` on touched src files: 0 errors / 0 warnings
|
Someone is attempting to deploy a commit to the NexusCore Team on Vercel. A member of the Team first needs to authorize it. |
CI Report✅ All checks passed Pipeline Status
Test Results
✅ All 10075 tests passed 5 test(s) skipped — expand for details
Code CoverageTests
📋 View full run · Generated by CI |
|
Claude finished @magyargergo's task in 8m 30s —— View job PR #1889 Production-Readiness ReviewReviewer persona: staff-level GitNexus production-readiness — adversarial but fair. Tasks
Review bar for this PRReviewed against: DoD.md v2.0.0, GUARDRAILS.md, AGENTS.md, CONTRIBUTING.md. Key gates applied:
Problem being solvedJava gRPC client-jar consumers (no local Current PR state
Merge status and mergeabilityBranch has one merge-from-main commit ( Cannot verify mergeability state, branch protection requirements, or CodeQL/Dependency Review conclusions programmatically from this environment. CI comment says all checks passed. Treat CI green as confirmed unless a check is known to be flaky. Repository history considered
Branch hygiene assessmentClean. The visible commits are Understanding of the changeResolution path added: Before (for Java consumer with no local
After (for Java consumer with
Matching path (confirmed from Service-level contracts ending with
The old short-name Tree-sitter query: The
Static imports are excluded because tree-sitter-java's File-local isolation: The FindingsFinding 1 — Java
|
| Question | Verified answer |
|---|---|
Does buildProtoContext handle option java_package? |
No. grpc-extractor.ts:262 matches package\s+… only. No java_package handling anywhere in the codebase. |
Does runWildcardMatch handle FQN consumer wildcards? |
Yes. fqService = "com.acme.auth.proto.authservice" → exact-compares against providerFqService. Correct when packages match; fails silently when java_package ≠ package. |
| Do service-level wildcards go to exact match or wildcard match? | Wildcard match only. isServiceWildcard at matching.ts:13 returns true for /* suffix. runExactMatch explicitly excludes them. |
Is protoPackageFor only called when a matching stub/provider detection fires? |
Yes. The lookup ${serviceName}Grpc is keyed by the same service name as the stub usage. Unrelated *Grpc imports only pollute if the short name collides AND a detection fires. |
| Is the import map file-local? | Yes. grpcClassImports = new Map() at scan start; rebuilt per scan(tree) call. |
| What confidence values do existing tests assert? | New tests: 0.75 (consumer), 0.8 (provider). Matches confidenceWithProto field values in java.ts. ✓ |
Open questions that remain
None that can be resolved without the author's internal codebase or a concrete option java_package test fixture. The java_package divergence issue is verifiable from the codebase alone — it is a confirmed finding, not a suspicion.
Final verdict
not production-ready
The import-derived package captures the Java compilation package (java_package option), while provider contract IDs are built from the proto package declaration. When these diverge (any project using option java_package), the fix does not merely fail to create cross-links — it actively breaks existing wildcard matches that previously succeeded via the endsWith rule in runWildcardMatch. The !fqService.includes('.') guard at matching.ts:279 means dots-containing FQNs skip the endsWith fallback. Converting grpc::AuthService/* (no dots, endsWith applies) to grpc::java_pkg.AuthService/* (dots, endsWith skipped) is an active regression for any consumer repo where java_package ≠ proto package. No test in the PR covers this divergence. DoD §2.1 ("edge cases relevant to the changed surface are handled or explicitly documented as out of scope") and DoD §2.7 ("tests cover the real changed path") are both unmet for this scenario. The fix is sound and well-implemented for the common case where java_package is absent or equals the proto package; a targeted patch for that scope — or explicit test + doc acknowledgment of the java_package limitation — would move this to production-ready.
|
@henry201605 could you please look into claude's findings? 🙏 |
…grpc detection Addresses Claude bot review on PR abhigyanpatwari#1889: - Finding 1: parse `option java_package` when building proto context; add a reverse index so an import-derived package can be translated back to the proto package. - Finding 2: when same-repo proto map has the service, use the proto package; warn and record `meta.importPackage` if the import disagrees. - Finding 3: add an end-to-end wildcard match test (provider+consumer fixture, runs `buildProviderIndex`+`runWildcardMatch`). Client-jar consumer + diverging `java_package` (no local proto) remains a known limitation; pinned by a dedicated test.
|
Thanks @magyargergo and @claude for the thorough review — pushed What changed
Known limitation (documented in code + tests)Client-jar consumers whose published proto sets Verification
End-to-end re-verified against the two-repo Java group used to validate the original PR; same 9 grpc cross-links surface, no regression. |
|
Can you please benchmark your implementation? 🙏 We need to make sure we won't degrade the performance, especially for big projects. We need good and reliable algorithm to collect these information. |
|
@magyargergo benchmark done — no measurable degradation. Setup
Results
The PR run is, if anything, marginally faster than main; differences are within run-to-run jitter. Output sets are bit-identical. Why the cost is smallThe two added work units are O(1) per detection / O(F) per scan:
No new I/O, no tree-sitter re-parse, no per-file scan amplification. Caveat on scale4 repos / 309 contracts is the largest realistic fixture I could assemble locally — I don't have a 50-repo / 5k-contract group on hand to exercise. The asymptotic argument above (constant per-detection overhead, no I/O) should hold linearly, but if you have a larger benchmark fixture you'd like me to run it against, please point me at it. How to reproduce# baseline
git checkout 2f15c1ec -- gitnexus/src/core/group/extractors/grpc-extractor.ts gitnexus/test/unit/group/grpc-extractor.test.ts
cd gitnexus && npx tsc
for i in 1 2 3 4 5; do node dist/cli/index.js group sync <your-group>; done
# PR head
git checkout 9a3456c6 -- gitnexus/src/core/group/extractors/grpc-extractor.ts gitnexus/test/unit/group/grpc-extractor.test.ts
cd gitnexus && npx tsc
for i in 1 2 3 4 5; do node dist/cli/index.js group sync <your-group>; done |
Problem
Java gRPC microservices commonly follow the client-jar pattern: the service owner publishes a pre-compiled stub jar to a Maven repository, and consumer repos depend on the jar instead of carrying the originating
.protofiles. This is the shape documented by the official gRPC Java quickstart, Alibaba HSF, ByteDance KiteX-Java, and google-cloud-java.Before this PR,
GrpcExtractorcould only produce a fully-qualified contract id (grpc::<package>.<Service>/*) when the consumer repo also carried a matching.proto. Client-jar consumers have no proto, so they fell back to a short-name contract id (grpc::<Service>/*) that never matches the provider's package-qualified id. Cross-repo grpc cross-link counts dropped to zero on every realistic Java microservice group.Concrete repro from a two-repo group (provider + consumer):
Fix
Derive the proto package directly from each consumer file's
import <pkg>.<XxxGrpc>;statement. The package from the import is exactly the proto package, so the contract id matches the provider's verbatim — no.protolookup needed in the consumer repo.Implementation
grpc-patterns/types.ts—GrpcDetectiongains an optionalprotoPackagefield.grpc-patterns/java.ts— addsGRPC_CLASS_IMPORT_PATTERNS, a tree-sitter query that capturesimport_declaration > scoped_identifier { scope, name }pairs where the imported name ends inGrpc. The plugin builds a per-fileXxxGrpc → fullPackagemap and tags every provider / consumer detection it emits.grpc-extractor.ts—detectionToContract()now resolves the contract id in three steps:protoPackagewins (skips the proto map entirely so an unrelated same-name service in the consumer repo can't blur the FQN).Confidence stays at the "with proto" tier when the import path resolves — an import statement in real source code is at least as authoritative as a per-repo proto map.
Same-short-name disambiguation
A real internal codebase defines two distinct
ContentRpcServiceservices in different proto packages (com.example.api.proto.service.ContentRpcServicevscom.example.admin.proto.service.ContentRpcService). Consumer files importing the two flavours now emit two distinct FQNs; neither could be told apart from the other under the legacy short-name fallback.Scope
This PR is Java-only. Other plugins (
go.ts,python.ts,node.ts) keep their existing behaviour and can opt in later by settingprotoPackageon their detections.Tests
test/unit/group/grpc-extractor.test.tsadds a newJava client-jar consumer (import-derived FQN)describe block with 9 cases:End-to-end verification
Ran the patched CLI on a real two-repo Java group: a consumer repo (zero
.proto) plus its provider repo (has.proto). Synced as a group:Each cross-link's
contractIdexactly matches the provider's FQN (e.g.grpc::com.example.admin.proto.service.ApplicationRpcService/*), andmeta.protoPackageSource: "import"confirms the import-derived path resolved them.Verification
npx tsc --noEmit✅npx tsc(dist rebuild) ✅test/unit/group/grpc-extractor.test.ts— 60/60 pass (51 existing + 9 new)test/unit/group/— 30 files / 545 tests all greennpx prettier --checkon touched files ✅npx eslinton touched source files — 0 errors / 0 warnings