fix(java): close parsing-layer coverage gaps F35/F38/F41 (#1928)#2045
Conversation
…wari#1928) Registry-primary scope-resolution path (the live one post-abhigyanpatwari#942/abhigyanpatwari#943): - F35 [HIGH]: qualified / qualified-generic constructor calls. `new pkg.Foo()` parses as a `scoped_type_identifier` that the query bound only as `@reference.call.constructor.qualified` with no `@reference.name`, so the scope extractor fell back to the whole-expression anchor and the reference name became the raw `new pkg.Foo()` text (never resolved). Bind the simple -name tail (end-anchored last child) and add an arm for the previously uncaptured `new pkg.Box<String>()` (qualified + generic) shape. - F38 [MEDIUM]: `super(...)` / `this(...)` explicit constructor invocations, modeled as `explicit_constructor_invocation` and never matched by the scope query, dropped the chained-constructor CALLS edges. Synthesize them with the target resolved structurally (this -> enclosing type name; super -> superclass tail via the shared javaBaseLookupNameNode, skipping implicit Object) plus arity for overload disambiguation. - F41 [LOW]: interpretJavaTypeBinding stripped the qualifier before generics, so a qualified generic type arg (`Map<String, com.example.User>`) was cut inside the generic into `User>`. Strip generics first, then the qualifier; make the erasure fallback qualifier-tolerant. F36/F37 already landed upstream (abhigyanpatwari#1940/abhigyanpatwari#1956); F39/F40 are legacy-bank remnants that are no longer consumed (legacy @import skipped in parse-worker; legacy @call never read in parse-impl) so they are intentionally left untouched. Tests: low-level capture unit tests (constructor shapes incl. double-match guard; super/this/enum/implicit-Object), interpretJavaTypeBinding unit tests (qualified generic args + the corruption case), and end-to-end resolver tests with new fixtures asserting the CALLS edges resolve to the correct constructors. Co-authored-by: Cursor <cursoragent@cursor.com>
|
@Sweetdevil144 is attempting to deploy a commit to the NexusCore Team on Vercel. A member of the Team first needs to authorize it. |
magyargergo
left a comment
There was a problem hiding this comment.
PR #2045 — tri-review (Java F35/F38/F41, #1928)
Methods & engines. Reviewed across three method families and ~9 lanes: GitNexus swarm (risk, test/CI), Compound-Engineering personas (correctness, adversarial, performance, testing), the three Java specialists requested (static-analysis/tree-sitter, JVM/Java-semantics, knowledge-graph builder), and Codex. Independence is asymmetric — every lane except Codex is Claude under a different prompt, so agreement is correlated. Codex was live and acted as the one independent engine (it CONFIRMED the F35 qualifier finding and REFUTED the F41 "regression"). Two swarm lanes (risk, test/CI) ended mid-investigation and contributed only partial notes; their domains were covered by the other lanes. The headline finding comes from a single (Claude) lane but was reproduced end-to-end by the coordinator, so it stands on reproduction, not corroboration count.
Bottom line: the three fixes are well-built and the parsing-layer logic is correct (F41 is provably non-corrupting; F35's query and F38's AST walk are verified against tree-sitter-java 0.23.5). The one thing worth fixing before merge: F38's this(...) chaining edge doesn't land — it materializes as a self-loop, and the new integration test masks it.
🔴 Headline (inline) — F38 this() emits a degenerate self-loop, not a ctor→ctor edge [reproduced]
Child(int x){ this(); } should produce CALLS Child(int)@L7 → Child()@L3; the pipeline actually emits Child()@L3 → Child()@L3 (selfLoop=true) — both endpoints collapse onto the first Child constructor, and the real relationship is lost. (super(1)→Base is correct; the cross-class case works.)
- Root cause (pre-existing, not in this diff):
scope-resolution/graph-bridge/node-lookup.ts:101-135registers overload-disambiguation keys (parameter-types / shape / constraints) only forFunction/Method, neverConstructor, so same-name ctors first-wins-collapse to one node. F38 is the first feature to create same-class ctor→ctor edges, so the first to surface it. - The test masks it:
java-1928.test.tsmatches thethis()edge by name only, which a self-loop satisfies — green on a broken edge. - Suggested: at minimum strengthen the test to assert distinct source/target constructors (by
startLine/arity) — that turns it red; the real fix extends the node-lookup overload keys to includeConstructor. Fix here or track separately, but the test shouldn't certify a self-loop.
Inline (lower) — F41 test can't catch an order regression [reproduced]
java-interpret.test.ts covers List<Map<String, User>> → List, but that input has no dot inside the generic args, so it yields List under both the old and new strip order — it can't detect a regression of the F41 reorder. The case that actually exercises it is a qualified nested element: add List<com.x.Foo<String>> → List (corrupted Foo<String>> under the old order). Low severity; the fix itself is correct.
✅ What's solid (validation is a feature)
- F41 reorder is correct and a strict improvement. A 500-input sweep found 0 regressions / 360 fixes; JVM-semantics confirms
List<com.x.Foo<String>> → Listis the JLS-correct erasure (the oldFoowas an accidental, inconsistent result — unqualifiedList<Foo<String>>already returnedList). Codex refuted the "regression" framing. The earlier-suspected nested-generic regression is not a defect. - F35 query verified against tree-sitter-java 0.23.5: the anchored tail binds the simple name (
new a.b.Foo()→Foo, nevera/b), exactly one ctor ref pernew(no double-match with the simple/generic/type-arg arms), array-creation correctly not matched, annotations robust. - F38 structural targeting is correct across nested/local/enum/record/generic/qualified-super; implicit-
Objectsuper correctly skipped; no crashes across 17 malformed inputs. The synth emits the right capture — the breakage is purely downstream in the graph-bridge. - No worker/non-worker parity gap:
emitJavaScopeCapturesis the single sharedemitScopeCaptureshook used by both pipelines (verified) — the trap that bit #1951/#2038 does not apply here. - New unit tests pass (23/23).
Lower-priority / known limitations (non-blocking)
- F35 qualifier is captured but not threaded to resolution (Codex CONFIRMED).
new pkg.Foo()resolves by the bare simple nameFoo; the captured@reference.call.constructor.qualifiedis inert (no consumer). This is net-positive vs base (base emitted no edge for qualifiednew) and consistent with the codebase-wide simple-name resolution.⚠️ The obvious "just emit@reference.qualified-name" fix is inert for constructor calls — there is no qualified channel in thecall/MethodRegistry.lookuppath (only theinherits/#1982 path consumes it). Real package-disambiguation is a larger, separate change. Only failure mode today:new pkg.Foo()with nopkg.Foobut a same-tailFooelsewhere mis-binds (requires non-compiling source); the two-Foocase drops on tie. - F38
superto a same-tail qualified superclass (extends a.b.Basewith a collidingc.d.Base) drops both EXTENDS and super-ctor edges — symmetric with the existing EXTENDS behavior (Java never threads a qualified base; the #1982 path is dormant for Java). Consistent, not newly broken. - Annotated superclass (
extends @Ann Base) drops both EXTENDS and super-ctor edges — pre-existingjavaBaseLookupNameNodegap (noannotated_typearm); a one-line arm would fix both. - Malformed
extends {emits a@reference.name:""ref on ERROR trees (resolves to nothing; no edge, no crash) — parity with the inheritance synth; an optional empty-name guard would tidy both. - Perf:
emitJavaScopeCapturesnow does the main query + two manual full-tree walks (inheritance + explicit-ctor). All O(n), andfindEnclosingTypeDeclaration's parent walk is constant-depth — no O(n²). Given the active large-repo parse-OOM work (#1983), consider folding the two synth walks into one pass. - Test gaps: enum/record
super(...)skip path, thesuper/thisvsnewparameter-types asymmetry (arity-only), and thethis()distinct-source assertion (the masking gap above) are untested.
CI
gh pr checks: Vercel fails on authorization (deploy auth, not a code issue); "Apply conventional label" passes; "Validate PR title" skipped. No test-suite check is surfaced as a PR check, though the new test files are wired into the vitest default project (run by npm test). Mergeability: BLOCKED (driven by the Vercel auth failure), MERGEABLE otherwise.
Automated multi-tool digest (GitNexus swarm + Compound-Engineering personas + Codex). Verify before acting.
| it('resolves `this()` in Child(int) to the Child constructor', () => { | ||
| const calls = getRelationships(result, 'CALLS'); | ||
| const thisCall = calls.find( | ||
| (c) => c.target === 'Child' && c.targetLabel === 'Constructor' && c.source === 'Child', |
There was a problem hiding this comment.
[P2 · reproduced] F38 this() produces a self-loop, and this assertion masks it.
Running the pipeline on this fixture, Child(int x){ this(); } emits CALLS Child()@L3 → Child()@L3 (selfLoop=true), not the intended Child(int)@L7 → Child()@L3 — both endpoints collapse onto the first Child constructor and the real ctor→ctor edge is lost (super(1)→Base resolves correctly).
This find(...) matches by name only (source==='Child' && target==='Child'), which a self-loop satisfies, so the suite stays green on a broken edge.
Root cause is pre-existing and outside this diff: scope-resolution/graph-bridge/node-lookup.ts:101-135 registers overload-disambiguation keys only for Function/Method, never Constructor, so same-name constructors first-wins-collapse to one graph node.
Fix: at minimum assert the edge connects distinct constructors (e.g. by startLine/arity or node id) so this test goes red and exposes the miss; the full fix extends the node-lookup/ids overload keys to include Constructor.
| it('falls back to the raw class name for an unrecognized nested generic', () => { | ||
| // Nested generic args defeat the single/two-arg element extraction; the | ||
| // erasure fallback keeps the outer class name (pre-existing behavior). | ||
| expect(raw('List<Map<String, User>>')).toBe('List'); |
There was a problem hiding this comment.
[P3 · reproduced] This case can't catch an F41 order-regression.
List<Map<String, User>> has no . inside its generic args, so stripQualifier is a no-op here and it normalizes to List under both the old order (stripGeneric(stripQualifier(x))) and the new order this PR introduces — so it can't detect a regression of the reorder.
The case that actually exercises F41 is a qualified nested element. Add:
expect(raw('List<com.x.Foo<String>>')).toBe('List');which produced a corrupted Foo<String>> under the old order and List after the fix. The fix itself is correct — this only hardens the guard.
…uper() chains don't self-loop (abhigyanpatwari#1928 F38 review) Review of abhigyanpatwari#2045 caught two gaps; both confirmed by reproduction. P2 — F38 this() emitted a self-loop. On the java-explicit-constructor fixture, Child(int){ this(); } produced CALLS Child()#0 -> Child()#0 instead of Child(int)abhigyanpatwari#1 -> Child()#0. Root cause is the language-agnostic graph-bridge: the parse phase mints distinct Constructor nodes (Child#0, Child#1) carrying parameterTypes, but node-lookup.ts registered the parameter-types / shape overload keys only for Function/Method, never Constructor, so both ctors collapsed onto the first-wins qualified/simple key and the caller Child(int) resolved to Child#0 (the this() target). Extend the overload keys to Constructor in both node-lookup.ts (registration) and ids.ts (lookup) via a shared isOverloadableCallable predicate. Verified the edge now connects distinct nodes (Child#1 -> Child#0); super(1)->Base#1 still correct. No cross-language regressions (the 9 worker-path failures reproduce identically on clean HEAD). Also harden the integration test: it matched the this() edge on name only, which a self-loop satisfies; now assert the endpoints are DISTINCT constructors. P3 — F41 order-regression guard was inert (List<Map<String,User>> normalizes to List under both strip orders). Add List<com.x.Foo<String>> -> List, which is corrupted to Foo<String>> under the old order and only correct generics-first. Co-authored-by: Cursor <cursoragent@cursor.com>
CI Report✅ All checks passed Pipeline Status
Test Results
✅ All 10502 tests passed 16 test(s) skipped — expand for details
Code CoverageTests
📋 View full run · Generated by CI |
…tures in baselines.json Updated the fingerprint for the Java section and added detailed notes regarding the enhancements in constructor query captures, including qualified and qualified-generic constructor queries. This change reflects ongoing improvements in the parsing layer coverage and fixture updates.
Summary
Closes the three open findings of #1928 (parent epic #1919) in the registry-primary scope-resolution path — the live one after the legacy call/heritage/import DAG removal (#942/#943). F36/F37 already landed upstream (#1940/#1956); F39/F40 are verified-dead legacy-bank remnants and are intentionally left untouched.
new pkg.Foo()parses as ascoped_type_identifierthat the scope query bound only as@reference.call.constructor.qualifiedwith no@reference.name. The scope extractor then fell back to the whole-expression anchor (scope-extractor.tsnameCap = match['@reference.name'] ?? anchor), so the reference name became the rawnew pkg.Foo()text and never resolved. Now the simple-name tail is bound via an end-anchored last child, plus a new arm for the previously uncapturednew pkg.Box<String>()(qualified and generic) shape. Mirrors the existing TS/JS new-expression qualified-constructor capture.super(...)/this(...)explicit constructor invocations. Modeled by tree-sitter asexplicit_constructor_invocation, never matched by the scope query, so chained-constructorCALLSedges were silently dropped. Synthesized (alongside the existing inheritance synth) with the target resolved structurally —this→ enclosing type name;super→ superclass tail via the sharedjavaBaseLookupNameNode(implicitObjectskipped) — plus arity for overload disambiguation.interpretJavaTypeBindingstrip order. The qualifier was stripped before generics, so a qualified generic type argument (Map<String, com.example.User>) was cut inside the generic intoUser>. Generics are now stripped first, then the qualifier; the erasure fallback is qualifier-tolerant.Why F39/F40 are untouched
Traced end-to-end: the legacy
@importcapture is explicitly skipped (parse-worker.ts— IMPORTS now come from scope-resolution, #942/#943), and the legacy@callresult feedsresult.calls, whichparse-impl.tsnever reads (CALLS come solely from scope-resolution). Editing the legacyJAVA_QUERIESbank would change nothing functionally while churning the parity gate and risking the route/fetch path. Recommend closing F39/F40 as obsoleted by the registry-primary migration.Test plan
New tests cover each finding at the capture level and end-to-end:
test/unit/scope-resolution/java/java-captures.test.ts— every constructor shape (Foo,pkg.Foo,a.b.Foo,Box<T>,pkg.Box<T>), name-tail correctness, qualified text, arity, and an "exactly one ref pernew" double-match guard;super/this/generic-super/enum-this/implicit-Object-skip.test/unit/scope-resolution/java/java-interpret.test.ts— qualified/generic/qualified-generic bases, the exactMap<String, com.example.User>corruption case, containers, nested-generic fallback,var.test/integration/resolvers/java-1928.test.ts(+java-qualified-constructor/java-explicit-constructorfixtures) — asserts theCALLSedges resolve to the correct constructors and no corruptedpkg.Footarget leaks.npx tsc --noEmitclean. Pre-existing unrelated failures (C++/Rust/Ruby worker-path parity fix(ingestion): Ruby same-tail nested mixin-MODULE (Trait) node collision — structure phase never qualifiesmoduleids (#1982 / #1978 follow-up) #1991/fix(ingestion): qualify Rust GENERIC inherent-impl target by mod scope (same-tailimpl Inner<T>collapse) #1992/fix(ingestion): C++ same-tail nested types in sibling unions / anonymous namespaces still merge into one node (#1978 fan-out U1 retro-fix) #1995, Laravel route) confirmed independent by reproducing them with these edits stashed.Made with Cursor