Skip to content

feat: Phase 7 type resolution — return-aware loop inference & PHP class-property iterables#341

Merged
magyargergo merged 11 commits into
mainfrom
feat/phase7-type-resolution
Mar 18, 2026
Merged

feat: Phase 7 type resolution — return-aware loop inference & PHP class-property iterables#341
magyargergo merged 11 commits into
mainfrom
feat/phase7-type-resolution

Conversation

@magyargergo

@magyargergo magyargergo commented Mar 17, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Phase 7.1+7.2 (foundation): Introduces ReturnTypeLookup interface and buildReturnTypeLookup factory, replacing the bare (node, env) for-loop extractor signature with a ForLoopExtractorContext object. Adds pendingCallResults infrastructure (discriminated union type, processing loop) for future var x = f() return-type propagation. The loop is structurally complete but no extractor populates it yet — const users = getUsers() currently resolves via the existing scanConstructorBinding path in call-processor.ts.
  • Phase 7.3 (7 languages): Extends all typed-iteration for-loop extractors (Go, TypeScript, Python, Rust, Java, Kotlin, C#) to resolve the loop variable type when the iterable is a direct function call (for user in getUsers(), foreach (var u in GetUsers()), etc.) by unwrapping lookupRawReturnType via the shared extractElementTypeFromString helper.
  • Phase 7.4 (PHP): Adds Strategy C to the PHP for-loop extractor — when the iterable is $this->property, walks up the AST to the enclosing class_declaration, scans the declaration_list for a matching property_declaration, and extracts the element type from the @var PHPDoc annotation or PHP 7.4+ native type. Removes the @param workaround that was previously required in the fixture.
  • Docs: type-resolution-roadmap.md updated with implementation notes and resolved-gap strikethroughs for all 7A/7B/7C items.

Test plan

  • npx vitest run test/unit/type-env.test.ts — 288 tests pass (includes new Phase 7.3/7.4 cases)
  • npx vitest run test/unit/call-processor.test.ts — 78 tests pass
  • npx vitest run test/integration/resolvers/go.test.ts — 70 tests pass
  • npx vitest run test/integration/resolvers/php.test.ts — passes
  • Full suite: 2783 tests pass, 0 new failures (only pre-existing infra failures in lbug-pool / local-backend* / cursor skill install)

🤖 Generated with Claude Code

…ontext object, pendingCallResults

- Move extractReturnTypeName + helpers from call-processor.ts to type-extractors/shared.ts
  (breaks circular import risk: call-processor → type-env → type-extractors → call-processor)
- Add SymbolTable.lookupFuzzyCallable(name) — lazy callable-only index, O(1) per call,
  invalidated on add(); avoids per-call .filter() on lookupFuzzy results
- Add ReturnTypeLookup interface (conservative: undefined when 0 or 2+ callables match)
- Add ForLoopExtractorContext interface — replaces 4 positional params with context object;
  update all 10 language extractor implementations (go, ts, py, jvm×2, cs, rs, rb, php, c-cpp)
- Add PendingAssignment discriminated union (kind: 'copy' | 'callResult');
  update PendingAssignmentExtractor in all 9 language extractors that implement it
- Wire buildTypeEnv: build ReturnTypeLookup from optional symbolTable; split pendingAssignments
  into pendingCopies + pendingCallResults; add Tier 2b call-result propagation loop
- Update call-processor.test.ts to import extractReturnTypeName from shared.ts
…oop extractors (7 languages)

Extends for-loop type extraction in all 7 typed-iteration languages to
resolve element types when the iterable is a direct function call.

**New capability**: `for (var u : getUsers())` in Java, `for u in get_users()`
in Python, `for user in getUsers()` in TypeScript, etc. now resolve
`u`/`user` to the callee's return element type via lookupRawReturnType +
extractElementTypeFromString.

Changes per language:
- types.ts: extend ReturnTypeLookup with lookupRawReturnType (raw return
  string for container-type extraction); update ForLoopExtractorContext
  with returnTypeLookup field
- type-env.ts: implement lookupRawReturnType on the concrete ReturnTypeLookup
  built in buildTypeEnv (same guards as lookupReturnType, no extractReturnTypeName)
- go.ts: call_expression branch in range_clause — identifier func or
  selector_expression method; existing isChannelType guards updated
- typescript.ts: identifier fn branch inside call_expression handler
- python.ts: identifier fn branch inside call handler
- jvm.ts (Java): method_invocation without object field in enhanced_for_statement
- jvm.ts (Kotlin): simple_identifier callee branch in call_expression node
- csharp.ts: identifier fn branch in invocation_expression handler
- rust.ts: identifier func branch in call_expression handler (alongside
  existing field_expression/method-call path)

All branches follow the same conservative pattern:
  lookupRawReturnType(callee) → extractElementTypeFromString → bind loop var
…var class property scan

Adds Strategy C to PHP's extractForLoopBinding for the pattern:

  foreach (\$this->property as \$item)

when Strategy A (resolveIterableElementType) and Strategy B (scopeEnv lookup)
both fail to find the element type.

Strategy C: when the iterable is a member_access_expression with object '$this',
walk up the AST to the enclosing class_declaration, scan its declaration_list
for a property_declaration whose variable_name matches the property, and extract
the element type from:
  1. PHPDoc @var annotation on a preceding comment sibling (/** @var User[] */)
  2. PHP 7.4+ native type field (e.g. UserRepo \$repo — skips generic 'array')

This eliminates the @param workaround that was previously required in the
php-foreach-member-access fixture (which used @param User[] \$users on the method
to populate the method's scopeEnv with a \$users binding).

New helpers in php.ts:
- PHPDOC_VAR_RE: regex for @var extraction
- extractClassPropertyElementType: reads @var or native type from a property_declaration
- findClassPropertyElementType: scans class body for a named property

Tests added (type-env.test.ts):
- PHP: resolves from @var User[] without @param workaround
- PHP: conservative — no binding for unknown property
- PHP: multi-class file — both classes resolve independently

Fixture updated (php-foreach-member-access/App.php):
- Removed the @param User[] \$users workaround from processMembers()
- Test now validates the natural class-property-based resolution path
Records that 7A (call_expression iterables, 7 languages), 7B (PHP
$this->property via @var scan), and 7C (ReturnTypeLookup + context object)
are all shipped. Adds implementation notes and strikethroughs on resolved
language-specific gaps.
@vercel

vercel Bot commented Mar 17, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
gitnexus Ready Ready Preview, Comment Mar 18, 2026 8:34am

Request Review

@github-actions

github-actions Bot commented Mar 17, 2026

Copy link
Copy Markdown
Contributor

CI Report

All checks passed

Pipeline Status

Stage Status Details
✅ Typecheck success tsc --noEmit
✅ Unit Tests success 3 platforms
✅ Integration success 3 OS x 4 groups = 12 jobs

Test Results

Suite Tests Passed Failed Skipped Duration
Unit 1568 1567 0 1 8s
Integration 1202 1182 0 19 52s
Total 2770 2749 0 20 60s

✅ All 2749 tests passed

20 test(s) skipped — expand for details

Integration:

  • Python match/case as-pattern type binding > resolves u.save() to User#save via match/case as-pattern binding
  • Python match/case as-pattern type binding > does NOT resolve u.save() to Repo#save (negative disambiguation)
  • Swift constructor-inferred type resolution > detects User and Repo classes, both with save methods
  • Swift constructor-inferred type resolution > resolves user.save() to Models/User.swift via constructor-inferred type
  • Swift constructor-inferred type resolution > resolves repo.save() to Models/Repo.swift via constructor-inferred type
  • Swift constructor-inferred type resolution > emits exactly 2 save() CALLS edges (one per receiver type)
  • Swift self resolution > detects User and Repo classes, each with a save function
  • Swift self resolution > resolves self.save() inside User.process to User.save, not Repo.save
  • Swift parent resolution > detects BaseModel and User classes plus Serializable protocol
  • Swift parent resolution > emits EXTENDS edge: User → BaseModel
  • Swift parent resolution > emits IMPLEMENTS edge: User → Serializable (protocol conformance)
  • Swift cross-file User.init() inference > resolves user.save() via User.init(name:) inference
  • Swift cross-file User.init() inference > resolves user.greet() via User.init(name:) inference
  • Swift return type inference > detects User class and getUser function
  • Swift return type inference > detects save function on User (Swift class methods are Function nodes)
  • Swift return type inference > resolves user.save() to User#save via return type of getUser() -> User
  • Swift return-type inference via function return type > resolves user.save() to User#save via return type of getUser()
  • Swift return-type inference via function return type > user.save() does NOT resolve to Repo#save
  • Swift return-type inference via function return type > resolves repo.save() to Repo#save via return type of getRepo()

Unit:

  • buildTypeEnv > known limitations (documented skip tests) > Ruby block parameter: users.each { |user| } — closure param inference, different feature

Code Coverage

Combined (Unit + Integration)

Metric Coverage Covered Base Delta Status
Statements 59.2% 6699/11314 47.15% 📈 +12.1 🟢 ███████████░░░░░░░░░
Branches 53.49% 4590/8581 42.44% 📈 +11.1 🟢 ██████████░░░░░░░░░░
Functions 57.57% 566/983 46.35% 📈 +11.2 🟢 ███████████░░░░░░░░░
Lines 60.96% 5973/9798 48.72% 📈 +12.2 🟢 ████████████░░░░░░░░
Coverage breakdown by test suite

Unit Tests

Metric Coverage Covered Base Delta Status
Statements 44.56% 5042/11314 47.15% 📉 -2.6 🔴 ████████░░░░░░░░░░░░
Branches 39.71% 3408/8581 42.44% 📉 -2.7 🔴 ███████░░░░░░░░░░░░░
Functions 42.52% 418/983 46.35% 📉 -3.8 🔴 ████████░░░░░░░░░░░░
Lines 45.94% 4502/9798 48.72% 📉 -2.8 🔴 █████████░░░░░░░░░░░

Integration Tests

Metric Coverage Covered Base Delta Status
Statements 46.95% 5312/11314 47.15% 📉 -0.2 🔴 █████████░░░░░░░░░░░
Branches 44.2% 3793/8581 42.44% 📈 +1.8 🟢 ████████░░░░░░░░░░░░
Functions 44.45% 437/983 46.35% 📉 -1.9 🔴 ████████░░░░░░░░░░░░
Lines 48.07% 4710/9798 48.72% 📉 -0.6 🔴 █████████░░░░░░░░░░░

📋 View full run · Generated by CI

…gration tests for 7 languages

Add integration test coverage for Phase 7.3's call_expression iterable
resolution across all 7 languages (Go, TypeScript, Python, Java, Kotlin,
PHP, Rust). Each test creates a fixture with competing User/Repo classes
that both define save(), then verifies for-loop iteration over a function
call's return value resolves to the correct class.

PHP was missing function_call_expression support in its for-loop extractor.
Three changes fix this:
- php.ts extractForLoopBinding: handle function_call_expression and
  member_call_expression iterables via returnTypeLookup
- php.ts normalizePhpReturnType: preserve array notation (User[]) in
  SymbolTable so lookupRawReturnType returns useful container types
- parse-worker.ts + parsing-processor.ts: upgrade uninformative AST
  return types (array, iterable) with PHPDoc @return annotations

35 new integration tests (5 per language), 2525 total tests passing.
@magyargergo

Copy link
Copy Markdown
Collaborator Author

@claude ultra-think while identifying per-language gaps for all of our supported languages (except swift). Perfom this analysis on this pull request and verify those findings. Act as a senior compiler font-end engineer and expert with static analysis tools and review this change from that perpective and architectural fit. Also check out the previous comments. You must also cross-reference it with our type-resolution-roadmap.md and see if we need to update type-resolution-system.md after this change.

@claude

claude Bot commented Mar 18, 2026

Copy link
Copy Markdown

Claude finished @magyargergo's task in 7m 4s —— View job


Phase 7 Type Resolution — Deep Review

Tasks

  • Gather context (roadmap, system doc, changed files)
  • Review core architectural changes (types.ts, shared.ts, type-env.ts)
  • Review per-language extractor changes
  • Identify per-language gaps (all languages except Swift)
  • Cross-reference with type-resolution-roadmap.md
  • Determine if type-resolution-system.md needs updating
  • Write final review

Review

Acting as a senior compiler front-end engineer / static analysis tooling specialist. This is a well-structured, incremental expansion of the type resolution substrate. Below is a detailed breakdown.


Architecture

ReturnTypeLookup interface (types.ts:30–37)

The lookupReturnType / lookupRawReturnType split is the right abstraction. Using lookupRawReturnType for iterable extraction (preserving []User, List<User>) and lookupReturnType for variable binding (stripping to User) means callers don't need to know which transformation to apply. The conservative 1-match-only guard in buildTypeEnv:

const callables = symbolTable.lookupFuzzyCallable(callee);
if (callables.length !== 1) return undefined;

is load-bearing and correct — ambiguous overloads produce no binding rather than a wrong one.

ForLoopExtractorContext context object (types.ts:41–50)

Replacing (node, env) with a context object was the right call. Adding the 4th (returnTypeLookup) and potentially future fields (Phase 8 field maps?) would have been impossible ergonomically with positional args. Minor note: the context object is heap-allocated on every for-loop node visit in the hot AST walk (type-env.ts:475). This is negligible given the overall file-processing workload, but worth noting for a future micro-opt pass.

extractReturnTypeName deduplication (call-processor.ts: -137 lines)

The 137 removed lines were an exact duplicate of extractReturnTypeName + its helpers (WRAPPER_GENERICS, extractFirstGenericArg, etc.) that now live canonically in shared.ts. This is a clean DRY win with no behavior change.

⚠️ pendingCallResults / Tier 2b is dormant infrastructure

This is the most significant concern in the PR. The PendingAssignment discriminated union defines a callResult kind, type-env.ts has full Tier 2b wiring:

// type-env.ts:651–658
for (const { scope, lhs, callee } of pendingCallResults) {
    const scopeEnv = env.get(scope);
    if (!scopeEnv || scopeEnv.has(lhs)) continue;
    const typeName = returnTypeLookup.lookupReturnType(callee);
    if (typeName) scopeEnv.set(lhs, typeName);
}

But zero language extractors ever return { kind: 'callResult', ... } from extractPendingAssignment. Every implementation returns only { kind: 'copy', ... } or undefined. The pendingCallResults array is always empty at runtime.

The PR description says "Phase 7.2 added pendingCallResults propagation: var x = f() assignments now propagate the inferred return type of f into the local scope env." This is not accurate as written. What actually works for const users = getUsers() is the existing scanConstructorBindingConstructorBindingcall-processor.ts path. Tier 2b is forward-looking infrastructure but currently dead code.

Options:

  • a) Remove callResult from PendingAssignment and the pendingCallResults processing loop until at least one extractor uses it, to keep the codebase honest
  • b) Wire it up in at least one language (TypeScript is the most obvious — extractPendingAssignment already sees variable_declarator nodes; returning callResult for call_expression RHS when no type annotation is present would be straightforward), with a matching test
  • c) Add a code comment explicitly marking Tier 2b as "infrastructure-ready, not yet wired" with a Phase 9 reference

Fix option (c) →

⚠️ PHP normalizePhpType vs extractElementTypeFromString asymmetry

In the call-expression path, PHP uses normalizePhpType(rawReturn) (php.ts:457,464) while all other 6 languages use extractElementTypeFromString(rawReturn). The behavioral difference: normalizePhpType('User')'User' (scalar passes through), while extractElementTypeFromString('User')undefined (no container syntax).

This means if a PHP function has a @return User annotation (single object, not array), foreach (getUser() as ...) would incorrectly bind the loop variable to User. Other languages would correctly return undefined in this case. Consider making PHP's call-expression path more conservative by using extractElementTypeFromString and only falling back to normalizePhpType when the result is a clear container type.

Review this →


Per-Language Gap Analysis

The 7 languages updated in Phase 7.3 are solid. Below are the remaining gaps per language.

TypeScript / JavaScript

Gap Severity Notes
Method-chained call iterable: for (const u of repo.getUsers()) Medium fn.type === 'member_expression' path treats repo as the container, not calls returnTypeLookup on getUsers. Only bare function calls (getUsers()) use returnTypeLookup (typescript.ts:376–380).
for (const u of this.getUsers()) Medium fn.type === 'member_expression' with obj.type === 'this_expression' — not matched by obj.type === 'identifier' guard (typescript.ts:368)
for...of object destructuring: for (const { id } of users) Low Correctly skipped with a comment; minor missed opportunity
No callResult from extractPendingAssignment Low const users = getUsers() only resolves via call-processor.ts path

Go

Gap Severity Notes
Channel type for call-expression iterables: for v := range getChan() Medium extractElementTypeFromString('chan User')undefined. The identifier-iterable path has isChannelType() handling but the call-expression path doesn't (go.ts:308–326).
Multi-return non-error cases: a, b := getUser() where second var isn't err/ok/_ Low scanConstructorBinding only binds first var when second is error sentinel (go.ts:154–168)

Python

Gap Severity Notes
Method call iterable: for u in self.get_users() Medium fn.type === 'attribute' path sets iterableName = 'self', tries scopeEnv lookup for self as a container — not useful. returnTypeLookup is not consulted for attribute-method calls (python.ts:249–254).
for u in repo.get_users() Medium Same issue — treated as repo container lookup, not return-type lookup on get_users
Generator function return type Low Generator[User, None, None] not parsed by extractElementTypeFromString

PHP

Gap Severity Notes
self::$staticProperty / static::$property iterables Medium Strategy C only handles $this->property via member_access_expression. Static property access (scope_resolution_expression) not covered (php.ts:499–512).
normalizePhpType less conservative than other languages Medium See architecture note above
No pattern binding extractor Low PHP has no extractPatternBinding

Java

Gap Severity Notes
Java 21 switch expression patterns: obj switch { case User u -> ... } Low type_pattern is handled in extractJavaPatternBinding (jvm.ts:222–233). Switch expressions may use different AST structure than switch statements.
Receiver-qualified method call: for (var u : repo.getUsers()) Low Handled! method_invocation with obj field → iterableName = obj.text → container lookup. If repo type is in scopeEnv, should resolve.

Kotlin

Gap Severity Notes
Similar to Java — generally solid Low The for-each path is shared via jvm.ts

C#

Gap Severity Notes
Switch expression patterns: obj switch { User u => ... } Low Switch statement declaration_pattern is handled. Switch expressions may differ in AST.
foreach (var u in this.GetUsers()) Medium invocation_expression with member_access_expression where expression is thisobj.type would be this_expression, not matched by obj.type === 'identifier' guard (csharp.ts:240–243)

Rust

Gap Severity Notes
Nested Option/Result: if let Some(Ok(x)) = nested Medium Only one unwrap level (Some/Ok/Err); nested wrappers not resolved (rust.ts:241–247)
Struct field destructuring: if let User { name, email } = user Medium extractCapturedPatternBindings handles captured_pattern but not struct field extraction — field types are Phase 8
For-loop over impl Iterator: for u in get_users() where return is impl Iterator<Item=User> Low extractElementTypeFromString('impl Iterator<Item=User>')undefined. The impl Trait syntax is not parsed.

Ruby ⚠️ Not updated for Phase 7.3

Gap Severity Notes
No call-expression iterable support at all High extractForLoopBinding doesn't destructure returnTypeLookup from context and has no call-expression path (ruby.ts:345). for user in get_users() is not resolved even if get_users is in the symbol table.
Block parameter inference: `users.each { user }`
for user in method() on call node Medium Only handles method name extraction for method chains like users.each, not standalone function calls

Fix Ruby gap →

C / C++ ⚠️ Not updated for Phase 7.3

Gap Severity Notes
No standalone function call iterable support High extractForLoopBinding doesn't use returnTypeLookup. for (auto& user : getUsers()) where rightNode.type === 'call_expression' and function.type === 'identifier' → not handled (c-cpp.ts:323–337).
Only container.method() call handled Medium Only field_expression-based calls (like users.begin()). Direct calls like getUsers() are silently dropped.

Fix C/C++ gap →


Cross-reference: type-resolution-roadmap.md

The roadmap is accurate. All Phase 7 items correctly show ✓ shipped status. The per-language gaps section correctly omits Ruby and C/C++ call-expression support (they're not listed). Two additions would complete the picture:

  • Under Go: add "channel type for call-expression iterables" as a remaining gap
  • Under Ruby: add "for...in call-expression iterables — excluded from Phase 7.3, no returnTypeLookup support"
  • Under All languages: the pendingCallResults Tier 2b path should be noted as "infrastructure shipped, not yet wired in extractors (Phase 9 prerequisite)"

Does type-resolution-system.md need updating? Yes.

The following sections are stale and need to be updated to reflect Phase 7:

1. Architecture → shared.ts entry (line 72): Add buildReturnTypeLookup / ReturnTypeLookup to the description.

2. Language Feature Matrix (line 369–379): The "For-loop element types" row should be annotated with two capability tiers:

  • Identifier iterables: all languages with for-loop support
  • Call-expression iterables (Phase 7.3): Go, TS, Python, Rust, Java, Kotlin, C# — not Ruby, C/C++

3. "Current Strengths" section (line 386): Add mention of Phase 7 — "call-expression iterables in for-loops" and "PHP class-property iterable resolution."

4. "Current Limitations" section (line 399): These are now resolved and should be struck or updated:

  • "no broad expression-level return-type propagation inside TypeEnv" — partially resolved via Phase 7.3 for-loop paths
  • The $this->property PHP iterable limitation is resolved by Phase 7.4

5. Resolution Tiers → Tier 0b (line 209): Describe the new call-expression iterable path as a 4th strategy in resolveIterableElementType (it's actually a pre-check, not a 4th strategy, but worth documenting).

6. SymbolTable Interaction / Deferred Binding Verification sections (lines 346–363): The ReturnTypeLookup is now also used at TypeEnv build time (for-loop extractors), not just in call-processor.ts. The statement "what does not yet exist is feeding that information back into TypeEnv broadly enough to power loop inference" is now outdated — loop inference is exactly what Phase 7.3 delivers.


Code Quality

Minor nits:

  1. php.ts:127findClassPropertyElementType fallback on last named child:

    ?? classNode.namedChild(classNode.namedChildCount - 1);

    This is a smell — if a class_declaration ever has a non-body last child (e.g., attribute), it would produce wrong results. The childForFieldName('body') path should be sufficient for tree-sitter-php. Consider removing the fallback or restricting it to declaration_list type.

  2. go.ts:359–373 — Channel-type check for single-var range: the call-expression path (when iterableName is undefined) defaults to non-channel treatment conservatively. A comment explaining this assumption would be helpful.

  3. type-env.ts:475 — The forLoopCtx object could be hoisted or reused to avoid per-node allocation, though this is micro-optimization territory.

  4. All extractors returning only copy from extractPendingAssignment — the PendingAssignment type signature permits callResult but it's never emitted. TypeScript's type system won't catch this gap. A runtime assertion or comment in each extractor like // TODO Phase 9: add callResult case for call_expression RHS would help track this.


Summary

Area Assessment
ReturnTypeLookup interface design ✅ Clean, well-separated
ForLoopExtractorContext migration ✅ Good API design
Phase 7.3 (7 languages) ✅ Solid, well-tested
Phase 7.4 PHP Strategy C ✅ Elegant AST walk approach
extractReturnTypeName deduplication ✅ 137 lines cleaned up
pendingCallResults Tier 2b ⚠️ Dead code — no extractor populates it
Ruby Phase 7.3 exclusion ⚠️ For-in call-expression gap
C/C++ Phase 7.3 exclusion ⚠️ For-range-loop call-expression gap
PHP normalizePhpType in call path ⚠️ Less conservative than other languages
type-resolution-system.md ❌ Needs update to reflect Phase 7 additions
type-resolution-roadmap.md ✅ Accurate (minor additions suggested)

The core Phase 7 delivery is well-engineered and the test coverage across all 7 Phase 7.3 languages is comprehensive. The main items to address before merge (in priority order):

  1. Clarify/resolve the pendingCallResults dead-code situation (documentation accuracy or light wire-up)
  2. Update type-resolution-system.md to reflect Phase 7
  3. Consider extending Phase 7.3 to Ruby and C/C++ (or document the exclusion explicitly in the roadmap)
    | Branch

… + dormant infrastructure docs

- Replace normalizePhpType with extractElementTypeFromString in PHP call-expression
  foreach paths, aligning with all 6 other language extractors and preventing
  incorrect binding of bare non-container types like User
- Add NOTE comments clarifying pendingCallResults Tier 2b is infrastructure-ready
  but no extractor populates it yet
- Expand Go channel-type comments explaining why non-channel assumption is safe
@magyargergo

Copy link
Copy Markdown
Collaborator Author

@claude ultra-think while identifying per-language gaps for all of our supported languages (except swift). Perfom this analysis on this pull request and verify those findings. Act as a senior compiler font-end engineer and expert with static analysis tools and review this change from that perpective and architectural fit. Also check out the previous comments. You must also cross-reference it with our type-resolution-roadmap.md and see if we need to update type-resolution-system.md after this change.

@claude

claude Bot commented Mar 18, 2026

Copy link
Copy Markdown

Claude finished @magyargergo's task in 4m 49s —— View job


Phase 7 Type Resolution — Deep Review (Verification Pass)

Tasks

  • Read core types, type-env, all language extractors, roadmap and system docs
  • Verify all previous review findings against actual source
  • Per-language gap analysis (all except Swift)
  • Architectural assessment
  • Cross-reference roadmap & system doc staleness

I read every relevant source file (types.ts, type-env.ts, shared.ts, and all 9 language extractors) and cross-referenced against the roadmap and system docs. I'm going to flag one incorrect finding in the previous review, confirm the accurate ones, add new findings not previously surfaced, and give a full gap inventory. Acting as a senior compiler front-end engineer throughout.


⚠️ Correction: Previous review was wrong about PHP normalizePhpType

The previous review claimed:

"In the call-expression path, PHP uses normalizePhpType(rawReturn) (php.ts:457,464) while all other 6 languages use extractElementTypeFromString(rawReturn)."

This is factually incorrect. The actual code at php.ts:452–465:

} else if (iterableNode?.type === 'function_call_expression') {
    const calleeName = extractCalleeName(iterableNode);
    if (calleeName) {
      const rawReturn = returnTypeLookup.lookupRawReturnType(calleeName);
      if (rawReturn) callExprElementType = extractElementTypeFromString(rawReturn);
    }
} else if (iterableNode?.type === 'member_call_expression') {
    const methodName = iterableNode.childForFieldName('name');
    if (methodName) {
      const rawReturn = returnTypeLookup.lookupRawReturnType(methodName.text);
      if (rawReturn) callExprElementType = extractElementTypeFromString(rawReturn);
    }
}

PHP uses extractElementTypeFromString in both call-expression branches — identical to the other 6 languages. The PHP asymmetry concern about foreach(getUser() as ...) incorrectly binding scalar return types does not exist in this code. extractElementTypeFromString('User') correctly returns undefined for a bare type name with no container syntax. The PHP implementation is symmetric and correct.


Architecture Assessment

ReturnTypeLookup interface design (types.ts:30–37)

The lookupReturnType/lookupRawReturnType split is the correct API boundary:

  • lookupRawReturnType → keeps []User, List<User> so callers can apply extractElementTypeFromString
  • lookupReturnType → applies extractReturnTypeName to strip wrappers, for direct variable binding

The 1-match-only guard at type-env.ts:386:

if (callables.length !== 1) return undefined;

is load-bearing and conservative-by-design. Multiple overloads produce no binding rather than a wrong one. This is the right call.

ForLoopExtractorContext context object (types.ts:41–50)

Replacing positional (node, env) with a context object is the textbook move here. Adding returnTypeLookup as a 4th field would have been an awful positional signature. The context object also leaves room for Phase 8 field-type maps to be threaded through without another breaking signature change.

One micro-note: forLoopCtx is heap-allocated per visited for-loop node (type-env.ts:478). At AST-walk frequency this is fine — the allocation rate is dominated by the symbol table, not this. Worth a comment for future perf auditors.

extractReturnTypeName deduplication

137 lines removed from call-processor.ts, now canonically in shared.ts. Clean DRY win, no behavioral change, easy to verify by diffing WRAPPER_GENERICS and extractFirstGenericArg — they're identical in both the removed and added versions.

SKIP_SUBTREE_TYPES performance optimization (type-env.ts:353–365)

The comment is precise: template_string is correctly excluded from the skip list because interpolated expressions can contain typed arrow function parameters. The documented reasoning is exactly right from a tree-sitter AST semantics perspective.

⚠️ pendingCallResults — Dormant infrastructure (but comments now accurately say so)

Confirmed: no language extractor returns { kind: 'callResult' } from extractPendingAssignment. The pendingCallResults array is always empty at runtime.

However, the previous review stated the code needed comments — those comments are now present and accurate:

At type-env.ts:408–411:

// NOTE: Infrastructure-ready — no language extractor currently returns { kind: 'callResult' }
// from extractPendingAssignment. When one does, this array will bind variables to their
// function return types at TypeEnv build time. See PendingAssignment in types.ts.

And at type-env.ts:656–658:

// NOTE: Currently dormant — no extractPendingAssignment implementation emits 'callResult' yet.
// The loop is structurally complete and will activate when any language extractor starts
// returning { kind: 'callResult', lhs, callee } from extractPendingAssignment.

The code is honest about its state. The remaining issue is the roadmap documentation (see below).

✅ PHP Strategy C — findClassPropertyElementType (php.ts:125–143)

The AST walk approach (up to class_declaration → scan declaration_list → match property_element) is exactly right for PHP's tree-sitter grammar. The @var PHPDoc precedence over native type is the correct priority order — @var is the primary annotation mechanism.

One verified concern: the fallback on line 127:

const declList = classNode.childForFieldName('body')
    ?? classNode.namedChild(classNode.namedChildCount - 1); // fallback: last named child

For tree-sitter-php, class_declaration reliably exposes body as a named field so this fallback is practically dead. But semantically, if any PHP grammar version ever puts a trailing attribute list as the last named child, it would silently return wrong results. A declList?.type === 'declaration_list' guard on the fallback would be defensive and cheap.

Fix this fallback →


Per-Language Gap Analysis (Source-Verified)

TypeScript / JavaScript

Gap Severity Verified? Notes
for (const u of this.getUsers()) Medium ✅ Confirmed fn.type === 'member_expression' with obj.type === 'this_expression' — neither identifier nor nested member_expression guard matches (typescript.ts:368,371)
for (const u of repo.getUsers()) where repo is known Low ✅ Works obj.type === 'identifier'iterableName = 'repo' → scopeEnv lookup on repo's type succeeds if repo is typed
for (const u of this.repos.getUsers()) Low ✅ Confirmed gap obj.type === 'member_expression' extracts the inner property name (repos) → resolves as container, not via return type

Go

Gap Severity Verified? Notes
Channel-typed return: for v := range getChan() Medium ✅ Confirmed Code comment at go.ts:358–365 explicitly acknowledges this: "For call-expression iterables (iterableName undefined), conservative: treat as non-channel." The single-var case returns early when iterableName is undefined. This is a documented, intentional trade-off.

Python

Gap Severity Verified? Notes
for u in self.get_users() Medium ✅ Confirmed fn.type === 'attribute' with obj.type === 'identifier'iterableName = obj.text = 'self' → tries to resolve 'self' as a container in scopeEnv (always fails since self isn't typed as a container)
for u in repo.get_users() Medium ✅ Confirmed Same: iterableName = 'repo' → container lookup. Works only if repo is already in scopeEnv with a container-typed value
Generator[User, None, None] return types Low Likely extractElementTypeFromString('Generator[User, None, None]') — outer [ parses as Swift array sugar branch, hits content check for User, None, None which fails \w+ test. Returns undefined.

PHP

Gap Severity Verified? Notes
static::$property / self::$property iterables Medium ✅ Confirmed Strategy C only guards iterableNode?.type === 'member_access_expression' with obj.text === '$this'. scope_resolution_expression nodes are never checked (php.ts:499–512).
foreach ($this->getUsers() as $user) ✅ Works N/A member_call_expression is handled — lookups returnTypeLookup.lookupRawReturnType(methodName.text) and calls extractElementTypeFromString.

Java

Gap Severity Verified? Notes
for (var u : repo.getUsers()) where repo is in scope ✅ Works N/A method_invocation with obj.type === 'identifier'iterableName = obj.text = 'repo' → falls through to resolveIterableElementType
Direct call for (var u : getUsers()) ✅ Works N/A !obj && name branch → lookupRawReturnType(name.text)extractElementTypeFromString

The JVM extractor is the most complete of the 7 Phase 7.3 languages.

Kotlin

Gap Severity Verified? Notes
Shares JVM for-loop path with Java Confirmed N/A The extractKotlinForLoopBinding delegates upward through jvm.ts. Same coverage as Java.

C#

Gap Severity Verified? Notes
foreach (var u in this.GetUsers()) Medium ✅ Confirmed invocation_expression with fn.type === 'member_access_expression' and obj.type === 'this_expression' — fails the obj.type === 'identifier' guard at csharp.ts:242.
foreach (var u in GetUsers()) ✅ Works N/A fn.type === 'identifier'lookupRawReturnTypeextractElementTypeFromString

Rust

Gap Severity Verified? Notes
impl Iterator<Item=User> return types Low ✅ Confirmed extractElementTypeFromString('impl Iterator<Item=User>') — the impl keyword prefix means no opening < or [ at index 0; openAngle finds the < in Iterator< but the base check fails the \w+ test on impl Iterator. Returns undefined.
Nested Option<Result<User, E>> Low Confirmed extractReturnTypeName unwraps one level (WRAPPER_GENERICS for Option/Result) but not nested wrappers

Ruby ⚠️ Not updated for Phase 7.3

Gap Severity Verified? Notes
for user in get_users() — no call-expression path High ✅ Confirmed extractForLoopBinding at ruby.ts:345 does NOT destructure returnTypeLookup from the context. For call nodes, it extracts method.text as iterableName and runs resolveIterableElementType('get_users', ...) — scopeEnv lookup for get_users returns nothing, YARD param lookup for get_users returns nothing. Silently produces no binding.
Block parameter inference `users.each { user }` High

Fix Ruby gap →

C / C++ ⚠️ Not updated for Phase 7.3

Gap Severity Verified? Notes
for (auto& user : getUsers()) — direct call High ✅ Confirmed extractForLoopBinding at c-cpp.ts:269 does NOT destructure returnTypeLookup. In the call_expression branch (c-cpp.ts:323–331), only field_expression-typed callees (i.e., container.method()) are handled. When fieldExpr.type === 'identifier' (bare function call like getUsers()), the branch falls through with iterableName undefined.

Fix C/C++ gap →


Roadmap Accuracy (type-resolution-roadmap.md)

Two statements are inaccurate and contradict the code comments:

Line 86:

"Phase 7.2 also added pendingCallResults propagation: var x = f() assignments now propagate the inferred return type of f into the local scope env."

Line 100:

"var x = f() assignments propagate the element / receiver type for subsequent use in the same scope"

Both are false. The pendingCallResults processing loop is structurally complete but no extractor emits { kind: 'callResult' }. The code's own comments (added in this PR) explicitly call this out as dormant. The roadmap is contradicting the implementation.

Additionally, the per-language gap section doesn't document that Ruby and C/C++ were excluded from Phase 7.3, even though they support for-loops. A reader would need to infer this from the absence of those languages in the "all 7 typed-iteration languages" wording.

Fix roadmap inaccuracy →


Does type-resolution-system.md Need Updating? Yes — 3 stale sections

1. shared.ts description in the Architecture table (line 72) — stale:

| `shared.ts` | Language-agnostic helpers such as `extractSimpleTypeName`, ...

Missing: extractElementTypeFromString, resolveIterableElementType, buildReturnTypeLookup/ReturnTypeLookup — these are now major residents of shared.ts.

2. "Deferred Binding Verification" paragraph (line 363) — stale:

"What does not yet exist is feeding that information back into TypeEnv broadly enough to power loop inference, general assignment propagation, and wider expression typing."

This sentence is now outdated. Phase 7.3 is exactly "feeding that information back into TypeEnv to power loop inference" — ReturnTypeLookup is threaded into every for-loop extractor via ForLoopExtractorContext. This is a significant correctness issue in the documentation because it describes Phase 7 work as future work.

3. "Current Limitations" (line 409) — partially stale:

"no broad expression-level return-type propagation inside TypeEnv"

This is now partially resolved: for-loop iterables in 7 languages now use return-type propagation inside TypeEnv. It should be updated to distinguish "for-loop call-expression iterables (done, 7 languages)" from "general assignment return-type binding (pending, Phase 9)".

The Language Feature Matrix at line 375 should also add a sub-row or note distinguishing "identifier iterables" vs "call-expression iterables" since Ruby/C++ only support the former.


Code Quality: Additional Nits Not Covered Previously

  1. go.ts:358–365 — channel gap is documented, decision is defensible. The comment is accurate and the conservative trade-off is right. For call-expression iterables (getChan() returning chan User), extractElementTypeFromString wouldn't handle chan User anyway since it has no bracket syntax. Even if isChannelType were called, it would need to inspect the raw return type string — a separate enhancement.

  2. jvm.ts:167!obj && name for direct calls is elegantly correct. In Java's tree-sitter grammar, getUsers() produces a method_invocation where object field is absent and name field holds the method name. The guard is exactly right.

  3. csharp.ts:244fn.type === 'identifier' for GetUsers() works correctly. C#'s invocation_expression for a bare call produces identifier as the first child. But this.GetUsers() produces member_access_expression as first child with expression = this_expression — the gap is confirmed at line 242.

  4. All extractPendingAssignment implementations return only copy kind. This is consistent. A // TODO Phase 9: callResult case comment in even one extractor would make the intention clearer to contributors. Currently only type-env.ts mentions Phase 9.


Summary

Finding Previous Review Verification
pendingCallResults dormant + now has clarifying comments Identified (no comments yet) ✅ Comments added — gap closed
PHP normalizePhpType asymmetry ⚠️ INCORRECT finding ❌ PHP uses extractElementTypeFromString — same as all other languages
Ruby call-expression gap Identified ✅ Confirmed at ruby.ts:345
C/C++ call-expression gap Identified ✅ Confirmed at c-cpp.ts:269
TypeScript this.getUsers() gap Identified ✅ Confirmed at typescript.ts:368
C# this.GetUsers() gap Identified ✅ Confirmed at csharp.ts:242
Go channel-from-call gap Identified ✅ Confirmed + code comment documents this trade-off
Python self.get_users() gap Identified ✅ Confirmed at python.ts:250
type-resolution-system.md stale Identified ✅ Confirmed — 3 specific stale sections
Roadmap pendingCallResults inaccuracy Not flagged 🆕 Lines 86 and 100 are factually wrong
findClassPropertyElementType fallback type smell Identified ✅ Confirmed, low risk, easy guard

Verdict: The Phase 7 core delivery is high quality — the architecture is sound, the test coverage is comprehensive, and the implementation is conservative in the right places. The two action items before merge are: (1) correct the two inaccurate roadmap sentences about pendingCallResults, and (2) update the stale paragraph in type-resolution-system.md at line 363. Extending Phase 7.3 to Ruby and C/C++ is desirable but can be a follow-up.

Branch |

@magyargergo

Copy link
Copy Markdown
Collaborator Author

Response to Verification Pass Review

Re: PHP normalizePhpType — the original finding was correct.

The verification pass quotes lines 452–465 as evidence that PHP already uses extractElementTypeFromString. However, those lines reflect the post-fix code from commit 6fe3568 ("fix(type-resolution): address PR #341 review findings — PHP asymmetry + dormant infrastructure docs"). The pre-fix code at those same lines used normalizePhpType(rawReturn) in both call-expression branches. The diff:

-      if (rawReturn) callExprElementType = normalizePhpType(rawReturn);
+      if (rawReturn) callExprElementType = extractElementTypeFromString(rawReturn);

The behavioral difference matters: normalizePhpType('User') returns 'User' (treating a bare scalar type as a valid element type), while extractElementTypeFromString('User') correctly returns undefined (no container syntax detected). Without the fix, foreach(getUser() as $u) where getUser() returns User would incorrectly bind $u to User as if it were an element of a collection.

Re: shared.ts description (line 72) — the current description is accurate.

The review suggests buildReturnTypeLookup and ReturnTypeLookup are missing from the shared.ts row. These items are not in shared.tsbuildReturnTypeLookup is in type-env.ts and the ReturnTypeLookup interface is in types.ts. Both are already described in their respective file rows.

Fixes applied in this push:

  1. Roadmap lines 86 and 100 — corrected pendingCallResults description. Line 86 now states the infrastructure exists but is dormant (Phase 9 work). Line 100 replaced the false var x = f() propagation claim with an accurate description of the dormant infrastructure.

  2. type-resolution-system.md line 363 — updated to reflect that Phase 7.3 added for-loop call-expression resolution via ReturnTypeLookup, while noting general assignment propagation remains pending.

  3. type-resolution-system.md line 409 — clarified that for-loop call-expression iterables are now resolved in 7 languages, while general var x = f() binding is pending.

  4. PHP fallback type guard (php.ts:127) — added ?.type === 'declaration_list' guard on the fallback to prevent silent wrong results if a future grammar version puts a trailing attribute list as the last named child.

All 2835 tests pass. The PHP integration tests specifically confirm the call-expression paths work correctly with extractElementTypeFromString.

Acknowledged for follow-up (not in this PR):

  • Ruby extractForLoopBinding doesn't destructure returnTypeLookup — call-expression gap confirmed
  • C/C++ extractForLoopBinding same gap — bare getUsers() falls through
  • extractPendingAssignment Phase 9 TODO comments in extractors

…HP fallback guard

- Roadmap lines 86/100: correct pendingCallResults from "active" to "dormant infrastructure (Phase 9)"
- type-resolution-system.md line 363: update to reflect Phase 7.3 loop inference is delivered
- type-resolution-system.md line 409: clarify for-loop call-expression resolution (done) vs general assignment propagation (pending)
- php.ts:127: add declaration_list type guard on fallback to prevent silent wrong results
@magyargergo magyargergo merged commit 604b575 into main Mar 18, 2026
22 checks passed
@magyargergo magyargergo deleted the feat/phase7-type-resolution branch March 18, 2026 08:39
motolese pushed a commit to motolese/datamoto-gitnexus that referenced this pull request Apr 23, 2026
…ss-property iterables (abhigyanpatwari#341)

* feat(type-resolution): Phase 7.1+7.2 foundation — ReturnTypeLookup, context object, pendingCallResults

- Move extractReturnTypeName + helpers from call-processor.ts to type-extractors/shared.ts
  (breaks circular import risk: call-processor → type-env → type-extractors → call-processor)
- Add SymbolTable.lookupFuzzyCallable(name) — lazy callable-only index, O(1) per call,
  invalidated on add(); avoids per-call .filter() on lookupFuzzy results
- Add ReturnTypeLookup interface (conservative: undefined when 0 or 2+ callables match)
- Add ForLoopExtractorContext interface — replaces 4 positional params with context object;
  update all 10 language extractor implementations (go, ts, py, jvm×2, cs, rs, rb, php, c-cpp)
- Add PendingAssignment discriminated union (kind: 'copy' | 'callResult');
  update PendingAssignmentExtractor in all 9 language extractors that implement it
- Wire buildTypeEnv: build ReturnTypeLookup from optional symbolTable; split pendingAssignments
  into pendingCopies + pendingCallResults; add Tier 2b call-result propagation loop
- Update call-processor.test.ts to import extractReturnTypeName from shared.ts

* feat(type-resolution): Phase 7.3 — call_expression iterables in for-loop extractors (7 languages)

Extends for-loop type extraction in all 7 typed-iteration languages to
resolve element types when the iterable is a direct function call.

**New capability**: `for (var u : getUsers())` in Java, `for u in get_users()`
in Python, `for user in getUsers()` in TypeScript, etc. now resolve
`u`/`user` to the callee's return element type via lookupRawReturnType +
extractElementTypeFromString.

Changes per language:
- types.ts: extend ReturnTypeLookup with lookupRawReturnType (raw return
  string for container-type extraction); update ForLoopExtractorContext
  with returnTypeLookup field
- type-env.ts: implement lookupRawReturnType on the concrete ReturnTypeLookup
  built in buildTypeEnv (same guards as lookupReturnType, no extractReturnTypeName)
- go.ts: call_expression branch in range_clause — identifier func or
  selector_expression method; existing isChannelType guards updated
- typescript.ts: identifier fn branch inside call_expression handler
- python.ts: identifier fn branch inside call handler
- jvm.ts (Java): method_invocation without object field in enhanced_for_statement
- jvm.ts (Kotlin): simple_identifier callee branch in call_expression node
- csharp.ts: identifier fn branch in invocation_expression handler
- rust.ts: identifier func branch in call_expression handler (alongside
  existing field_expression/method-call path)

All branches follow the same conservative pattern:
  lookupRawReturnType(callee) → extractElementTypeFromString → bind loop var

* feat(type-resolution): Phase 7.4 — PHP \$this->property iterable via @var class property scan

Adds Strategy C to PHP's extractForLoopBinding for the pattern:

  foreach (\$this->property as \$item)

when Strategy A (resolveIterableElementType) and Strategy B (scopeEnv lookup)
both fail to find the element type.

Strategy C: when the iterable is a member_access_expression with object '$this',
walk up the AST to the enclosing class_declaration, scan its declaration_list
for a property_declaration whose variable_name matches the property, and extract
the element type from:
  1. PHPDoc @var annotation on a preceding comment sibling (/** @var User[] */)
  2. PHP 7.4+ native type field (e.g. UserRepo \$repo — skips generic 'array')

This eliminates the @param workaround that was previously required in the
php-foreach-member-access fixture (which used @param User[] \$users on the method
to populate the method's scopeEnv with a \$users binding).

New helpers in php.ts:
- PHPDOC_VAR_RE: regex for @var extraction
- extractClassPropertyElementType: reads @var or native type from a property_declaration
- findClassPropertyElementType: scans class body for a named property

Tests added (type-env.test.ts):
- PHP: resolves from @var User[] without @param workaround
- PHP: conservative — no binding for unknown property
- PHP: multi-class file — both classes resolve independently

Fixture updated (php-foreach-member-access/App.php):
- Removed the @param User[] \$users workaround from processMembers()
- Test now validates the natural class-property-based resolution path

* docs: mark Phase 7 complete in type-resolution-roadmap.md

Records that 7A (call_expression iterables, 7 languages), 7B (PHP
$this->property via @var scan), and 7C (ReturnTypeLookup + context object)
are all shipped. Adds implementation notes and strikethroughs on resolved
language-specific gaps.

* fix(docs): update project references to feat-phase7-type-resolution in AGENTS.md and CLAUDE.md

* feat(type-resolution): Phase 7.5 — PHP call_expression foreach + integration tests for 7 languages

Add integration test coverage for Phase 7.3's call_expression iterable
resolution across all 7 languages (Go, TypeScript, Python, Java, Kotlin,
PHP, Rust). Each test creates a fixture with competing User/Repo classes
that both define save(), then verifies for-loop iteration over a function
call's return value resolves to the correct class.

PHP was missing function_call_expression support in its for-loop extractor.
Three changes fix this:
- php.ts extractForLoopBinding: handle function_call_expression and
  member_call_expression iterables via returnTypeLookup
- php.ts normalizePhpReturnType: preserve array notation (User[]) in
  SymbolTable so lookupRawReturnType returns useful container types
- parse-worker.ts + parsing-processor.ts: upgrade uninformative AST
  return types (array, iterable) with PHPDoc @return annotations

35 new integration tests (5 per language), 2525 total tests passing.

* fix(type-resolution): address PR abhigyanpatwari#341 review findings — PHP asymmetry + dormant infrastructure docs

- Replace normalizePhpType with extractElementTypeFromString in PHP call-expression
  foreach paths, aligning with all 6 other language extractors and preventing
  incorrect binding of bare non-container types like User
- Add NOTE comments clarifying pendingCallResults Tier 2b is infrastructure-ready
  but no extractor populates it yet
- Expand Go channel-type comments explaining why non-channel assumption is safe

* fix(type-resolution): address verification review — docs accuracy + PHP fallback guard

- Roadmap lines 86/100: correct pendingCallResults from "active" to "dormant infrastructure (Phase 9)"
- type-resolution-system.md line 363: update to reflect Phase 7.3 loop inference is delivered
- type-resolution-system.md line 409: clarify for-loop call-expression resolution (done) vs general assignment propagation (pending)
- php.ts:127: add declaration_list type guard on fallback to prevent silent wrong results
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant