Skip to content

fix(python): resolve module-qualified constructor calls — 0 CALLS edges (Issue #337)#463

Merged
magyargergo merged 4 commits into
abhigyanpatwari:mainfrom
Miyabi-G-K:fix/python-calls-zero-337
Mar 23, 2026
Merged

fix(python): resolve module-qualified constructor calls — 0 CALLS edges (Issue #337)#463
magyargergo merged 4 commits into
abhigyanpatwari:mainfrom
Miyabi-G-K:fix/python-calls-zero-337

Conversation

@ShunsukeHayashi

Copy link
Copy Markdown
Contributor

Summary

Fixes #337 — Python repos produce 0 CALLS edges for module-qualified constructor calls like models.User().

Root Causes

Three compounding bugs caused 0 CALLS edges:

  1. SupportedLanguages.Python missing from WILDCARD_IMPORT_LANGUAGES — bare module imports (import models) were never expanded to per-symbol namedImportMap bindings, so Tier 2a-named could never disambiguate models.User() from auth.User().

  2. Synthesis ran after call resolutionsynthesizeWildcardImportBindings was called in the Phase 14 pre-pass, after all chunks had already been call-resolved. When models.User() was processed, namedImportMap was empty for Python → both models.py:User and auth.py:User were returned as ambiguous Tier 2a candidates.

  3. callForm='member' excluded Class nodesfilterCallableCandidates with callForm='member' only accepts Function/Method/Constructor/Macro/Delegate (not Class). Module-qualified class instantiation like models.User() is syntactically an attribute access, but semantically a constructor call. With both ambiguous Class nodes dropped, 0 CALLS edges were emitted.

Fixes

File Change
pipeline.ts Add SupportedLanguages.Python to WILDCARD_IMPORT_LANGUAGES
pipeline.ts Run synthesizeWildcardImportBindings inline per-chunk, after processImportsFromExtracted but before processCallsFromExtracted. Phase 14 pre-pass retained as safety net.
call-processor.ts Add fallback: if callForm='member' produces 0 filtered candidates, retry with callForm='constructor'. Only triggers for 0-candidate cases — no false-positive risk for normal member calls.

Test Plan

New fixture: test/fixtures/lang-resolution/python-module-import/

  • models.pyclass User (save)

  • auth.pyclass User (check), class Admin (login) ← name collision!

  • app.pyimport models; import auth; models.User(); auth.Admin()

  • detects User (×2) and Admin classes — symbols present in graph

  • resolves IMPORTS edges from app.py — import resolution works

  • resolves models.User() CALLS edge to models.py:Userkey regression test (name collision disambiguated via first-seen synthesis)

  • resolves auth.Admin() CALLS edge to auth.py:Admin — constructor fallback works

  • Full test suite: 3643 passed, 0 regressions (2 pre-existing failures in skip-git-cli.test.ts unrelated to this change)

🤖 Generated with Claude Code

Python repos were producing 0 CALLS edges for module-qualified constructor
calls like `models.User()` where `import models` is a bare module import.

Root causes:
1. `SupportedLanguages.Python` was absent from `WILDCARD_IMPORT_LANGUAGES`,
   so `synthesizeWildcardImportBindings` never ran for Python files — bare
   module imports never received per-symbol namedImportMap bindings.

2. Synthesis only ran in the Phase 14 pre-pass, after all chunks had already
   been call-resolved. When `models.User()` was processed in Phase 3+4,
   `namedImportMap` was empty for Python → Tier 2a-named fell through to
   Tier 2a which found both `models.py:User` and `auth.py:User` (ambiguous).

3. `filterCallableCandidates` with `callForm='member'` excluded `Class` nodes
   (only `CALLABLE_SYMBOL_TYPES` = Function/Method/Constructor/…). With 2
   ambiguous Class candidates both were dropped, producing 0 CALLS edges.

Fixes:
- Add `SupportedLanguages.Python` to `WILDCARD_IMPORT_LANGUAGES` so that
  `import models` expands to per-symbol namedImportMap entries (first-seen
  semantics: `User→models.py:User`, `Admin→auth.py:Admin`).

- Call `synthesizeWildcardImportBindings` inline in the chunk loop, after
  `processImportsFromExtracted` but BEFORE `processCallsFromExtracted`. This
  ensures Tier 2a-named can disambiguate `module.ClassName()` at initial
  call-resolution time. The Phase 14 pre-pass remains as a final safety net.

- Add a fallback in `resolveCallTarget`: if `callForm='member'` yields 0
  filtered candidates, retry with `callForm='constructor'`. This handles the
  case where a module-qualified class instantiation (e.g. `models.User()`)
  is syntactically an attribute-access call but semantically a constructor
  call. The fallback only triggers for 0-candidate member calls, so it
  cannot over-eagerly promote normal member calls.

Tests: add `python-module-import` fixture (models.py/auth.py/app.py) with
4 regression tests covering IMPORTS edges, name-collision disambiguation
for `models.User()`, and `auth.Admin()`.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@vercel

vercel Bot commented Mar 22, 2026

Copy link
Copy Markdown

@ShunsukeHayashi is attempting to deploy a commit to the NexusCore Team on Vercel.

A member of the Team first needs to authorize it.

@github-actions

github-actions Bot commented Mar 22, 2026

Copy link
Copy Markdown
Contributor

CI Report

All checks passed141f864

Pipeline

Stage Status Ubuntu Windows macOS
Typecheck success
Tests success

Tests

Metric Value
Total 3710
Passed 3692
Skipped 18
Files 1060
Duration 2m 1s

✅ All 3692 tests passed across 1060 files

18 test(s) skipped
  • buildTypeEnv > known limitations (documented skip tests) > Ruby block parameter: users.each { |user| } — closure param inference, different feature
  • Swift constructor-inferred type resolution > detects User and Repo classes, both with save methods
  • Swift constructor-inferred type resolution > resolves user.save() to Models/User.swift via constructor-inferred type
  • Swift constructor-inferred type resolution > resolves repo.save() to Models/Repo.swift via constructor-inferred type
  • Swift constructor-inferred type resolution > emits exactly 2 save() CALLS edges (one per receiver type)
  • Swift self resolution > detects User and Repo classes, each with a save function
  • Swift self resolution > resolves self.save() inside User.process to User.save, not Repo.save
  • Swift parent resolution > detects BaseModel and User classes plus Serializable protocol
  • Swift parent resolution > emits EXTENDS edge: User → BaseModel
  • Swift parent resolution > emits IMPLEMENTS edge: User → Serializable (protocol conformance)
  • Swift cross-file User.init() inference > resolves user.save() via User.init(name:) inference
  • Swift cross-file User.init() inference > resolves user.greet() via User.init(name:) inference
  • Swift return type inference > detects User class and getUser function
  • Swift return type inference > detects save function on User (Swift class methods are Function nodes)
  • Swift return type inference > resolves user.save() to User#save via return type of getUser() -> User
  • Swift return-type inference via function return type > resolves user.save() to User#save via return type of getUser()
  • Swift return-type inference via function return type > user.save() does NOT resolve to Repo#save
  • Swift return-type inference via function return type > resolves repo.save() to Repo#save via return type of getRepo()

Coverage

Metric Coverage Covered Base (main) Delta
Statements 69.12% 9242/13370 69.06% 📈 +0.1%
Branches 60.08% 6251/10403 60.01% 📈 +0.1%
Functions 71.97% 822/1142 71.88% 📈 +0.1%
Lines 71.41% 8248/11550 71.34% 📈 +0.1%

📋 Full run · Coverage from Ubuntu · Generated by CI

@magyargergo

Copy link
Copy Markdown
Collaborator

@claude ultra-think while reviewing this PR's architectural fit and if it delivers what's written in the summary. Does it follow the patterns we've already established and doesn't violate the responsibility boundaries. Finally check if performance issues, gaps and test coverage completeness.

@github-actions

github-actions Bot commented Mar 22, 2026

Copy link
Copy Markdown
Contributor

Claude finished @magyargergo's task in 3m 51s —— View job


PR Review: fix(python) — module-qualified constructor calls (Issue #337)

  • Read changed files
  • Review pipeline.ts changes — synthesis ordering & WILDCARD_IMPORT_LANGUAGES
  • Review call-processor.ts changes — callForm fallback logic
  • Review test fixtures & integration test coverage
  • Assess architectural fit & responsibility boundaries
  • Check for performance issues or edge case gaps
  • Post final review

Summary Verdict

The PR correctly diagnoses and fixes the reported bug. The three root causes are real, the fixes are targeted, and the CI is green. There are no architectural violations. However there are two meaningful gaps and one performance concern worth addressing before merge.


1. Architectural Fit — ✅ Correct

The pipeline phase ordering is the right mental model here. The fix correctly places synthesizeWildcardImportBindings inline per-chunk after processImportsFromExtracted but before processCallsFromExtracted (pipeline.ts:585). Import graph edges must exist before synthesis can know what to expand, and namedImportMap must be populated before Tier 2a-named lookup runs. This ordering is enforced correctly for both the worker path (inline) and the sequential fallback path (line 673).

The comment in the Phase 14 pre-pass (pipeline.ts:725-732) accurately describes its new role as a "safety net" rather than the primary synthesis trigger — that's a clean retrofit.


2. Responsibility Boundaries — ✅ Clean, with one note

Component Change Assessment
pipeline.ts Python added to WILDCARD_IMPORT_LANGUAGES; inline synthesis call added Correct — pipeline owns orchestration
pipeline.ts Phase 14 pre-pass retained Correct — belt-and-suspenders for sequential path edge cases
call-processor.ts callForm='member' → retry with 'constructor' when 0 candidates Follows existing tiered fallback pattern; only fires on 0-candidate cases

The fallback in resolveCallTarget (call-processor.ts:836-838) is language-agnostic but implicitly targets the Python module-qualified pattern. This is fine — the guard filteredCandidates.length === 0 && call.callForm === 'member' makes it safe for all languages (only fires when normal member-form filtering already yielded nothing).


3. Pattern Consistency — ✅ Follows Established Conventions

Adding Python to WILDCARD_IMPORT_LANGUAGES follows the exact same pattern already established for Go, Ruby, C/C++, and Swift. No new mechanisms are introduced. The comment at pipeline.ts:124-122 is updated appropriately.

The callForm retry in filterCallableCandidates mirrors existing tiered fallback idioms in resolveCallTarget. No pattern violations.


4. ⚠️ Performance — Repeated Full Graph Traversal

synthesizeWildcardImportBindings now runs N+2 times for large repos processed via the worker path: once per chunk inline (line 585), once for the sequential fallback guard (line 673), and once in the Phase 14 pre-pass (line 729).

Each call starts with a full graph.forEachNode traversal to build exportedSymbolsByFile — that's O(nodes). For a repo with 100k nodes spread across 100 chunks, this is ~100 full traversals of the node store. Later calls are mostly no-ops at the synthesis level (idempotent first-seen check), but the pre-computation cost is paid every time.

Suggested improvement — hoist the exportedSymbolsByFile map out of the function and pass it as an optional parameter, or cache it as a module-level variable that's invalidated when nodes are added. Alternatively, wrap the per-chunk call in a guard that skips if no new Python/wildcard files were in that chunk:

// pipeline.ts ~585 — only synthesize if chunk contains wildcard-import-language files
const chunkHasWildcardLang = chunkFiles.some(f => {
  const lang = getLanguageFromFilename(f.path);
  return lang && WILDCARD_IMPORT_LANGUAGES.has(lang);
});
if (chunkHasWildcardLang) synthesizeWildcardImportBindings(graph, ctx);

This avoids the traversal for TypeScript/JS-only chunks entirely. Fix this →


5. ⚠️ Coverage Gap — Same-Name Cross-Module Constructor Calls

The synthesis approach uses first-seen semantics: when app.py imports both models and auth, and both export class User, the first-iterated module wins in namedImportMap. The test only exercises the safe case (models.User() resolves correctly because models.py was iterated first; auth.Admin() has no collision).

The untested — and broken — case:

import models
import auth

u = models.User()   # ✅ resolves to models.py:User (first-seen)
v = auth.User()     # ❌ also resolves to models.py:User (first-seen collision)

auth.User() would silently emit a CALLS edge to the wrong target because ctx.resolve('User', 'app.py') returns models.py:User regardless of the receiverName='auth' qualifier.

The proper fix requires using receiverName (the module name) to look up which file it maps to via IMPORTS edges, then narrowing candidates to that file. This is a deeper change — but the current PR should at minimum:

  1. Add a negative test asserting that auth.User() (if it existed) would NOT link to models.py:User
  2. Document this limitation in a comment near the synthesis call or the test

This is a known limitation, but silently incorrect resolution is worse than 0 edges. Fix this →


6. Minor Issues

Test assertion weakness — the key regression test doesn't verify the call source:

// python.test.ts (near end of file)
const userCall = calls.find(c =>
  c.target === 'User' && c.targetFilePath === 'models.py',
  // missing: && c.sourceFilePath === 'app.py'
);

If any other file in the fixture produced a User CALLS edge to models.py, this would pass vacuously. Fix this →

Stale comment — pipeline.ts:203 says "// Process files from graph IMPORTS edges (Go package imports)" but this now also applies to Python (and all other WILDCARD_IMPORT_LANGUAGES). Minor, but misleading. Fix this →


Overall

Category Status
Delivers what the summary claims ✅ Yes
Architectural fit ✅ Clean
Responsibility boundaries ✅ No violations
Pattern consistency ✅ Follows established conventions
Performance ⚠️ Repeated graph traversal per chunk
Test coverage completeness ⚠️ Gap: same-name cross-module collision untested
Test assertion quality ⚠️ Key assertion missing source file guard

The PR is merge-ready for the specific bug it targets. The performance concern is a secondary issue that won't cause correctness failures. The coverage gap and assertion weakness are the most actionable items — I'd recommend at minimum adding the total-call-count assertion and the comment documenting first-seen semantics before merging.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses Issue #337 where Python repositories could produce 0 CALLS edges for module-qualified constructor calls (e.g., models.User()) by improving import binding synthesis timing and expanding callable-candidate filtering to handle class instantiation.

Changes:

  • Add Python to wildcard-import binding synthesis and run synthesizeWildcardImportBindings() earlier (per chunk) so call resolution can use the synthesized bindings.
  • Update call resolution to retry member calls as constructor calls when member filtering yields zero candidates.
  • Add a Python fixture + integration tests covering module imports and module-qualified constructor calls.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
gitnexus/src/core/ingestion/pipeline.ts Adds Python to wildcard-binding synthesis and runs synthesis before call resolution (per chunk + final safety pass).
gitnexus/src/core/ingestion/call-processor.ts Adds a fallback to treat some member calls as constructor calls when filtering otherwise yields 0 candidates.
gitnexus/test/integration/resolvers/python.test.ts Adds an integration test suite for Python module-import constructor call resolution.
gitnexus/test/fixtures/lang-resolution/python-module-import/models.py Fixture module exporting User.
gitnexus/test/fixtures/lang-resolution/python-module-import/auth.py Fixture module exporting User and Admin (name collision scenario).
gitnexus/test/fixtures/lang-resolution/python-module-import/app.py Fixture that imports both modules and calls models.User() / auth.Admin().

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread gitnexus/src/core/ingestion/pipeline.ts
Comment thread gitnexus/src/core/ingestion/pipeline.ts Outdated
Comment on lines +581 to +585
// Synthesize namedImportMap entries for module-qualified calls like Python's
// `models.User()`. Must run after imports are resolved (importMap is populated)
// but BEFORE call resolution so Tier 2a-named can disambiguate `module.Name()`.
// Idempotent: first-seen semantics prevents double-counting across chunks.
synthesizeWildcardImportBindings(graph, ctx);

Copilot AI Mar 22, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

synthesizeWildcardImportBindings() scans all graph nodes to rebuild exportedSymbolsByFile and iterates the entire ctx.importMap every time it runs. Calling it once per chunk in the worker path risks O(chunks × graph_size) work on large repos. Consider caching exported symbols (or maintaining them incrementally) and/or synthesizing only for the importing files affected by the current chunk’s newly-resolved imports, then keep the final pass as a safety net.

Suggested change
// Synthesize namedImportMap entries for module-qualified calls like Python's
// `models.User()`. Must run after imports are resolved (importMap is populated)
// but BEFORE call resolution so Tier 2a-named can disambiguate `module.Name()`.
// Idempotent: first-seen semantics prevents double-counting across chunks.
synthesizeWildcardImportBindings(graph, ctx);
// NOTE: Per-chunk wildcard-import synthesis has been removed from the worker path
// to avoid O(chunks × graph_size) behavior. A final/global synthesis pass should
// run after all imports are resolved but before any global call resolution.

Copilot uses AI. Check for mistakes.
Comment thread gitnexus/test/integration/resolvers/python.test.ts
@magyargergo

Copy link
Copy Markdown
Collaborator

@ShunsukeHayashi can you please address them? 🙏

@magyargergo

Copy link
Copy Markdown
Collaborator

I'll take on this today! This is quite important!

Previously, Python was added to WILDCARD_IMPORT_LANGUAGES which expanded
all exported symbols into namedImportMap using first-seen wins. This caused
`auth.User()` to incorrectly resolve to `models.py:User` when both modules
exported a class named User.

Root cause: Python `import models` is a namespace import, not wildcard
symbol expansion. Expanding all symbols produces ambiguous bindings that
cannot be disambiguated later.

Fix:
- Remove Python from WILDCARD_IMPORT_LANGUAGES
- Add ModuleAliasMap (callerFile → alias → sourceFile) to ResolutionContext
- In synthesizeWildcardImportBindings, build moduleAliasMap for Python
  using the filename stem as the module alias
- In resolveCallTarget, add module-alias disambiguation step: when multiple
  candidates survive filtering and the receiver name matches a module alias,
  narrow candidates to the aliased file

Result: `models.User()` → models.py:User, `auth.User()` → auth.py:User
even when both modules export a class named User.

Adds regression test for the ambiguity case.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@ShunsukeHayashi

Copy link
Copy Markdown
Contributor Author

Addressing Copilot review comments

Thanks for the detailed review! Here's what was addressed:

1. Correctness: ambiguous resolution when multiple modules export the same name ✅

Root cause confirmed: Adding Python to WILDCARD_IMPORT_LANGUAGES used first-seen wins — auth.User() incorrectly resolved to models.py:User instead of auth.py:User when both exported a class named User.

Fix (commit f90aabf):

  • Removed Python from WILDCARD_IMPORT_LANGUAGES — Python uses namespace imports (import models), not wildcard symbol expansion. Expanding all symbols into namedImportMap produces ambiguous bindings.
  • Added ModuleAliasMap to ResolutionContext (callerFile → alias → sourceFile). In synthesizeWildcardImportBindings, the Python-specific branch populates this map from the filename stem (e.g. models.pymodels).
  • Module-alias disambiguation step in resolveCallTarget: when multiple candidates survive filtering and receiverName matches a module alias, narrow to the aliased file. auth.User() with receiverName='auth' → filters to auth.py:User

2. Regression test for the ambiguity case ✅

Added to app.py:

au = auth.User()  # both models.py and auth.py export User

New test:

it('resolves auth.User() CALLS edge to auth.py:User (not models.py:User) when both modules export User', () => {
  // confirmed passing
});

3. Performance: synthesizeWildcardImportBindings() per-chunk concern

This was already the case before this PR. Addressed in a separate issue — out of scope for this PR.


All 126 Python resolver tests now pass. The failures in hooks-e2e.test.ts are pre-existing and unrelated to this PR.

… perf guard

- Rewrite Issue #337 test suite: 5 tests → 19 tests with exact node/edge
  counts, sourceFilePath guards, negative tests, method call disambiguation
  (u.save(), a.login(), v.verify()), HAS_METHOD verification, and
  cross-module collision assertions
- Un-skip 2 match/case as-pattern tests (they pass now) and remove leftover
  DEBUG test
- Add per-chunk language guard for synthesizeWildcardImportBindings — skips
  full graph traversal for TS/JS-only chunks, avoiding O(chunks × graph_size)
- Rename fixture method check → verify to avoid BUILT_IN_NAMES noise filter
- Expand fixture with method calls on constructor-inferred receivers
- Fix stale comment referencing only "Go package imports"
Replace Set<number> + flatMap with a simple boolean array indexed by
chunk — cleaner data structure for sequential integer keys.
@magyargergo magyargergo merged commit a575508 into abhigyanpatwari:main Mar 23, 2026
6 of 7 checks passed
motolese pushed a commit to motolese/datamoto-gitnexus that referenced this pull request Apr 23, 2026
…n-calls-zero-337

fix(python): resolve module-qualified constructor calls — 0 CALLS edges (Issue abhigyanpatwari#337)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Python repos produce 0 CALLS edges — call extraction works but resolveCallTarget fails

3 participants