Skip to content

Cicd/claude actions review#3

Merged
zander-raycraft merged 37 commits into
mainfrom
CICD/claude-actions-review
Mar 18, 2026
Merged

Cicd/claude actions review#3
zander-raycraft merged 37 commits into
mainfrom
CICD/claude-actions-review

Conversation

@zander-raycraft

Copy link
Copy Markdown
Owner

No description provided.

JasonOA888 and others added 30 commits March 8, 2026 15:50
Add configurations for DeepSeek-V3 and DeepSeek-Chat models
via OpenRouter integration.

- DeepSeek-V3: Reasoning model (input: /usr/bin/bash.27, output: .10)
- DeepSeek-Chat: Chat model (input: /usr/bin/bash.14, output: /usr/bin/bash.28)

Fixes abhigyanpatwari#215
…-deepseek-model

feat(models): add DeepSeek model configurations
…er initial repo analysis (npx gitnexus analyze --skills) (abhigyanpatwari#171)

* calm fix 4 adding skills to repo [ISSUE abhigyanpatwari#140]

* inspect

* unit and integration tests

* fixed hardcoded cohesion miss

* e2e tests for --skills flag for langauge/repo support

* Cohesion test e2e tests
…tructor discrimination (abhigyanpatwari#238)

* feat: add Method Resolution Order (MRO) with language-specific rules

Implement full MRO computation for multi-language inheritance hierarchies:

- HAS_METHOD edges: Class→Method ownership edges emitted during parsing
  (both worker pool and sequential fallback paths)
- Method signatures: extract parameterCount and returnType from AST nodes
- C# heritage fix: distinguish EXTENDS vs IMPLEMENTS for base_list captures
  using symbol table lookup + I[A-Z] naming heuristic fallback
- MRO processor (Phase 4.5): walks inheritance DAG, detects method-name
  collisions across parents, applies language-specific resolution:
  - C++: leftmost base class in declaration order wins
  - C#/Java: class method wins over interface default
  - Python: C3 linearization with cycle detection
  - Rust: no auto-resolution (requires qualified syntax)
  - Default: first definition in BFS order wins
- OVERRIDES edges emitted for resolved method collisions
- KuzuDB schema: Method table extended with parameterCount/returnType;
  dedicated CSV writer and COPY query for 10-column Method rows
- MCP tools: updated Cypher examples for HAS_METHOD, OVERRIDES, diamond

72 tests across 5 test files covering MRO resolution, HAS_METHOD edges,
method signature extraction, C# heritage resolution, and integration
tests across C#/Rust/Python/TS/Java/C++.

* feat: add scope-based symbol resolution replacing raw lookupFuzzy

Introduces a shared 3-tier resolveSymbol function used by both
heritage-processor and call-processor:
1. Same-file (lookupExactFull — authoritative)
2. Import-scoped (filtered by ImportMap — high confidence)
3. Global fuzzy (first match — low confidence fallback)

Adds lookupExactFull to SymbolTable returning full SymbolDefinition
with type info needed for heritage Class/Interface disambiguation.

* refactor: tighten symbol resolution — Tier 3 refuses ambiguous matches

- lookupExactFull now O(1) via direct SymbolDefinition storage in fileIndex
  (shared object references with globalIndex — zero additional memory)
- Added resolveSymbolInternal() preserving { definition, tier, candidateCount }
  for test assertions and logging
- Tier 3 now returns null when multiple global candidates exist instead of
  arbitrary allDefs[0] — a wrong edge is worse than no edge
- call-processor: renamed fuzzy-global → unique-global, removed dead branch
- 12 new tests: tier assertions, ambiguous refusal per language family,
  heritage false-positive guard, O(1) shared reference verification

* fix: critical language support bugs in import resolution and MRO

Phase 5 critical fixes from all-language analysis:
- Python: add relative_import query capture (PEP 328) — `.models`, `..utils`
  were silently dropped, producing zero ImportMap entries
- Rust: extract prefix from grouped imports `crate::module::{A, B}` — brace
  groups previously failed resolution entirely
- Swift: use normalizedFileList for Windows path compatibility in module
  import resolution (matches Go's resolveGoPackage pattern)
- MRO: fix c_sharp → csharp language name mismatch (enum is 'csharp'),
  add Kotlin to C#/Java resolution rules (class method wins over interface)

* feat: add strict multi-language integration tests + fix C/C++ import resolution

Add 32 integration tests across 6 language fixtures (TypeScript, C#, C++,
Java, Python, Rust) with exact toBe/toEqual assertions validating heritage
edges, import resolution, and trait implementations.

Fix C/C++ import resolution bug where dot-to-slash conversion mangled
include paths (e.g. "animal.h" became "animal/h"). Now skips conversion
for C/C++ languages which use actual file paths in #include directives.

* fix: language-gate heritage heuristic, add Swift extension heritage, handle Rust grouped imports

- Gate I[A-Z] naming heuristic to C#/Java only (was firing for all languages)
- Swift unresolved types default to IMPLEMENTS (protocol conformance is the norm)
- Add tree-sitter query for Swift extension protocol conformance (extension Foo: Protocol)
- Handle Rust top-level grouped imports (use {crate::a, crate::b}) in both import loops
- Add 4 new heritage-processor tests (TypeScript refusal, Swift default, Swift Tier 1)

* feat: add Go struct embedding heritage + PackageMap optimization

Add Go struct embedding detection (anonymous fields → EXTENDS edges) via
new tree-sitter heritage query with named-field filtering in both
parse-worker and heritage-processor paths.

Implement PackageMap optimization for Go cross-package resolution:
replace O(N) file-level ImportMap expansion with directory-level suffix
matching (Tier 2b in symbol resolver). Graph IMPORTS edges are preserved
via addImportGraphEdge split.

Remove overly broad @definition.type from GO_QUERIES that was
double-matching structs/interfaces as TypeAlias nodes, breaking Tier 3
unique-global resolution.

Add Go fixture (go-pkg) with Admin→User embedding, cross-package calls,
and 7 integration tests covering structs, functions, imports, calls,
and heritage edges.

* test: add Kotlin heritage integration tests

Adds a kotlin-heritage fixture and 7 integration tests validating
class inheritance, interface implementation, JVM-style import
resolution, and symbol-table-driven EXTENDS/IMPLEMENTS disambiguation
via Kotlin delegation specifiers.

* feat: extract resolvers, add PHP tests, ambiguous tests for all languages

- Extract language-specific resolvers from import-processor.ts into
  resolvers/ directory (P7): jvm, go, csharp, php, rust, standard, utils
- import-processor.ts reduced from 1412 to 711 lines (50% reduction)
- Add comprehensive PHP integration tests: PSR-4 imports, traits, enums,
  heritage edges, method calls, MRO overrides
- Add ambiguous symbol resolution tests for all 9 languages verifying
  correct disambiguation via import chains
- Split monolithic lang-resolution.test.ts (1080 lines) into 9 per-language
  files under test/integration/resolvers/ with shared helpers

* feat: update integration tests to include resolver tests for multiple languages

* fix: address code review — schema gap, Rust impl name, Property OVERRIDES

Bugs fixed:
- Add 13 missing FROM/TO pairs in RELATION_SCHEMA for HAS_METHOD edges
  (Class/Interface/Struct/Trait/Impl/Record to Method/Constructor/Property)
- Fix findEnclosingClassId to pick implementing type for Rust
  impl Trait for Struct blocks (was picking trait name)
- Exclude Property nodes from MRO OVERRIDES collision detection
- Change MRO language fallback from typescript to unknown

Tests added:
- Unit: Property OVERRIDES exclusion (2 tests), Rust impl Trait for
  Struct name resolution (2 tests), schema HAS_METHOD pair coverage
- Integration: no OVERRIDES targets Property nodes across all 9 languages
- PHP fixture: added shared $status property to both traits to create
  real collision scenario for Property OVERRIDES exclusion test

Documentation:
- OVERRIDES edge direction (Class to Method), Go return type gap,
  BFS first-reach heuristic limitation

* feat: harden CALLS-edge resolution — Phase 0 validation

- Fix same-file confidence (0.85 → 0.95) to correctly outrank import-scoped (0.9)
- Fix Tier 1 overload preservation: use globalIndex filter instead of fileIndex lookup
- Add callable-kind guard: refuse CALLS edges to Interface and Enum symbols
- Fix Kotlin countCallArguments: handle call_suffix → value_arguments nesting
- Fix Kotlin extractFunctionName: add simple_identifier to fallback search
- Strictly type findParameterList and countCallArguments (remove all `any`)
- Add arity-based call resolution integration tests for 9 languages
- Add unit regression tests for Interface/Enum CALLS refusal

* chore: remove C# build artifacts from fixtures

* feat: add call-form discrimination and ownerId to symbol table (Phase 1)

Add inferCallForm() and extractReceiverName() to distinguish free/member/constructor
calls at the AST level across all 9 languages. Add ownerId field to SymbolDefinition
linking Method/Constructor/Property to their owning class. Includes 36 unit tests
and member-call integration tests for all 9 languages (132 tests, 0 failures).

* feat: constructor/struct-literal resolution across all languages (Phase 2)

Add constructor discrimination to CALLS-edge resolution: new Foo(),
User{...} struct literals, and C# primary constructors now resolve to
Constructor/Class/Struct/Record nodes instead of being filtered out.

Queries: new_expression (C++), object_creation_expression (PHP),
composite_literal (Go), struct_expression (Rust), primary constructor
and implicit_object_creation_expression (C#).

Relaxes global tier in collectTieredCandidates to pass all candidates
through filterCallableCandidates, allowing kind/arity narrowing to
disambiguate at lower confidence.

* feat: receiver-constrained resolution with integration tests for all 9 languages

Add receiver-type filtering (Phase 3): when a member call like `user.save()`
has a known receiver type from TypeEnv, filter candidates by ownerId to
disambiguate methods with the same name across different classes.

Key changes:
- call-processor: build per-file TypeEnv, pass receiverTypeName to resolveCallTarget
- parse-worker: extract receiverTypeName from TypeEnv in worker thread
- resolveCallTarget: new step D filters by ownerId matching receiver type
- utils: extractReceiverName supports C++ field_expression (argument field)
- utils: findEnclosingClassId extracts Go method receiver types
- type-env: handle Go qualified_type, Kotlin user_type/variable_declaration
- parse-worker + parsing-processor: Function added to needsOwner for
  Kotlin/Rust/Python class methods captured as Function nodes

Integration tests added for receiver-constrained resolution across all 9
languages: TypeScript, Java, Python, Go, Rust, C++, C#, Kotlin, PHP.

* feat: NamedImportMap, scoped TypeEnv, broadened signatures + TS rest-param variadic fix

Address all 4 PR abhigyanpatwari#238 review items:
1. Remove redundant lookupFuzzy in processRoutesFromExtracted
2. Add NamedImportMap for TS/Python symbol-level import tracking (Tier 2a)
3. Make TypeEnv scope-aware (Map<scopeKey, Map<varName, type>>) to fix
   non-deterministic receiver resolution across functions
4. Broaden extractMethodSignature: Go/Rust/C++ return types, variadic
   detection for Go/Java/Python/C++/Kotlin/TypeScript rest params

Discovered and fixed: TS rest params (...args) were not detected as
variadic — added rest_pattern detection inside required_parameter nodes.

Integration tests added: scoped receiver, named import disambiguation,
and variadic call resolution for both TypeScript and Python.

* fix: alias import resolution, Go multi-assign TypeEnv, dead code removal

- NamedImportMap now stores {sourcePath, exportedName} so aliased imports
  (import { User as U }) resolve U → User in the source file
- Named binding check moved before empty-allDefs early return in both
  call-processor and symbol-resolver, fixing constructor calls via aliases
- Go extractFromGoShortVarDeclaration iterates all LHS/RHS pairs for
  multi-assignment (user, repo := User{}, Repo{}) instead of only first
- Remove unused TYPED_DECLARATION_TYPES set (TYPED_PARAMETER_TYPES kept)
- Integration tests for both fixes (go-multi-assign, typescript-alias-imports)

* feat: alias import extraction for Kotlin, Rust, PHP, C# + integration tests

Add named import alias extraction to both pipeline paths
(import-processor.ts and parse-worker.ts) for Kotlin, Rust, PHP,
and C#. Add integration test fixtures and tests for all 5 languages
(Python alias extraction already worked, just needed the test).

Each test verifies: class detection, member call resolution through
aliases to correct target files, and IMPORTS edge emission.

* refactor: use SupportedLanguages enum everywhere instead of raw strings

Replace all raw language string literals and `language: string` types
with the SupportedLanguages enum across 10 files. This ensures
compile-time safety for language dispatch and eliminates dead
`language === 'tsx'` checks (tsx maps to TypeScript in the enum).

* fix: tier-ordering bug, re-export chains, PHP grouped imports, Java named imports

- Fix collectTieredCandidates tier-ordering: same-file now checked before
  named bindings, preventing imports from shadowing local definitions
  (matches resolveSymbolInternal priority order)
- Add re-export chain resolution for TypeScript/JavaScript barrel files:
  export { X } from './base' and export type { X } from './base' now
  followed up to 5 hops through NamedImportMap
- Fix PHP grouped import alias extraction: use App\Models\{User, Repo as R}
  now correctly handled in both parse-worker and import-processor
- Add Java NamedImportMap support: import com.example.models.User now
  records User as a named binding for precise disambiguation
- Add 16 new integration tests across TypeScript, PHP, and Java resolvers
  (220 total resolver tests, all passing)

* refactor: consolidate alias extraction + add variadic/constructor/shadow integration tests

- Extract shared named-binding-extraction.ts from duplicate logic in
  import-processor.ts and parse-worker.ts (net -200 lines)
- Deduplicate appendKotlinWildcard (now imported from resolvers/index.ts)
- Add integration tests: constructor calls (Kotlin, Python), variadic
  resolution (Go, Java, C#, C++, Kotlin), re-export chains (Python),
  local definition shadowing (Python, Go)
- Add TODO(stack-graph) for TypeEnv scope key collision
- 225 integration tests passing (was 223)

* fix: PHP non-aliased imports, Python node identity, re-export chain dedup + local-shadow tests

- PHP flat non-aliased imports (use App\Models\User) now stored in NamedImportMap
- PHP grouped non-aliased imports ({User} in {User, Repo as R}) now stored in NamedImportMap
- Python: replace non-public child.id with child.startIndex for node identity
- Extract shared walkBindingChain() from symbol-resolver and call-processor
- Add PHP variadic resolution fixture + test (variadic_parameter already covers PHP)
- Add local-shadow integration tests for Java, C#, Kotlin, Rust, PHP, C++ (6 languages)

* feat: Rust non-aliased use bindings, Kotlin non-aliased imports, re-export chain resolution

Extend NamedImportMap coverage for Rust and Kotlin non-aliased imports:

- Rust: rename collectUseAsClauses → collectRustBindings, extract terminal
  scoped_identifier (use crate::models::User) and identifier in use_list
  (use crate::models::{User, Repo}) into NamedImportMap. This also enables
  pub use re-export chain following via walkBindingChain.
- Kotlin: extend extractKotlinNamedBindings to handle non-aliased imports
  (import com.example.User), skipping wildcard imports.
- Add rust-reexport-chain fixture + 3 integration tests verifying Handler{}
  resolves through mod.rs pub use to handler.rs.
- Add Kotlin heritage + constructor-calls reason assertions for non-aliased
  import-resolved resolution.
- Add C# heritage test documenting namespace import tier behavior.

* fix: skip Kotlin lowercase member imports in NamedImportMap

Member imports like `import util.OneArg.writeAudit` (lowercase last
segment) must not populate NamedImportMap — same-named function imports
from different classes collide, breaking arity-based disambiguation.
Apply the same guard Java already uses: skip lowercase last segments.

* fix: skip spurious path-prefix bindings in Rust grouped imports

collectRustBindings was extracting the path segment (e.g. "models") from
`use crate::models::{User, Repo}` as a spurious NamedImportMap entry.
Skip scoped_identifier nodes that are direct children of scoped_use_list
since they are path prefixes, not importable symbols.

Adds rust-grouped-imports fixture and 4 integration tests verifying both
symbols resolve correctly and no spurious binding leaks through.

* fix: use startIndex in TypeEnv scope key to prevent same-name method collision

Two methods named identically in different classes within the same file
previously shared a scope key, causing non-deterministic type resolution.
Now keys use funcName@startIndex for uniqueness.

Also adds tests documenting destructuring assignment extraction gap.

* test: document C# namespace-level import limitation in named binding extraction

* test: document same-arity overload discrimination limitation in call processor

* perf: parallelize calls/heritage/routes processing in worker path

Worker path now runs processCallsFromExtracted, processHeritageFromExtracted,
and processRoutesFromExtracted via Promise.all instead of sequentially.
Safe because all three only read shared state and write via addRelationship's
dedup guard. Sequential fallback path stays sequential (shared LRU astCache).

Also fixes Rust collectRustBindings spurious path-prefix bindings for 3+ level
grouped imports, and adds @param JSDoc for walkBindingChain's allDefs invariant.

* docs: improve Promise.all safety comment and walkBindingChain JSDoc

Clarify that the parallelization safety comes from disjoint relationship
types + idempotent id-keyed Maps, not from lack of shared state (the
graph is shared). Strengthen allDefs JSDoc to describe silent-miss
consequence of passing pre-filtered results.

* refactor: extract language-specific processing into modular dispatch tables

Phase 1: Extract type binding logic from type-env.ts (635→125 LOC) into
type-extractors/ directory with per-language files and Record<SupportedLanguages,
LanguageTypeConfig> + satisfies dispatch.

Phase 2: Extract 5 config loaders from import-processor.ts into
language-config.ts (removed ~196 LOC of inline loaders).

Phase 3: Convert export-detection.ts switch/case to exhaustive
Record<SupportedLanguages, ExportChecker> + satisfies dispatch table,
fix node: any → SyntaxNode.

Also adds language feature matrix to README.

All 1146 unit tests and 433 integration tests pass.

* refactor: extract type binding logic into type-extractors/ directory (Phase 1)

Extract per-language type extraction from type-env.ts (635→125 LOC) into
type-extractors/ with Record<SupportedLanguages, LanguageTypeConfig> + satisfies
dispatch. 9 per-language files, shared helpers, and barrel index.

* refactor: extract config loaders to language-config.ts (Phase 2)

Move 5 language-specific config loaders and their type interfaces from
import-processor.ts into standalone language-config.ts module.
…ch table (abhigyanpatwari#278)

* fix(ruby): method-level call resolution, HAS_METHOD edges, and dispatch table refactoring

- Replace all `if (language === Ruby)` checks in processors with a
  `callRouters` dispatch table in call-routing.ts (renamed from
  ruby-call-routing.ts to preserve git history)
- Add Ruby `method` and `singleton_method` to FUNCTION_NODE_TYPES so
  findEnclosingFunction produces Method-level CALLS sources
- Add Ruby `class` and `module` to CLASS_CONTAINER_TYPES for HAS_METHOD
  edge generation
- Add bare call capture via tree-sitter query `(body_statement (identifier))`
  for Ruby methods called without parentheses
- Add Ruby member call detection (`call` node with `receiver` field) to
  inferCallForm and extractReceiverName
- Wire resolveRubyImport into resolveLanguageImport
- Add 24 integration tests across 5 suites: heritage/properties, arity
  filtering, member calls, ambiguous disambiguation, local shadow
- Add ruby.test.ts to CI integration workflow

* fix(ruby): resolve 6 Ruby resolution gaps from PR review

- Fix singleton_method label mismatch: @definition.function → @definition.method
  so CALLS edges from `def self.foo` bodies get correct sourceId
- Add ownerId and HAS_METHOD edges to attr_* Property nodes by calling
  findEnclosingClassId in both parse-worker and call-processor property branches
- Distinguish include/extend/prepend heritage: add heritageKind to
  RubyHeritageItem, propagate through heritage pipeline as IMPLEMENTS reason
- Document bare call over-capture limitation in tree-sitter query comment
- Add bare `require` (non-relative) import test coverage
- Add prepend/extend test coverage with distinct Loggable/Cacheable modules

31 Ruby integration tests passing, no regressions in other language resolvers.

* fix(ruby): web package parity — heritage reasons, property HAS_METHOD, singleton_method label

- Web call-processor: use item.heritageKind as IMPLEMENTS reason instead of
  hardcoded 'trait-impl', add :${kind} suffix to edge ID for uniqueness
- Web call-processor: port findEnclosingClassId, add HAS_METHOD edges for
  attr_* Property nodes to match CLI fix
- Web call-processor: singleton_method label 'Function' → 'Method' to match
  CLI tree-sitter query fix
- CLI parse-worker: update stale ExtractedHeritage.kind JSDoc to include
  'include' | 'extend' | 'prepend'

* fix(web): add HAS_METHOD to RelationshipType union

Web package was missing HAS_METHOD in the RelationshipType union,
causing a type mismatch with the HAS_METHOD edges emitted by the
attr_* property fix in call-processor.ts.
… resolution (abhigyanpatwari#274)

* feat(type-env): constructor-call type inference for TypeEnv (Phase 1)

Add extractInitializer as a Tier 1 fallback in buildTypeEnv: when a
declaration node has no explicit type annotation, infer the type from
constructor-call patterns (new X(), X::new(), X::default(), $x = new X()).

Languages covered: TypeScript/JS, Java (var), Rust, PHP, C++ (auto).
Python/Kotlin/Swift deferred — need symbol-table access to distinguish
class constructors from function calls.

Adds 20 new unit tests covering constructor inference, annotation
precedence, and known limitations across all supported languages.

* fix(type-env): class-aware constructor resolution, multi-declarator fix

- Add collectClassNames pre-scan: walks AST to build Set<string> of
  class/struct names defined in the file
- C++ extractInitializer uses classNames.has() to verify identifier is
  a known class before inferring (auto x = User() resolves, auto x =
  getUser() does not — no false positives)
- Add InitializerExtractor type that receives classNames parameter
- Fix env.size gating: always call extractInitializer when available,
  so mixed declarators like const a: A = x, b = new B() resolve both
- Add env.has() guard in Java extractInitializer to skip already-bound vars
- Document Rust new/default whitelist rationale
- Pin all test assertions, add mixed multi-declarator test case

* fix(type-env): resolve Self/self/static/parent to actual type names

- Rust: Self::new()/Self::default() resolves to enclosing impl type
- PHP: new self()/static() resolves to enclosing class, parent() to superclass
- Rust: Tier 0 annotation guard prevents overwrite by constructor inference
- Rust: mut_pattern handling in extractVarName for let mut bindings
- TS: fix misleading comment in extractInitializer
- 58 tests passing (3 new Self/self resolution tests)

* perf(type-env): single-pass AST walk with closure-scoped state

Refactors buildTypeEnv to use closures instead of passing mutable state
as parameters. classNames, env, and config are captured by the inner
walk and extractTypeBinding functions — no parameter mutation.

- Eliminates separate collectClassNames pre-scan (O(2n) → O(n))
- config looked up once per file instead of per-node
- 29 fewer lines

* feat(type-env): constructor-inferred type resolution for all languages

Add cross-file constructor type inference to the ingestion pipeline,
enabling receiver-type disambiguation for member calls like
`user.save()` when the variable is assigned from a constructor without
explicit type annotations.

Pipeline changes:
- Add extractInitializer to Python and Swift type extractors
- Add CONSTRUCTOR_BINDING_SCANNERS for Python, Swift, C/C++ in type-env
- Wire constructorBindings through parse-worker → parsing-processor →
  pipeline → processCallsFromExtracted
- Rewrite resolveCallTarget receiver-type filtering (step D) to use
  tiered import resolution (same-file → import-scoped → global) before
  falling back to fuzzy ownerId matching
- Use collectTieredCandidates for constructor binding verification
  instead of raw lookupFuzzy

Bug fixes:
- Fix C++ inline method query: @definition.method was captured on
  field_declaration_list instead of function_definition, causing wrong
  parameterCount for all inline class methods
- Fix parse-worker accumulated/flush results missing constructorBindings

CI changes:
- Add swift.test.ts to ci-integration pipeline group and coverage job
- Update ci-report to fetch base branch (main) coverage for delta
  reporting instead of showing config thresholds
- Add per-suite timing breakdown table (unit/integration/total)
- Add expandable skipped test details section

Tests: 288 passed, 4 skipped (swift — macOS only) across 10 languages
- 36 new constructor-inferred integration tests (4 per language)
- 10 fixture directories with cross-file constructor patterns
- TypeScript, JavaScript, Java, Kotlin, Python, PHP, Rust, Go, C++, Swift

* fix(type-extractors): add type assertion for LanguageTypeConfig

* feat(ruby): constructor-inferred type resolution and self-receiver mapping

Add Ruby User.new constructor binding scanner to type-env, enabling
receiver-type disambiguation for member calls like user.save vs repo.save.
Add self/this → enclosing class resolution in lookupTypeEnv so self.method()
calls resolve to the correct class even when the method name is ambiguous.

* docs: update README with constructor inference and self/this resolution details

* refactor(ingestion): unified ResolutionContext replaces fragmented map passing

Introduce createResolutionContext() as the single resolution API for all
processors. Eliminates duplicated tier-selection logic, fixes heritage
namedImportMap bug, and adds per-file resolution caching.

- NEW resolution-context.ts: closure-factory with resolve(), per-file cache,
  TIER_CONFIDENCE constant, and shared ResolutionTier type
- DELETE symbol-resolver.ts: zero production importers, logic now in
  resolution-context.ts
- call-processor: all functions take ctx instead of 6 separate maps,
  collectTieredCandidates removed (ctx.resolve replaces it),
  D4 redundant re-resolve eliminated
- heritage-processor: takes ctx, resolveHeritageId helper extracts
  repeated 14-line fallback pattern, namedImportMap now included
- import-processor: takes ctx, dead createImportMap/createPackageMap/
  createNamedImportMap factories removed
- pipeline: creates single ctx, wires onProgress to all processors,
  logs cache hit rate in dev mode
- Tier renamed: unique-global → global (honest about returning all candidates)
- Tests migrated: 1178 unit + 84 integration passing

* feat(type-env): self/this/super resolution, TypeEnvironment API, and review fixes

Add cross-language receiver keyword resolution:
- self/this/$this → enclosing class name via AST walk
- super/base/parent → parent class name via heritage AST extraction
  (8 grammar variants: TS/JS, Java, Python, Ruby, C#, PHP, Kotlin, C++, Swift)
- D-phase widening in resolveCallTarget for super→parent method dispatch

Introduce TypeEnvironment API replacing loose TypeEnvResult + lookupTypeEnv:
- buildTypeEnv() returns TypeEnvironment with .lookup() method
- Single-pass AST walk merges constructor binding scan (was separate traversal)
- ClassNameLookup type replaces over-broad ReadonlySet<string> facade
- Memoized class name lookups to avoid redundant SymbolTable scans

Code review fixes (6 agents, 11 findings):
- Replace ctx.resolve(name, '') hack with direct symbols.lookupFuzzy()
- Extract scope key helpers (extractFuncNameFromScope, receiverKey)
- Simplify D-phase from 5 steps to 4 with deduped typeNodeIds
- Remove C from CONSTRUCTOR_BINDING_SCANNERS (YAGNI — C has no constructors)
- Cache Map reuse in ResolutionContext to reduce GC pressure
- Remove unused TieredCandidates import

Integration tests for self/this, parent, and super resolution across all
12 supported languages with per-language fixture directories.

* fix(type-env): generic parent resolution, TS cast inference, C++ brace-init

Fix generic parent class breaking super resolution:
- extractParentClassFromNode now uses extractSimpleTypeName to strip
  generic params (Base<T> → Base) and qualified names (models.Model → Model)
- Affects TS, Java, Python, C# heritage extraction

Fix TypeScript new X() as T / new X()! missed inference:
- Unwrap as_expression and non_null_expression before checking for
  new_expression in extractInitializer

Fix C++ brace-init User{} missed inference:
- Handle compound_literal_expression with type_identifier child
  in extractInitializer

Clean up deprecated lookupTypeEnv:
- Remove standalone lookupTypeEnv export, migrate all callers to
  TypeEnvironment.lookup() method
- Update all 80+ test assertions to use the new API

Integration test fixtures added:
- typescript-cast-constructor-inference (new X() as T, new X()!)
- typescript/java/csharp/kotlin-generic-parent-resolution
- cpp-brace-init-inference (auto x = User{})

* fix(type-extractors): Go &User{}, TS double-cast, Swift .init inference

Fix Go pointer-to-struct literal not inferred:
- Unwrap unary_expression (address-of &) before composite_literal check
- user := &User{} now correctly infers type User

Fix TypeScript double-cast only unwrapping one level:
- Change if to while loop for nested as_expression/non_null_expression
- new User() as unknown as Admin now correctly infers type User

Fix Swift User.init(name:) explicit init call missed:
- Handle navigation_expression callee with .init suffix in extractInitializer

Integration test fixtures:
- go-pointer-constructor-inference (&User{}, &Repo{})
- typescript-double-cast-inference (as unknown as T)

* feat: Rust struct literal, Python qualified ctor, Go new(), Swift .init scanner

- Rust: handle struct_expression in extractInitializer (User { name: "alice" })
- Python: support attribute nodes in extractInitializer (models.User("alice"))
  and the cross-file scanner — extractSimpleTypeName handles qualified names
- Go: handle new(User) built-in in extractGoShortVarDeclaration
- Swift: extend CONSTRUCTOR_BINDING_SCANNERS to handle navigation_expression
  callee for User.init(name:) cross-file resolution

Unit tests: 87 → 96 (Rust struct literal, Go new(), Python qualified ctor,
Python scanner qualified, plus edge cases)
Integration tests: 4 new describe blocks with fixtures

* fix: Rust Self{} resolution, C++ scoped brace-init, PHP promotion params, Ruby constants

- Rust: resolve Self {} struct literal to enclosing impl type (was stored as "Self")
- C++: replace type_identifier guard with extractSimpleTypeName for compound_literal_expression,
  enabling ns::User{} scoped brace-init (closes previously deferred gap)
- PHP: add property_promotion_parameter to TYPED_PARAMETER_TYPES for PHP 8.0+
  constructor property promotion (__construct(private Foo $x))
- Ruby: extend extractRubyConstructorBinding to accept constant left-hand side
  (REPO = Repo.new)

Unit tests: 96 → 101 (+5: Rust Self{} ×2, C++ ns::User{} ×1, PHP promotion ×1,
Ruby constant ×1)
Integration tests: 4 new describe blocks with fixtures

* feat: Phase 1 type resolution gaps — walrus, PHP properties, nullable, Go make/assert

Phase 1 quick wins from the type resolution gap analysis:

1. Python walrus operator := (named_expression) — extractInitializer + scanner
2. PHP 7.4+ typed class properties — property_declaration in extractDeclaration
3. Nullable union unwrapping — User | null → User in extractSimpleTypeName
4. Go make() builtin — slice/map element type extraction
5. Go type assertions — iface.(User) type extraction

Also: PHP primitive_type handling in extractSimpleTypeName (string, int, etc.)

Unit tests: 101 → 114 (+13)
Integration tests: 8 new describe blocks with fixtures

* feat: Phase 2 type resolution gaps — C++ range-for, Rust if-let, C# pattern matching, Python class annotations

Phase 2 medium-effort improvements:

1. C++ range-for with explicit type — for (User& u : vec) binds u: User
2. Rust if-let/while-let captured_pattern — user @ User { .. } binds user: User
3. C# is-pattern matching — if (obj is User user) binds user: User
4. Python class-level annotations — confirmed already working, added tests

Unit tests: 114 → 127 (+13)
Integration tests: 11 new test cases with fixtures
* refactor: migrate from KuzuDB to LadybugDB v0.15

KuzuDB was archived (Apple acquisition, Oct 2025). LadybugDB is the
community fork with full API compatibility.

- Package swap: kuzu → @ladybugdb/core, kuzu-wasm → @ladybugdb/wasm-core
- Rename all internal paths: kuzu → lbug (adapters, schema, storage)
- Storage path: .gitnexus/kuzu → .gitnexus/lbug (with auto-cleanup)
- Add explicit VECTOR extension loading (required in v0.15)
- Update CI workflow, documentation, and all tests
- 1151 unit + 27 integration tests passing

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: address code review findings (P1-P3)

P1: Fix WASM adapter to use getAll() API, wire cleanupOldKuzuFiles
into analyze command, add symlink path traversal protection.
P2: Cache VECTOR extension load state, batch augmentation engine
queries (20→4), fix web getCopyQuery for multi-language tables,
fix stale KuzuDB references, correct brainstorm package names.
P3: Complete lbug-wasm.d.ts type declarations, batch semantic
search per-label, update stale BM25 comment.

* chore: remove outdated KuzuDB migration brainstorming document

* fix: load FTS extension in MCP pool adapter on init

The read-only pool adapter never loaded the FTS extension, so all
QUERY_FTS_INDEX calls failed silently. This broke search-pool and
augmentation integration tests, and caused empty results in the
web UI server mode.

* feat: implement shared Database caching and connection reference counting

* feat: enhance KuzuDB migration handling and status reporting

* fix: mock cleanupOldKuzuFiles in local backend callTool tests

* fix: update mock for cleanupOldKuzuFiles and adjust imports in callTool tests

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…pe extractors (abhigyanpatwari#284)

* feat: Phase 3 — return type inference, generic args extraction, Ruby YARD type extractor

Three architectural improvements to the type resolution system:

1. Return type inference — wire extractMethodSignature returnType through
   SymbolDefinition into call-processor. When var = callee() and callee
   has a known return type, bind var to that type. Handles Promise<T>
   unwrapping, nullable stripping, pointer/reference removal.

2. Generic type argument extraction — new extractGenericTypeArgs() utility
   that extracts type parameters from List<User> → ['User']. Handles
   TS/Java/Kotlin/C#/Rust generic syntax. Building block for for-loop
   variable typing.

3. Ruby dedicated type extractor — replaces the stub with YARD annotation
   parsing (@param name [Type]), handling qualified types, nullable types,
   and singleton methods. Ruby now has real type resolution.

Unit tests: 127 → 192+ (type-env) + 65 (symbol-table, call-processor) + 18 (generics)
Integration tests: 8+ new test cases with fixtures across TS/Python/Go/Java/Ruby

* fix: Phase 3 gaps — WRAPPER_GENERICS correctness, Ruby :: qualifier, namespaced constructors

- Remove collection types (List, Array, Vec, Set) from WRAPPER_GENERICS to prevent
  false CALLS edges (e.g. List<User> no longer unwraps to User)
- Add :: qualifier handling in extractReturnTypeName for Ruby/C++/Rust namespaced types
- Add Ruby `constant` and `scope_resolution` node types to shared extractors
- Extract shared extractRubyConstructorAssignment helper (dedup type-env.ts + ruby.ts)
- Add integration tests for return type inference: Python, TypeScript, Go, Java, Ruby
- Add Ruby namespaced constructor fixture (Models::UserService.new)
- Add unit tests for collection reclassification and :: qualifiers

* feat: Phase 4 — CONSTRUCTOR_BINDING_SCANNERS for all languages + return type inference tests

Add CONSTRUCTOR_BINDING_SCANNERS for 6 missing languages, completing
return type inference coverage across all 11 supported languages:

- TypeScript/JS: variable_declarator with call_expression, unwraps await
- Go: short_var_declaration single-assignment (skips multi-return, new/make)
- Java: local_variable_declaration with `var` type + method_invocation
- C#: variable_declaration with implicit_type (var) + invocation_expression
- Rust: let_declaration without type annotation, handles mut_pattern
- PHP: assignment_expression with function_call_expression

Also adds property_identifier to extractSimpleTypeName for qualified
member calls (repo.getUser → getUser), fixing namespaced constructor
inference that was previously a known limitation.

Integration tests added for all 11 languages with correct label
assertions (Function vs Method per language's tree-sitter queries).

* refactor: merge CONSTRUCTOR_BINDING_SCANNERS into per-language LanguageTypeConfig

Eliminates the parallel dispatch map in type-env.ts by moving all 11
constructor binding scanners into their respective type-extractors/*.ts
files as `scanConstructorBinding` on LanguageTypeConfig.

- Add ConstructorBindingScanner type to types.ts
- Add shared helpers: hasTypeAnnotation, unwrapAwait, extractCalleeName
- Move scanners to typescript.ts, jvm.ts, python.ts, php.ts, go.ts,
  rust.ts, swift.ts, c-cpp.ts, csharp.ts, ruby.ts
- Fix `any` types in C# scanner → SyntaxNode | null
- Delete ~300 lines from type-env.ts (CONSTRUCTOR_BINDING_SCANNERS map)
- Update buildTypeEnv to use config.scanConstructorBinding

All 143 type-env unit tests and all 10 language integration suites pass.

* fix: remove unused import, fix any type in Java scanner, update stale comment

- Remove unused extractCalleeName import from jvm.ts
- Fix (c: any) → (c: SyntaxNode) in Java scanner
- Update stale CONSTRUCTOR_BINDING_SCANNERS reference in ruby.ts comment

* fix: C# and PHP return type inference — scanner fixes, method signature extraction, and cross-file resolution

Addresses code review findings on PR abhigyanpatwari#284:

C# scanner (csharp.ts):
- Fix type node lookup: iterate children instead of childForFieldName('type')
  which returns undefined in tree-sitter-c-sharp
- Fix initializer lookup: handle direct invocation_expression children
  (no equals_value_clause wrapper in tree-sitter-c-sharp)

C# return type extraction (utils.ts):
- Add 'returns' field check to extractMethodSignature — tree-sitter-c-sharp
  uses 'returns', not 'type', for method return types

C# cross-file resolution (call-processor.ts + fixture):
- Add constructor binding verification to sequential processCalls path
  (was only in the worker processCallsFromExtracted path)
- Add ReturnType.csproj to csharp-return-type fixture
- Update fixture namespaces to use ReturnType.Models/ReturnType.Services
  prefix (matches real C# project conventions)

PHP scanner (php.ts):
- Extend scanConstructorBinding to handle member_call_expression
  ($this->getUser() patterns), not just function_call_expression

Shared (shared.ts):
- Add member_access_expression to extractSimpleTypeName qualified-names
  block (C# method calls like svc.GetUser())

Tests:
- Add Repo.cs/Repo.php disambiguation fixtures (two Save methods)
- Strengthen C# and PHP return type tests with hard disambiguation assertions
- Add C# scanner unit tests and return type extraction test

* feat: per-language ReturnTypeExtractor + doc-comment @param parsing for PHP, JS, Ruby

Add ReturnTypeExtractor to LanguageTypeConfig interface with implementations
for Ruby (YARD @return), PHP (PHPDoc @return), and JS/TS (JSDoc @returns).
The fallback is wired in both parsing-processor and parse-worker paths,
activating only when extractMethodSignature finds no AST-based return type.

Also add doc-comment @param type extraction for PHP and JS/TS, following
Ruby's existing collectYardParams pattern. This enables parameter.method()
resolution in loosely-typed codebases using PHPDoc @param or JSDoc @param.

Additional fixes from PR abhigyanpatwari#284 code review:
- Go: add selector_expression + field_identifier to extractSimpleTypeName
  (enables package-qualified factory calls like models.NewUser())
- Ruby: broaden scanConstructorBinding to capture plain call assignments
  (user = get_user()) in addition to Class.new patterns
- Ruby: harden return-type fixture with disambiguation (two save methods)

Test coverage: +14 new integration tests across Go, Ruby, PHP, JS/TS

* fix: JSDoc async return type, PHP attribute walkers, and $this receiver disambiguation

Three fixes from fourth-pass code review on PR abhigyanpatwari#284:

1. JSDoc `@returns {Promise<User>}` no longer stripped to `Promise` — extractReturnType
   now uses sanitizeReturnType (preserves generics) instead of normalizeJsDocType
   (which stripped them before extractReturnTypeName could unwrap WRAPPER_GENERICS).

2. PHP 8+ `#[Attribute]` and JS `@decorator` nodes no longer break doc-comment walkers.
   Both extractReturnType and collect*Params functions now skip attribute_list/decorator
   nodes instead of breaking on them as named siblings.

3. PHP `$this->method()` now provides receiverClassName for disambiguation.
   When two classes define the same method, the enclosing class narrows candidates
   via ownerId matching in call-processor, preventing false no-binding results.

* fix: sanitizeReturnType dot corruption, JS test assertions, Ruby constant receiver

- Remove redundant dot-path stripping from sanitizeReturnType that corrupted
  qualified names inside generics (e.g. Promise<models.User> → User>)
- Split JS async fixture into separate files and add negative assertions
  to properly verify disambiguation (mirroring PHP test pattern)
- Accept 'constant' node type in Ruby scanConstructorBinding for factory
  call assignments (SERVICE = build_service())
- Add 'constant' to SIMPLE_RECEIVER_TYPES so extractReceiverName handles
  Ruby constant receivers (SERVICE.process)

* fix: nested generic arg splitting, JS/Ruby test false positives

- Replace naive comma split in extractReturnTypeName with bracket-balanced
  extractFirstGenericArg so nested types like Future<Result<User, Error>>
  unwrap correctly instead of producing malformed "Result<User"
- Add CompletableFuture to WRAPPER_GENERICS for Java async unwrapping
- Split js-jsdoc-return-type fixture models.js into user.js/repo.js and
  add negative assertions to prove disambiguation (not just file match)
- Split ruby-constant-factory-call fixture into separate service files
  and add negative assertions against AdminService resolution

* fix: review findings — receiverClassName parity, Rust wrappers, Go multi-return, Kotlin/Swift qualified calls

P1: Sequential path now includes receiverClassName narrowing for PHP
$this->method() disambiguation (was missing vs worker path).

P2: Added Rc/Arc/Weak/MutexGuard/Cow + 6 more Rust Deref types to
WRAPPER_GENERICS (Box excluded — Java Swing collision). Extended
Kotlin/Swift scanners to handle navigation_expression callees.
Added Go multi-return support (user, err := f()) with blank/_/err/ok
guard + AST-level first-return extraction in extractMethodSignature.

P3: Extracted shared verifyConstructorBindings() eliminating 60 lines
of duplication between sequential and worker paths. Added return-type
inference integration tests for C++, Rust, Swift with competing
methods and negative disambiguation assertions.

* fix: Swift navigation_suffix unwrapping, Rust lifetime skipping, Kotlin disambiguation tests

- Swift scanConstructorBinding: handle tree-sitter wrapping qualified
  identifiers in navigation_suffix nodes
- Add extractFirstTypeArg to skip Rust lifetime parameters ('a, '_)
  when unwrapping wrapper generics like Ref<'_, User>
- Kotlin tests: add Repo class fixture with competing save() methods
  to prove disambiguation; assert no spurious edges on known gap
- Remove tree-sitter-kotlin from optionalDependencies (now regular dep)

* fix: C# null-conditional calls, Ruby YARD bracket-balanced split, PHPDoc alternate order, escapeValue hardening

- Add C# null-conditional call support (user?.Save()): tree-sitter query for
  conditional_access_expression, member_binding_expression in MEMBER_ACCESS_NODE_TYPES,
  receiver extraction via conditional_access_expression parent walk
- Fix Ruby YARD type parsing for nested generics (Hash<Symbol, User>): replace
  naive split(',') with bracket-balanced splitter respecting <> depth
- Add alternate YARD format (@param [Type] name) alongside standard (@param name [Type])
- Add alternate PHPDoc format (@param $name Type) alongside standard (@param Type $name)
- Harden escapeValue in kuzu-adapter.ts: escape \n and \r to prevent Cypher injection
- Integration tests: C# null-conditional fixture (5 tests), Ruby YARD generics fixture (6 tests)
- Unit tests: PHPDoc alternate order (2 tests), C# null-conditional call-form (updated)

* test: add Python static/classmethod integration tests (issue abhigyanpatwari#289)

Verifies that classes using only @staticmethod/@classmethod have HAS_METHOD
edges connecting them to their child methods. This was the root cause of
issue abhigyanpatwari#289 where context() and impact() returned empty for such classes.

Tests cover: HAS_METHOD edge emission, unique static method resolution
(create_user, delete_user), and ambiguous same-named method handling
(find_user on both UserService and AdminService — safely refused).

* fix: lbug batch escapeValue newline hardening, Rust ::default() scanner exclusion

- Apply \n/\r escaping to batch upsert escapeValue in lbug-adapter.ts:429
  (missed instance of the CREATE-path fix from ec4dca4)
- Exclude Rust ::default() from scanConstructorBinding to match
  extractInitializer behavior — avoids wasted cross-file lookups on
  the broadly-implemented Default trait
- Unit tests: 2 new scanner exclusion tests (::default and ::new)
- Integration tests: 6 new Rust ::default() constructor resolution tests
  with disambiguation fixture (User::default vs Repo::default)

* fix: C#/Rust async await unwrap, PHP backslash namespace, fallback escaping

- C# scanConstructorBinding: unwrap await_expression to find invocation_expression
  (var user = await svc.GetUserAsync() now produces constructor binding)
- Rust scanConstructorBinding: unwrap .await postfix via shared unwrapAwait helper
  (let user = get_user().await now produces constructor binding)
- extractReturnTypeName: handle PHP backslash namespace separator (\App\Models\User → User)
- fallbackRelationshipInserts: match batch escapeValue hardening with \n/\r escaping

Tests: 2 unit (type-env), 3 unit (call-processor), 7 integration (csharp+rust), 7 fixtures

* fix: C#/Rust async-binding test false positives — add competing types and negative assertions

C# fixture: add Order.cs with Order.Save(), change OrderService to return
Task<Order> via GetOrderAsync, add negative assertion proving user.Save()
does not resolve to Order#Save.

Rust fixture: split models.rs into user.rs/repo.rs, make process_user and
process_repo async fn, add bidirectional negative assertions proving no
cross-contamination between User#save and Repo#save.

* fix: C# async-binding broken assertion, bare wrapper type leak, JSDoc optional params

- Split Program.cs Main into ProcessUser/ProcessOrder so negative
  assertions use strict toBeUndefined() (matching Rust pattern)
- Guard bare wrapper types (Task, Promise, Option…) in
  extractReturnTypeName — return undefined instead of the wrapper name
- Update JSDOC_PARAM_RE to capture @param {Type} [optionalName] syntax

* fix: update symbol and relationship counts in documentation
…iscovery (abhigyanpatwari#231)

* feat(ingestion): respect .gitignore and .gitnexusignore during file discovery

Add support for excluding files from indexing based on .gitignore and
.gitnexusignore patterns. Previously, GitNexus used only a hardcoded
ignore list, causing significant index pollution in repositories with
git-ignored directories containing code (e.g., Docker-mounted volumes).

Changes:
- Add `ignore` package for gitignore-spec pattern matching
- Add `loadIgnoreRules()` to parse .gitignore + .gitnexusignore
- Add `createIgnoreFilter()` returning glob-compatible IgnoreLike object
- Integrate filter into glob's `ignore` option for directory-level pruning
- Remove post-glob `.filter()` call (now handled during traversal)

The hardcoded DEFAULT_IGNORE_LIST remains as fallback for non-git repos.

Closes abhigyanpatwari#228

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(ingestion): address review feedback on ignore filtering

- Distinguish ENOENT vs EACCES in loadIgnoreRules (warn on permission errors)
- Add GITNEXUS_NO_GITIGNORE env var to bypass .gitignore parsing
- Fix bare-name pattern matching in childrenIgnored (check both with/without trailing slash)
- Rename isIgnoredDirectory to isHardcodedIgnoredDirectory for clarity
- Add clarifying comments for design decisions (D2 negation, D3 dot:false redundancy)
- Add tests for bare-name patterns, file-glob patterns, EACCES handling, env var

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(ingestion): address second round of review feedback

- G1: Document GITNEXUS_NO_GITIGNORE in `analyze --help` and log when active
- G2: Add comment clarifying path-scurry POSIX normalization contract
- G3: Add IgnoreOptions interface — env var now falls back, callers can
  pass `noGitignore` explicitly for testability and future CLI flag
- G4: Add integration test verifying walkRepositoryPaths respects the env var

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat(ingestion): gracefully skip files with unavailable tree-sitter grammars

Port unsupported language resilience from PR abhigyanpatwari#301 by @jecanore.
- Make Kotlin import optional (like Swift) in parser-loader and parse-worker
- Add worker-local isLanguageAvailable() with filePath param for tsx distinction
- Track and log skipped files per language in both sequential and worker paths
- Add skippedLanguages to ParseWorkerResult for worker→main aggregation
- Add isLanguageAvailable unit tests

Refs: abhigyanpatwari#301, abhigyanpatwari#155, abhigyanpatwari#228

Co-Authored-By: jecanore <juan@housingbase.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* test(e2e): add ignore + language-skip end-to-end test with fixture repo

Add a fixture repo (test/fixtures/ignore-and-skip-repo/) with .gitignore,
.gitnexusignore, TypeScript source files, and a Swift file to exercise
all three features end-to-end:

- File discovery: verifies .gitignore excludes data/ and *.log,
  .gitnexusignore excludes vendor/, source files are discovered
- Parsing: verifies TypeScript files produce Function nodes and DEFINES
  relationships, Swift files are skipped gracefully when grammar is
  unavailable

Add the test to the standalone group in ci-integration.yml and coverage job.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(ci): move ignore-and-skip-e2e test to e2e group per review feedback

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(test): use temp directory instead of fixture for e2e ignore test

The fixture's .gitignore prevented data/seed.json and debug.log from
being committed — these files would be missing after checkout in CI.

Switch to creating the entire test structure in a temp directory via
beforeAll (matching filesystem-walker.test.ts pattern). This ensures
all files exist regardless of git ignore rules.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(test): correct graph API usage in e2e ignore test

Use graph.nodes property getter instead of graph.getNodes(), and check
Function node filePath instead of non-existent File nodes (File nodes
are created by processStructure, not processParsing).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* ci: add workflows permission to ci-integration.yml

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* ci: change workflows permission to write per review

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* ci: move workflows permission from ci-integration.yml to ci.yml caller

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(ci): fix Claude workflows for fork PRs, remove misplaced workflows perm

Three issues prevented Claude from running on fork PRs:

1. claude-code-review.yml lacked workflows:write — push failed when
   fork PRs modify .github/workflows/ files
2. claude.yml had no fork PR support — checked out main and couldn't
   fetch the fork's branch from origin
3. Cleanup step unconditionally deleted branches even when push failed,
   breaking the concurrent claude.yml workflow

Also removes workflows:write from ci.yml's integration job — CI tests
don't need that permission. The permission belongs on the claude
workflows that push fork branches.

Changes:
- Add workflows:write to both claude workflow permissions blocks
- Add fork PR detection + branch push/cleanup to claude.yml
- Add step id to push-fork; cleanup only runs if push succeeded
- Pass branch names via env vars to prevent shell injection (security)
- Add concurrency groups to prevent race conditions between workflows
- Remove misplaced workflows:write from ci.yml integration job

* fix(ci): use GitHub API for fork branch refs instead of git push

GITHUB_TOKEN cannot have 'workflows' permission — it's only valid for
PATs and GitHub Apps. This means git push fails whenever a fork PR
modifies .github/workflows/ files.

Replace git push with the GitHub REST API (POST/PATCH /git/refs) to
create temporary branch refs. The API creates a pointer to the
already-existing PR head commit without triggering the workflow file
push protection. Similarly, cleanup uses DELETE /git/refs instead of
git push --delete.

Also removes the invalid 'workflows: write' from permissions blocks.

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: jecanore <juan@housingbase.io>
Co-authored-by: Gergo Magyar <gergomagyar@icloud.com>
… assignment chains, code review fixes (abhigyanpatwari#310)

* feat: Phase 4 type resolution — nullable unwrapping, for-loop typing, assignment chains, Kotlin return types

Phase 4.1: Nullable/optional chain unwrapping
- Add stripNullable utility in shared.ts for stripping nullable wrappers
- Apply in lookupInEnv to unwrap User | null → User, User? → User before receiver lookup
- Handles TS union, Kotlin/C#/Swift nullable suffix, Python Union[T, None], Rust Option<T>
- Enables receiver-type disambiguation through ?. optional chaining

Phase 4.2: For-loop element typing (Tier 0 — Java/C#/Kotlin)
- Add ForLoopExtractor type and forLoopNodeTypes to LanguageTypeConfig
- Java enhanced_for_statement, C# foreach_statement, Kotlin for_statement extractors
- Only explicit element types in AST (Tier 0); inference-based languages deferred

Phase 4.3: Assignment chain propagation (single-pass, depth-1)
- Add PendingAssignmentExtractor to LanguageTypeConfig with per-language implementations
- Handles TS/JS variable_declarator, Rust let_declaration, Python assignment,
  Go short_var_declaration, C# equals_value_clause, Java/Kotlin variable_declarator
- Single post-walk propagation pass (no fixpoint iteration per Sorbet/Pyright design)
- Resolves const b = a; b.save() when a has known type from Tier 0/1/1b

Phase 4.5: Kotlin return type extraction (bug fix)
- Fix extractMethodSignature to handle Kotlin user_type after function_value_parameters
- Remove lenient test assertions, add strict disambiguation proof

Integration tests across 10+ languages with competing same-name methods
and negative assertions proving disambiguation.

* fix: per-language assignment chain gaps from code review

- Kotlin: new extractKotlinPendingAssignment for property_declaration →
  variable_declaration AST (Java's variable_declarator doesn't exist in Kotlin)
- Go: handle var_spec (var b = u) alongside short_var_declaration (:=)
- PHP: add extractPendingAssignment for $alias = $user with $ prefix preserved

Integration tests added for all three languages with competing
same-name methods and negative disambiguation assertions.

* fix: code review fixes — DRY nullable keywords, avoid array allocations, clarify depth comment

Addresses findings from 6-agent code review on PR abhigyanpatwari#310:

- Move stripNullable JSDoc to correct position (was orphaned above NULLABLE_KEYWORDS)
- DRY: reuse NULLABLE_KEYWORDS set in pipe-split filter instead of inline strings
- Replace node.children.find() with findChildByType/manual loops in jvm.ts,
  go.ts, csharp.ts to avoid unnecessary array allocations per tree-sitter call
- Clarify "depth-1" comment in type-env.ts: single-pass resolves multi-hop
  chains when forward-declared; reverse-order is depth-1 only
- Annotate extractGenericTypeArgs as Phase 5 infrastructure (zero production callers)
- Re-export PendingAssignmentExtractor from index.ts for API consistency
- Add explicit return undefined in Go extractPendingAssignment
- Remove redundant child.text === '=' check in Kotlin extractor

Test coverage:
- 20 new unit tests: stripNullable edge cases, per-language assignment chains,
  reverse-order depth limitation, nullable lookup resolution
- 15 new integration tests: multi-hop chains (a→b→c), nullable+chain combined
  (User|null + alias), Python User|None through stripNullable path
- 3 new fixtures: ts-multi-hop-chain, ts-nullable-chain, python-nullable-chain

* fix: third-pass review — walrus chain, scanner allocations, Kotlin variable_declaration, C# type guard

Addresses 4 new findings from third-pass CI review:

1. Python walrus operator (:=) now handled by extractPendingAssignment —
   named_expression nodes propagate alias chains alongside regular assignment
2. Scanner .namedChildren.find()/.some() in jvm.ts replaced with
   findChildByType() — consistent with 98daed4 code review fixes
3. Kotlin extractPendingAssignment extended to handle variable_declaration
   nodes in addition to property_declaration (function-local val/var)
4. C# extractPendingAssignment early-returns for is_pattern_expression and
   field_declaration nodes (never contain variable_declarator children)

Integration tests:
- Python: walrus chain (alias := u) with disambiguation (5 tests, 1 fixture)
- Kotlin: assignment chain with typed declarations (5 tests, 1 fixture)
- C#: assignment chain + is-pattern coexistence (6 tests, 1 fixture)
- Unit: Python walrus propagation (1 test)

* feat: nullable wrapper unwrapping + C++ assignment chains

Gaps 1, 2, 4 from code review — architectural changes to type resolution:

1. extractSimpleTypeName now unwraps nullable wrapper generics:
   - Optional<User> → "User" (Java), Option<User> → "User" (Rust),
     Maybe<User> → "User" (Kotlin Arrow/Haskell-style)
   - Containers (List, Map) and async wrappers (Promise, Future) are NOT
     unwrapped — methods are called on the container, not the inner type
   - Uses existing extractGenericTypeArgs (now production-active, was dead code)
   - NULLABLE_WRAPPER_TYPES set: Optional, Option, Maybe

2. C++ extractPendingAssignment added for auto alias chains:
   - auto alias = user; alias.save() now propagates User type
   - Handles pointer/reference declarators, auto/decltype(auto)

3. Updated existing Rust test: Option<User> parameter now correctly
   stores "User" instead of "Option" in TypeEnv

Integration tests with fixtures for Java Optional, Rust Option, C++ auto
chain. Full pipeline resolution marked .todo — requires call-processor
enhancement (TypeEnv stores correct types but call-processor needs
additional work to produce CALLS edges for these patterns).

Unit tests: 196 passed (7 new). Integration: all 9 languages green.

* fix: resolve .todo tests — stale dist/ was the root cause

The Rust Option<User> and C++ auto assignment chain integration tests
were marked .todo because the pipeline didn't produce CALLS edges.
Root cause: dist/ was compiled from pre-Phase 4 source and lacked:
- NULLABLE_WRAPPER_TYPES unwrapping in extractSimpleTypeName
- C++ extractPendingAssignment

After npm run build, all tests pass as real assertions:
- Rust: alias.save() resolves to User#save via Option<User> unwrap + chain
- C++: alias.save() and rAlias.save() resolve via auto assignment chain
  with correct disambiguation (User vs Repo)

Only remaining .todo: Rust user.unwrap().save() (Phase 5 — chained
return type inference, not a TypeEnv issue).
…s-as-receiver (abhigyanpatwari#315)

* feat: Phase 5 type resolution — chained calls, pattern matching, class-as-receiver, code review fixes

Phase 5.1: Chained method call resolution (depth-capped at 3)
- resolveChainedReceiver() resolves a.getUser().save() by walking the chain
  and looking up intermediate return types from the SymbolTable
- extractReceiverNode() + extractCallChain() shared in utils.ts
- receiverCallChain on ExtractedCall for worker path parity
- MAX_CHAIN_DEPTH=3 enforced in both extraction and resolution

Phase 5.2: Pattern matching binding extractors
- PatternBindingExtractor type added to LanguageTypeConfig
- declarationTypeNodes map tracks original type AST nodes for generic unwrapping
- Rust: if let Some(x)/Ok(x) unwrapping with extractGenericTypeArgs
- Java: instanceof pattern variables (Java 16+)
- C#: is-pattern disambiguation fixture (already working via extractDeclaration)

Phase 5.5d: Python standalone type annotations (name: str)
- expression_statement with type child now captured in DECLARATION_NODE_TYPES

Phase 5.5e: ReceiverKey collision fix for overloaded methods
- receiverKey preserves @StartIndex to prevent same-name method collisions
- lookupReceiverType does prefix scan with ambiguity refusal

Class-as-receiver for static method calls (abhigyanpatwari#289)
- UserService.find_user() now resolves via ctx.resolve() tiered lookup
- Respects import scoping — no false positives from unrelated packages

Code review fixes:
- Extracted CALL_EXPRESSION_TYPES + extractCallChain to utils.ts (eliminated duplication)
- Converted resolveChainedReceiver from recursion to loop (no exposed depth param)
- Added depth cap to extractReturnTypeName (defense against nested wrapper types)
- Replaced lookupFuzzy with ctx.resolve for class-as-receiver (architecturally consistent)

Closes abhigyanpatwari#289

Test coverage: 6 new fixtures, 12+ new unit tests, 7 new integration test suites

* fix: Ruby chain calls, Rust Err(x) unwrap, Enum class-as-receiver (abhigyanpatwari#315)

Address three per-language gaps identified in Phase 5 code review:

- Ruby: add `method`/`receiver` field fallbacks to extractCallChain
  (tree-sitter-ruby uses different field names than other grammars)
- Rust: handle `Err(e)` pattern binding via typeArgs[1] from Result<T,E>
- Enum: include Enum type in class-as-receiver filter (both paths)

Integration tests added for all three fixes.

* fix: chain base type resolution parity between serial and worker paths (abhigyanpatwari#315)

- Worker path: add typeEnv.lookup for chain base receiver after extraction
  (typed parameters like `fn process(svc: &UserService)` were silently lost)
- Serial path: add ctx.resolve class-as-receiver fallback for chain base
  (class-name chains like `UserService.find_user().save()` failed)
- Fix misleading comment in parse-worker.ts that described unimplemented logic
- Integration tests: typed-parameter chain, static class-name chain

* fix: Kotlin chain call extraction, createClassNameLookup Enum/Struct (abhigyanpatwari#315)

- Kotlin: extractCallChain now handles navigation_expression → navigation_suffix
  AST structure (Kotlin's call_expression has no 'function' field)
- createClassNameLookup: include Enum and Struct alongside Class for consistent
  constructor recognition in extractInitializer
- Integration test: kotlin-chain-call fixture verifying svc.getUser().save()
- Add Codex to Editor Support table
- Add Codex manual config example (~/.codex/config.toml)
- Update editor list in usage table

Fixes abhigyanpatwari#131

Made-with: Cursor
…#328)

* fix(resolver): prefer same-directory file for Python bare imports

Python's sys.path searches the importing script's own directory first,
so `import user` from services/auth.py should resolve to services/user.py
even if models/user.py was indexed first in the suffix index.

Add a proximity check in resolveImportPath that consults the existing
dirMap index (O(1)) before falling back to global suffix matching, for
single-segment bare Python imports only.

Made-with: Cursor

* refactor(resolver): replace dirMap scan with O(1) allFiles.has() for proximity check

The previous implementation used index.getFilesInDir() + siblings.find()
which had two issues:
- dirMap stores all suffix levels, so getFilesInDir('services') matched
  files from every directory named 'services/' across the repo — false
  positives in monorepos
- siblings.find() was an O(n) linear scan despite the O(1) claim

Replace with a direct allFiles.has(importerDir + '/' + name + '.py') lookup.
allFiles is a Set<string> of full repo-relative paths, so the lookup is
truly O(1) and exact — no suffix ambiguity possible.

Also fixes: dead code (the '.rb' branch was unreachable since the outer if
gates on Python), and Windows backslash handling via normalize before split.

Made-with: Cursor

* test: remove flag-based demo from unit tests

Made-with: Cursor

* fix(resolver): cover package __init__.py in proximity check and add end-to-end CALLS test

- Also try importerDir/name/__init__.py as a second O(1) candidate so that
  `import user` resolves to services/user/__init__.py when the target is a
  package rather than a bare module file
- Add unit tests for package proximity, __init__.py fallback, and Windows
  backslash path handling
- Add end-to-end CALLS assertion to the bare-import integration test:
  svc.execute() must resolve to UserService#execute in services/user.py,
  proving the fix propagates correctly through the type inference pipeline

Made-with: Cursor

* refactor: extract Python import resolution into resolvers/python.ts

- Move PEP 328 relative import and proximity-based bare import logic
  from standard.ts into a dedicated resolvers/python.ts (resolvePythonImport)
- Dispatch Python imports from resolveLanguageImport in import-processor.ts,
  consistent with how Ruby, PHP, and other languages are handled
- standard.ts is now language-agnostic (TS/JS aliases, Rust paths, suffix fallback)
- Add inline comment on __init__.py vs .py resolution order edge case
- Update unit tests to call resolvePythonImport directly

Made-with: Cursor

* docs: add PEP 302/328/451 references to python.ts comments

Made-with: Cursor

* fix(python): address reviewer comments on PEP compliance

- Guard dirParts.pop() against over-traversal: return null when dot
  count exceeds directory depth, matching CPython's ImportError for
  'attempted relative import beyond top-level package' (PEP 328)
- Swap __init__.py / .py check order to match CPython's finder
  precedence (PEP 451 §4); coexistence is physically impossible so
  order only matters for spec compliance
- Fix overstated PEP 302 comment: proximity check is a static
  heuristic, not a sys.path[0] implementation
- Acknowledge namespace package gap (PEP 420) in docstring
- Add unit test for over-traversal guard

Made-with: Cursor

* test(python): document namespace package resolution behaviour

Add two unit tests for PEP 420 namespace packages (directory with no
__init__.py): bare import returns null (expected — no file exists to
resolve to, CPython sets __file__ = None), while the submodule form
(import user.model) resolves correctly via suffixResolve fallback.

Made-with: Cursor

---------

Co-authored-by: chirag-nighut <chiragnighut@gmail.com>
…ontainer descriptors, 10-language coverage (abhigyanpatwari#318)

* feat: Phase 6 type resolution — pattern matching, for-loop Tier 1c, coverage completion

- Add patternBindingNodeTypes gate to LanguageTypeConfig for 50% perf improvement
- Expand ForLoopExtractor signature with optional declarationTypeNodes + scope
- Add extractElementTypeFromString shared utility for container type parsing
- Python match/case: extractPatternBinding for `case User() as u:` pattern
- C# refactor: move is_pattern_expression from extractDeclaration to extractPatternBinding
- Ruby: add extractPendingAssignment for assignment chain propagation
- TS/JS: add for-loop Tier 1c for `for (const user of users)` with User[] inference
- Python: add for-loop Tier 1c for `for user in users:` with type annotation inference
- Go: add for-loop Tier 1c for `for _, user := range users` with []User inference
- Fix 'Property' as any stale cast in call-processor.ts
- Add dual return-type string length cap (2048 pre-cap, 512 post-cap)
- Add chain call integration tests for C#, Go, Rust, Python, JS, C++
- Add Python match/case integration test fixtures
- 27 new extractElementTypeFromString unit tests
- 3 for-loop edge cases skipped (declarationTypeNodes scope key lookup)

* fix: address code review findings for Phase 6

- Add missing patternBindingNodeTypes to C# typeConfig (perf gate)
- Add 2048-char input length guard to extractElementTypeFromString
- Skip Python match/case integration tests (call extraction needs query updates)

* reorganise

* fix: Phase 1 bug fixes — Go range semantics, typed_parameter, bracket depth

- Go single-var range correctly returns early for slices/maps (index, not element)
- Go single-var range on channels correctly resolves element type
- Added map_type and channel_type to extractGoElementTypeFromTypeNode
- Added isChannelType helper for channel detection before skip decision
- Added 'typed_parameter' to TYPED_PARAMETER_TYPES for Python annotated params
- Fixed bracket depth tracking in extractElementTypeFromString — only match
  selected closeChar at depth 0, return undefined for mismatched brackets
- Un-skipped 3 prematurely skipped tests (TS local const, Python List/Sequence)
- Added tests for map range, single-var range semantics, bracket edge cases

* refactor: Phase 2 architecture — shared helper, required params, decoupled type nodes

- Extract resolveIterableElementType shared helper in shared.ts implementing
  3-strategy fallback (declarationTypeNodes → scopeEnv string → AST walk)
- Refactor TS, Python, Go extractors to use shared helper (eliminates 3x duplication)
- Make ForLoopExtractor params required (aligned with PatternBindingExtractor)
- Update Java, Kotlin, C# extractor signatures to accept required params
- Decouple declarationTypeNodes from scopeEnv — capture raw type annotation
  nodes BEFORE extractDeclaration for container types (User[], []User, List[User])
- Hybrid approach: direct name extraction + keysBefore fallback for multi-declarator
- Document declarationTypeNodes invariant change (superset of scopeEnv)

* feat: Phase 3 partial — Rust for-loop + C# var foreach Tier 1c

- Rust: add extractForLoopBinding with for_expression support
  - Handles &users, &mut users via reference_expression unwrapping
  - extractRustElementTypeFromTypeNode: generic_type, reference_type, slice/array
  - findRustParamElementType: AST walk with reference/mut pattern unwrapping
  - 4 unit tests (Vec<User>, &[User], range expr negative, no-annotation negative)

- C#: upgrade foreach to handle var (implicit_type) via Tier 1c
  - extractCSharpElementTypeFromTypeNode: generic_name, array_type, nullable_type
  - findCSharpParamElementType: AST walk to method_declaration parameters
  - 3 unit tests (var foreach, explicit type regression, no-annotation negative)

* feat: Phase 3 complete — all language gaps + pattern matching

Kotlin Tier 1c:
- Unannotated for-loop resolves via shared helper
- extractKotlinElementTypeFromTypeNode handles type_projection unwrapping
- findKotlinParamElementType walks to function_declaration

Java Tier 1c:
- var foreach resolves via shared helper
- extractJavaElementTypeFromTypeNode handles generic_type, array_type
- findJavaParamElementType walks to method_declaration

TypeScript:
- readonly User[] unwrapped via readonly_type → array_type recursion

C# switch patterns:
- declaration_pattern added to patternBindingNodeTypes
- extractPatternBinding handles standalone declaration_pattern (switch case/expr)

Rust match arms:
- match_arm added to patternBindingNodeTypes
- extractPatternBinding extended with match_arm → match_expression parent traversal

Python:
- as_pattern tries childForFieldName('alias') before positional fallback

Tests: 237 pass (was 224), 13 new tests added

* feat: Phase 4 — known limitation tests, match arm fix, final verification

- Fix Rust match_arm pattern extraction: unwrap match_pattern to get
  tuple_struct_pattern inside (tree-sitter-rust wraps in match_pattern node)
- Add first-writer-wins regression test for match arm scope leakage
- Add 5 documented skip tests for known limitations:
  - TS destructured for-of (tuple destructuring)
  - Python tuple unpacking in for-loops
  - TS instanceof narrowing (block-level scoping)
  - Rust for with .iter() (method call iterable)
  - Ruby block parameters (closure param inference)

Final: 238 passed, 5 skipped (documented limitations), tsc clean

* test: integration tests for all Phase 6 language gaps + fix Rust param pattern field

Integration test fixtures and tests (30 new tests, all with exact match + negative):

Rust for-loop (5 tests):
- for user in &users with Vec<User> → User#save, negative Repo#save
- for repo in &repos with Vec<Repo> → Repo#save, negative User#save

Rust match arm (5 tests):
- match opt { Some(user) => user.save() } → User#save, negative Repo#save
- if let Ok(repo) = res → Repo#save, negative User#save

C# var foreach (5 tests):
- foreach (var user in users) with List<User> → User#Save, negative Repo#Save
- foreach (var repo in repos) with List<Repo> → Repo#Save

C# switch pattern (4 tests):
- is User user → User#Save, case Repo repo → Repo#Save

Kotlin unannotated for (4 tests):
- for (user in users) with List<User> → user.save, negative repo.save

Go map range (3 tests):
- for _, user := range userMap with map[string]User → User#Save, negative

TypeScript readonly (4 tests):
- for (const user of users) with readonly User[] → user.save, negative

Bug fix: type-env.ts parameter branch now falls back to childForFieldName('pattern')
for Rust parameters (Rust uses 'pattern' not 'name' for parameter names)

* test: add assertion bodies to known limitation skip tests

Convert empty skip test stubs to proper tests with parse/buildTypeEnv/expect
assertions following the codebase convention (e.g., call-processor.test.ts:319).
Each skip test now documents the exact expected behavior, so removing .skip
will cause a meaningful failure when the limitation is eventually fixed.

Also clarify Python integration skip tests as call-extraction issues (not
type-env) and Swift integration skips as build-dep issues (self/super
resolution code already exists in type-env.ts).

* feat: resolve 4 known limitation skip tests + method-aware type arg selection

Unskip 4 of 5 type-env known limitations with full integration test coverage:

1. TS destructured for-of: handle array_pattern by binding last named child
   to element type. Fix Map<K,V> to return last generic arg (value type).
2. Python dict.items() loop: handle `call` iterables + `pattern_list` left
   side. Fix dict[K,V] extraction via type_parameter with last-arg heuristic.
   Unwrap `type` wrapper in extractPyElementTypeFromAnnotation.
3. TS instanceof narrowing: add extractPatternBinding for binary_expression
   with positional child access. First-writer-wins (not block-scoped).
4. Rust .iter() for-loops: handle call_expression in for_expression value
   node by extracting receiver from field_expression.

Method-aware type arg resolution:
- Add TypeArgPosition ('first'|'last') to resolveIterableElementType
- .keys()/.keySet()/.Keys → first type arg (key); all else → last (value)
- Thread position through all 3 strategy callbacks in TS/Rust/Python
- Add predefined_type to extractSimpleTypeName for TS primitives (string etc)

New fixtures: rust-iter-for-loop, typescript-destructured-for-of,
typescript-instanceof-narrowing, python-dict-items-loop.
248 unit tests pass (6 new), 1 skip (Ruby block params).

* feat: container descriptor table for generic type arg resolution

Replace simple KEY_METHODS heuristic with CONTAINER_DESCRIPTORS table
that maps 30+ container types across all languages to their type parameter
semantics per access method.

Key improvements:
- Container-aware resolution: HashMap.iter() correctly yields V (arity 2),
  while Vec.iter() yields T (arity 1) — same method, different semantics
- Cross-language coverage: Map/HashMap/BTreeMap/dict/Dict/Dictionary/
  ConcurrentHashMap + List/Vec/Set/HashSet/Queue/Deque/Stack etc.
- Method categorization: keyMethods (keys/keySet/Keys) vs valueMethods
  (values/get/pop/iter/first/last) per container type
- Fallback for unknown containers: still uses method name heuristic,
  so MyCache<K,V>.keys() correctly returns first arg
- Exported getContainerDescriptor() for future heritage-chain lookups

Each language extractor now passes containerTypeName from scopeEnv to
methodToTypeArgPosition for descriptor-aware resolution.

252 unit tests pass (4 new descriptor tests), 1 skip (Ruby).

* feat: method-aware for-loop extractors + integration tests for all languages

Upgrade 4 existing extractors + create 3 new ones for full cross-language
coverage of call_expression iterables and container descriptor resolution:

Upgraded (add call expr iterable + methodToTypeArgPosition):
- Java: method_invocation (data.keySet(), data.values())
- Kotlin: navigation_expression + call_expression (data.keys, data.values())
- C#: member_access_expression + invocation_expression (data.Keys, data.Values)
- Go: TypeArgPosition threading for Go 1.18+ generics

New for-loop extractors:
- C++: for_range_loop with auto& unwrapping, template_type + qualified_identifier
  (std::vector<User>) extraction, explicit vs auto type handling
- PHP: foreach_statement with simple/key-value/by-reference forms, PHPDoc
  @param priority over AST array type
- Ruby: for-in with YARD @param type resolution via comment parsing

Integration test fixtures + tests for all 6 languages:
- java-map-keys-values (Map.values() + List iteration)
- kotlin-map-keys-values (HashMap.values + List iteration)
- csharp-dictionary-keys-values (Dictionary.Values foreach)
- cpp-range-for (auto& + const auto& range-based for)
- php-foreach-loop (foreach with PHPDoc @param User[])
- ruby-for-in-loop (for-in with YARD @param Array<User>)

Bugs fixed during integration testing:
- C++: qualified_identifier (std::vector) not unwrapped to template_type
- PHP: extractParameter overwrote PHPDoc-derived types with bare 'array'

252 unit tests pass, 201 integration tests pass across 6 languages.

* fix: update extractElementTypeFromString tests for last-arg default

TypeArgPosition change (default 'last') broke 5 existing tests expecting
first arg from multi-arg generics. Updated expectations and added explicit
pos='first' tests for key type extraction.

* fix: rename C++ fixture files to correct case for case-sensitive CI

On case-sensitive filesystems (Linux/macOS CI), git tracked both the old
lowercase files (app.cpp, user.h) and the new uppercase files (App.cpp,
User.h) as separate files. The pipeline processed both, causing the old
app.cpp (with explicit User& type) to interfere with the new auto& test.

Removes old lowercase entries and re-adds with uppercase casing to match
the #include directives in the fixture.

* feat: PR abhigyanpatwari#318 review findings — pattern bindings, member access iterables, structured bindings

Address all 7 genuine gaps identified in PR abhigyanpatwari#318 deep code review:

- Kotlin: add extractKotlinPatternBinding for when/is (type_test AST node)
  with allowPatternBindingOverwrite for smart-cast semantics
- Java: add type_pattern branch for Java 17+ switch pattern variables
- TypeScript: explicit object_pattern skip in for-of (no false bindings)
- Cross-language: member access iterables (self.users, this.users, repo.users)
  across all 10 language extractors
- C++: structured_binding_declarator handling in range-for (last-child heuristic)
- Rust: closure_parameter added to TYPED_PARAMETER_TYPES
- PHP: normalizePhpType handles angle-bracket generics (Collection<User>)

Code review fixes applied:
- Remove 4 debug console.log statements (c-cpp.ts, call-processor.ts)
- Hoist KNOWN_CONTAINER_PROPS to module scope (csharp.ts)
- Guard keysBefore allocation behind typeNode check (type-env.ts)
- Add depth limits (50) to 7 recursive type extraction functions
- Add 2048-char length cap to extractSimpleTypeName
- Fix PHP/Ruby missing typeArgPos parameter in resolveIterableElementType

Integration test fixtures: kotlin-when-pattern, java-switch-pattern,
cpp-structured-binding, typescript-member-access-for-loop,
python-member-access-for-loop

* fix: position-indexed when/is bindings, Kotlin param extraction, HashMap.values for-loop

Three root causes for failing Kotlin integration tests:

1. When/is multi-arm resolution: flat scopeEnv stored only the last arm's
   type (last-writer-wins). Added PatternOverrides with AST range indexing
   so each when arm resolves to its narrowed type independently.

2. HashMap.values for-loop: navigation_expression without call_suffix was
   classified as bare property access (iterableName='values' instead of
   'data'). Now tries object-as-iterable + property-as-method first, with
   fallback to property-as-iterable for this.users patterns.

3. Kotlin parameter extraction: tree-sitter-kotlin parameter nodes use
   positional children (simple_identifier, user_type) not named fields
   (name, type). Added fallback to findChildByType in both
   extractKotlinParameter and extractTypeBinding.

Integration tests added for .keys/.values/Set/MutableMap iteration,
3-arm when/is, multi-call within arms, and when+else branch.

* feat: enhance PHP type resolution for generics and member access in foreach loops

* feat: Phase 6.1 type resolution gap closure — container descriptors, recursive_pattern, class fields

Add 13 missing container type descriptors (Collection, MutableMap, Stream, SortedSet, etc.)
to CONTAINER_DESCRIPTORS for correct element type extraction across C#, Kotlin, and Java.

Extend C# pattern binding to handle recursive_pattern (obj is User { Name: "Alice" } u)
in both is-expression and switch expression contexts.

Add TypeScript class field declaration support (public_field_definition) so for-loop
iteration over this.fieldName resolves element types from class field type annotations.
Includes file-scope fallback in resolveIterableElementType and nested member_expression
handling for this.field.method() patterns.

* docs: add type resolution system documentation with roadmap

Covers the full architecture, resolution tiers (0-2), scope model,
language feature matrix, container descriptors, pipeline integration,
and the Phase 7-9 roadmap for cross-scope propagation, field-type
resolution, and return-type-aware binding.

* feat: Phase 6.2 review findings — C# nested member foreach, C++ deref range-for, Java field_access

Close two gaps found during fourth-pass review of PR abhigyanpatwari#318:

- C# foreach (var user in this.data.Values): nested member_access_expression
  now extracts intermediate property name for scopeEnv lookup
- C++ for (auto& user : *ptr): pointer_expression dereference now recognized
  as range-for iterable

Root causes fixed in shared infrastructure:
- extractSimpleTypeName: add template_type (C++) and generic_name (C#)
- extractGenericTypeArgs: add generic_name for consistency
- type-env.ts: unwrap variable_declaration wrapper in field_declaration
  for declarationTypeNodes capture (zero-allocation manual loop)

Additional review findings addressed:
- Java: add field_access handler for this.data.values() in method_invocation
- C++ pointer_expression: document limitation (*identifier only)
- TypeScript: fix stale comment about property_identifier

All 525 tests pass (278 unit + 247 integration).

* perf: optimize type resolution pipeline — worker threshold, skip graph phases, AST pruning

- Skip worker pool creation for small repos (<15 files or <512KB) — saves 100-400ms
- Add skipGraphPhases option to runPipelineFromRepo to skip MRO/community/process phases
- Add conservative SKIP_SUBTREE_TYPES for leaf-only AST nodes (string, comment, number)
- Pre-compute interestingNodeTypes set — single Set.has() replaces 3 checks per node
- Add fastStripNullable — skip full stripNullable for simple identifiers (90%+ case)
- Replace .children?.find() with manual for loops in extractFunctionName (no array alloc)
- Add hookTimeout: 120000 to vitest.config.ts for CI beforeAll hooks

* fix: review findings — remove template_string from SKIP_SUBTREE_TYPES, handle bare nullable keywords

- Remove template_string and concatenated_string from SKIP_SUBTREE_TYPES
  (template literals contain interpolated expressions with typed code)
- Add FAST_NULLABLE_KEYWORDS check to fastStripNullable for behavioral
  parity with stripNullable on bare null/undefined/void/None/nil
- Add explanatory comment on extractPendingAssignment scopeEnv guard

* feat: add type resolution system and roadmap documentation
…aphPhases, AST pruning

- 6 new unit tests for fastStripNullable branches (simple id, nullable union, bare keyword)
- 4 new integration tests for skipGraphPhases pipeline option
- Tests for SKIP_SUBTREE_TYPES and interestingNodeTypes code paths
…shing (abhigyanpatwari#321) (abhigyanpatwari#345)

* fix(impact): return structured error + partial results instead of crashing (abhigyanpatwari#321)

- Wrap impact() in try-catch to return structured error JSON instead of
  process crash (SIGSEGV/exit 139)
- Extract core logic to _impactImpl() for clean error boundary
- Break out of depth traversal loop on query failure, return partial
  results collected so far (previously silently swallowed errors)
- Add 'partial' flag to response when traversal was interrupted
- Add try-catch in CLI impactCommand with structured error output
- Improve formatImpactResult to show suggestion text and partial warning
- Add 3 new unit tests for error/suggestion/partial scenarios

Fixes abhigyanpatwari#321

* fix: address review feedback — 4 bugs from @claude review

Per @claude's review (requested by @magyargergo):

- [BUG 1] Consistent target field shape: error responses now return
  {name: string} instead of raw string, matching success response schema
- [BUG 2] Remove misleading partial:true from total-failure responses
  (partial is only meaningful when some depth levels succeeded)
- [BUG 3] Move getBackend() inside try-catch in impactCommand so
  backend init failures return structured JSON instead of crashing
- [BUG 4] Safe error message extraction: use instanceof Error check
  to handle thrown strings correctly (err?.message is undefined for
  non-Error thrown values)
- [MINOR] Add radix argument to parseInt (10)

* test: add integration tests for impact error handling (abhigyanpatwari#321)

Per @claude's recommendation (requested by @magyargergo):

- impact: structured error for unknown symbol (no crash)
- impact: error response has consistent {name: string} target shape
- impact: partial:true only set when some results were collected

Tests use existing withTestLbugDB + seeded graph fixture.
…ss-property iterables (abhigyanpatwari#341)

* feat(type-resolution): Phase 7.1+7.2 foundation — ReturnTypeLookup, context object, pendingCallResults

- Move extractReturnTypeName + helpers from call-processor.ts to type-extractors/shared.ts
  (breaks circular import risk: call-processor → type-env → type-extractors → call-processor)
- Add SymbolTable.lookupFuzzyCallable(name) — lazy callable-only index, O(1) per call,
  invalidated on add(); avoids per-call .filter() on lookupFuzzy results
- Add ReturnTypeLookup interface (conservative: undefined when 0 or 2+ callables match)
- Add ForLoopExtractorContext interface — replaces 4 positional params with context object;
  update all 10 language extractor implementations (go, ts, py, jvm×2, cs, rs, rb, php, c-cpp)
- Add PendingAssignment discriminated union (kind: 'copy' | 'callResult');
  update PendingAssignmentExtractor in all 9 language extractors that implement it
- Wire buildTypeEnv: build ReturnTypeLookup from optional symbolTable; split pendingAssignments
  into pendingCopies + pendingCallResults; add Tier 2b call-result propagation loop
- Update call-processor.test.ts to import extractReturnTypeName from shared.ts

* feat(type-resolution): Phase 7.3 — call_expression iterables in for-loop extractors (7 languages)

Extends for-loop type extraction in all 7 typed-iteration languages to
resolve element types when the iterable is a direct function call.

**New capability**: `for (var u : getUsers())` in Java, `for u in get_users()`
in Python, `for user in getUsers()` in TypeScript, etc. now resolve
`u`/`user` to the callee's return element type via lookupRawReturnType +
extractElementTypeFromString.

Changes per language:
- types.ts: extend ReturnTypeLookup with lookupRawReturnType (raw return
  string for container-type extraction); update ForLoopExtractorContext
  with returnTypeLookup field
- type-env.ts: implement lookupRawReturnType on the concrete ReturnTypeLookup
  built in buildTypeEnv (same guards as lookupReturnType, no extractReturnTypeName)
- go.ts: call_expression branch in range_clause — identifier func or
  selector_expression method; existing isChannelType guards updated
- typescript.ts: identifier fn branch inside call_expression handler
- python.ts: identifier fn branch inside call handler
- jvm.ts (Java): method_invocation without object field in enhanced_for_statement
- jvm.ts (Kotlin): simple_identifier callee branch in call_expression node
- csharp.ts: identifier fn branch in invocation_expression handler
- rust.ts: identifier func branch in call_expression handler (alongside
  existing field_expression/method-call path)

All branches follow the same conservative pattern:
  lookupRawReturnType(callee) → extractElementTypeFromString → bind loop var

* feat(type-resolution): Phase 7.4 — PHP \$this->property iterable via @var class property scan

Adds Strategy C to PHP's extractForLoopBinding for the pattern:

  foreach (\$this->property as \$item)

when Strategy A (resolveIterableElementType) and Strategy B (scopeEnv lookup)
both fail to find the element type.

Strategy C: when the iterable is a member_access_expression with object '$this',
walk up the AST to the enclosing class_declaration, scan its declaration_list
for a property_declaration whose variable_name matches the property, and extract
the element type from:
  1. PHPDoc @var annotation on a preceding comment sibling (/** @var User[] */)
  2. PHP 7.4+ native type field (e.g. UserRepo \$repo — skips generic 'array')

This eliminates the @param workaround that was previously required in the
php-foreach-member-access fixture (which used @param User[] \$users on the method
to populate the method's scopeEnv with a \$users binding).

New helpers in php.ts:
- PHPDOC_VAR_RE: regex for @var extraction
- extractClassPropertyElementType: reads @var or native type from a property_declaration
- findClassPropertyElementType: scans class body for a named property

Tests added (type-env.test.ts):
- PHP: resolves from @var User[] without @param workaround
- PHP: conservative — no binding for unknown property
- PHP: multi-class file — both classes resolve independently

Fixture updated (php-foreach-member-access/App.php):
- Removed the @param User[] \$users workaround from processMembers()
- Test now validates the natural class-property-based resolution path

* docs: mark Phase 7 complete in type-resolution-roadmap.md

Records that 7A (call_expression iterables, 7 languages), 7B (PHP
$this->property via @var scan), and 7C (ReturnTypeLookup + context object)
are all shipped. Adds implementation notes and strikethroughs on resolved
language-specific gaps.

* fix(docs): update project references to feat-phase7-type-resolution in AGENTS.md and CLAUDE.md

* feat(type-resolution): Phase 7.5 — PHP call_expression foreach + integration tests for 7 languages

Add integration test coverage for Phase 7.3's call_expression iterable
resolution across all 7 languages (Go, TypeScript, Python, Java, Kotlin,
PHP, Rust). Each test creates a fixture with competing User/Repo classes
that both define save(), then verifies for-loop iteration over a function
call's return value resolves to the correct class.

PHP was missing function_call_expression support in its for-loop extractor.
Three changes fix this:
- php.ts extractForLoopBinding: handle function_call_expression and
  member_call_expression iterables via returnTypeLookup
- php.ts normalizePhpReturnType: preserve array notation (User[]) in
  SymbolTable so lookupRawReturnType returns useful container types
- parse-worker.ts + parsing-processor.ts: upgrade uninformative AST
  return types (array, iterable) with PHPDoc @return annotations

35 new integration tests (5 per language), 2525 total tests passing.

* fix(type-resolution): address PR abhigyanpatwari#341 review findings — PHP asymmetry + dormant infrastructure docs

- Replace normalizePhpType with extractElementTypeFromString in PHP call-expression
  foreach paths, aligning with all 6 other language extractors and preventing
  incorrect binding of bare non-container types like User
- Add NOTE comments clarifying pendingCallResults Tier 2b is infrastructure-ready
  but no extractor populates it yet
- Expand Go channel-type comments explaining why non-channel assumption is safe

* fix(type-resolution): address verification review — docs accuracy + PHP fallback guard

- Roadmap lines 86/100: correct pendingCallResults from "active" to "dormant infrastructure (Phase 9)"
- type-resolution-system.md line 363: update to reflect Phase 7.3 loop inference is delivered
- type-resolution-system.md line 409: clarify for-loop call-expression resolution (done) vs general assignment propagation (pending)
- php.ts:127: add declaration_list type guard on fallback to prevent silent wrong results
… (abhigyanpatwari#349)

* fix: MCP server crashes under parallel tool calls (abhigyanpatwari#326)

* fix: ensure full connection pool is pre-created to avoid race conditions during query execution

* fix: improve graceful shutdown handling with exit codes

* fix: resolve critical concurrency bugs in connection pool init

- Add initPromises dedup map to prevent double-init race when parallel
  tool calls trigger initLbug for the same repoId simultaneously
- Move pool.set() after FTS load so concurrent checkout can't grab a
  connection mid-async-init (FTS race on available[0])
- Replace lazy createConnection growth path with integrity error — pool
  is pre-warmed, lazy creation would silence stdout during active queries
- Add preWarmActive flag so watchdog timer skips stdout restore during
  the synchronous pre-warm loop
- Unify stdout capture: server.ts imports realStdoutWrite from
  lbug-adapter instead of capturing its own copy

* test: add connection pool parallel stability tests

7 integration tests covering concurrent query safety, waiter queue
overflow, stdout.write restoration, connection leak detection, initLbug
deduplication, atomic pool visibility, and mixed query types.

* fix: run LadybugDB tests sequentially via vitest projects config

Vitest's projects feature splits test files into two groups: lbug-db
(fileParallelism: false) and default (parallel). This prevents native
mmap file-lock conflicts on Windows without requiring the CI shell loop
locally.

* test: add enrichment Promise.all regression test for abhigyanpatwari#292/abhigyanpatwari#316

Verifies that 3 concurrent queries via Promise.all (the exact pattern
from the impact command's enrichment phase at local-backend.ts:1415)
complete without SIGSEGV on a pre-warmed connection pool.
…rns (abhigyanpatwari#356)

- Handle `for i, k, v in enumerate(d.items())` — flat pattern
- Handle `for i, (k, v) in enumerate(d.items())` — nested tuple_pattern
- Handle `for (k, v) in enumerate(users)` — parenthesized tuple as top-level

Extract helper functions for cleaner code:
- `extractMethodCall()` — deduplicate method call parsing
- `collectPatternIdentifiers()` — recursively collect identifiers from patterns

Add unit tests for TypeEnv and integration tests verifying CALLS edges.

Made-with: Cursor

Co-authored-by: chirag-nighut <chiragnighut@gmail.com>
demirciberk and others added 7 commits March 18, 2026 14:52
…bhigyanpatwari#364)

* fix: mapping all supported languages to callRouters to fix undefined apply error (abhigyanpatwari#352)

* fix(cli): remove duplicate Swift/Kotlin keys in callRouters causing TS1117

* refactor: remove runtime callRouter fallbacks, rely on Record<SupportedLanguages> compile-time enforcement

* refactor: use 'satisfies' keyword for callRouters compile-time enforcement
* feat: Phase 8 field/property type resolution — resolve chained member access

Add field/property type extraction to the type resolution system so that
chained member access like `user.address.save()` resolves the intermediate
receiver type (`address → Address`) through Property symbols in SymbolTable.

Key changes:
- SymbolTable: add `declaredType` field, `fieldByOwner` O(1) index,
  `lookupFieldByOwner()` method, P0 conditional callableIndex invalidation,
  P2 exclude Properties from globalIndex to prevent namespace pollution
- tree-sitter queries: add `definition.property` for TypeScript, Java, Go
- parse-worker: extract declared types for Property nodes via
  `extractPropertyDeclaredType()`, capture field-access receiver info
- call-processor: add `resolveFieldAccessType()` helper and field-access
  branch in both sequential and worker receiver resolution paths
- Integration tests: new field-types test suite verifying end-to-end
  `user.address.save() → Address#save` resolution

* fix: Go tree-sitter query captures field_declaration not field_declaration_list

Post-review fix: the Go struct field query incorrectly put @definition.property
on field_declaration_list (the list container) instead of field_declaration
(the individual field). Also removed unused `language` parameter from
extractPropertyDeclaredType.

* feat: expand field-type tests to 6 languages, fix Go ownerId and Kotlin navigation_expression

- Add integration test fixtures for Java, C#, Go, Kotlin, PHP (alongside existing TS)
- Fix Go: add type_declaration handling in findEnclosingClassId for struct fields
  (field_declaration → field_declaration_list → struct_type → type_spec → type_declaration)
- Fix Kotlin: add navigation_expression handling in field-access resolution
  (Kotlin uses navigation_expression + navigation_suffix, not member_expression)
- Add extractMemberAccessParts helper in call-processor for cross-language member access
- All 24 field-type tests pass across 6 languages, 181 Go+Kotlin tests pass with no regressions

* refactor: split HAS_METHOD into HAS_METHOD + HAS_PROPERTY edge types

Property nodes now use HAS_PROPERTY edges instead of HAS_METHOD, giving
the graph schema proper semantic separation between methods and fields.

- HAS_METHOD: Method, Constructor, Function (when inside a class)
- HAS_PROPERTY: Property nodes (class fields, struct fields, attributes)

MRO processor only reads HAS_METHOD — properties correctly excluded from
method resolution order. Impact analysis accepts both edge types.

Updated 12 files: graph types, schema, tools docs, parse-worker,
parsing-processor, call-processor, and 6 test files.

* fix(test): update security test to expect 7 VALID_RELATION_TYPES (added HAS_PROPERTY)

* test: add unit tests for Phase 8 SymbolTable features (39 tests, up from 19)

Cover all new branches: declaredType metadata, Property exclusion from
globalIndex, conditional callableIndex invalidation, lookupFieldByOwner
(happy path + edge cases), lookupFuzzyCallable filtering, and clear()
with fieldByOwner. Fixes branch coverage threshold (21.8% → 23%+).

* feat: Phase 8B mixed field+method chain resolution, C++/Rust chain fixes

Unify field and method chain resolution into a single `extractMixedChain`
walker that handles interleaved patterns like `svc.getUser().address.save()`.
Fix C++ chain calls (tree-sitter-cpp `field_expression` uses `argument` not
`object`), Rust unit struct instantiation (`let svc = TypeName;`), and add
stdlib passthrough for `unwrap()`/`clone()`/`expect()` in chain loops.

Key changes:
- Replace `receiverCallChain` + `receiverFieldAccess` with unified
  `receiverMixedChain: MixedChainStep[]` on ExtractedCall
- Add `extractMixedChain` in utils.ts (handles both call_expression and
  field_expression nodes, including C++ `argument` field)
- Add `TYPE_PRESERVING_METHODS` set for stdlib identity operations
- Add C++ inline method double-indexing guard in parsing-processor.ts
  and parse-worker.ts
- Add Rust unit struct recognition in type-extractors/rust.ts
- Split field-types.test.ts into per-language test files
- Add ts-mixed-chain fixture and integration tests
- Resolve rust.test.ts todo: Option<T>.unwrap().save() now works
- Update roadmap: Phases 7+8 complete, Phase 9 is next

* fix: Python declaredType extraction and sequential-path property registration

- Move @definition.property capture from expression_statement to assignment
  node in Python queries so Strategy 1 childForFieldName('type') succeeds
- Pass item.declaredType through ctx.symbols.add in sequential call-processor
  path, matching worker path behavior (fixes Ruby YARD declaredType drop)
- Add Python chain resolution integration test (user.address.save → Address#save)
- Update Rust/Python status in roadmap and system docs to reflect actual coverage

* fix: Python/Ruby field type disambiguation and Rust chain test

Three fixes from PR abhigyanpatwari#354 third review:

1. Python typed_parameter name extraction: tree-sitter-python's
   typed_parameter uses positional children for the name, not a named
   field. TypeEnv and extractParameter now fall back to firstNamedChild.

2. Ruby/Python call-step field resolution: Ruby's AST uses `call` nodes
   for both property access and method calls. The chain walker now tries
   resolveFieldAccessType before resolveCallTarget for call steps, so
   attr_accessor properties resolve via declaredType.

3. Rust chain resolution test: added missing integration test asserting
   user.address.save() resolves to Address#save.

Also splits C/C++ and TS/JS columns in type-resolution-system.md
language matrix with footnotes for accuracy.

1062 resolver integration tests passing, 0 failures.

* refactor: Phase 8 code review cleanup — extract walkMixedChain, fix MCP agent gaps

- Extract duplicated chain resolution loop into shared walkMixedChain() helper,
  eliminating ~60 lines of copy-pasted code between sequential and worker paths
- Add returnType to ResolveResult, removing redundant lookupFuzzy+find per chain step
- Fix context() tool to include HAS_METHOD, HAS_PROPERTY, OVERRIDES in queries
  so agents can discover class members
- Fix p.declaredType Cypher example (column doesn't exist) → p.description
- Add HAS_METHOD, HAS_PROPERTY, OVERRIDES to schema resource
- Document HAS_METHOD/HAS_PROPERTY in impact tool description
- Delete dead code extractMemberAccessParts (superseded by extractMixedChain)
- Replace any with SyntaxNode on extractPropertyDeclaredType
- Add Rust deep-field-chain test (5 tests), Java mixed-chain (4), Go mixed-chain (4)
- All 1075 tests pass (13 new, 0 regressions)

* refactor: type SymbolDefinition.type as NodeLabel, add O(1) receiver index

- Change SymbolDefinition.type from string to NodeLabel union (35 members)
  across symbol-table.ts, parse-worker.ts, parsing-processor.ts — compiler
  now enforces correctness at all comparison/assignment sites
- Replace O(N*M) linear scan in lookupReceiverType with pre-built
  ReceiverTypeIndex (Map<funcName, Map<varName, Entry>>) for O(1) lookups
  with proper ambiguity handling and file-level fallback
- All 1075 tests pass, 0 regressions

* fix: capture C++ pointer/ref fields, Kotlin data class props, PHP constructor promotion

Add tree-sitter query patterns for three previously missed property declaration
forms: C++ pointer/reference member fields (Address* addr; Address& ref;),
Kotlin primary constructor val/var parameters (data class User(val name: String)),
and PHP 8.0+ constructor property promotion (public Address $address).

Fix "10 languages" off-by-one in docs (Ruby is single-level only, not deep chain).
Update Python feature matrix cell from No* to Yes* after 31b95f0 fix.

11 new integration tests with per-language fixtures verify property capture,
HAS_PROPERTY edge emission, and field-access chain resolution.
The SupportedLanguages enum includes Kotlin but the web project's
LANGUAGE_QUERIES and languageFileMap Records were missing it, breaking
the Vercel build with TS2741.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@zander-raycraft

Copy link
Copy Markdown
Owner Author

@claude can you look at this

@zander-raycraft zander-raycraft marked this pull request as ready for review March 18, 2026 21:48
@zander-raycraft zander-raycraft merged commit 8b9e5d6 into main Mar 18, 2026
17 checks passed
@zander-raycraft

Copy link
Copy Markdown
Owner Author

@claude can you review this

1 similar comment
@zander-raycraft

Copy link
Copy Markdown
Owner Author

@claude can you review this

@zander-raycraft

Copy link
Copy Markdown
Owner Author

@claude can you look at this pt3

@zander-raycraft

Copy link
Copy Markdown
Owner Author

@claude v3 can you look at this

@claude

claude Bot commented Mar 18, 2026

Copy link
Copy Markdown

Claude encountered an error —— View job


I'll analyze this and get back to you.

@zander-raycraft zander-raycraft deleted the CICD/claude-actions-review branch March 18, 2026 22:39
zander-raycraft pushed a commit that referenced this pull request Apr 6, 2026
…gyanpatwari#498)

* feat: add COBOL language support with regex extraction pipeline

Standalone COBOL processor following the markdown-processor.ts pattern:
- No LanguageProvider modification — COBOL uses regex, not tree-sitter
- No SupportedLanguages enum change — standalone processor pattern

New files:
- cobol-processor.ts — orchestrator (processCobol, isCobolFile, isJclFile)
- cobol/cobol-preprocessor.ts — regex state machine extraction (~888 LOC)
- cobol/cobol-copy-expander.ts — COPY statement expansion with circular detection
- cobol/jcl-parser.ts — JCL job/step/DD extraction
- cobol/jcl-processor.ts — JCL graph node creation

Extraction produces:
- Module nodes (PROGRAM-ID)
- Function nodes (paragraphs)
- Namespace nodes (sections)
- Property nodes (data items)
- CALLS edges (PERFORM intra-file, CALL cross-program)
- IMPORTS edges (COPY statements)
- CONTAINS edges (section → paragraph hierarchy)

Pipeline integration: single processCobol() call in Phase 2.6

54 new tests (33 COBOL + 21 JCL), all 3889 tests pass.

* docs: document custom processor pattern in pipeline.ts

Add comment block at the custom processor integration point
documenting the pattern for future non-tree-sitter language additions.

* feat(cobol): enrich graph with EXEC SQL/CICS, ENTRY points, MOVE data flow, PERFORM THRU

Maps the remaining 60% of CobolRegexResults to the graph:
- EXEC SQL blocks → CodeElement nodes + ACCESSES edges to DB tables
- EXEC CICS LINK/XCTL → CodeElement nodes + cross-program CALLS edges
- ENTRY points → Constructor nodes (registered for cross-program resolution)
- MOVE statements → ACCESSES edges (read/write data flow tracking)
- PERFORM THRU → expanded CALLS edges for range targets
- File declarations → Record nodes with assignment metadata
- Cross-program CALL 2nd pass: resolves unresolved targets after all programs processed

* test(cobol): add 26 integration tests with exact assertions + fix CICS resolution bug

Integration tests (test/integration/resolvers/cobol.test.ts):
- 26 tests covering full COBOL system extraction
- ALL assertions use exact toBe(N) — zero fuzzy assertions
- Fixtures: CUSTUPDT.cbl, AUDITLOG.cbl, CUSTDAT.cpy, RPTGEN.cbl, RUNJOBS.jcl

Bug fix (cobol-processor.ts):
- CICS LINK/XCTL cross-program resolution was broken — edges were
  created with "resolved" reason but pointing to <unresolved> targets
- Fix: use cics-link-unresolved / cics-xctl-unresolved suffix pattern
  matching the existing cobol-call-unresolved pattern
- Second-pass resolver now patches both CALL and CICS unresolved edges

All 3915 tests pass, 0 failures.

* test(cobol): exhaustive 57-test suite with strict exact assertions

Complete rewrite of COBOL integration tests using ground-truth approach:
dump the full graph, then assert EVERY node and EVERY edge.

57 tests across 9 sections:
- Node completeness: Module(3), Function(13), Namespace(2), Property(21),
  Record(1), CodeElement(8), Constructor(1) — exact sorted arrays
- Edge completeness: 22 tests covering every type+reason combination
  with exact source→target pairs
- Cross-program resolution: 6 tests verifying CALL, CICS LINK/XCTL, JCL
- COPY expansion: copybook data items in RPTGEN
- Section hierarchy: exact paragraph membership per section
- Data item ownership: exact per-module breakdown
- MOVE data flow: exact read/write pairs
- JCL integration: job/step/dataset containment
- Grand totals: CALLS(22), CONTAINS(48), IMPORTS(1), ACCESSES(7)

Fixture enhancements:
- CUSTUPDT.cbl: added INIT-SECTION + PROCESSING-SECTION, PERFORM THRU
- AUDITLOG.cbl: added ENTRY "AUDITLOG-BATCH"
- RPTGEN.cbl: added EXEC CICS XCTL

Zero fuzzy assertions — every expect uses toBe(N) or toEqual([...sorted]).

* fix(cobol): add removeRelationship API + single-quote CALL/COPY/ENTRY, PERFORM keyword skip

Phase 0A: Add removeRelationship(id) to KnowledgeGraph interface and
implementation (trivial Map.delete wrapper). Required for orphan edge
cleanup in next commit.

Phase 1A (from PR abhigyanpatwari#500 review, modified):
- RE_CALL and RE_COPY_QUOTED now match both "double" and 'single' quotes
- parseSingleCopyStatement in copy-expander updated for single quotes
- PERFORM_KEYWORD_SKIP set prevents UNTIL/VARYING/WITH/TEST/FOREVER
  from being stored as false-positive perform targets
- Sequence number stripping uses /[^0-9 ]/ (preserves numeric seq numbers
  unlike PR abhigyanpatwari#500's /\S/ which stripped them)
- Normalized || to ?? for regex group extraction in copy-expander

5 new graph unit tests, all 57 COBOL integration tests pass.

* fix(cobol): RE_ENTRY single-quote + remove orphan unresolved CALLS edges

Phase 1B: RE_ENTRY regex now supports both "double" and 'single' quoted
ENTRY targets. Uses named intermediates (entryName, usingClause) with ??
operator. USING capture group shifted from [2] to [3].

Phase 1C: Second-pass resolution now collects resolved orphan edge IDs
during iteration and removes them after the loop completes, using the new
graph.removeRelationship() API. Graph no longer contains phantom
<unresolved>: edges alongside their resolved replacements. CALLS count
drops from 22 to 18 (4 orphan edges removed).

* fix(cobol): Property ID collisions + O(1) Map lookup for MOVE edges

Phase 1D+3C (atomic): Property node IDs now use composite key
filePath:section:level:name instead of filePath:name. This prevents
duplicate data item names in different sections (e.g., STATUS in both
WORKING-STORAGE and LINKAGE) from silently colliding.

New generatePropertyId() helper ensures both node creation and MOVE
edge lookup use the identical key formula. buildDataItemMap() replaces
the O(n) findDataItemNode linear scan with O(1) Map lookup, built once
per file before MOVE processing.

* feat(cobol): MOVE multi-target extraction with OF/IN qualifier filtering

MOVE X TO A B C now produces write edges for all targets, not just the
first. extractMoveTargets() helper handles OF/IN qualified names
(WS-NAME OF WS-RECORD -> target is WS-NAME), subscript stripping
(WS-TABLE(I) -> WS-TABLE), and MOVE_SKIP filtering on targets.

Data model: CobolRegexResults.moves.to:string -> targets:string[]
MOVE CORRESPONDING stays single-target per COBOL standard.
Processor MOVE loop now iterates move.targets.

* feat(cobol): COPY IN/OF library, pseudotext REPLACING, dynamic CALL, PERFORM TIMES, CICS MAP unquoted

Phase 2B: COPY ... IN/OF library-name now captured as metadata in
CopyResolution (IN and OF are synonyms per COBOL-85 standard).

Phase 2C: COPY REPLACING ==pseudotext== support. Tokenizer handles
==...== delimiters alongside "quoted" strings. Pseudotext forces EXACT
type. Two-pass applyReplacing: first pass handles space-containing/
non-identifier pseudotext via global string replace; second pass handles
identifier-level LEADING/TRAILING/EXACT. New test file
cobol-copy-expander.test.ts with 10 tests.

Phase 2E: PERFORM WS-COUNT TIMES no longer produces a false-positive
perform target (checks for TIMES keyword after captured identifier).

Phase 2F: Dynamic CALL via data item (CALL WS-PROG-NAME without quotes)
now emits a CodeElement annotation node with description 'dynamic-call'
instead of silently ignoring. Adds isQuoted:boolean to call results.

Phase 3A: CICS MAP(WS-MAP-NAME) unquoted identifiers now captured.
Phase 3B: Normalized || to ?? in copy-expander (done in Phase 1A).

* feat(cobol): nested program support — capture multiple PROGRAM-IDs per file

Phase 2D: The state machine now captures all PROGRAM-IDs, not just the
first. The primary program name stays in programName; additional nested
programs go into nestedPrograms[]. The processor creates separate Module
nodes for each nested program, contained by the outer module, and
registers them in moduleNodeIds for cross-program CALL resolution.

Paragraphs/data items are not yet scoped per-program (attributed to the
outer module) — full per-program scoping is a future enhancement that
requires END PROGRAM boundary tracking in the state machine.

* test(cobol): expand integration tests for all new language features

New fixtures:
- NESTED.cbl — two PROGRAM-IDs (OUTER-PROG, INNER-PROG) for nested
  program support testing
- COPYLIB.cpy — copybook for pseudotext REPLACING test target

Modified fixtures:
- CUSTUPDT.cbl — single-quoted ENTRY 'ALTENTRY', multi-target MOVE
  (WS-AMT TO FIELD-A FIELD-B), dynamic CALL WS-PROG-NAME, COPY COPYLIB
  with pseudotext REPLACING, LINKAGE SECTION with LS-PARAM
- RPTGEN.cbl — PERFORM WS-COUNT TIMES (false-positive guard), unquoted
  MAP(WS-MAP-NAME), additional data items WS-COUNT WS-MAP-NAME

Integration test rewritten with 62 exact assertions covering:
- 5 Module, 17 Function, 33 Property, 9 CodeElement, 2 Constructor nodes
- Nested program containment (OUTER-PROG -> INNER-PROG)
- Dynamic CALL annotation (CodeElement with cobol-dynamic-call)
- Multi-target MOVE (UPDATE-BALANCE: 2 reads, 3 writes)
- Single-quoted ENTRY (ALTENTRY under CUSTUPDT)
- PERFORM TIMES guard (WS-COUNT not in CALLS)
- Orphan unresolved edge removal (zero -unresolved edges)
- Grand totals: 21 CALLS, 68 CONTAINS, 2 IMPORTS, 10 ACCESSES

* fix(cobol): pseudotext REPLACING now applies correctly via isPseudotext flag

Root cause: ==PREFIX-== matched /^[A-Z][A-Z0-9-]*$/i (trailing hyphens
allowed), routing it to the second-pass EXACT identifier match where
PREFIX-RECORD !== PREFIX- failed silently.

Fix: Propagate isPseudotext from parseReplacingClause to CopyReplacing
interface, then use it in applyReplacing first-pass condition to force
global string replacement for all pseudotext entries regardless of
whether the content looks like an identifier.

Result: COPY COPYLIB REPLACING ==PREFIX-== BY ==WS-==. now correctly
transforms PREFIX-RECORD → WS-RECORD, PREFIX-CODE → WS-CODE, etc.

* refactor(cobol): per-program scoping via boundary tracking + line-range grouping

State machine changes (minimal, ~30 lines):
- Add RE_END_PROGRAM regex for END PROGRAM program-name. detection
- Replace nestedPrograms[] with programs[] containing startLine/endLine/
  nestingDepth metadata for each PROGRAM-ID in the file
- Reset division/section/paragraph state on new PROGRAM-ID boundary
- EOF finalization flushes remaining stack entries (single-program files)
- Programs sorted by startLine (outer before inner)

Processor changes:
- Uses programs[] with line-range containment to find enclosing parent
  Module for nested programs (replaces hardcoded nestedParent logic)
- programModuleIds Map tracks Module node IDs per program name

Fixture: NESTED.cbl now includes END PROGRAM lines for both programs.

Integration test: PREFIX-* Property nodes now correctly appear as WS-*
after the pseudotext REPLACING fix from the previous commit.

* feat(cobol): free-format COBOL support (>>source free)

Auto-detects >>SOURCE FREE directive in the first 500 chars and switches
to free-format line processing:
- No column-position rules (cols 1-6 are program text, not sequence area)
- Comments use *> prefix instead of col 7 indicator
- No continuation line indicator
- Strip inline *> comments
- Skip >>SOURCE directive lines

preprocessCobolSource() skips col-1-6 stripping for free-format files.

Paragraph/section regexes relaxed from fixed 7-space prefix to flexible
whitespace with case-insensitivity (/^\s*([A-Z][A-Z0-9-]+)\.\s*$/i).
EXCLUDED_PARA_NAMES expanded with COBOL verbs (GOBACK, END-READ, etc.)
to prevent false-positive paragraph detection in free-format.

Also fixes: entry-point-scoring.ts crash when language is 'cobol'
(MERGED_ENTRY_POINT_PATTERNS[language] was undefined → optional chaining).

Benchmark on ACAS 3.01 (268 GnuCOBOL free-format programs, 10MB):
- Before: 407 nodes, 393 edges (near-empty, only file nodes)
- After:  4,297 nodes, 3,612 edges, 542 clusters, 11 flows

* fix(cobol): relax data item regexes for free-format (^\s+ to ^\s*)

RE_FD, RE_DATA_ITEM, RE_ANONYMOUS_REDEFINES, and RE_88_LEVEL all used
^\s+ which requires at least 1 leading space. In free-format mode, lines
are trimmed before processing, so data items like "01 WS-FIELD PIC X."
have no leading whitespace after trimming.

Changed to ^\s* (zero or more spaces) which works for both fixed-format
(indented lines still have spaces) and free-format (trimmed lines).

ACAS benchmark (268 GnuCOBOL programs):
- Before: 4,297 nodes, 3,612 edges (paragraphs only)
- After:  13,832 nodes, 8,615 edges (+ data items, FDs, 88-levels)

* feat(cobol): 100% structural feature coverage — GO TO, SCREEN, SD/RD, SORT, SEARCH, CANCEL, Level 66

New extractions: GO TO (CALLS edges), SCREEN SECTION data items,
SD/RD alongside FD (Record nodes), SORT/MERGE USING/GIVING (ACCESSES),
SEARCH (ACCESSES), CANCEL (CALLS), Level 66 RENAMES (Property),
IS EXTERNAL/IS GLOBAL (Property description enrichment).

ACAS: 13,951 nodes | 13,193 edges | 685 clusters | 150 flows
(+53% edges from new GO TO/SORT/SEARCH/CANCEL extractions)

* feat(cobol): enriched CICS extraction — file I/O, dynamic PROGRAM, queues, HANDLE ABEND

EXEC CICS blocks now extract:
- FILE/DATASET clause: captures VSAM file name (literal or data item ref)
  for READ/WRITE/REWRITE/DELETE/STARTBR/READNEXT/READPREV → ACCESSES edges
- PROGRAM clause: now handles unquoted variable references (dynamic CICS
  program transfer) → CodeElement annotation with cics-dynamic-program reason
- QUEUE clause: captures TS/TD queue names from WRITEQ/READQ → ACCESSES edges
- LABEL clause: captures HANDLE ABEND error handler targets → CALLS edges
- TRANSID: now handles unquoted variable references

CodeElement descriptions enriched with all captured fields (map, program,
transid, file, queue, label).

CardDemo benchmark: +49 nodes, +33 edges from enriched CICS extraction.

* feat(cobol): complete CICS command extraction — all 7 expert recommendations

From COBOL expert agent analysis:
1. ENDBR added to isRead file command list
2. LOAD added to PROGRAM edge commands (alongside LINK/XCTL)
3. Two-word commands expanded: WRITEQ/READQ/DELETEQ TS/TD, HANDLE
   ABEND/AID/CONDITION, START TRANSID
4. Queue reason differentiated: cics-queue-read/-write/-delete
5. RETURN/START TRANSID → CALLS edges to synthetic <transid> target
6. MAP → ACCESSES edges for screen traceability
7. INTO/FROM data fields extracted → ACCESSES edges to data items

Also: dataItemMap built before CICS block processing (was declared after),
CodeElement descriptions enriched with all captured CICS fields.

* test(cobol): strict exhaustive integration tests with exact edgeSet assertions

Every edge reason has exact sorted pair assertions via edgeSet(), not
just counts. Any change to extraction that adds, removes, or reorders
edges will produce a precise, descriptive failure.

Updated RPTGEN.cbl fixture with:
- GO TO EXIT-PARAGRAPH, SORT USING/GIVING, SEARCH table
- EXEC CICS READ FILE INTO, WRITEQ TS QUEUE FROM, SEND MAP FROM
- EXEC CICS HANDLE ABEND LABEL, RETURN TRANSID, XCTL PROGRAM(variable)
- ABEND-HANDLER and EXIT-PARAGRAPH paragraphs

46 tests covering 24 CALLS + 79 CONTAINS + 18 ACCESSES + 2 IMPORTS edges
across 15 distinct edge reason codes, all with exact sorted pair lists.

* fix(cobol): address 5 findings from second Claude review (compiler front-end perspective)

Finding #2: Numeric sequence numbers now stripped (changed /[^0-9 ]/ to
/\S/ in preprocessCobolSource). Lines like "000100 MAIN-PARAGRAPH." now
have cols 1-6 blanked so paragraph regex matches correctly.

Finding #11: JCL in-stream PROC ordering fixed — pre-register all PROCs
into moduleNames before step processing. Steps that EXEC a PROC defined
later in the same file now get CALLS edges.

Finding #A: PROCEDURE DIVISION USING no longer captures calling-convention
keywords (BY, VALUE, REFERENCE, CONTENT, ADDRESS, OF) as parameter names.

Finding #C: SORT/MERGE USING/GIVING now captures ALL file references
(multi-file), not just the first. Changed from single-match to section
extraction with split.

Finding #D: Section headers no longer set currentParagraph, preventing
PERFORM caller misattribution to Namespace instead of Function nodes.

* fix(cobol): address code review findings — ReDoS fix, perf, cleanup

P1 CRITICAL — ReDoS in SORT USING/GIVING:
Replaced nested-quantifier regex with safe indexOf+substring+split
approach. No backtracking possible on crafted input.

P2 — readCopy O(M) linear scan:
Added copybookByPath reverse Map for O(1) path-to-content lookup.

P3 — Dead code removal:
Deleted unused RE_SORT_USING and RE_SORT_GIVING constants.

P3 — EXCLUDED_PARA_NAMES simplification:
Replaced 20 END-* entries with startsWith('END-') prefix check.
Auto-covers future END-* verbs.

P3 — Misplaced JSDoc on removeRelationship:
Fixed comment that described removeNodesByFile instead.
Added missing JSDoc to removeNodesByFile.

Review agents: architecture-strategist, performance-oracle,
security-sentinel, code-simplicity-reviewer.

* refactor: add Cobol to SupportedLanguages with parseStrategy: standalone

New languages/cobol.ts — standalone regex processor provider with no-op
tree-sitter fields. Declares parseStrategy: 'standalone' to distinguish
from tree-sitter-based languages.

Added parseStrategy: 'tree-sitter' | 'standalone' to LanguageProviderConfig
for languages that use their own processor instead of tree-sitter.

Removed all 11 'cobol' as any casts — now uses SupportedLanguages.Cobol.
Added empty Cobol entries to entry-point-scoring and framework-detection.

* fix(cobol): 5 fixes from third Claude review + 3 regression tests

Fixes:
- Line numbers now 1-indexed in fixed-format (was 0-indexed, off-by-one
  in jump-to-definition links)
- Copybook content preprocessed before COPY expansion (sequence numbers
  and patch markers in copybooks no longer survive into expanded source)
- ENTRY USING filters calling-convention keywords (BY, VALUE, REFERENCE,
  CONTENT, ADDRESS, OF) — same fix as PROCEDURE DIVISION USING
- SORT/MERGE trailing period stripped from USING/GIVING file tokens
- Paragraph exclusion uses exact match for SECTION/DIVISION (was substring
  match that excluded valid names like CROSS-SECTION-ANALYSIS)

USING_KEYWORDS moved to module scope for reuse by both PROCEDURE DIVISION
USING and ENTRY USING handlers.

New unit tests:
- ENTRY USING BY VALUE filtering
- Paragraph names containing SECTION not excluded
- Numeric sequence numbers stripped enabling paragraph detection

* fix(cobol): address 6 findings from fourth Claude review + tests

Fourth review findings fixed:
- New #IV: PERFORM TIMES guard uses perfMatch.index instead of
  line.indexOf (prevents wrong match when target appears earlier in line)
- New #V: 88-level condition values now handle single-quoted literals
  ('Y' no longer stored with embedded quotes)
- New #I: CANCEL edges use two-pass resolution like CALL (no longer
  silently dropped when target indexed after source)
- New #3: Multi-line SORT/MERGE accumulation — sortAccum state variable
  accumulates lines until period, then extracts USING/GIVING from full
  statement (95% of production SORT statements span multiple lines)
- New #II: PROCEDURE DIVISION USING on split lines — pendingProcUsing
  flag defers parameter capture to next line if USING not on same line
- New #6 (prior): EXCLUDED_PARA_NAMES exact match for SECTION/DIVISION

Updated fixture: RPTGEN.cbl SORT now uses multi-line format with GIVING
on separate line (period-terminated). New sort-giving integration test.
ACCESSES total: 18 → 19 (new sort-giving edge from multi-line capture).

* fix(cobol): address 4 findings from fifth Claude review

Finding #B (5 reviews old): Section/paragraph node IDs now include
enclosing program name to prevent collision when nested programs share
section/paragraph names. New findOwningProgramName() helper uses
programs[] line ranges to find the innermost enclosing program.

Finding #α: pendingProcUsing now reset in the if(procUsingMatch) branch
(was only set in else branch, could leak across nested programs).

Finding #β: RE_CALL_DYNAMIC uses negative lookbehind (?<![A-Z0-9-]) to
prevent false-positive on compound identifiers like WS-CALL OCCURS.

Finding #γ: sortAccum flushed at EOF (parallel to flushSelect and
pendingFdName EOF cleanup). Prevents silent loss of SORT USING/GIVING
relationships in truncated files.

* fix(cobol): address findings from reviews 5+6 with full test coverage

Review 5 fixes:
- #α: pendingProcUsing reset in if(procUsingMatch) branch
- #β: RE_CALL_DYNAMIC negative lookbehind prevents WS-CALL false positive
- #γ: sortAccum flushed at EOF for truncated files
- #B: Section/paragraph IDs include owning program name

Review 6 fixes:
- #P: sectionNodeIds/paraNodeIds maps use program-scoped keys
  (PROGNAME:NAME). New scopedParaLookup/scopedCallerLookup helpers.
  findContainingSection updated with programs parameter.
- #Q: RETURNING added to USING_KEYWORDS for COBOL 2002+
- #R: RE_PERFORM matches both THRU and THROUGH via alternation

New unit tests (6):
- PERFORM THROUGH captures thruTarget
- PROCEDURE DIVISION USING RETURNING filters keyword
- RE_CALL_DYNAMIC no false-match on WS-CALL compound identifier
- Multi-line SORT captures USING/GIVING from continuation lines
- PROCEDURE DIVISION USING on split line via pendingProcUsing
- Copybook preprocessing strips sequence numbers

* fix(cobol): address findings from seventh Claude review + 3 tests

Review 7 fixes:
- #i: findContainingSection only updates best when lookup succeeds
  (prevents undefined overwriting valid parent section)
- #ii: RE_PROC_SECTION handles segment numbers (SECTION 30.)
- #III: procedureUsing now stored per-program on boundary stack
  entries, propagated to programs[] output. Inner programs no longer
  overwrite outer program's parameters.
- #δ: Dynamic CANCEL (CANCEL variable) now creates CodeElement
  annotation node, matching dynamic CALL behavior. RE_CANCEL_DYNAMIC
  with negative lookbehind. cancels[] gains isQuoted field.
- #Q: RETURNING added to USING_KEYWORDS (already in prev commit)
- #R: PERFORM THROUGH already fixed (THRU|THROUGH alternation)

New unit tests:
- Nested programs carry per-program procedureUsing
- SECTION with segment number detected
- Dynamic CANCEL via data item captured with isQuoted=false

* feat(cobol): link PROCEDURE DIVISION USING to LINKAGE data items + close 4 findings

Finding #10 FIXED: procedureUsing parameters now create ACCESSES edges
with reason 'cobol-procedure-using' from Module to matching LINKAGE
SECTION Property nodes. This exposes the program's parameter contract
in the graph (e.g., AUDITLOG → LS-CUST-ID, AUDITLOG → LS-AMOUNT).

Findings closed by expert agent consensus:
- #6 COPY IN library: WONTFIX — captured metadata, no universal
  library-to-directory mapping exists. Field costs nothing and is useful
  for library queries.
- abhigyanpatwari#14 SQL DELETE: WONTFIX — DB2 requires FROM; existing FROM pattern
  handles it. Bare DELETE would risk false positives.
- #E OCCURS DEPENDING ON: WONTFIX — runtime sizing concern, not
  structural. The static occurs count is sufficient for indexing.

All 39 findings from 7 Claude reviews now resolved or closed.

* fix(cobol): resolve 48 review findings across 9 review cycles

Ninth deep review resolved all remaining COBOL parser gaps identified
by 5 specialist agents (COBOL expert, architecture strategist,
TypeScript reviewer, security sentinel, code simplicity reviewer).

Fixes (P1 — critical):
- SELECT OPTIONAL now correctly skips OPTIONAL keyword (C1)
- RETURNING params excluded from PROCEDURE DIVISION USING list (C7)
- SORT GIVING no longer captures clause keywords as file names (C5)
- Extract flushSort() helper eliminating 40-line duplication (S2)
- Flush unclosed EXEC blocks at EOF matching SORT/SELECT pattern (S3)
- Guard undefined map key in jcl-processor moduleNames (S1)
- Add MAX_TOTAL_EXPANSIONS=500 to prevent exponential COPY breadth (S4)

Fixes (P2 — important):
- Quote-aware stripInlineComment for | and *> in string literals (C2+C3)
- Fixed-format literal continuation now handles quoted strings (C6)
- PROGRAM-ID detected regardless of division state for siblings (C9)

Fixes (P3 — cleanup):
- EXEC SQL INTO restricted to INSERT INTO to avoid FETCH false-pos (C8)
- Copy expander line numbers fixed from 0-based to 1-based (C11)
- Remove dead code: inInStreamProc, fileIsLiteral, expansionDepth (S7-S10)

Also fixes 8th-review findings: nested program CONTAINS attribution,
multi-PERFORM on same line, INPUT/OUTPUT PROCEDURE IS in SORT,
GO TO DEPENDING ON multi-target, MOVE CORR abbreviation, per-program
procedureUsing ACCESSES edges.

Tests: 145 COBOL tests passing (59 integration + 86 unit)
Benchmarks: CardDemo 12,323 nodes/8,893 edges (7.4s)
            ACAS 14,016 nodes/15,452 edges (9.3s, -9% faster)

* docs(cobol): update documentation for ninth review cycle fixes

Update all 4 COBOL documentation files to reflect the 16 fixes
from the ninth review cycle:

- regex-extraction.md: quote-aware comment stripping, SELECT OPTIONAL,
  RETURNING exclusion, SORT_CLAUSE_NOISE filter, flushSort() helper,
  GO TO multi-target, PROGRAM-ID division-independent detection
- copy-expansion.md: MAX_TOTAL_EXPANSIONS=500 breadth guard, 1-based
  line numbers, removed expansionDepth/warnedCircular param
- deep-indexing.md: GO TO DEPENDING ON, INPUT/OUTPUT PROCEDURE IS,
  MOVE CORR edge reasons, INSERT INTO restriction, literal continuation
- performance.md: updated benchmarks (CardDemo 12,323n/8,893e/7.4s,
  ACAS 14,016n/15,452e/9.3s), COPY breadth guard

* fix(cobol): resolve 10th review findings — nested program edge attribution

Fix 6 findings from the 10th review (PR abhigyanpatwari#498 comment #4132201110):

#A+#F: All CALL/CANCEL/CICS/ENTRY/SQL/SEARCH/file-declaration edges
now use owningModuleId() for nested program attribution instead of
the outer program's parentId. Added helper function owningModuleId()
to centralize the pattern.

#B: Added USING and GIVING to SORT_CLAUSE_NOISE set to prevent MERGE
USING + OUTPUT PROCEDURE from capturing clause keywords as file names.

#C: INPUT/OUTPUT PROCEDURE regex now captures optional THRU/THROUGH
range end paragraph, mirroring RE_PERFORM's THRU support.

#D: scopedCallerLookup fallback now uses programModuleIds.get(pgm)
instead of parentId, so PERFORM/MOVE/GOTO in nested programs with
unresolvable paragraphs fall back to the correct inner module.

#E: pendingProcUsing only set when PROCEDURE DIVISION line is NOT
period-terminated, preventing false USING expectation.

Tests: 145 passing | TypeScript clean

* fix(cobol): resolve 10th review findings — nested program edge attribution

Fix 6 findings from the 10th review (PR abhigyanpatwari#498 comment #4132201110):

#A+#F: All CALL/CANCEL/CICS/ENTRY/SQL/SEARCH/file-declaration edges
now use owningModuleId() for nested program attribution instead of
the outer program's parentId. Added helper function owningModuleId()
to centralize the pattern.

#B: Added USING and GIVING to SORT_CLAUSE_NOISE set to prevent MERGE
USING + OUTPUT PROCEDURE from capturing clause keywords as file names.

#C: INPUT/OUTPUT PROCEDURE regex now captures optional THRU/THROUGH
range end paragraph, mirroring RE_PERFORM's THRU support.

#D: scopedCallerLookup fallback now uses programModuleIds.get(pgm)
instead of parentId, so PERFORM/MOVE/GOTO in nested programs with
unresolvable paragraphs fall back to the correct inner module.

#E: pendingProcUsing only set when PROCEDURE DIVISION line is NOT
period-terminated, preventing false USING expectation.

Tests: 145 passing | TypeScript clean

* fix(cobol): resolve 11th review findings — final nested program + multi-CALL gaps

#1: scopedCallerLookup(null) now uses owningModuleId(lineNum) instead
of parentId, fixing PERFORM/MOVE/GOTO before first paragraph in nested
programs.

#2+#3: CALL and CANCEL extraction now uses matchAll (global flag) to
capture multiple occurrences on the same line. Dynamic CALL/CANCEL
checked independently instead of in else branch.

#4: SORT/MERGE ACCESSES edge IDs now use owningModuleId(sort.line)
instead of parentId for nested program correctness.

#5: preprocessCobolSource free-format detection now uses first 10 lines
(consistent with extractCobolSymbolsWithRegex threshold).

#6: EXCLUDED_PARA_NAMES expanded with DISPLAY, ACCEPT, WRITE, READ,
REWRITE, DELETE, OPEN, CLOSE, RETURN, RELEASE, SORT, MERGE to prevent
false-positive paragraph detection on isolated verbs.

Also removed unused GraphNode import from cobol-processor.ts.

Tests: 145 passing | TypeScript clean

* docs(cobol): deepened full language coverage plan with research findings

3 research agents analyzed Phase 1-2 features and graph value ranking.

Key findings: cobol-call-using is #1 edge type (9.2/10); multi-line
accumulation is dominant challenge; DECLARATIVES is lowest-risk Phase 2
item; SET TO TRUE covers 80-90% of SET usage.

* feat(cobol): implement Phase 1 — high-value data flow edges

4 new extraction features that create new ACCESSES and IMPORTS edges:

1.1: EXEC SQL INCLUDE -> IMPORTS edges with reason 'sql-include'
     Handles unquoted (SQLCA), quoted ('DBRMLIB.MEMBER'), and
     underscored (CUST_TBL_DCL) member names.

1.2: CALL USING parameter extraction -> ACCESSES edges
     Extracts parameters from CALL USING clause, filtering BY/REFERENCE/
     CONTENT/VALUE/ADDRESS/OF/LENGTH/OMITTED keywords. Creates
     'cobol-call-using' ACCESSES edges (graph value: 9.2/10).

1.4: OCCURS DEPENDING ON -> ACCESSES edges with reason 'cobol-depends-on'
     Extended OCCURS regex captures DEPENDING ON field with subscript
     stripping. Creates dependency edge from table to controlling field.

1.5: VALUE clause for standard data items
     Extracts VALUE from data item clauses: quoted strings with type
     prefix (X/N/G/B), ALL literals, numerics (incl negative/decimal),
     and figurative constants. Populates Property node values.

Tests: 145 passing (+2 ACCESSES from CALL USING) | TypeScript clean

* feat(cobol): implement Phase 2 — DECLARATIVES, SET, INSPECT, EXEC DLI

4 new extraction features for error handling, data flow, and IMS/DB:

2.1: EXEC DLI (IMS/DB) -> CodeElement + ACCESSES edges
     Accumulates EXEC DLI blocks like EXEC SQL. Parses DLI verbs
     (GU, GN, ISRT, REPL, DLET, CHKP, SCHD, TERM). Extracts
     SEGMENT, PCB, INTO/FROM, PSB. Creates dli-{verb} ACCESSES
     edges to <ims>:segment Record nodes.

2.2: DECLARATIVES / USE AFTER EXCEPTION -> ACCESSES edges
     Tracks inDeclaratives state. Detects USE AFTER STANDARD
     EXCEPTION ON file-name. Creates cobol-error-handler ACCESSES
     edge from handler section to file Record.

2.3: SET statement -> ACCESSES edges
     Detects SET TO TRUE (80-90% of SET usage) and SET index
     TO/UP BY/DOWN BY. Creates cobol-set-condition / cobol-set-index
     write edges + cobol-set-read for identifier values.

2.4: INSPECT -> ACCESSES edges with multi-line accumulator
     Accumulates INSPECT until period (like SORT). Extracts inspected
     field + tally counters. Creates cobol-inspect-read/write/tally
     edges. Form detection: tallying/replacing/converting/combined.

Preprocessor: 1398 -> 1597 LOC (+199). Tests: 145 passing.

* feat(cobol): implement Phase 3 — completeness fixes

6 partial features fixed to first-class support:

3.1: CALL RETURNING -> ACCESSES write edge (cobol-call-returning)
3.2: SELECT OPTIONAL flag preserved in FileDeclaration + Record node
3.3: ALTERNATE RECORD KEY extraction (matchAll for multiple keys)
3.4: COMMON attribute on nested programs (RE_PROGRAM_ID extended)
3.5: IS EXTERNAL / IS GLOBAL as first-class boolean properties
     (removed usage string hack)
3.6: AUTHOR / DATE-WRITTEN mapped to Module node description

Tests: 145 passing | TypeScript clean

* feat(cobol): implement Phase 4 — INITIALIZE + metadata completeness

4.1: INITIALIZE statement -> ACCESSES write edge (cobol-initialize)
4.2: DATE-COMPILED and INSTALLATION paragraphs extracted and mapped
     to Module node description alongside existing AUTHOR/DATE-WRITTEN

All 4 plan phases complete. Coverage: ~95% (up from 71.9%).
Tests: 145 passing | TypeScript clean

* test(cobol): add 24 unit tests for Phase 1-4 features

Coverage for all new extraction features:

Phase 1 (8 tests):
- EXEC SQL INCLUDE (unquoted, quoted, underscored)
- CALL USING (simple, mixed modes, ADDRESS OF, OMITTED)
- CALL RETURNING
- OCCURS DEPENDING ON
- VALUE clause (string, numeric, figurative constant)

Phase 2 (10 tests):
- EXEC DLI GU/ISRT/SCHD (verb, segment, PCB, INTO, FROM, PSB)
- DECLARATIVES USE AFTER EXCEPTION (single + multiple sections)
- SET TO TRUE, SET index UP BY
- INSPECT TALLYING, INSPECT REPLACING

Phase 3-4 (6 tests):
- SELECT OPTIONAL flag
- ALTERNATE RECORD KEY
- PROGRAM-ID IS COMMON
- IS EXTERNAL / IS GLOBAL booleans
- INITIALIZE extraction
- Full programMetadata (AUTHOR, DATE-WRITTEN, DATE-COMPILED, INSTALLATION)

Total: 168 tests passing (145 + 24 - 1 removed duplicate)

* fix(cobol): use /\r?\n/ split for Windows CRLF compatibility

All 4 COBOL source files now split on /\r?\n/ instead of '\n' to
handle CRLF line endings on Windows. Previously, trailing \r in
lines caused RE_GOTO's $ anchor to fail on multi-line GO TO
DEPENDING ON statements, producing only 1 goto edge instead of 4.

Files fixed: cobol-preprocessor.ts (2 sites), cobol-processor.ts,
jcl-parser.ts, cobol-copy-expander.ts

Tests: 168 passing | TypeScript clean

* fix(cobol): resolve 12th review — dynamic CALL/CANCEL dedup + trailing anchors

#1+#2: Removed incorrect hasQuotedCall/hasQuotedCancel deduplication
guards. RE_CALL_DYNAMIC and RE_CANCEL_DYNAMIC require [A-Z] after
CALL/CANCEL, so they CANNOT match quoted targets — the guards were
both unnecessary and actively harmful, suppressing dynamic CALL/CANCEL
in ON EXCEPTION patterns.

#3+#5: Changed RE_CALL_DYNAMIC and RE_CANCEL_DYNAMIC trailing anchor
from (?:\s|\.) to (?=\s|\.|$) (lookahead). The consuming anchor
failed when the identifier was the last token on a physical line.

Tests: 168 passing | TypeScript clean

* feat(cobol): add CALL accumulator + fix SORT double-statement (#4, #6)

Finding #4: Multi-line CALL USING accumulator
Added callAccum state variable that accumulates CALL statements
spanning multiple physical lines until period or END-CALL is found.
Uses flushCallAccum() to re-extract CALL target + USING parameters
from the full accumulated statement. This fixes the silent loss of
ACCESSES parameter edges when USING appears on lines after CALL.

Finding #6: SORT double-statement on same line
After flushSort(), the code now falls through to re-check the
current line for a new SORT/MERGE start (was previously blocked
by the sortAccum === null check evaluating before flushSort ran).

Also fixed: used non-global regex for CALL detection test to avoid
the classic global regex .test() lastIndex bug.

Tests: 168 passing (+1 ACCESSES from multi-line CALL USING)

* fix(cobol): resolve 13th review — CICS LOAD, USING extraction, file scoping

#1: CICS LOAD unresolved edge no longer silently deleted in second pass.
    Changed narrow cics-link/cics-xctl check to catch-all pattern:
    rel.reason?.startsWith('cics-') && rel.reason.endsWith('-unresolved')

#2: flushCallAccum USING extraction now stops before COBOL statement
    verbs (INSPECT, SEARCH, SORT, MERGE, DISPLAY, ACCEPT, MOVE, PERFORM,
    GO TO, CALL, IF, EVALUATE). Prevents absorbing adjacent statements
    as false USING parameters in legacy pre-COBOL-85 code without END-CALL.

#3: CICS FILE Record nodes now globally-scoped (<cics-file>:FILENAME)
    instead of per-file-scoped. Enables cross-program CICS file access
    analysis, consistent with SQL table scoping (<db>:TABLE).

#4: callAccum pre-check regex now has (?<![A-Z0-9-]) lookbehind to
    prevent false activation on compound identifiers like WS-CALL-FLAG.

Tests: 168 passing | TypeScript clean

* fix(cobol): resolve 14th review — callAccum false paragraph + Area A guard

#1: callAccum continuation lines now check for COBOL statement verb
    starts (GO TO, PERFORM, MOVE, etc.) and paragraph/section headers.
    If detected, the CALL is flushed as-is and the line processed
    normally — prevents false paragraph detection and currentParagraph
    corruption from lines like "WS-ADDR." being treated as paragraphs.

#4: callAccum pre-check now guarded by currentDivision === 'procedure'
    to prevent unnecessary activations in DATA DIVISION.

#5: Fixed-format paragraph detection now rejects lines with >7 leading
    spaces (Area B indentation) as paragraph candidates. Paragraph
    names in fixed-format must start in Area A (col 8-11, max 7 spaces).
    Free-format mode is unaffected.

Tests: 168 passing | TypeScript clean

* fix(cobol): resolve 15th review — callAccum Area A + verb boundary fixes

#A: Column-position-aware paragraph detection in callAccum flush.
#B: inspectAccum early-flush on paragraph/section/verb headers.
#C: Verb boundary \b → (?:\s|$) prevents MOVE-COUNT false flush.

* test(cobol): add 17 edge-case regression tests + fix USING verb boundary

17 new tests covering all recurring review patterns:

Multi-line CALL USING (7 tests):
- Parameters on separate continuation lines (IBM mainframe style)
- No absorption of INSPECT/GO TO/paragraphs following CALL
- END-CALL scope terminator
- Hyphenated identifiers (MOVE-COUNT) not triggering false flush
- Dual quoted+dynamic CALL on same line (ON EXCEPTION)

Nested program attribution (2 tests):
- CALL in inner program within inner line range
- PERFORM before first paragraph has null caller

CRLF compatibility (1 test):
- GO TO DEPENDING ON with \r\n line endings

Area A paragraph detection (2 tests):
- Area B (>7 spaces) rejected; Area A (7 spaces) accepted

SORT/MERGE (1 test): COLLATING SEQUENCE keywords not captured
PROCEDURE USING (2 tests): RETURNING excluded, period-terminated
Comment stripping (1 test): pipe in quoted string preserved
SELECT OPTIONAL (1 test): correct file name, not OPTIONAL keyword

Bug fix: USING extraction regex verb terminators changed from
\bVERB\b to \bVERB(?=\s|$) in flushCallAccum — prevents truncation
on hyphenated identifiers like MOVE-COUNT, PERFORM-LIMIT.

Total: 185 tests passing

* test(cobol): add 32 comprehensive edge-case regression tests

13 new describe blocks covering all extraction features:

- EXEC DLI: no-SEGMENT, multi-line accumulation (2 tests)
- SET: multiple targets, DOWN BY, TO numeric (3 tests)
- INSPECT: CONVERTING, multiple counters, tallying-replacing,
  paragraph flush during accumulation (4 tests)
- DECLARATIVES: no-STANDARD keyword, I-O mode, post-END paragraphs (3)
- COPY REPLACING: pseudotext deletion ==OLD== BY ==== (1 test)
- VALUE: hex literal, negative numeric, ALL literal (3 tests)
- OCCURS: TO range, fixed-size without DEPENDING ON (2 tests)
- Dynamic CALL/CANCEL: end-of-line, multiple CANCELs (3 tests)
- EXEC SQL: INCLUDE skips tables, SELECT INTO host vars, host
  variable extraction (3 tests)
- INITIALIZE: target and caller context (1 test)
- Nested programs: sibling scoping, PROGRAM-ID without ID DIV (2)
- EXEC EOF flush: unclosed EXEC SQL flushed (1 test)
- Multi-PERFORM: IF/ELSE dual PERFORM on single line (1 test)
- IS EXTERNAL: USAGE not polluted by external flag (1 test)

Total: 215 tests passing

* fix(cobol): resolve 16th review — CANCEL in CALL block + USING boundary

#1: flushCallAccum now extracts CANCEL statements from within CALL
    ON EXCEPTION blocks. Adds RE_CANCEL + RE_CANCEL_DYNAMIC matchAll
    passes alongside existing CALL extraction.

#2: Added \bCANCEL(?=\s|$) to USING lookahead regex to prevent CANCEL
    keyword being captured as false USING parameter.

#3: Multi-line CALL start now returns immediately to prevent the CALL
    start line from simultaneously feeding sortAccum/inspectAccum.

#6: Division transitions now flush all active accumulators (callAccum,
    sortAccum, inspectAccum) to prevent state leakage across programs.

Also added CANCEL to callAccum flush trigger verb list.

Tests: 215 passing | TypeScript clean

* refactor(cobol): extract shared verb constants + resolve 17th review

Extract COBOL_STATEMENT_VERBS, RE_STATEMENT_VERB_START, and
RE_USING_PARAMS as shared constants — eliminates 4 duplicated
25-verb regex patterns.

17th review: #1 flushCallAccum before EXEC entry, #2 inspectAccum
verb parity via shared constant.

Tests: 215 passing | TypeScript clean

* test(cobol): replace all fuzzy assertions with exact toBe checks

Replaced 7 toBeGreaterThan/toBeLessThan/toBeGreaterThanOrEqual
assertions with exact toBe values:

- dataItems.length: >= 3 → toBe(3)
- calls.length: >= 1 → toBe(1)
- calls[0].line: range check → toBe(10)
- programs[].startLine/endLine: comparison → exact values
- innerA.endLine/innerB.startLine: comparison → exact values

Also added 11 new edge-case tests (accumulator flush on EXEC/division
transitions, free-format, CANCEL in CALL block, SORT THRU, verb
flush, integration).

226 tests passing — zero fuzzy assertions remain.

* fix(cobol): resolve 19th review + 15 accumulator flush tests

Fixes:
#1: END PROGRAM flushes callAccum/sortAccum/inspectAccum
#2: PROGRAM-ID sibling path flushes all accumulators
#3: Added COMPUTE/ADD/SUBTRACT/MULTIPLY/DIVIDE/STRING/UNSTRING
    to COBOL_STATEMENT_VERBS (now 32 verbs)

Tests (15 new):
- END PROGRAM flush: single + nested programs (2)
- PROGRAM-ID sibling flush (1)
- Arithmetic verb flush: COMPUTE/ADD/SUBTRACT/MULTIPLY/DIVIDE (5)
- String verb flush: STRING/UNSTRING (2)
- Arithmetic not captured as false USING params (1)
- SORT flushed at END PROGRAM (1)
- INSPECT flushed at END PROGRAM (1)
- All with exact toBe assertions (2)

Total: 239 tests passing | Zero fuzzy assertions

* fix(cobol): resolve 20th review — INITIALIZE multi-target + 2 tests

Finding 1: INITIALIZE now captures multiple targets with REPLACING
clause keyword filtering. Regex changed to lazy match stopping at
REPLACING/WITH/period boundary. Targets split on whitespace and
filtered against INITIALIZE_CLAUSE_KEYWORDS set.

Tests (2 new):
- INITIALIZE multi-target: WS-CUSTOMER WS-ORDER WS-LINE-ITEM → 3
- INITIALIZE with REPLACING: only WS-RECORD captured, not keywords

Total: 241 tests passing | TypeScript clean
zander-raycraft pushed a commit that referenced this pull request May 7, 2026
…atwari#756)

* Initial plan

* feat(SM-13): extract resolveFreeCall from resolveCallTarget

Extract the free-function call resolution path into a dedicated
`resolveFreeCall(calledName, filePath, ctx)` function that uses
`lookupExact` + import-scoped resolution via `ctx.resolve()`.

- Free function calls (foo()) now route through `resolveFreeCall`
- Swift/Kotlin implicit constructors (User()) delegate to
  `resolveStaticCall` within `resolveFreeCall`
- `resolveCallTarget` dispatches `callForm === 'free'` early,
  removing the inline freeFormHasClassTarget logic
- S0 block simplified to only handle `callForm === 'constructor'`
- Global (Tier 3) fallthrough preserved via ctx.resolve() until Phase 5
- 9 new unit tests for resolveFreeCall
- All 163 unit tests pass, all 1199 integration resolver tests pass

Agent-Logs-Url: https://github.com/abhigyanpatwari/GitNexus/sessions/c5f2e73a-259a-438c-b5c8-286b82e3c215

Co-authored-by: magyargergo <11230420+magyargergo@users.noreply.github.com>

* chore: revert unrelated package-lock.json change

Agent-Logs-Url: https://github.com/abhigyanpatwari/GitNexus/sessions/c5f2e73a-259a-438c-b5c8-286b82e3c215

Co-authored-by: magyargergo <11230420+magyargergo@users.noreply.github.com>

* fix(SM-13): address PR abhigyanpatwari#756 review findings on resolveFreeCall

Addresses all 7 findings from the PR abhigyanpatwari#756 review comment.

Code (R1, finding #1)
- Replace the literal `'Class' | 'Struct' | 'Record'` check in
  `hasClassTarget` with `INSTANTIABLE_CLASS_TYPES.has(c.type)`. Converts
  an invariant that was previously comment-enforced ("keep this list
  aligned with INSTANTIABLE_CLASS_TYPES") into one enforced structurally.
  Any future extension of the set propagates here automatically. The
  narrower Swift extension dedup block below still uses literal
  `'Class' | 'Struct'` by design — Swift extensions only produce Class
  duplicates in practice, Record is deliberately excluded there, and
  the inline comment now documents that asymmetry.

Tests (+12 regression scenarios)

Finding #2 — language coverage
- Go free function (doStuff())
- Python free function (def helper(): ... helper())
- Rust free function outside any impl block
- Java statically-imported function
- JavaScript module-level function
Each exercises `_resolveCallTargetForTesting` with `callForm='free'`
and the language-specific file extension. `resolveFreeCall` has no
file-extension branching, so these guard the dispatch chain per
language without assuming extractor-specific symbol shapes.

Finding #3 — argCount threading
- 2-arg overload selected when argCount=2
- 0-arg overload selected when argCount=0

Finding #5 — Tier 3 (global) resolution
- Function globally visible but not imported. Asserts exact
  `TIER_CONFIDENCE.global === 0.5` and `reason === 'global'` to catch
  silent drift if the tier table is ever refactored.

Finding #6 — preComputedArgTypes worker path
- String overload matched via preComputedArgTypes=['String']
- Int overload matched via preComputedArgTypes=['int'] (lowercase,
  mirroring the parse-worker's inferred-literal shape; stored 'Int' is
  normalized via normalizeJvmTypeName at comparison time)

Finding #7 — Enum null-route documentation
- Enum-only free call asserts `toBeNull()` with an explanatory comment
  linking to the INSTANTIABLE_CLASS_TYPES rationale. NOT marked skipped
  — current behavior is intentional, not broken.

Finding #4 — Swift extension dedup guard
- Two same-name Class entries at different path lengths; exercises the
  full dispatch chain:
    1. filterCallableCandidates with 'free' strips Class → length 0
    2. hasClassTarget triggers resolveStaticCall
    3. Homonym ambiguity null-routes per SM-12 round-1 contract
    4. Constructor-form retry repopulates with both Classes
    5. Dedup block sorts by filePath.length → shortest path wins

Verification
- `tsc --noEmit` clean
- 3064 unit tests pass (+12)
- 1766 integration tests pass
- Zero regressions

Plan: docs/plans/2026-04-09-003-fix-sm13-resolve-free-call-review-findings-plan.md
Review: abhigyanpatwari#756 (comment)

* refactor(SM-13): extract dedupSwiftExtensionCandidates shared helper

Follow-up to the PR abhigyanpatwari#756 review fix. SM-13 duplicated the Swift
extension same-name collision dedup block between `resolveCallTarget`
and `resolveFreeCall` — two copies of identical 15-line logic with the
same heuristic (`filePath.length` sort, Class/Struct-only, `length > 1`
guard). Extract a single shared helper so the two sites cannot drift.

Changes
- New `dedupSwiftExtensionCandidates(candidates, tier)` helper defined
  alongside `tryOverloadDisambiguation`, with JSDoc documenting:
  - The Swift extension scenario it addresses
  - Why it is intentionally narrower than INSTANTIABLE_CLASS_TYPES
    (Class/Struct only, not Record — C#/Kotlin records don't exhibit
    the multi-file definition pattern, widening risks accidental
    dedup of legitimately distinct record types)
  - The return-null-on-no-match contract so callers can fall through
- `resolveCallTarget` tail dedup (was lines 1593-1610): replaced with
  a single `dedupSwiftExtensionCandidates` call
- `resolveFreeCall` tail dedup (was lines 1994-2012): same replacement
- Net line count: -32 insertions, -9 deletions in the consumer sites,
  +36 for the shared helper + JSDoc

Verification
- `tsc --noEmit` clean
- 3064 unit tests pass (including the R7 Swift dedup guard test added
  in the previous commit that exercises the full free-form retry
  chain through this helper)
- 1766 integration tests pass
- Zero regressions

Follows-up on: abhigyanpatwari#756

* docs(SM-13): address PR abhigyanpatwari#756 final review — comment cleanup only

Three documentation-only findings from the approval review. No
behavior change, no new tests, no code path modifications.

Finding #1 — stale line-number comment
- The comment inside `resolveFreeCall` at the `hasClassTarget` site
  referenced "lines ~1994-2008" for the Swift extension dedup block.
  Those lines were the inlined pre-SM-13 version; the block has since
  been extracted to `dedupSwiftExtensionCandidates`. Replaced the line
  reference with the helper name so future readers don't chase dead
  line numbers.

Finding #2 — fuzzy-widening asymmetry undocumented
- `resolveFreeCall` intentionally has no `widenCache` parameter and no
  D2 fuzzy-widening pass (unlike `resolveCallTarget`'s member-call
  path). Added an explicit "Asymmetry vs `resolveCallTarget`" paragraph
  to the JSDoc so a caller comparing the two signatures knows the
  skipped pass is deliberate and tied to Phase 5.

Finding #3 — constructor-form retry reasons undocumented
- `resolveStaticCall` can return null for three distinct reasons
  (empty instantiable pool, homonym ambiguity, ownerless Constructor
  nodes). The retry below it unconditionally re-filters with
  `'constructor'` form, which is correct for all three but not
  obvious. Added a structured three-case comment enumerating each
  reason and linking (a) to the SM-12 null-route contract, (b) to
  the R7 dedup test, and (c) to the currently-uncovered ownerless-
  Constructor path (noted as a future test candidate).

Verification
- `tsc --noEmit` clean
- 175 `resolveFreeCall` + `resolveStaticCall` + sibling tests pass
  (sanity check — no behavior change expected)
- No regressions

Follows-up on: abhigyanpatwari#756 (comment)

* test(SM-13): cover ownerless-Constructor retry + PHP free function

Two low-severity test gaps from PR abhigyanpatwari#756 review comment 4215739052 —
previously addressed doc-only, now have concrete test coverage.

Finding #3 low — ownerless-Constructor retry path (previously comment-only)
- The retry after resolveStaticCall returns null handles three distinct
  null-return reasons. Cases (a) and (b) were already tested (Interface/
  Trait null-route from SM-12, Swift shadowing dedup from R7). Case (c) —
  resolveStaticCall step-4 bailout when the tiered pool contains
  ownerless Constructor nodes — was only covered by a comment.
- New test: Class + ownerless Constructor in tiered pool, callForm='free'.
  Exercises the full chain:
    1. resolveStaticCall step 3 walks classCandidates via
       lookupMethodByOwner — ownerless Constructor not in methodByOwner,
       nothing found.
    2. Step 4 detects Constructor in tiered pool, bails with null.
    3. resolveFreeCall retry re-runs filterCallableCandidates with
       'constructor' form, which prefers Constructor over Class per
       CONSTRUCTOR_TARGET_TYPES ordering.
    4. Single survivor returned.
- Asserts the Constructor node (not the Class) is the resolved target.

Low — PHP free function coverage gap
- The language coverage table in the same review flagged PHP free
  functions (top-level `function helper()` outside any class) as
  uncovered. Added a test mirroring the existing Go/Python/Rust/Java/
  JS language tests — exercises the `.php` dispatch path for free
  calls. Ruby and C/C++ remain uncovered; deferred to a future round
  since those languages also have other gaps in the broader test file.

Verification
- `tsc --noEmit` clean
- 3066 unit tests pass (+2 new regression tests)
- 1766 integration tests pass
- Zero regressions

Follows-up on: abhigyanpatwari#756 (comment)

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: magyargergo <11230420+magyargergo@users.noreply.github.com>
Co-authored-by: Gergo Magyar <gergomagyar@icloud.com>
zander-raycraft pushed a commit that referenced this pull request May 7, 2026
…yanpatwari#770)

* Initial plan

* SM-19: Replace resolveCallTarget with thin dispatcher

Delete the monolithic resolveCallTarget function (~200 lines) and replace it
with a 15-line thin dispatcher that routes to resolveMemberCall,
resolveStaticCall, or resolveFreeCall. Extract module-alias resolution and
file-based member-call fallback into dedicated helper functions.

- resolveCallTarget body reduced from ~200 lines to ~15 lines
- Extract resolveModuleAliasedCall helper (Python/Ruby module imports)
- Extract resolveMemberCallByFile helper (trait dispatch, overload disambiguation)
- Extract singleCandidate helper (constructor alias fallback, name-based fallback)
- Update unit tests for new dispatcher semantics
- Update doc comments referencing deleted D0-D4 paths

Agent-Logs-Url: https://github.com/abhigyanpatwari/GitNexus/sessions/469eac38-b0c0-4a26-a2ff-3eb06299730b

Co-authored-by: magyargergo <11230420+magyargergo@users.noreply.github.com>

* SM-19: Add singleCandidate tail fallback for member calls with unresolvable receiver type

Agent-Logs-Url: https://github.com/abhigyanpatwari/GitNexus/sessions/469eac38-b0c0-4a26-a2ff-3eb06299730b

Co-authored-by: magyargergo <11230420+magyargergo@users.noreply.github.com>

* fix(SM-19): address all PR abhigyanpatwari#770 review findings + fix CI

Fixes all 5 test failures (2 unit + 3 integration) and addresses 10
review findings from comment 4225312416.

Critical fix — singleCandidate null-route guard
The SM-19 dispatcher chained singleCandidate as an unconditional tail
fallback for member calls with receiverTypeName. This bypassed the
SM-10 R3 null-route contract: when the receiver type IS in the index
but file/owner filtering produced zero matches, the old code returned
null (genuine miss), but the new code fell through to singleCandidate
(false-positive CALLS edge).

Root cause: resolveMemberCallByFile returns null for two semantically
different reasons — (1) type not found in the index at all, and
(2) type found but no candidate matched after narrowing. The dispatcher
treated both as "try the next fallback." The old resolveCallTarget
exited the entire function on case 2.

Fix: after the scoped resolvers both return null, check whether the
receiver type resolves in the index. If it does (case 2), null-route
— the scoped resolvers made the right decision. If it doesn't (case 1,
e.g. PHP 'mixed', dynamic types), singleCandidate is the correct last
resort. ctx.resolve is cached so the check is free.

This fixes:
- Unit: no heritageMap null-route test (was getting 1 edge, expects 0)
- Integration: Rust c.trait_only() negative test
- Integration: 3 PHP heritage + alias tests (singleCandidate correctly
  fires when the receiver type is not in the index)

Performance (findings #1, #2, #3)
- Thread pre-computed tiered result into resolveModuleAliasedCall via
  new tieredOverride parameter — eliminates the duplicate ctx.resolve
  call on every module-alias path.
- Add countCallableCandidates helper that short-circuits at threshold
  without allocating an intermediate array — replaces the
  filterCallableCandidates(...).length > 1 allocation in skipMember.
- resolveMemberCallByFile lookupCallableByName caching deferred to a
  follow-up (finding #2) — the fix requires threading widenCache
  through the file-scoped resolver which is a larger change.

Code quality (findings #4, #5)
- Remove dead code: redundant conditional in resolveMemberCallByFile
  where both branches returned null.
- Move WidenCache type declaration from mid-file (between JSDoc blocks)
  to adjacent to CONSTRUCTOR_TARGET_TYPES with other type declarations.

Formatting
- Applied prettier to call-processor.ts (CI format check was failing).

Verification
- tsc --noEmit clean
- 3188 unit tests pass (0 skipped real tests)
- 1766 resolver integration tests pass
- Zero regressions — all PHP, Rust, and no-heritageMap tests green

Review: abhigyanpatwari#770 (comment)

* fix(SM-19): restore module-alias narrowing and constructor disambiguation

Codex adversarial review on PR abhigyanpatwari#770 surfaced two silent regressions in the
SM-19 thin dispatcher:

Finding 1 [high] — Typed member calls bypassed module-alias narrowing.
When two homonym receiver types are both imported by the caller, the
import-scoped tier no longer narrows and the owner/file resolvers see
genuine ambiguity. The dispatcher null-routed silently, dropping valid
CALLS edges. Fix: consult `resolveModuleAliasedCall` at the top of the
typed-member branch so an active alias on `call.receiverName` picks the
aliased file before the generic resolvers run.

Finding 2 [medium] — Constructor dispatch lost overload disambiguation.
When `resolveStaticCall` bails (ambiguous or ownerless Constructor pool)
and the caller supplied `overloadHints` / `preComputedArgTypes`, the
branch fell straight through to `singleCandidate` — which also bails on
multiple same-arity survivors. Fix: between `resolveStaticCall` and
`singleCandidate`, run constructor-filtered overload disambiguation on
the tiered pool. Only engages when a narrowing signal is present;
preserves SM-10 R3 null-route for genuinely ambiguous cases.

Tests:
- call-processor.test.ts: 3 new dispatcher-level regression tests
  covering real-homonym alias narrowing, constructor overload
  disambiguation with `argTypes`, and null-route control
- symbol-table.test.ts: update `module alias homonyms` test which
  previously codified the Finding 1 regression; now asserts resolution
  to the aliased file's method

Verification: 3191 unit + 2398 integration tests pass; tsc --noEmit
clean; prettier clean.

* refactor(SM-19): address code review findings with clean-code pass

Code review on commit f424685 surfaced one P1 correctness regression and
two P2 maintainability concerns. This commit closes all ten findings:

P1 — Alias helper placement regression
  - resolveModuleAliasedCall now runs as a FALLBACK in the typed-member
    branch, after resolveMemberCall/resolveMemberCallByFile return null.
    Previously it short-circuited BEFORE scoped resolvers, leaking unrelated
    homonyms from the aliased file when a local var coincidentally matched
    a module alias.
  - Added type-file verification guard: alias narrowing only fires when the
    alias target file is among the receiver type's defining files. Prevents
    cross-type false positives and hardens SM-10 R3.

P2 — Thin-dispatcher drift (roadmap Phase 3)
  - Extracted disambiguateByOverloadOrArgTypes shared helper. Centralizes
    the overloadHints → preComputedArgTypes precedence rule used by both
    member and constructor resolvers.
  - Folded constructor overload disambiguation into resolveStaticCall as
    step 4.5 (between the ambiguous-pool bail and the instantiable-class
    fallback). resolveStaticCall now accepts optional overloadHints /
    preComputedArgTypes symmetric with resolveMemberCallByFile.
  - Dispatcher's constructor branch returns to a 2-line delegation.
  - resolveMemberCallByFile now calls the shared helper instead of inlining
    the ternary.

P2 — Missing test coverage
  - owner-scoped wins over alias narrowing (alias with unrelated target
    class must not override unique owner-scoped answer)
  - alias narrowing rejects unrelated target type (type-file guard)
  - alias fallthrough: receiverName not in alias map
  - alias fallthrough: alias target file has no matching method
  (overloadHints-for-constructor variant transitively covered via the
   extracted helper's member-path tests; direct dispatcher test deferred
   as it requires real OverloadHints fixture parsing)

P3 — Clarity and durability
  - Stripped "Codex SM-19 Finding N" prefixes from comments. Replaced with
    durable explanations of WHY each guarded branch exists.
  - Added cross-reference comment at the tail-branch resolveModuleAliasedCall
    call site pointing to the typed-member branch usage.

Verification: 3195 unit + 1766 resolver integration + 2398 full integration
tests pass. tsc --noEmit clean. prettier clean.

Plan: docs/plans/2026-04-11-002-fix-sm19-code-review-findings-plan.md

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: magyargergo <11230420+magyargergo@users.noreply.github.com>
Co-authored-by: Gergo Magyar <gergomagyar@icloud.com>
zander-raycraft pushed a commit that referenced this pull request May 7, 2026
…npatwari#606 split) (abhigyanpatwari#796)

* feat(group): extractor expansion + manifest extractor

Part 2 of 4 in the split of abhigyanpatwari#606 (ticket: abhigyanpatwari#792). Follows abhigyanpatwari#795
(bridge.lbug storage foundation, already merged), but this PR has no
code-level dependency on abhigyanpatwari#795 — it only imports types and the
ContractExtractor interface that existed on upstream main before
either PR. It could have been reviewed in parallel with abhigyanpatwari#795.

## What changed

Expands the 3 existing contract extractors with substantially more
language/framework coverage, and adds a new `manifest-extractor`
that resolves `group.yaml`-declared cross-links against the per-repo
graph via exact-name lookups.

### New file (228 LOC)

- `gitnexus/src/core/group/extractors/manifest-extractor.ts` —
  exact graph lookup for `group.yaml`-declared cross-links. HTTP
  paths are canonicalized before Route.name matching; gRPC is
  resolved by service/method name (NO `.proto`-filename fallback);
  topic and lib use exact-name match. Falls back to a synthetic
  `manifest::<repo>::<contractId>` uid when the graph has no
  matching symbol, so cross-impact traversal still has a stable
  anchor for the contract.

### Modified extractors (+958 LOC prod)

- `extractors/grpc-extractor.ts` (+522) — `.proto` parser with
  comment and string-literal sanitization (braces inside strings no
  longer truncate service bodies); package/service/method canonical
  IDs; server/client detection across Go (`grpc.NewServer`,
  `RegisterXxxServer`, `XxxGrpc.XxxImplBase`), Java (`@GrpcService`,
  `BlockingStub`), Python (`servicer_to_server`, `XxxStub`), and
  TypeScript/Node (`@GrpcMethod`, `ClientGrpc`, `loadPackageDefinition`).
- `extractors/http-route-extractor.ts` (+174) — Go gin/echo/stdlib
  `HandleFunc`, NestJS `@Controller`+`@Get`/etc, Python FastAPI
  decorators, Java Spring `@RequestMapping`/`@GetMapping`,
  restTemplate / WebClient / OkHttp consumers.
- `extractors/topic-extractor.ts` (+98) — sarama `ProducerMessage{}`
  struct literal detection (replaces a constructor-anchored regex
  that missed topics inside producer loops), kafka-go Writer/Reader,
  Python NATS (`await nc.subscribe`/`await nc.publish`), JetStream
  helpers.

### Modified and new tests (+1264 LOC)

- `grpc-extractor.test.ts` (+539) — full coverage of the new proto
  parser (strings-with-braces regression, comments-with-braces
  regression), per-language server/client detection
- `http-route-extractor.test.ts` (+240) — per-framework route
  extraction + normalization edge cases
- `topic-extractor.test.ts` (+177) — the sarama in-loop regression,
  JetStream, Python NATS, kafka-go Writer/Reader
- `manifest-extractor.test.ts` (+308 NEW) — HTTP path normalization,
  gRPC exact lookup with proto-fallback regression, lib and topic
  exact matching, synthetic-uid fallback behavior

### Self-review fixes folded in

Carried forward from the abhigyanpatwari#606 self-review (commit `d15b8cb`):

- **HIGH #1** — `manifest-extractor.resolveSymbol` was too fuzzy.
  Previously used `CONTAINS` on route/name fields plus an
  unconditional `filePath ENDS WITH '.proto'` fallback for gRPC.
  Consequences: `/orders` matched `/suborders`, and any repo with
  any `.proto` file returned a random proto symbol for a gRPC
  manifest entry. Replaced with exact equality + deterministic
  `ORDER BY` + synthetic-uid fallback for unresolved manifests.
  Regression tests included.
- **MED #3** — gRPC proto parser brace-depth counting now sanitizes
  strings and comments first (`stripProtoCommentsAndStrings`). A
  valid proto with `option deprecated_reason = "use NewService {
  instead"` used to have its service body closed early by the `"{"`
  inside the literal, silently dropping methods after the offending
  string. Regression tests for both string-with-brace and
  comment-with-brace cases.
- **MED #4** — sarama Kafka regex changed from
  `sarama.NewSyncProducer[\s\S]{0,300}?Topic:` (anchored on
  constructor, caught only first topic in a loop) to
  `sarama.ProducerMessage{...Topic:}` (matches every struct literal
  directly). Regression test with a for-loop that constructs
  multiple `ProducerMessage`s.
- **MED #7** — `manifest-extractor.resolveSymbol` no longer has a
  silent `catch { /* fall through */ }`. Errors from the graph
  executor are logged via `console.warn` with link type, contract
  name, repo key, and error message before falling through to the
  synthetic-uid path.

## Why

Reviewer focus here is pure regex / parser correctness — no
storage, no Cypher queries, no algorithmic changes to the cross-link
algorithm. Separating this from the bridge foundation PR (abhigyanpatwari#795)
meant reviewers could stay in a single mental mode (parsing logic)
instead of context-switching between DDL, Cypher, and regex.

## How to verify

- `cd gitnexus && npx tsc --noEmit`
- `cd gitnexus && npx vitest run test/unit/group/grpc-extractor.test.ts --pool=forks`
- `cd gitnexus && npx vitest run test/unit/group/http-route-extractor.test.ts --pool=forks`
- `cd gitnexus && npx vitest run test/unit/group/topic-extractor.test.ts --pool=forks`
- `cd gitnexus && npx vitest run test/unit/group/manifest-extractor.test.ts --pool=forks`

Local pre-push: typecheck clean, all 99 extractor unit tests pass
(grpc 43, http 18, topic 30, manifest 8).

## Risk / rollback

**Low.** Extractors have no user-facing surface in this PR — they
produce `ExtractedContract[]` that is consumed by `sync.ts` in the
next split (abhigyanpatwari#793). No existing behavior changes for users who don't
run a `group sync`. Rollback = `git revert` of the merge commit;
the modifications to `grpc-extractor.ts` / `http-route-extractor.ts`
/ `topic-extractor.ts` revert to the pre-PR versions that still
work (they're subsets of the new functionality).

## Scope discipline (per GUARDRAILS.md)

- Only the 8 files above are touched; no drive-by refactors
- No CI/release/security config changes
- No secrets or machine-specific paths
- Content lifted from abhigyanpatwari#606 (CI 11/11 green on `d15b8cb`)

## Dependencies

- **Base:** `main` (upstream already includes abhigyanpatwari#795 as `1ff324c`)
- **Blocks:** sync pipeline (abhigyanpatwari#793) and the cross-impact feature (abhigyanpatwari#794)
- **Tracker issue:** abhigyanpatwari#792
- **Parent PR:** abhigyanpatwari#606

Co-authored-by: Claude <noreply@anthropic.com>

* refactor(group): migrate topic-extractor from regex to tree-sitter queries

Addresses @magyargergo's feedback on abhigyanpatwari#796 that regex-based lookups
should use tree-sitter nodes instead, and that the top-level
extractors must NOT carry language dependencies. This is phase 1 of
a multi-step migration — topic-extractor first because its patterns
are the most uniform (16 "call/annotation with first-arg string
literal" variants), which makes it a clean proof of the approach
before grpc-extractor and http-route-extractor get the same treatment.

## Architecture: language-agnostic orchestrator + per-language plugins

The top-level extractor is a thin orchestrator that never imports a
tree-sitter grammar or a query string. Per-language knowledge lives
in a new `topic-patterns/` folder with one file per language plus a
registry that maps file extensions to compiled plugins:

```
src/core/group/extractors/
├── tree-sitter-scanner.ts         # shared, language-agnostic scanning utilities
├── topic-extractor.ts              # thin orchestrator (no grammar imports)
└── topic-patterns/
    ├── types.ts                    # TopicMeta, Broker
    ├── index.ts                    # registry: extension → compiled provider
    ├── java.ts                     # tree-sitter-java + JAVA_TOPIC_PROVIDER
    ├── go.ts                       # tree-sitter-go + GO_TOPIC_PROVIDER
    ├── python.ts                   # tree-sitter-python + PYTHON_TOPIC_PROVIDER
    └── node.ts                     # tree-sitter-javascript + tree-sitter-typescript
                                    # → JAVASCRIPT_/TYPESCRIPT_/TSX_TOPIC_PROVIDER
```

**Shared scanner (`tree-sitter-scanner.ts`)** — defines
`PatternSpec<TMeta>`, `LanguagePatterns<TMeta>`, `CompiledPatterns<TMeta>`
and the `scanFile(parser, plugin, content)` helper. Plugins compile their
queries eagerly at module load via `compilePatterns()`, so a broken
pattern fails loudly at import time instead of silently at scan time.
`unquoteLiteral()` handles single/double/template quotes, Python
triple-quoted strings, and Go raw backtick strings.

**Per-language plugins** own:
- the tree-sitter grammar import (this is the ONLY place in
  `src/core/group/` where tree-sitter grammars are imported),
- the query S-expressions,
- the `TopicMeta` payload (role, broker, confidence, symbolName) that
  the orchestrator receives back on every match.

Each plugin uses a `@value` capture name to bind the topic literal node.
The JavaScript and TypeScript grammars share AST node names for every
construct we query, so `node.ts` defines the pattern sources once and
compiles them against `JavaScript`, `TypeScript.typescript`, and
`TypeScript.tsx` — exporting three providers because `Parser.Query`
objects are NOT portable across grammar instances.

**Registry (`topic-patterns/index.ts`)** — maps `.java` → Java provider,
`.go` → Go, `.py` → Python, `.js`/`.jsx` → JS, `.ts` → TS, `.tsx` → TSX.
Also exports `TOPIC_SCAN_GLOB` so adding a new language is a single
file-level edit (drop `topic-patterns/<lang>.ts`, import + register it
here — zero edits required in `topic-extractor.ts`).

**Orchestrator (`topic-extractor.ts`)** — ~110 lines, no grammar or
query imports. Per file: `getProviderForFile(rel)` → `scanFile(parser,
provider, content)` → `unquoteLiteral(valueText)` → `makeContract(...)`.
Reuses one `Parser` instance across files; the scanner calls
`setLanguage` per plugin.

## Why this is better than regex

1. **Comments and strings are respected for free.** The old regex
   would match `// kafkaTemplate.send("fake.topic")` as a real
   producer; tree-sitter never visits comments or string literals as
   code nodes, so false positives from commented-out code are
   eliminated.
2. **Struct/object literal patterns are structural, not textual.**
   `sarama.ProducerMessage{Topic: "..."}` no longer needs a 300-char
   lookahead (which was a known cross-match bug partly mitigated by a
   loop regression test in the self-review). The new query matches a
   specific `composite_literal` with a specific `qualified_type` and
   `keyed_element` — exactly one struct literal per match.
3. **No order-of-operations fragility.** Regex for
   `channel.publish` vs `channel.consume` was independent and
   file-wide; the AST scopes matches to the specific `call_expression`.
4. **Language-agnostic extension.** Adding Ruby, Rust, or C# topic
   detection later means dropping one file in `topic-patterns/` — no
   changes to shared scanner or orchestrator, and no tree-sitter
   imports leak into top-level code.

## Per-file fault tolerance

- Malformed files that tree-sitter can't parse are silently skipped
  (`parser.parse` is wrapped by `scanFile`). The ingestion pipeline
  already logs unparseable files at index time.
- A syntactically invalid query is caught at `compilePatterns` time,
  not scan time — broken plugins fail loudly at import.
- Per-pattern `matches()` failures are swallowed so one broken query
  in a plugin doesn't block the rest.

## Tests

All 30 existing `topic-extractor.test.ts` tests pass **without any
changes to the test file** — they were written as input/output contract
tests (given this source file, expect these `ExtractedContract` objects)
and that contract is unchanged. Regression coverage includes:

- Kafka: Java `@KafkaListener` + `kafkaTemplate.send`; Node
  `producer.send` + `consumer.subscribe`; Go sarama producer/consumer
  (sync and async); kafka-go Writer/Reader; Python `KafkaConsumer` +
  `producer.send/produce`
- RabbitMQ: Java `@RabbitListener` + `rabbitTemplate.convertAndSend`;
  Node `channel.consume/publish/sendToQueue`; Python `basic_consume/
  basic_publish` with keyword args
- NATS: Go and Node `nc.Subscribe/Publish`; Go and Node JetStream
  `js.Subscribe/Publish`; Python `await nc.subscribe/publish`

Including the regression test for the sarama `ProducerMessage`
in-loop case — the AST-based query captures every literal in the
file independently, not just the first one after `NewSyncProducer`.

## Neighbor regression check

- `topic-extractor.test.ts` — 30/30 pass (rewritten extractor)
- `http-route-extractor.test.ts` — 18/18 pass (untouched)
- `grpc-extractor.test.ts` — 43/43 pass (untouched)
- `manifest-extractor.test.ts` — 8/8 pass (untouched)
- Full `npx tsc --noEmit` clean

## Scope discipline (per GUARDRAILS.md)

- Only files under `src/core/group/extractors/` are touched; no
  changes to other extractors, tests, MCP surface, or pipeline.ts.
- No CI/release/security config changes, no secrets.
- New tree-sitter imports all reference grammars that are already
  installed as dependencies (`tree-sitter`, `tree-sitter-javascript`,
  `tree-sitter-typescript`, `tree-sitter-python`, `tree-sitter-java`,
  `tree-sitter-go` — all in `package.json` for the existing pipeline).

## Phase 2 / phase 3 plan

- **Phase 2 (next commit):** rewrite `http-route-extractor.ts`
  Strategy B (regex fallback) on the same plugin pattern. Graph-assisted
  Strategy A stays as-is (already uses pipeline-built tree-sitter data
  via `HANDLES_ROUTE` Cypher queries).
- **Phase 3 (commit after):** rewrite `grpc-extractor.ts` for Java /
  Go / Python / TypeScript detection. `.proto` files are the one
  outstanding question — there is no `tree-sitter-proto` grammar
  installed; the in-tree string-sanitizing parser stays as a pragmatic
  exception with a comment, alternative being to add
  `tree-sitter-proto` as a dep (open for the maintainer).

Co-authored-by: Claude <noreply@anthropic.com>

* refactor(group): migrate http-route-extractor Strategy B to tree-sitter plugins

Phase 2 of the extractor refactor requested by @magyargergo on abhigyanpatwari#796.
Same architecture as the phase 1 topic-extractor rewrite: a thin,
language-agnostic orchestrator plus per-language plugins that own
tree-sitter grammars and query sources. The top-level extractor file
no longer imports any tree-sitter grammar or query string.

## Architecture

```
src/core/group/extractors/
├── tree-sitter-scanner.ts          # shared, language-agnostic primitives
├── http-route-extractor.ts         # thin orchestrator (no grammar imports)
└── http-patterns/
    ├── types.ts                    # HttpDetection, HttpLanguagePlugin, HttpRole
    ├── index.ts                    # registry: ext → plugin + HTTP_SCAN_GLOB
    ├── java.ts                     # tree-sitter-java: Spring + RestTemplate/WebClient/OkHttp
    ├── go.ts                       # tree-sitter-go: gin/echo/HandleFunc + http/resty consumers
    ├── python.ts                   # tree-sitter-python: FastAPI + requests
    ├── php.ts                      # tree-sitter-php: Laravel Route::get/...
    └── node.ts                     # tree-sitter-javascript + tree-sitter-typescript:
                                    #   NestJS controllers, Express, fetch, axios
```

**Shared scanner (`tree-sitter-scanner.ts`)** — generalised from phase 1:
- `ScanMatch<TMeta>.captures` is now a full `CaptureMap` (every named
  capture the query binds, not just a single `@value`). Topic extractor
  updated to read `match.captures.value` accordingly.
- New `runCompiledPatterns(plugin, tree)` helper lets plugins run
  multiple query bundles against the same pre-parsed tree. This is
  needed for HTTP plugins that combine a class-prefix query with a
  method-route query (Spring, NestJS).
- `scanFile` becomes a thin wrapper over `parser.parse + runCompiledPatterns`.

**HTTP plugin shape** — unlike topic plugins, HTTP plugins expose a
`scan(tree)` function rather than a flat pattern list. This reflects
HTTP's more complex extraction: each detection needs method + path +
handler name, and framework patterns like Spring `@RequestMapping` /
NestJS `@Controller` require cross-referencing a class-level prefix
with method-level annotations. Plugins internally use
`compilePatterns` + `runCompiledPatterns` and walk the AST to resolve
the class/method relationships.

**Per-framework coverage:**

- **Java (`java.ts`)**
  - Spring: `@RequestMapping("/api/v2")` class prefix + `@(Get|Post|Put|
    Delete|Patch)Mapping("/sub")` method routes, joined via the
    enclosing `class_declaration` node id.
  - `RestTemplate.getForObject/postForEntity/put/delete/patchForObject` →
    method derived from API name.
  - `WebClient.method(HttpMethod.X, "/path")` → method from
    `HttpMethod.X` capture.
  - `new Request.Builder().url("/path")` → OkHttp consumer.

- **Go (`go.ts`)**
  - gin / echo / chi frameworks: `\w+.GET("/path", handler)` captures
    upper-case verb + handler identifier.
  - `net/http.HandleFunc("/path", handler)` → provider (default GET).
  - `http.Get/Post/Head` consumer, `http.NewRequest("METHOD", ...)`,
    resty `client.R().Get/Post/...`.

- **Python (`python.ts`)**
  - `@app.get("/path")` FastAPI decorators.
  - `requests.get/post/...` and `requests.request("METHOD", "url")`.

- **PHP (`php.ts`)**
  - Laravel `Route::get/post/.../patch('/path', ...)` via
    `scoped_call_expression`. Uses `PHP.php_only` to match the
    existing ingestion pipeline's grammar selection.

- **Node (`node.ts`) — JS + TS + TSX**
  - Pattern sources defined once, compiled against three grammar
    variants (`JavaScript`, `TypeScript.typescript`, `TypeScript.tsx`)
    because `Parser.Query` objects are not portable across grammars.
    Exports three plugins sharing the same `scan` logic.
  - NestJS: `@Controller('prefix')` decorators are siblings of the
    class in `export_statement` / `program`; `@Get(':id')` decorators
    are siblings of the method in `class_body`. The plugin walks
    decorator → next named sibling to find the decorated class /
    method, then combines the class prefix with the method path.
    Only emits NestJS detections when the enclosing class has a real
    `@Controller` decorator — prevents false positives from generic
    classes that happen to use `@Get` from another library.
  - Express: `(router|app).<verb>('/path', ...)`.
  - `fetch(url)` (default GET) + `fetch(url, { method: 'X' })`
    (uses two queries + a SyntaxNode-id dedupe set so URL literals
    aren't double-emitted by the options variant).
  - `axios.get/post/...`.

## Orchestrator changes

`http-route-extractor.ts` drops every `scanXxxProviders` / `scanXxxConsumers`
regex method and replaces them with a single source-scan loop that
delegates to `getPluginForFile(rel).scan(tree)`. The orchestrator
still owns:

- **Path normalization** (`normalizeHttpPath`, `normalizeConsumerPath`)
  — language-agnostic string processing shared by both strategies.
- **Graph-assisted Strategy A** (`HANDLES_ROUTE` / `FETCHES` / `CONTAINS`
  Cypher queries) — unchanged in spirit. The only regex helpers it
  used (`inferMethodFromFileScan`, `pickJavaHandlerName`) are now
  replaced by a lookup against the plugin's detections for the same
  file: for each route row, find the detection whose normalized path
  matches, and pull the HTTP method + handler name from it.
- **Per-file parse cache** — the orchestrator parses each relevant
  file at most once per `extract()` call. Both the graph-assisted
  enrichment loop and the source-scan fallback share the same
  `cachedDetections` map, so we never run the plugin twice for the
  same file.

## Why this is better than the regex version

1. **Comments and strings for free.** The old regex would match
   `// router.get('/fake')` as a real Express route; tree-sitter
   never visits string/comment nodes.
2. **Structural controller-prefix.** Spring and NestJS class-prefix
   joining is now scoped to the enclosing class via `class_declaration`
   node ids, eliminating file-wide state that broke when a file had
   multiple controllers.
3. **Precise NestJS disambiguation.** The plugin only emits a NestJS
   detection when the enclosing class has a real `@Controller`
   decorator — the old regex would fire on any `@Get(...)` in the
   file regardless of surrounding context.
4. **Language-agnostic extension.** Adding Ruby / Rust / Kotlin HTTP
   detection later means dropping one file in `http-patterns/` — no
   changes to the shared scanner, the orchestrator, or the Strategy A
   Cypher queries.

## Tests

- `http-route-extractor.test.ts` — **18/18 pass** (tests unchanged;
  they're contract-style input/output tests and the contract shape is
  unchanged). Covers Spring class prefix, Express, gin/echo, stdlib
  HandleFunc, NestJS, Laravel, FastAPI for providers and
  fetch/axios/python-requests/rest-template/webClient/okhttp/go-stdlib/
  resty for consumers, plus graph-first Strategy A for both.
- `topic-extractor.test.ts` — **30/30 pass** after the `captures.value`
  API migration.
- `grpc-extractor.test.ts` — 43/43 pass (untouched; phase 3).
- `manifest-extractor.test.ts` — 8/8 pass (untouched).
- `service.test.ts`, `sync.test.ts`, `storage.test.ts` — 41/41 pass.
- `npx tsc -p tsconfig.json --noEmit` clean.

## Scope discipline (per GUARDRAILS.md)

- Only files under `src/core/group/extractors/` are touched.
- No changes to pipeline.ts, MCP surface, ingestion, or tests.
- No CI / release / security / secrets changes.
- Tree-sitter grammars imported by plugins (`tree-sitter-java`,
  `tree-sitter-go`, `tree-sitter-python`, `tree-sitter-php`,
  `tree-sitter-javascript`, `tree-sitter-typescript`) are all already
  in `package.json` for the existing ingestion pipeline.

## Phase 3 plan

- **grpc-extractor** gets the same treatment: plugin-per-language under
  `grpc-patterns/` for Java / Go / Python / TS detection. `.proto`
  files remain an open question — no `tree-sitter-proto` grammar is
  installed, so the in-tree string-sanitizing parser from PR abhigyanpatwari#796's
  self-review stays as a pragmatic exception unless the maintainer
  wants us to add `tree-sitter-proto` as a new dep.

Co-authored-by: Claude <noreply@anthropic.com>

* refactor(group): migrate grpc-extractor source scans to tree-sitter plugins

Phase 3 (final) of the extractor refactor requested by @magyargergo on
abhigyanpatwari#796. Same architecture as phase 1 (topic) and phase 2 (http): thin
language-agnostic orchestrator + per-language plugins that own
tree-sitter grammars and query sources. With this commit the top-level
extractors under `src/core/group/extractors/` import ZERO tree-sitter
grammars and ZERO query strings — every grammar import lives in a
`*-patterns/<lang>.ts` plugin file, and the orchestrators go through
the registry indirection.

## Architecture

```
src/core/group/extractors/
├── tree-sitter-scanner.ts         # shared primitives (unchanged)
├── grpc-extractor.ts               # orchestrator (only `.proto` parser left)
└── grpc-patterns/
    ├── types.ts                    # GrpcDetection, GrpcLanguagePlugin, GrpcRole
    ├── index.ts                    # registry: ext → plugin + GRPC_SCAN_GLOB
    ├── go.ts                       # tree-sitter-go: RegisterXxxServer, Unimplemented, NewXxxClient
    ├── java.ts                     # tree-sitter-java: @GrpcService + XxxImplBase + newBlockingStub
    ├── python.ts                   # tree-sitter-python: add_XxxServicer_to_server + XxxStub
    └── node.ts                     # tree-sitter-javascript + tree-sitter-typescript:
                                    #   @GrpcMethod, @GrpcClient field type,
                                    #   .getService<X>('Svc'), new XxxServiceClient,
                                    #   loadPackageDefinition dynamic constructors
```

## Per-language coverage

**Go (`go.ts`)**
- Provider: `\w+.RegisterXxxServer(...)` via `call_expression →
  selector_expression → field_identifier` + JS regex filter
  `^Register(\w+)Server$`.
- Provider: `pb.UnimplementedXxxServer` embedded in a struct via
  `struct_type → field_declaration_list → field_declaration →
  qualified_type → type_identifier` + JS filter.
- Consumer: `\w+.NewXxxClient(...)` via the same call_expression
  query + JS filter `^New(\w+)Client$`.

**Java (`java.ts`)**
- Provider: `class X extends YyyGrpc.YyyImplBase` — two queries
  handle the scoped and plain forms. `scoped_type_identifier`'s
  children are positional (no `scope:`/`name:` fields), so the
  query matches the two `type_identifier` children by position.
- `#match? @inner "ImplBase$"` restricts matches at query time.
- Whether the class has `@GrpcService` or not controls only the
  `source` metadata label — the plugin walks the class_declaration's
  `modifiers` child in JS to detect the marker_annotation.
- Consumer: `YyyGrpc.newStub(ch)` / `newBlockingStub(ch)` via a
  `method_invocation` query with `#match? @method
  "^new(Blocking)?Stub$"`, service name extracted via
  `^(\w+)Grpc$` on the object identifier.

**Python (`python.ts`)**
- Single call-expression query covers both bare identifier and
  `obj.method` attribute forms:
  `(call function: [(identifier) @fn (attribute attribute: (identifier) @fn)])`.
- Plugin filters `@fn.text` against two JS regexes:
  `^add_(\w+)Servicer_to_server$` (provider) and `^(\w+)Stub$`
  (consumer), with a reserved-names ignore list for the Stub case
  (Mock / Test / Fake / Stub).

**Node — JavaScript + TypeScript + TSX (`node.ts`)**
- Pattern sources defined once, compiled three times (one per grammar)
  because `Parser.Query` objects are not portable across grammars.
  Exports three `GrpcLanguagePlugin`s sharing the same `scan`.
- `@GrpcMethod('Service', 'Method')`: decorator query captures the
  two string literals. Confidence is hard-coded 0.8 regardless of
  proto map resolution (matches the original regex version's
  behaviour).
- `@GrpcClient(...) field: XxxServiceClient`: decorator query
  captures the decorator node, plugin walks up to find the enclosing
  `public_field_definition` (decorators on fields are CHILDREN of
  the field definition in tree-sitter-typescript, not siblings) and
  reads its first `type_annotation → type_identifier`, then runs the
  `^(\w+Service)Client$` JS filter.
- `client.getService<X>('AuthService')`: call-expression query on
  `member_expression.property = "getService"` + string literal arg.
- `new XxxServiceClient(...)`: `new_expression` with a bare
  identifier constructor, filtered by `^(\w+Service)Client$` so
  generic `new AuthClient(...)` (missing the `Service` infix) does
  NOT falsely register as a consumer. Preserves the regression test
  `test_extract_ts_non_service_client_constructor_is_ignored`.
- `loadPackageDefinition` dynamic loader: gated on
  `tree.rootNode.text.includes('loadPackageDefinition')`. When set,
  `new foo.bar.Xxx(...)` qualified constructors with a capitalised
  property name register as consumers.

## Orchestrator changes

`grpc-extractor.ts` loses every `scanGoProviders` / `scanJavaProviders`
/ ... helper and replaces them with a single source-scan loop that:

1. Parses each file with the plugin's grammar (one shared `Parser`
   instance across all files, `setLanguage` called per plugin).
2. Calls `plugin.scan(tree)` to get `GrpcDetection[]`.
3. Converts each detection to an `ExtractedContract` via the private
   `detectionToContract` helper, which:
   - Looks the short service name up in the proto map (filled by
     the `.proto` parser).
   - Picks confidence = `confidenceWithProto` if resolved, else
     `confidenceWithoutProto`.
   - Builds a method-level contract id (`grpc::pkg.Svc/Method`) when
     the detection carries a `methodName` (TS `@GrpcMethod` only),
     otherwise a service-level id (`grpc::pkg.Svc/*`).

Everything else — the `.proto` parser, `buildProtoContext`,
`buildProtoMap`, `resolveProtoConflict`, `serviceContractId`,
`stripProtoCommentsAndStrings`, `extractServiceBlocks`, the dedupe
function — stays exactly as before. The `.proto` parser is kept as a
pragmatic exception to the "no regex in extractors" rule because no
`tree-sitter-proto` grammar is installed in the repo; a comment at the
top of the file explains this and flags the maintainer option of
adding `tree-sitter-proto` as a dependency.

## Why this is better than the regex version

1. **Comments and strings are respected for free.** Matched node types
   are only code constructs, never text inside comments or string
   literals.
2. **No false positives on partial names.** The old `(\w+?)Grpc`-style
   regexes would cross-match unrelated identifiers; structural queries
   restrict matches to the exact AST shape (`scoped_type_identifier →
   type_identifier` pairs, `method_invocation → identifier` etc.).
3. **NestJS `@GrpcClient` is structural, not regex-based.** The old
   regex required a specific textual layout
   (`@GrpcClient(...) private readonly foo!: XxxServiceClient`); the
   plugin now walks the AST, so modifier order / optional modifiers /
   multi-line formatting don't break it.
4. **Language-agnostic extension.** Adding Kotlin / Rust / C# gRPC
   detection later is a one-file edit in `grpc-patterns/index.ts` —
   no touches to the shared scanner, the orchestrator, or the proto
   parser.

## Tests

- `grpc-extractor.test.ts` — **43/43 pass** (tests unchanged; the
  contract shape is identical). Covers .proto parsing (including the
  brace-inside-string regression), Go provider/consumer,
  Java @GrpcService / plain ImplBase provider + newBlockingStub
  consumer, Python servicer + stub, TS @GrpcMethod + @GrpcClient +
  .getService + new XxxServiceClient + loadPackageDefinition + the
  `AuthClient` vs `AuthServiceClient` discrimination, dedupe across
  multiple patterns in one file, proto-aware confidence, and the
  inherited-package resolution for split proto definitions.
- `topic-extractor.test.ts` — 30/30 pass.
- `http-route-extractor.test.ts` — 18/18 pass.
- `manifest-extractor.test.ts` — 8/8 pass.
- `service.test.ts`, `sync.test.ts`, `storage.test.ts` — 41/41 pass.
- `npx tsc -p tsconfig.json --noEmit` clean.

## Scope discipline (per GUARDRAILS.md)

- Only files under `src/core/group/extractors/` are touched.
- No pipeline.ts, MCP surface, ingestion, CI / release / security, or
  test changes.
- New tree-sitter grammar imports (`tree-sitter-go`, `tree-sitter-java`,
  `tree-sitter-python`, `tree-sitter-javascript`, `tree-sitter-typescript`)
  are all already installed for the ingestion pipeline.

## End of phase series

This commit completes the three-phase extractor refactor:
  - **Phase 1** (`ea06d11`): topic-extractor → `topic-patterns/`
  - **Phase 2** (`b6015f6`): http-route-extractor → `http-patterns/`
  - **Phase 3** (this commit): grpc-extractor → `grpc-patterns/`

Every remaining regex-based extractor helper under the `src/core/group/
extractors/` directory is either (a) language-agnostic string
processing (path normalization, dedupe keys) or (b) the `.proto`
parser, which is documented as an explicit exception.

Co-authored-by: Claude <noreply@anthropic.com>

* feat(group): add tree-sitter-proto for .proto file parsing

Addresses @magyargergo's suggestion on abhigyanpatwari#796 to replace the manual
string-sanitizing .proto parser with a tree-sitter grammar.

- **Vendored `tree-sitter-proto`** in `vendor/tree-sitter-proto/`.
  Grammar source from [coder3101/tree-sitter-proto](https://github.com/coder3101/tree-sitter-proto)
  (latest `grammar.js`), parser.c regenerated with `tree-sitter-cli
  0.24` to produce ABI version 14 — compatible with the project's
  `tree-sitter 0.25` runtime (which supports ABI ≤ 14). Added as
  `optionalDependency` with `file:./vendor/tree-sitter-proto`.

- **New `grpc-patterns/proto.ts` plugin** — uses the same
  `compilePatterns` + `runCompiledPatterns` infrastructure as every
  other plugin. Two queries:
  - `(package (full_ident) @pkg)` — package declaration
  - `(service (service_name) @service_name (rpc (rpc_name) @rpc_name))`
    — one match per (service, rpc) pair

- **Graceful fallback** — `tree-sitter-proto` is an optional
  dependency. If it fails to install (platform incompatibility) or
  fails the runtime smoke-test (`setLanguage` + `parse` on a trivial
  proto), `PROTO_GRPC_PLUGIN` stays `null` and the orchestrator
  uses the existing manual parser. The smoke-test catches the
  `SyntaxNode` TDZ error that occurs in vitest's fork-based test
  runner.

- **Orchestrator updated** — when `hasProtoPlugin` is true, `.proto`
  files are handled by the plugin loop (they're included in
  `GRPC_SCAN_GLOB`), and the manual `parseProtoFile` loop is
  skipped. `buildProtoContext` still runs to build the proto map
  for cross-referencing source-file detections.

1. **No manual comment/string stripping.** The old parser needed
   `stripProtoCommentsAndStrings` (110 lines) to avoid counting
   braces inside comments and string literals. tree-sitter handles
   this natively.
2. **No brace-depth tracking.** `extractServiceBlocks` used a manual
   depth counter to find service boundaries. tree-sitter's AST gives
   us `service` → `service_name` + `rpc` → `rpc_name` directly.
3. **Performance.** tree-sitter's C-based parser is faster than
   character-by-character JS scanning + regex on large proto files.

- `grpc-extractor.test.ts` — **43/43 pass** (unchanged)
- All other extractor tests — 99/99 pass
- `npx tsc -p tsconfig.json --noEmit` clean

Co-authored-by: Claude <noreply@anthropic.com>

* chore: add .gitignore for vendored tree-sitter-proto build artifacts

https://claude.ai/code/session_01SFUCxgKMMQ8EgRHYw91xPU

* fix: correct .gitignore paths for vendored tree-sitter-proto

Patterns should be relative to the .gitignore file's directory.

https://claude.ai/code/session_01SFUCxgKMMQ8EgRHYw91xPU

* refactor(group): address Copilot review feedback on abhigyanpatwari#796

Six fixes suggested by the Copilot AI review:

1. **`normalizeHttpPath` root-path edge case** — stripping trailing
   slashes on the input `/` produced an empty string, yielding
   malformed contract ids like `http::GET::`. Now preserves `/` for
   the root handler/fetch case.

2. **Dedupe `scanFiles` call** — `extract()` was globbing the
   source-scan file list twice (once for the provider fallback, once
   for the consumer fallback). Moved to a single lazy call that
   memoizes the result for the rest of the method.

3. **HTTP `scanFiles` now ignores `**/vendor/**`** — every other
   extractor's glob already ignored vendored sources; the HTTP one
   didn't. Fixed for consistency.

4. **`loadPackageDefinition` check is now structural** — was calling
   `tree.rootNode.text.includes('loadPackageDefinition')` which forces
   materialization of the entire file text from the parse tree
   (expensive on large files). Replaced with a dedicated compiled
   query on `(call_expression function: [(identifier) | (member_expression)])`
   so the check stays in the AST domain.

5. **`grpc-extractor.ts` header docstring updated** — still claimed
   ".proto parsing is not tree-sitter-based because no grammar is
   installed". Now describes the actual behaviour: tree-sitter when
   `tree-sitter-proto` is available (optionalDependency), manual
   fallback otherwise.

6. **Eliminated the double proto file parse on the fallback path** —
   `buildProtoContext` already globs + parses every `.proto` file to
   build `servicesByName`. On the `!hasProtoPlugin` branch the
   extractor was globbing + parsing again via the now-removed
   `parseProtoFile` helper. The fallback branch now iterates the map
   that `buildProtoContext` already produced to emit provider
   contracts directly — single pass per proto file.

## Tests

- `topic-extractor.test.ts` — 30/30 pass
- `http-route-extractor.test.ts` — 18/18 pass
- `grpc-extractor.test.ts` — 43/43 pass
- `manifest-extractor.test.ts` — 8/8 pass
- `npx tsc -p tsconfig.json --noEmit` clean

Co-authored-by: Claude <noreply@anthropic.com>

* refactor(group): address Claude review feedback (bugs + dedup + hygiene) on abhigyanpatwari#796

Follows up `2f28bfc` with the remaining items from the Claude AI review:

## Bugs

**Bug 2 — Label-unaware Cypher queries in `resolveSymbol`.**
The manifest-extractor's lookup queries were `MATCH (n) WHERE n.name = $x`
with no label filter, so a topic/service/package name could silently match
any node type (File, Variable, Import, Folder, …). Added label filters:
- `topic` → `(n:Function|Method|Class|Interface)` (topics are best-effort
  symbol-name matches against listener/publisher symbols)
- `grpc` method → `(n:Function|Method)`
- `grpc` service → `(n:Class|Interface)`
- `lib` → `(n:Package|Module)`

All 8 manifest-extractor tests still pass (mock executor is
label-agnostic, but the production LadybugDB graph now gets correctly
scoped queries).

**Bug 8 — Tautological `!handlerName` condition.**
`http-route-extractor.ts:extractProvidersGraph` had
`let handlerName = null; if (!method || !handlerName) { ... }` — the
`!handlerName` clause was always true since there was no intervening
assignment. Simplified to always run the plugin-scan lookup (we need
the handler name even when `methodFromRouteReason` already resolved
the method).

## Clean code / dedup

**Design 7 — `readSafe` was copy-pasted in all three orchestrators.**
Extracted to `extractors/fs-utils.ts` as the single source of truth
for the path-traversal guard. Dropped the three local copies and the
now-unused `fs`/`path` imports from topic-extractor.

**Style 10 — Language-specific `_test.go` skip in the topic orchestrator.**
Was `if (rel.endsWith('_test.go')) continue;` inside the language-
agnostic extraction loop. Pushed into the glob's ignore list
(`'**/*_test.go'`) alongside the existing `node_modules`, `vendor`,
`dist`, `build` entries, with a comment explaining that other
languages' test file conventions either live in separate directories
(Python `tests/`, Java `src/test/`) or are already covered by the
existing ignores.

## Already addressed in `2f28bfc` (mentioned again in Claude review)

- Bug 3: `normalizeHttpPath('/')` returns `''` — fixed
- Bug 4: double glob + double parse of `.proto` — fixed
- Bug 5: `scanFiles` called twice in HTTP — fixed
- Bug 6: missing `**/vendor/**` in HTTP glob — fixed
- Design 9 partially: `tree.rootNode.text.includes('loadPackageDefinition')`
  replaced with a dedicated structural query

## Deferred

- Bug 1 (`http::*::path` vs `http::GET::path` matching) — out of scope;
  sync.ts matching logic lands in abhigyanpatwari#793, manifest extractor already
  emits correct synthetic uids for unresolved HTTP contracts.
- Design 9 full (change plugin `scan(tree)` → `scan(tree, source)`) —
  the only real use case (`loadPackageDefinition` gate) is already
  fixed via a structural query, so the interface change would be
  cosmetic churn without a concrete consumer.

## Tests

- `topic-extractor.test.ts` — 30/30 pass
- `http-route-extractor.test.ts` — 18/18 pass
- `grpc-extractor.test.ts` — 43/43 pass
- `manifest-extractor.test.ts` — 8/8 pass
- `npx tsc -p tsconfig.json --noEmit` clean

Co-authored-by: Claude <noreply@anthropic.com>

* docs+fix(group): address remaining Claude review items + add pipeline flow chart

## Fixes

**Remaining 🔴 — HTTP contract id wildcard format.** Documented the
`http::*::<path>` format as an intentional wildcard for manifest links
that omit the HTTP method, alongside the explicit-method form
(`GET::/path` → `http::GET::/path`). The docblock on `buildContractId`
now states both forms, notes that wildcard-aware matching is the
responsibility of the sync / cross-impact layer (abhigyanpatwari#793), and
recommends the explicit-method form whenever the author knows the
method (it round-trips through exact equality without needing
wildcard logic downstream). Tests unchanged — the wildcard format is
what they've always asserted.

**Minor 1 — stale comment at `manifest-extractor.ts:124-126`.** The
comment claimed "creates a contract with an empty symbolUid/ref" but
the code switched to `manifestSymbolUid(repo, contractId)` a few
commits back. Updated to describe the actual synthetic-uid fallback
semantics and the cross-impact path that relies on both sides of the
join deriving the same uid.

**Minor 2 — exhaustiveness guard on `buildContractId`.** The
`switch(type)` covered all five current `ContractType` variants but
silently returned `undefined` if a new variant was added. Added a
`default: const _exhaustive: never = type; throw new Error(...)`
clause so the build fails loudly on an unhandled variant.

**Minor 3 — `tree.rootNode.text` in `grpc-patterns/node.ts`.** Already
fixed in `2f28bfc` via a dedicated structural query
(`LOAD_PACKAGE_DEFINITION_SPEC`). No action needed.

## New: pipeline flow chart (per @magyargergo's request)

Added `src/core/group/PIPELINE.md` with four Mermaid diagrams:
1. **High-level overview** — `group.yaml` → extractors + manifest →
   contract matching → `bridge.lbug` → `runGroupImpact`.
2. **Per-repo extractor two-strategy shape** — graph-assisted
   Strategy A vs. source-scan Strategy B.
3. **Plugin architecture** — orchestrator → registry →
   per-language `*-patterns/<lang>.ts` → `tree-sitter-scanner.ts` →
   `ExtractedContract`.
4. **Manifest extraction** — label-scoped `resolveSymbol` with the
   synthetic-uid fallback.
5. **Cross-impact query (abhigyanpatwari#606)** — local impact → bridge join →
   cross-repo fan-out.

Each diagram is annotated with which PRs own which stage (this PR:
extractors + manifest; abhigyanpatwari#795: bridge storage; abhigyanpatwari#606: cross-impact
runtime) and points at the concrete files/functions involved.

## Tests

- 99/99 extractor tests pass
- `npx tsc -p tsconfig.json --noEmit` clean

Co-authored-by: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
zander-raycraft pushed a commit that referenced this pull request May 7, 2026
… creation on zero-node path (abhigyanpatwari#831)

* Initial plan

* fix: stale vectors preserved on content edits and vector index missing after zero-node run

Issue 1: Add contentHash to EMBEDDING_SCHEMA and embedding pipeline.
- contentHash column persisted per CodeEmbedding row
- POST /api/embed queries nodeId+contentHash, compares per-node hash
- Stale rows (hash mismatch) are DELETE'd before re-embedding
- Legacy DBs without contentHash treated as stale (full re-embed)
- loadCachedEmbeddings and run-analyze cache restore include contentHash

Issue 2: createVectorIndex called unconditionally before zero-node early return.

Regression tests:
- contentHashForNode determinism and content-change detection
- EMBEDDING_SCHEMA includes contentHash STRING column
- Pipeline exports verified

Agent-Logs-Url: https://github.com/abhigyanpatwari/GitNexus/sessions/1581c0c0-f359-4376-b47e-62d24a28fd2d

Co-authored-by: magyargergo <11230420+magyargergo@users.noreply.github.com>

* fix: use parameterized query for stale embedding DELETE, revert package-lock.json

Agent-Logs-Url: https://github.com/abhigyanpatwari/GitNexus/sessions/1581c0c0-f359-4376-b47e-62d24a28fd2d

Co-authored-by: magyargergo <11230420+magyargergo@users.noreply.github.com>

* fix: address review feedback — config consistency, narrow catches, extract DB logic

Bug #1: Use finalConfig consistently in contentHashForNode (line 224 was
using raw `config` while line 307 used `finalConfig`). Cache precomputed
hashes in filter phase to avoid double computation (Perf #5).

Bug #2: Narrow catch in loadCachedEmbeddings to only fall back on
column/table-missing errors. Rethrow transient/connection errors.

Bug #3: Log non-trivial DELETE failures instead of silently swallowing.

Arch Violation #3: Extract fetchExistingEmbeddingHashes from api.ts into
lbug-adapter.ts. Server layer now calls a single adapter function instead
of re-implementing the DB query logic with nested try-catch.

Tests: Add config consistency test, note that fetchExistingEmbeddingHashes
tests require native module (run in CI).

Agent-Logs-Url: https://github.com/abhigyanpatwari/GitNexus/sessions/b8c4f6b0-4095-4507-a15d-d8469793efac

Co-authored-by: magyargergo <11230420+magyargergo@users.noreply.github.com>

* fix: narrow Column error match to 'contentHash' in lbug-adapter fallback checks

Agent-Logs-Url: https://github.com/abhigyanpatwari/GitNexus/sessions/b8c4f6b0-4095-4507-a15d-d8469793efac

Co-authored-by: magyargergo <11230420+magyargergo@users.noreply.github.com>

* fix: address production-readiness review — eliminate competing state, use schema constants, hard-fail on stale DELETE, add incremental filter tests

Gap A / Arch Violation 1: Remove duplicate vectorExtensionLoaded flag from
embedding-pipeline.ts — delegate to lbug-adapter's loadVectorExtension()
which owns the VECTOR extension lifecycle and resets on DB reconnect.

Arch Violation 2: Replace all hardcoded 'CodeEmbedding' and
'code_embedding_idx' strings in embedding-pipeline.ts and run-analyze.ts
with EMBEDDING_TABLE_NAME, EMBEDDING_INDEX_NAME, and CREATE_VECTOR_INDEX_QUERY
imported from schema.ts. Add EMBEDDING_INDEX_NAME export to schema.ts.

Gap B: Make DELETE failure for stale vectors a hard throw (not just a
warning). Continuing after failed DELETE risks Kuzu vector-index corruption
since the constraint requires DELETE-before-INSERT for vector-indexed
properties. "not found" / "does not exist" errors are still safe to ignore.

STALE_HASH_SENTINEL: Define a named constant in embedding types.ts for the
empty-string sentinel convention. Used consistently in lbug-adapter.ts and
run-analyze.ts so the invariant is self-documenting.

Tests: Add comprehensive unit tests for the incremental filter logic with
mocked embedder:
- New node → embedded
- Unchanged node (hash matches) → skipped
- Stale node (hash mismatch) → DELETE + re-embed
- STALE_HASH_SENTINEL → treated as stale
- Zero nodes after filter → createVectorIndex still called
- DELETE failure with non-trivial error → throws

Agent-Logs-Url: https://github.com/abhigyanpatwari/GitNexus/sessions/b21edee7-c9c5-4742-947b-d0def4fb26aa

Co-authored-by: magyargergo <11230420+magyargergo@users.noreply.github.com>

* fix: tighten error classification — extract isMissingColumnOrTableError helper, remove broad pattern matching

- Extract isMissingColumnOrTableError() helper in lbug-adapter for
  consistent schema-error detection (replaces duplicate inline checks)
- Tighten 'contentHash' match: now requires 'property' AND 'contentHash'
  (Kuzu-specific pattern) instead of broad 'contentHash' substring
- Tighten DELETE error check: only ignore 'does not exist' (Kuzu's actual
  message), not broad 'not found' which could mask connection errors
- Fix test node ID/name/filePath consistency

Agent-Logs-Url: https://github.com/abhigyanpatwari/GitNexus/sessions/b21edee7-c9c5-4742-947b-d0def4fb26aa

Co-authored-by: magyargergo <11230420+magyargergo@users.noreply.github.com>

* fix: CI failures and final review — move STALE_HASH_SENTINEL to schema, tighten error matching, fix test mocking, format

- Move STALE_HASH_SENTINEL from embeddings/types.ts to lbug/schema.ts
  (fixes inverted layer dependency: lbug should not import from embeddings)
- Tighten isMissingColumnOrTableError: replace broad msg.includes('not found')
  with /(table|column|property).*not found/i regex to avoid matching transient errors
- Add vi.resetModules() in test beforeEach for explicit module isolation
  (fixes vi.doMock not intercepting loadVectorExtension in CI)
- Skip precomputedHashes.set() on unchanged (return false) path
- Run prettier on all 5 files flagged by CI format check

Agent-Logs-Url: https://github.com/abhigyanpatwari/GitNexus/sessions/e20311fd-4361-47b4-a137-9adc3e533b35

* fix: address remaining review nits — rename precomputedHashes, generalize error matcher, revert package-lock

- Rename precomputedHashes → computedStaleHashes (hashes are computed
  on-demand during filter, only cached for stale nodes being re-embedded)
- Remove contentHash-specific clause from isMissingColumnOrTableError —
  the regex /(table|column|property).*not found/i already covers it
- Revert package-lock.json ssh→https protocol change

Agent-Logs-Url: https://github.com/abhigyanpatwari/GitNexus/sessions/e20311fd-4361-47b4-a137-9adc3e533b35

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: magyargergo <11230420+magyargergo@users.noreply.github.com>
zander-raycraft pushed a commit that referenced this pull request May 7, 2026
…mo / useCallback / useMemo / observer) (abhigyanpatwari#1261)

* fix(typescript): name HOC-wrapped const declarations (forwardRef / memo / useCallback / useMemo / observer / debounce)

Follow-up to issue abhigyanpatwari#1166 / PR abhigyanpatwari#1175. After fixing HOF callbacks (Promise
fan-out, queryFn pair-arrows, multi-action Zustand stores) and JSX-as-call,
the dominant residual 0%-capture pattern in real React UI codebases was
the HOC-wrapped variable declaration:

  const Button = React.forwardRef((props, ref) => { ... })
  const Card = memo((props) => { ... })
  const handleClick = useCallback(() => { ... }, [])
  const computed = useMemo(() => { ... }, [])
  const debouncedSearch = debounce((q) => { ... }, 250)

All share the AST shape `lexical_declaration > variable_declarator >
call_expression > arguments > arrow_function`. Pre-fix, neither the
registry-primary `query.ts` nor the legacy `tree-sitter-queries.ts` had
a `@declaration.function` pattern matching this shape, and the legacy
DAG's `tsExtractFunctionName` only walked `variable_declarator` and
`pair` parents — `arguments` parents fell through with `funcName = null`.

Result: every shadcn/Radix component, every memoised React component,
and every `useCallback` / `useMemo` callback bound to a const registered
as anonymous; calls inside attributed to the file. Sourcerer-fe audit:
~296 declarations affected (~57 forwardRef + ~21 memo + ~161 useCallback
+ ~57 useMemo).

Fix:
  - 4 new tree-sitter patterns in `languages/typescript/query.ts`
    (registry-primary), anchored on the inner arrow_function /
    function_expression — same anchor discipline as the existing
    `lexical_declaration` and `pair` patterns from PR abhigyanpatwari#1175.
  - 8 mirrored patterns in `tree-sitter-queries.ts` (4 in
    TYPESCRIPT_QUERIES, 4 in JAVASCRIPT_QUERIES) for the legacy DAG
    and the CI parity gate.
  - New `arguments`-parent branch in `tsExtractFunctionName` that
    walks `arguments → call_expression → variable_declarator` and
    returns the const's name. Three guards keep it strictly scoped
    to HOC-wrapped declarations; bare statement-level HOC calls fall
    through anonymous.

Tests:
  - 11 integration tests + 9 minimal TS/TSX fixtures exercising
    forwardRef / memo / useCallback / useMemo / observer / debounce,
    with positive (named-Function + correct CALLS edge), negative
    (no phantom Functions for unbound HOCs, no phantom self-loops,
    no first-sibling-wins leakage), and cross-pollination assertions.
  - 8 new unit tests in `call-attribution-issue-1166.test.ts`
    pinning the legacy-DAG path: 6 attribution tests + 2
    @definition.function capture tests.

Trade-off documented inline: chained array-method declarations
(`const x = arr.find((y) => p(y))`) match the same shape and produce
a mostly-harmless phantom `Function:x` with one outgoing edge. The
false-positive cost is negligible vs. the React UI coverage gain.

Verification: - 11/11 typescript-hoc-wrapped (registry-primary)
  - 26/26 call-attribution-issue-1166 (8 new + 18 pre-existing)
  - 266/266 across all 4 typescript resolver test files (registry)
  - 236/236 typescript.test.ts on legacy DAG (CI parity gate)
  - 1693/1693 across all non-Kotlin/Swift resolver test files
  - tsc --noEmit clean; prettier clean; eslint clean (no new warnings)
Co-authored-by: Cursor <cursoragent@cursor.com>

* test(typescript): pin documented HOC trade-offs and close var-form parity gap

Addresses the four findings on PR abhigyanpatwari#1261 (Claude bot review for abhigyanpatwari#1261).
All findings flagged missing assertion tests for behaviour already documented
in code comments — none reported a real bug. The verdict was
"production-ready with minor follow-ups"; these tests strengthen the
documentation-to-test contract.

[medium #1] Array-method false-positive
  Pin `const found = items.find((item) => predicate(item))` →
  `predicate.attributedTo === 'found'` as an accepted FP. The const is a
  value, never invoked, so no incoming CALLS edge ever points at it; the
  outgoing edge is a minor mis-attribution we accept rather than maintain
  a HOC allowlist.

[medium #2] Nested HOCs (`memo(forwardRef(...))`) — no phantom Function:Wrapped
  Two integration tests in `typescript-hoc-wrapped.test.ts`:
    1. `Wrapped` is NOT a Function node (the outer call's first arg is a
       call_expression, not an arrow — no @declaration.function pattern
       matches the outer shape).
    2. The deepest arrow's `helper()` call is NOT attributed to
       Function:Wrapped (the deepest arrow is anonymous because
       call_expression.parent is `arguments`, not `variable_declarator`),
       and no Function-sourced CALLS originate from `nested.tsx`.

[medium #3] Multi-arrow argument dedup
  Pin `const x = call(() => first(), () => second())` — both arrows share
  the same `arguments → call_expression → variable_declarator` ancestor
  chain on the legacy DAG, so both attribute to "x". Documents the
  registry-primary dedup story alongside.

[low #4] `var X = HOC(...)` parity gap
  Registry-primary `query.ts` had `(variable_declaration ...)` HOC patterns
  but legacy `tree-sitter-queries.ts` (TS + JS) did not. Closes the gap by
  mirroring two `(variable_declaration ...)` HOC patterns into both legacy
  sections so the parity gate stays tight even if a codebase mixes
  `var X = HOC(...)` with `const X = HOC(...)`.

Validation
  - Targeted: 41/41 (28 unit + 13 integration) on registry-primary.
  - Broader TS suite: 60/60 across 4 resolver test files.
  - CI parity gate (`typescript.test.ts`): 236/236 on legacy DAG and 236/236
    on registry-primary.
  - Prettier clean. ESLint clean (5 pre-existing non-null-assertion
    warnings in the test file, unrelated). tsc --noEmit clean.

Co-authored-by: Cursor <cursoragent@cursor.com>

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.