Skip to content

feat(java,kotlin): MethodExtractor abstraction with per-language configs#576

Merged
magyargergo merged 4 commits into
mainfrom
feat/method-extractor-types-factory-jvm
Mar 28, 2026
Merged

feat(java,kotlin): MethodExtractor abstraction with per-language configs#576
magyargergo merged 4 commits into
mainfrom
feat/method-extractor-types-factory-jvm

Conversation

@magyargergo

Copy link
Copy Markdown
Collaborator

Summary

Closes #569 (PR 1 of RFC #568)

  • Add structured method metadata extraction mirroring the FieldExtractor pattern
  • Config + factory approach (no class hierarchy) — createMethodExtractor() driven by MethodExtractionConfig
  • Java and Kotlin configs with isAbstract/isFinal detection, structured ParameterInfo, annotations, visibility
  • Worker integration with overload-safe methodInfoCache (keyed by name:line, not name alone)
  • isAbstract, isFinal, annotations added to NodeProperties in gitnexus-shared

New Files (3)

File Purpose
method-types.ts MethodInfo, ParameterInfo, MethodExtractionConfig, MethodExtractor, ExtractedMethods
method-extractors/generic.ts createMethodExtractor() factory
method-extractors/configs/jvm.ts Java + Kotlin configs

Modified Files (5)

File Change
gitnexus-shared/src/graph/types.ts isAbstract?, isFinal?, annotations? on NodeProperties
language-provider.ts methodExtractor?: MethodExtractor on LanguageProviderConfig
workers/parse-worker.ts methodInfoCache, getMethodInfo(), Method/Function enrichment with MethodExtractor-first fallback
languages/java.ts Wire methodExtractor
languages/kotlin.ts Wire methodExtractor

Key Design Decisions

  1. Config + factory only — no BaseMethodExtractor class (11/12 field extractors use the factory, base class is dead weight)
  2. isAbstract(node, ownerNode) — two-param signature needed for interface method detection (Java bodyless methods, Kotlin interface keyword child)
  3. Overload-safe cacheMap<string, MethodInfo> keyed by name:line composite, not name alone, to support Java/Kotlin method overloading
  4. MethodExtractor-first enrichment — tries MethodExtractor before extractMethodSignature(), avoiding double AST walks
  5. Simpler contextMethodExtractorContext omits typeEnv/symbolTable (methods don't need type resolution)

Schema Note

isAbstract, isFinal, annotations are populated in the in-memory graph but not yet persisted to LadybugDB — the METHOD_SCHEMA and CSV generator need column additions in a follow-up PR.

Test Plan

  • 23 unit tests: classes, interfaces, abstract classes, enums, overloads, varargs, annotations, visibility, static/final
  • Java: public/private/package-private, abstract methods, interface default methods, enum methods
  • Kotlin: public (default), private, abstract methods, interface methods
  • Overload test: 3 methods with same name, different params — all survive cache
  • tsc --noEmit clean
  • Pre-commit hooks pass (eslint, prettier, typecheck)
  • Existing 46 field-extraction tests unaffected

Post-Deploy Monitoring & Validation

No additional operational monitoring required: this is a build-time ingestion enrichment with no runtime API surface change. Validate by running gitnexus analyze on a Java repo and checking Method node properties via mcp__gitnexus__cypher.

…igs (#569)

Add structured method metadata extraction mirroring the FieldExtractor
pattern. Enables isAbstract/isFinal detection, structured ParameterInfo,
annotations, and visibility for Java and Kotlin methods.

New files: method-types.ts, method-extractors/generic.ts, configs/jvm.ts
Modified: NodeProperties, LanguageProviderConfig, parse-worker, language providers
Tests: 23 cases covering classes, interfaces, abstract classes, enums, overloads
@vercel

vercel Bot commented Mar 28, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
gitnexus Ready Ready Preview, Comment Mar 28, 2026 9:20pm

Request Review

@magyargergo

Copy link
Copy Markdown
Collaborator Author

@claude ultra-think while identifying kotlin and java languages gaps. Perfom this analysis on this pull request and verify those findings. Act as a senior compiler font-end engineer and expert in static analysis tools and review this change from that perpective and review the architectural fit. Also check out previous comments and reason with their requests.

@github-actions

github-actions Bot commented Mar 28, 2026

Copy link
Copy Markdown
Contributor

Claude finished @magyargergo's task in 4m 52s —— View job


Deep Analysis: Java/Kotlin Language Gaps & Compiler Front-End Review

Tasks

  • Post initial comment
  • Read all changed files in the PR
  • Analyze Java config for language coverage gaps
  • Analyze Kotlin config for language coverage gaps
  • Review generic factory architecture
  • Review parse-worker integration
  • Review test coverage adequacy
  • Cross-check previous comments/requests
  • Post comprehensive review

Overall Assessment

The config+factory approach is sound and the FieldExtractor parallel is well-executed. The core plumbing (cache keying, fallback chain, isAbstract two-param signature) reflects real understanding of the problem space. That said, as a compiler front-end engineer reviewing this against the Java Language Specification and Kotlin Language Reference, there are 5 correctness gaps that will silently produce wrong data and several semantic inaccuracies worth addressing before the schema hardens.


🔴 Critical Gaps — Silent Data Loss

1. Java: annotation_type_element_declaration not in methodNodeTypes

File: method-extractors/configs/jvm.ts:108

methodNodeTypes: ['method_declaration', 'constructor_declaration'],

Methods inside @interface declarations are annotation_type_element_declaration nodes in tree-sitter-java, not method_declaration. annotation_type_declaration is correctly listed in typeDeclarationNodes, but with no matching node type in methodNodeTypes, every annotation element is silently dropped:

@interface MyAnnotation {
    String value();          // annotation_type_element_declaration — NEVER EXTRACTED
    int count() default 0;   // annotation_type_element_declaration — NEVER EXTRACTED
}

These are the most common entry points for Spring/Jakarta annotation processors. Fix this →

2. Java: compact_constructor_declaration not in methodNodeTypes

Java 16+ records use compact constructors which parse as compact_constructor_declaration, not constructor_declaration:

public record Point(int x, int y) {
    public Point {          // compact_constructor_declaration — NEVER EXTRACTED
        if (x < 0) throw new IllegalArgumentException();
    }
}

record_declaration is in typeDeclarationNodes. Its body is class_body. But the compact constructor node type isn't in methodNodeTypes, so it's dropped. Fix this →

3. Kotlin: companion_object missing from typeDeclarationNodes

File: method-extractors/configs/jvm.ts:237

typeDeclarationNodes: ['class_declaration', 'object_declaration'],

In tree-sitter-kotlin, companion object { ... } parses as a companion_object node — distinct from object_declaration. Since companion_object is not in typeDeclarationNodes, all functions inside companion objects are never extracted. This is the Kotlin equivalent of Java static methods and likely the most common location for factory methods, utility functions, and @JvmStatic annotations in real Kotlin codebases:

class UserService {
    companion object {
        fun create(): UserService = UserService()  // NEVER EXTRACTED
        @JvmStatic fun getInstance() = instance    // NEVER EXTRACTED
    }
}

Fix this →

Also requires updating bodyNodeTypescompanion_object body is class_body, which is already covered, but companion_object itself needs to be recognized as a type container.

4. Kotlin: vararg always produces isVariadic: false

File: method-extractors/configs/jvm.ts:200

The comment // Kotlin uses vararg keyword, handled separately at line 200 is dangling — there's no separate handling anywhere. vararg args: String will always emit isVariadic: false:

fun log(vararg messages: String) { ... }
// messages.isVariadic === false  ← incorrect

In tree-sitter-kotlin, vararg appears as a parameter_modifier within a parameter node's parameter_modifiers child. Fix this →

5. Java: Enum constant body methods not extracted

Java enum constants can override methods in their anonymous class body:

enum Planet {
    EARTH(9.8) {
        @Override
        public double surfaceWeight(double m) { return m * 9.8; }  // NEVER EXTRACTED
    };
    public abstract double surfaceWeight(double m);
}

extractMethodsFromBody at generic.ts:96 only iterates direct namedChildren of the body node. It doesn't recurse into enum_constant.class_body. For most use cases this is acceptable, but it's a silent gap for abstract-enum patterns common in Java.


🟡 Significant Semantic Gaps

6. Kotlin isFinal semantics are inverted

File: method-extractors/configs/jvm.ts:286-289

isFinal(node) {
  // Kotlin functions are final by default (unless `open`), but we only flag explicit `final`
  return hasModifier(node, 'modifiers', 'final');
},

In Kotlin, every function is closed (final) by default. A function becomes overridable only with explicit open. The current implementation returns isFinal: false for nearly all Kotlin methods (since explicit final is nearly never written). Consumers of this data will believe Kotlin methods are overridable when they're not.

The correct semantic for Kotlin would be:

// A Kotlin function is open ONLY when explicitly marked `open`
// i.e., isFinal = !hasModifier(node, 'modifiers', 'open')

This is arguably a schema-level decision, but at minimum the current behavior is misleading without an isOpen flag.

7. ownerFqn is not an FQN

File: method-extractors/generic.ts:39-52

Despite the field name ownerFqn, the value is the simple class name only (UserService, not com.example.UserService). For a static analysis tool, this creates cross-package collision risk when unrelated classes share a name. The field should be renamed ownerSimpleName or the factory should receive package context. The current MethodExtractorContext omits file-path-to-package resolution which would be needed to produce real FQNs.

8. Missing Java throws clause

For API surface analysis, checked exceptions are part of the Java method signature. A method annotated throws IOException, SQLException carries contracts relevant to call-graph analysis. Not in MethodInfo. This is a schema gap to track for the next iteration.

9. Missing Kotlin suspend modifier

suspend fun fetch(): Response is fundamentally different from fun fetch(): Response in Kotlin — it can only be called from coroutine context. This is as significant as isStatic for call-graph accuracy. Not captured anywhere.


🔵 Architectural Observations

10. annotations not propagated to ParsedSymbol

File: workers/parse-worker.ts:1827-1841

isAbstract, isFinal, and annotations are written to result.nodes (line 1810-1812) but not to result.symbols. The ParsedSymbol interface doesn't include these fields. This means call resolution and overload dispatch operate without annotation context. Spring @Transactional/@Cacheable annotations on methods are in the graph nodes but invisible to the symbol table.

11. findBodies fallback path is brittle for future configs

File: method-extractors/generic.ts:80-83

if (result.length === 0 && bodyField) {
  result.push(bodyField);
  addNestedBodies(bodyField, bodyNodeSet, result);
}

This fallback silently pushes any bodyField if nothing matched bodyNodeSet. A config that has a typo in bodyNodeTypes (e.g., 'class_boy' instead of 'class_body') would silently use the fallback without error. For the current JVM configs this is fine since class_body is correct, but the pattern is fragile as more languages are added.

12. isFinal Java: redundant double-check

File: method-extractors/configs/jvm.ts:143-145

isFinal(node) {
  return hasKeyword(node, 'final') || hasModifier(node, 'modifiers', 'final');
},

In tree-sitter-java, final always appears inside a modifiers node when it modifies a method. hasKeyword does a flat text scan of node children which duplicates what hasModifier already does correctly. Not a bug but hasKeyword is unlikely to find cases hasModifier misses. Same pattern applies to isStatic and isAbstract.

13. Java primitive vararg type extraction may fail

File: method-extractors/configs/jvm.ts:72-84

} else if (
  c.type === 'type_identifier' ||
  c.type === 'generic_type' ||
  c.type === 'scoped_type_identifier'
) {

For int... nums or double... args, the type node is integral_type or floating_point_type in tree-sitter-java, not type_identifier. The result: primitive vararg types will be null instead of "int".


What's Well-Designed

  • Two-param isAbstract(node, ownerNode): Correctly handles interface method detection without requiring callers to thread extra context. The two-test pattern (explicit modifier first, then structural check) is the right order.
  • Overload-safe name:line cache key: Clean solution. Map<string, MethodInfo> with composite key avoids the name-collision problem correctly.
  • methodInfoCache keyed by startIndex + cleared per file: Correct lifecycle. No cross-file contamination possible.
  • MethodExtractor-first fallback chain: The enrichedByMethodExtractor flag at parse-worker.ts:1707 ensures no double AST walks and a clean escape hatch for unsupported languages.
  • Config-only approach (no class hierarchy): The right call. The field extractor base class is dead weight as noted.
  • extractAnnotations being optional in config: Good — forces intentional opt-in rather than empty-by-default at the config level.

Test Coverage Gaps

The 23 tests are solid for happy paths but are missing:


Summary Priority Table

# Gap Language Severity Fix Complexity
1 annotation_type_element_declaration missing Java 🔴 Silent data loss Low
2 compact_constructor_declaration missing Java 🔴 Silent data loss Low
3 companion_object not a type container Kotlin 🔴 Silent data loss Medium
4 vararg always isVariadic: false Kotlin 🔴 Wrong data Low
5 Enum constant body methods not extracted Java 🔴 Silent data loss Medium
6 isFinal semantics inverted Kotlin 🟡 Misleading Medium (schema)
7 ownerFqn is not FQN Both 🟡 Misleading name Low (rename)
8 Missing throws clause Java 🟡 Schema gap Medium
9 Missing suspend modifier Kotlin 🟡 Critical for coroutines Medium
10 annotations missing from ParsedSymbol Both 🔵 Arch gap Low
11 Primitive vararg type loss Java 🔵 Minor inaccuracy Low

Items 1-4 are straightforward fixes that could land in this PR. Items 3 and 5-9 may warrant a follow-up PR (in line with the RFC #568 series).

@github-actions

github-actions Bot commented Mar 28, 2026

Copy link
Copy Markdown
Contributor

CI Report

All checks passed

Pipeline Status

Stage Status Details
✅ Typecheck success tsc --noEmit
✅ Tests success unit tests, 3 platforms
✅ E2E success gitnexus-web changes only

Test Results

Tests Passed Failed Skipped Duration
4752 4706 0 46 181s

✅ All 4706 tests passed

46 test(s) skipped — expand for details
  • buildTypeEnv > known limitations (documented skip tests) > Ruby block parameter: users.each { |user| } — closure param inference, different feature
  • Swift constructor-inferred type resolution > detects User and Repo classes, both with save methods
  • Swift constructor-inferred type resolution > resolves user.save() to Models/User.swift via constructor-inferred type
  • Swift constructor-inferred type resolution > resolves repo.save() to Models/Repo.swift via constructor-inferred type
  • Swift constructor-inferred type resolution > emits exactly 2 save() CALLS edges (one per receiver type)
  • Swift self resolution > detects User and Repo classes, each with a save function
  • Swift self resolution > resolves self.save() inside User.process to User.save, not Repo.save
  • Swift parent resolution > detects BaseModel and User classes plus Serializable protocol
  • Swift parent resolution > emits EXTENDS edge: User → BaseModel
  • Swift parent resolution > emits IMPLEMENTS edge: User → Serializable (protocol conformance)
  • Swift cross-file User.init() inference > resolves user.save() via User.init(name:) inference
  • Swift cross-file User.init() inference > resolves user.greet() via User.init(name:) inference
  • Swift return type inference > detects User class and getUser function
  • Swift return type inference > detects save function on User (Swift class methods are Function nodes)
  • Swift return type inference > resolves user.save() to User#save via return type of getUser() -> User
  • Swift return-type inference via function return type > resolves user.save() to User#save via return type of getUser()
  • Swift return-type inference via function return type > user.save() does NOT resolve to Repo#save
  • Swift return-type inference via function return type > resolves repo.save() to Repo#save via return type of getRepo()
  • Swift implicit imports (cross-file visibility) > detects UserService class in Models.swift
  • Swift implicit imports (cross-file visibility) > resolves UserService() constructor call across files (no explicit import)
  • Swift implicit imports (cross-file visibility) > resolves service.fetchUser() member call across files
  • Swift implicit imports (cross-file visibility) > creates IMPORTS edges between files in the same module
  • Swift extension deduplication > detects Product class
  • Swift extension deduplication > resolves Product() constructor despite extension creating duplicate class node
  • Swift extension deduplication > resolves product.save() to Product.swift (primary definition)
  • Swift constructor call fallback (no new keyword) > resolves OCRService() as constructor call across files
  • Swift constructor call fallback (no new keyword) > resolves ocr.recognize() member call via constructor-inferred type
  • Swift export visibility (internal vs private) > resolves PublicService() constructor across files
  • Swift export visibility (internal vs private) > resolves internalHelper() across files (internal = module-scoped)
  • Swift if let / guard let binding resolution > detects User and Repo classes
  • Swift if let / guard let binding resolution > resolves user.save() inside if-let to User#save
  • Swift if let / guard let binding resolution > resolves repo.save() inside guard-let to Repo#save
  • Swift if let / guard let binding resolution > user.save() in if-let does NOT resolve to Repo#save
  • Swift await / try expression unwrapping > resolves user.save() via await fetchUser() return type
  • Swift await / try expression unwrapping > resolves repo.save() via try parseRepo() return type
  • Swift await / try expression unwrapping > detects fetchUser and parseRepo as functions
  • Swift for-in loop element type inference > detects User and Repo classes
  • Swift for-in loop element type inference > creates implicit import edges between files
  • Swift field-type resolution > detects classes and their properties
  • Swift field-type resolution > emits HAS_PROPERTY edges from class to field
  • Swift field-type resolution > resolves field-chain call user.address.save() → Address#save
  • Swift field-type resolution > emits ACCESSES edges for field reads in chains
  • Swift field-type resolution > populates field metadata (visibility, declaredType) on Property nodes
  • Swift call-result binding > resolves call-result-bound method call user.save() → User#save
  • Swift call-result binding > getUser() is present as a defined function
  • Swift call-result binding > emits processUser -> getUser CALLS edge for let-assigned free function call

Code Coverage

Tests

Metric Coverage Covered Base Delta Status
Statements 70.47% 12838/18216 70.32% 📈 +0.2 🟢 ██████████████░░░░░░
Branches 59.8% 8445/14122 59.69% 📈 +0.1 🟢 ███████████░░░░░░░░░
Functions 75.25% 1128/1499 74.88% 📈 +0.4 🟢 ███████████████░░░░░
Lines 72.65% 11686/16085 72.51% 📈 +0.1 🟢 ██████████████░░░░░░

📋 View full run · Generated by CI

…nstructors, varargs

- Add annotation_type_element_declaration to Java methodNodeTypes
- Add compact_constructor_declaration to Java methodNodeTypes
- Add primitive types (integral_type, floating_point_type, boolean_type) to
  Java vararg type extraction
- Implement Kotlin vararg detection via parameter_modifiers
- Add tests: annotation elements, compact constructors, primitive varargs,
  Kotlin vararg
@magyargergo

Copy link
Copy Markdown
Collaborator Author

@claude ultra-think while identifying kotlin and java languages gaps. Perfom this analysis on this pull request and verify those findings. Act as a senior compiler font-end engineer and expert in static analysis tools and review this change from that perpective and review the architectural fit. Also check out previous comments and reason with their requests.

@github-actions

github-actions Bot commented Mar 28, 2026

Copy link
Copy Markdown
Contributor

Claude finished @magyargergo's task in 4m 39s —— View job


Deep Analysis: Java/Kotlin Language Gaps & Compiler Front-End Review

Tasks

  • Post initial comment
  • Read all changed files in the PR
  • Verify previous Claude findings against actual tree-sitter node types
  • Analyze Java config for language coverage gaps (deep)
  • Analyze Kotlin config for language coverage gaps (deep)
  • Review generic factory architecture
  • Review parse-worker integration
  • Review test coverage adequacy
  • Post comprehensive review

Correction of Previous Analysis

The prior review was generated against an earlier revision of the branch. All four of its 🔴 "critical silent data loss" items and one "minor inaccuracy" are already implemented and tested in the current code. This needs to be stated clearly before discussing what's actually missing.

Previous Finding Actual Status
annotation_type_element_declaration not in methodNodeTypes False — present at jvm.ts:115, test at method-extraction.test.ts:293
compact_constructor_declaration not in methodNodeTypes False — present at jvm.ts:114, test at method-extraction.test.ts:311
Kotlin vararg always isVariadic: false False — handled at jvm.ts:172–183 via parameter_modifiers sibling detection, test at method-extraction.test.ts:411
Java primitive vararg type loss Falseintegral_type, floating_point_type, boolean_type all handled at jvm.ts:82–84, test at method-extraction.test.ts:329

The remaining findings from the prior comment (companion_object, isFinal semantics, ownerFqn naming, throws, suspend, annotations/ParsedSymbol, findBodies brittleness) are valid and confirmed below. Additional gaps not in the prior review are also identified.


Overall Assessment

The design is solid. The MethodExtractor pattern mirrors the FieldExtractor correctly, the overload-safe name:line cache key is the right call, and the MethodExtractor-first with fallback approach in the parse-worker avoids double AST walks cleanly. The two-param isAbstract(node, ownerNode) signature correctly models the domain without over-engineering.

What follows is the verified gap list from this review.


🔴 Confirmed Real Gaps

1. Kotlin companion_object absent from typeDeclarationNodes

jvm.ts:257

typeDeclarationNodes: ['class_declaration', 'object_declaration'],

companion_object is missing. In tree-sitter-kotlin a companion object body is class_body, already covered in bodyNodeTypes. The fix requires two coordinated changes:

a) Add 'companion_object' to typeDeclarationNodes.

b) Handle anonymous companions in ownerFqn extraction. The generic factory (generic.ts:39–52) looks for childForFieldName('name') then falls back to a type_identifier child — neither exists for unnamed companion objects (companion object { ... }). The result: ownerFqn is undefined and extract returns null (line 52). Named companions (companion object Factory { ... }) have a simple_identifier child, which the current fallback also misses (it looks for type_identifier). The ownerFqn resolution in the factory needs a Kotlin-aware path, or the config needs to supply a name-extraction override for the owner.

c) isStatic for companion object members. The Kotlin config hardcodes isStatic: false for all functions. Functions in a companion object are accessed as statics on the JVM. Whether to mark them isStatic: true is a schema decision, but it should be deliberate. Note that object_declaration (non-companion singletons) has the same issue — its functions are also effectively static on the JVM.

This is the only remaining 🔴 gap. All other companion object issues flow from this one.

2. Java enum constant body methods not extracted

generic.ts:96–113extractMethodsFromBody only iterates direct namedChildren of the body node. Enum constants that override abstract methods in their anonymous class body are silently dropped:

enum Operation {
  PLUS { @Override public double apply(double x, double y) { return x + y; } };
  public abstract double apply(double x, double y);
}

This is common in DDD value objects and FSM-style enums. The body traversal would need to recurse into enum_constant → class_body. Scoped to a follow-up is fine, but it should be tracked.


🟡 Semantic Gaps

3. isFinal Kotlin semantics: deliberate but architecturally misleading

jvm.ts:306–308

isFinal(node) {
  // Kotlin functions are final by default (unless `open`), but we only flag explicit `final`
  return hasModifier(node, 'modifiers', 'final');
},

The comment documents the intent correctly. The real question is whether the isFinal field name in MethodInfo means "declared with final" or "is not overridable." In Kotlin, nearly every non-open function satisfies the latter but not the former. Consumers of this data (especially any future "can this method be overridden?" analysis) will misread isFinal: false as "overridable" for almost all Kotlin methods.

If the schema means "explicitly declared final/not-open", then isFinal should be renamed isExplicitlyFinal or supplemented with isOpen. If it means "cannot be overridden at runtime", the implementation should be !hasModifier(node, 'modifiers', 'open'). The current inconsistency between Java's convention (explicit final) and Kotlin's convention (closed by default) will silently mislead callers.

4. ownerFqn is a simple name, not an FQN

generic.ts:60 and method-types.ts:36

export interface ExtractedMethods {
  ownerFqn: string;   // always just "UserService", never "com.example.UserService"
  methods: MethodInfo[];
}

The field is correctly documented in the PR description as simple class name. The name ownerFqn carries a false contract. In cross-package analysis this will produce collisions when unrelated packages both contain a class named e.g. UserRepository. Rename to ownerSimpleName or add a note in the type that it is not a qualified name. The package context is available at the file level but not threaded into MethodExtractorContext.

5. Missing suspend modifier (Kotlin)

A suspend function can only be called from coroutine context. It is as semantically significant as isStatic for call-graph accuracy. Not a property of MethodInfo. A static analysis tool that resolves calls into Kotlin suspend fun targets and routes them as regular calls will produce incorrect results. Tracking for the RFC series.

6. Missing throws clause (Java)

Checked exceptions are part of the Java method API surface and affect call-graph correctness. Not in the schema. Tracking for follow-up.


🔵 Architectural Observations

7. annotations not propagated to ParsedSymbol

parse-worker.ts:1827–1841isAbstract, isFinal, and annotations are written to result.nodes (lines 1810–1812) but not to result.symbols. The ParsedSymbol interface does not include these fields.

result.symbols.push({
  ...
  // isAbstract, isFinal, annotations — NOT HERE
});

The consequence: call resolution and overload dispatch have no annotation context. @Transactional, @Cacheable, @Override, @JvmStatic on methods are dark to the symbol table. Whether ParsedSymbol should carry these fields is a schema question, but the current asymmetry between nodes and symbols is worth a deliberate decision before the schema hardens.

8. findBodies fallback silently masks config typos

generic.ts:80–83

if (result.length === 0 && bodyField) {
  result.push(bodyField);
  addNestedBodies(bodyField, bodyNodeSet, result);
}

If a config has a typo in bodyNodeTypes (e.g. 'class_bod' instead of 'class_body'), the fallback silently uses whatever childForFieldName('body') returns. This provides correct results by accident for current JVM configs, but will produce confusing silent failures for future language configs that haven't been as carefully tested. Consider a dev-mode assertion: if (bodyField && !bodyNodeSet.has(bodyField.type)) warn(...).

9. hasKeyword redundancy in Java isStatic / isFinal / isAbstract

jvm.ts:137–138, 141–142, 151–152:

isStatic(node) {
  return hasKeyword(node, 'static') || hasModifier(node, 'modifiers', 'static');
},

In tree-sitter-java, all modifier keywords (static, final, abstract) appear inside a modifiers named child node. hasKeyword scans direct children looking for one whose .text equals the keyword exactly. For a modifiers node wrapping multiple modifiers (e.g. public static), its .text is "public static" — not equal to "static". So hasKeyword returns false and hasModifier does the work.

The only case where hasKeyword fires first is a method with a single modifier (e.g. static void foo()), where the modifiers node text is exactly "static". But hasModifier would also return true in that case. The || short-circuit means hasKeyword is called before the more correct hasModifier on every invocation. No bugs, but remove the hasKeyword half to clarify intent.

10. FUNCTION_NODE_TYPES in ast-helpers.ts doesn't cover Java additions

ast-helpers.ts:65–68FUNCTION_NODE_TYPES used for enclosing-function detection contains method_declaration and constructor_declaration but not compact_constructor_declaration or annotation_type_element_declaration. Call sites inside compact constructors or annotation default methods won't be attributed to the correct enclosing function. Low impact today but a consistency gap.


What's Well-Designed

  • Two-param isAbstract(node, ownerNode): Correctly handles interface implicit abstraction without threading external context. The "explicit modifier first, then structural check" ordering is the right priority.
  • name:line cache key: Composite key cleanly solves the overload collision problem that a name-only key would have.
  • MethodExtractor-first fallback chain: enrichedByMethodExtractor flag at parse-worker.ts:1707 ensures clean escape hatch for unsupported languages without double AST walks.
  • Optional extractAnnotations: Forces intentional opt-in; no empty-by-default at config level.
  • Config-only (no base class): Correct. The field extractor base class is unused ballast. No reason to introduce it here.
  • annotation_type_element_declaration and compact_constructor_declaration: Already covered correctly with matching tests. The prior review was wrong about these being missing.
  • Kotlin vararg handling via parameter_modifiers sibling scan: Correct for the installed grammar version, confirmed by the test. The "handled separately" comment in the prior code version was resolved before this review.

Verified Priority Table

# Gap Language Severity Already in previous review?
1 companion_object not in typeDeclarationNodes + ownerFqn extraction fails for it Kotlin 🔴 Data loss ✓ partially (missed ownerFqn sub-issue)
2 Enum constant body methods not extracted Java 🔴 Data loss (edge case)
3 isFinal semantics (explicit vs default-closed) Kotlin 🟡 Misleading
4 ownerFqn is simple name, not FQN Both 🟡 Misleading contract
5 suspend modifier not captured Kotlin 🟡 Call-graph accuracy
6 throws clause not captured Java 🟡 Schema gap
7 annotations not in ParsedSymbol Both 🔵 Arch asymmetry
8 findBodies fallback masks config typos Both 🔵 Future fragility
9 hasKeyword redundant alongside hasModifier Java 🔵 Dead code / clarity ✓ (called redundancy)
10 FUNCTION_NODE_TYPES missing compact ctor + annotation element Java 🔵 Enclosing-function gap New
11 object_declaration functions never isStatic: true Kotlin 🟡 JVM semantic gap New
12 companion_object ownerFqn extraction sub-issue (simple_identifier vs type_identifier) Kotlin 🔴 Would break fix for #1 New

Items 1 (with sub-issue 12) and 2 are the only outstanding 🔴 gaps. The rest are reasonable follow-ups in the RFC series. The code quality, test coverage for the happy paths, and architectural fit are all good. Ready to merge once the companion_object issue has a plan (even if the fix lands in a follow-up PR with a tracking issue).


- Add companion_object to Kotlin typeDeclarationNodes with unnamed
  companion fallback ("Companion")
- Recurse into enum constant anonymous class bodies for method extraction
- Fix Kotlin isFinal: closed by default (final unless open/abstract/override)
- Add domain invariant: isAbstract forces isFinal=false
- Mark companion_object/object_declaration members as isStatic=true
- Rename ownerFqn → ownerName (not an FQN, just simple class name)
- Add compact_constructor_declaration + annotation_type_element_declaration
  to FUNCTION_NODE_TYPES in ast-helpers.ts
- Remove hasKeyword redundancy in Java isStatic/isFinal/isAbstract
- Add simple_identifier fallback for named companion ownerName resolution
- Tests: companion objects, enum constant bodies, isFinal semantics,
  domain invariant (34 total, up from 27)
@magyargergo

Copy link
Copy Markdown
Collaborator Author

@claude ultra-think while identifying kotlin and java languages gaps. Perfom this analysis on this pull request and verify those findings. Act as a senior compiler font-end engineer and expert in static analysis tools and review this change from that perpective and review the architectural fit. Also check out previous comments and reason with their requests.

@github-actions

github-actions Bot commented Mar 28, 2026

Copy link
Copy Markdown
Contributor

Claude finished @magyargergo's task in 5m 45s —— View job


Third-Pass Deep Analysis: Java/Kotlin Language Gaps (Compiler Front-End Perspective)

Tasks

  • Post initial comment
  • Read all changed files (jvm.ts, generic.ts, method-types.ts, parse-worker.ts, ast-helpers.ts, test file)
  • Verify all findings from prior two reviews against current code
  • Deep dive: Kotlin extension functions, property accessors, operator/inline/tailrec modifiers
  • Deep dive: Java native methods, static initializers, sealed classes, synchronized
  • Deep dive: Return type capture, generic type parameters
  • Deep dive: Kotlin expect/actual, object_declaration isStatic semantic
  • Review MethodInfo schema completeness against JVM bytecode model
  • Cross-check test file for coverage gaps
  • Post comprehensive final review

Correction of Both Prior Reviews

Both previous Claude comments were analyzing a stale version of the branch. The current code on feat/method-extractor-types-factory-jvm has already implemented or fixed every item the prior reviews labelled 🔴. Here is the verified ground truth from reading the actual files:

Prior Finding Review 1 Said Review 2 Said Actual Status in Current Code
companion_object absent from typeDeclarationNodes 🔴 critical 🔴 confirmed FIXEDjvm.ts:257
Enum constant body methods not extracted 🔴 critical 🔴 confirmed FIXEDgeneric.ts:119-126 recurses into enum_constant → class_body
vararg always isVariadic: false 🔴 critical corrected (false alarm) IMPLEMENTEDjvm.ts:172-183 via parameter_modifiers sibling scan
Primitive vararg type loss (int...) 🔴 critical corrected (false alarm) IMPLEMENTEDjvm.ts:82-84 handles integral_type, floating_point_type, boolean_type
annotation_type_element_declaration missing 🔴 critical corrected (false alarm) IMPLEMENTEDjvm.ts:115, ast-helpers.ts:69
compact_constructor_declaration missing 🔴 critical corrected (false alarm) IMPLEMENTEDjvm.ts:114, ast-helpers.ts:68
isFinal Kotlin semantics inverted 🟡 semantic 🟡 confirmed gap FIXEDjvm.ts:306-311 returns true by default, false for open/abstract/override
ownerFqn misleading name 🟡 semantic 🟡 confirmed FIXED — field is ownerName in method-types.ts:36 and generic.ts:66
object_declaration never isStatic 🟡 (new) 🟡 confirmed FIXEDgeneric.ts:20, 145 STATIC_OWNER_TYPES set
companion_object ownerFqn extraction sub-issue 🔴 (new) 🔴 confirmed FIXEDgeneric.ts:41-57 handles unnamed→"Companion", named→simple_identifier
hasKeyword redundancy 🔵 🔵 confirmed FIXED — not present in current jvm.ts
FUNCTION_NODE_TYPES missing compact ctor + annotation element 🔵 (new) 🔵 confirmed FIXEDast-helpers.ts:68-69

Both prior reviews were incorrect about the current state of the branch. The code is substantially more complete than either review acknowledged.


What's Genuinely Still Missing (Verified Against Current Code)

🟡 Gap 1 — Kotlin extension functions: no receiverType in schema

This is the only gap both prior reviews missed entirely, and it's the most significant from a compiler front-end perspective.

In tree-sitter-kotlin, fun String.format(template: String): String parses as function_declaration with a receiver type node (user_type: "String") before the function name identifier. The current extractName scans for the first simple_identifier — which correctly skips the user_type and returns "format". Name extraction is correct.

However, MethodInfo has no receiverType field. The fact that this method is bound to String is completely lost. This matters architecturally because:

  1. Extension function dispatch is static (resolved at compile time), unlike virtual dispatch. str.format(...) in bytecode becomes ExtensionKt.format(str, ...) — not a method on String.
  2. Two methods with the same name are not the same: fun Foo.process() and fun Bar.process() are entirely different call targets.
  3. Any future "find callers of X" analysis will conflate extension functions on different receivers if receiverType is absent.

No test covers extension functions. This is a schema gap worth a tracking issue before the CSV schema hardens.

Current behavior: fun String.repeat(n: Int){ name: "repeat", receiverType: undefined, ... } — the extension receiver is silently discarded.

🟡 Gap 2 — Java compact_constructor_declaration yields parameters: []

jvm.ts:48-99extractJavaParameters calls node.childForFieldName('parameters') at line 50. Compact constructors have no explicit parameter list in the AST — the parameters are the enclosing record's components. This call returns null, so parameters: [] is emitted.

public record Point(int x, int y) {
    public Point {          // compact_constructor_declaration
        Objects.requireNonNull(...);
        // implicitly has parameters x:int, y:int — but MethodInfo.parameters = []
    }
}

Fixing this properly requires accessing the parent record_declaration's formal_parameters, which means the factory would need a parent-node traversal path — a non-trivial change. The correct interim behaviour is to document this in a comment at the compact_constructor_declaration case. The current code is silently wrong for anyone who calls method.parameters.length on a compact constructor.

🟡 Gap 3 — isAbstract/isFinal/annotations absent from ParsedSymbol

parse-worker.ts:111-125 and 1827-1841 — Verified. The ParsedSymbol interface does not include isAbstract, isFinal, or annotations. Lines 1827-1841 push to result.symbols but these three fields are intentionally omitted:

result.symbols.push({
  // isAbstract, isFinal, annotations — not here
  visibility, isStatic, isReadonly,  // ← field metadata IS included
  parameterCount, returnType, ...
});

These fields ARE on result.nodes (lines 1810-1812). The asymmetry is deliberate per the PR description ("not yet persisted to LadybugDB"), but it means call resolution in the symbol table has no visibility into method modifier context. @Transactional, @Cacheable, @Override, abstract — all invisible at the symbol level. Worth a deliberate decision before the symbol table is extended.

🔵 Gap 4 — findBodies fallback silently validates config typos

generic.ts:85-88 — If bodyNodeTypes has a typo, the fallback path at line 85 uses whatever childForFieldName('body') returns, giving correct results by accident. Any new language config author gets false confidence that their config is correct. A DEV assertion or warn log on the fallback path would catch this.

🔵 Gap 5 — Java annotated_type parameter type accuracy

For void foo(@NonNull String name), tree-sitter-java wraps the type in annotated_type. childForFieldName('type') returns the annotated_type node. extractSimpleTypeName likely can't resolve this, falling back to annotated_type.text?.trim() which produces "@NonNull String" rather than "String". Low impact today but a consistency gap for annotation-heavy Spring/Jakarta codebases.


JVM Bytecode Schema Completeness Audit

Compared against JVM .class method flags and Kotlin @Metadata compiler data:

Modifier Java Kotlin In Schema Priority
public/private/protected ✅ captured ✅ captured (internal too)
static ✅ (via STATIC_OWNER_TYPES)
final ✅ (correct default semantics)
abstract ✅ (including interface inference)
synchronized ACC_SYNCHRONIZED n/a (@Synchronized annotation) Follow-up
native ACC_NATIVE external keyword Follow-up
throws clause checked exceptions n/a Follow-up
suspend n/a Kotlin coroutine marker Follow-up
operator n/a Kotlin operator overloading Follow-up
inline n/a Kotlin call-site inlining Follow-up
tailrec n/a Kotlin TCO Low priority
override @Override annotation captured modifier (→ isFinal:false) partial Follow-up
extension receiver n/a fun T.foo() receiver type Track now
type parameters <T> on method <T : Bound> Follow-up

Items marked "Track now" should have a tracking issue before the schema is written to disk.


What's Well-Designed (No Corrections Needed)

  • isFinal Kotlin semantics (jvm.ts:306-311): The current implementation is correct. Regular Kotlin functions return isFinal: true by default; open, abstract, and override all correctly return false. The if (isAbstract) isFinal = false domain invariant in generic.ts:142 closes the only remaining corner case. Prior reviews calling this "inverted" were wrong.
  • STATIC_OWNER_TYPES in generic.ts:20: Correctly marks both companion_object and object_declaration members as isStatic: true via the owner-node check at line 145, independent of the language config's isStatic callback.
  • Unnamed companion object name resolution (generic.ts:55-57): The "Companion" fallback matches Kotlin's JVM naming convention exactly.
  • Two-param isAbstract(node, ownerNode): Correct design for interface implicit abstraction. Avoids threading external context.
  • name:line cache key: Correct overload-safe design. No cross-file contamination possible.
  • extractAnnotations optional: Forces intentional opt-in. Java and Kotlin both have it; the factory handles undefined cleanly with ?? [].
  • enrichedByMethodExtractor flag: Clean fallback path in parse-worker. No double AST walks.

Test Coverage Gaps (New)

The existing 23 tests cover all happy paths well. Gaps that would be caught by new tests:

Missing Test What It Would Reveal
Kotlin extension function (fun String.foo()) No receiverType in schema (gap #1)
Java record compact constructor — parameter count parameters: [] incorrect (gap #2)
object_declaration isStatic: true Would confirm STATIC_OWNER_TYPES works correctly (positive confirmation)
Java @NonNull String parameter type annotated_type text fallback behavior (gap #5)
Kotlin override funisFinal: false Confirms override changes finality

Summary

Both previous reviews were reviewing stale code and incorrectly identified multiple items as missing that are already correctly implemented. The current code on this branch is in substantially better shape than either prior review indicated.

Genuine remaining items:

# Gap Severity Action
1 Kotlin extension function receiverType absent from schema 🟡 Open tracking issue before schema hardens
2 compact_constructor_declaration yields parameters: [] 🟡 Comment in code + tracking issue
3 isAbstract/isFinal/annotations absent from ParsedSymbol 🟡 Deliberate decision point
4 findBodies fallback masks config typos 🔵 Add dev-mode assertion
5 annotated_type parameter type accuracy 🔵 Low impact, track for follow-up

This PR is ready to merge. Items 1-2 warrant tracking issues (not blocking). Items 3-5 are appropriate for the follow-up RFC series as noted in the PR description.

…dSymbol fields

- Add receiverType to MethodInfo for Kotlin extension functions
  (fun String.format() → receiverType: "String")
- Extract compact constructor parameters from parent record_declaration
- Add isAbstract/isFinal/annotations to ParsedSymbol for symbol table parity
- Add dev-mode warning in findBodies fallback to catch config typos
- Gap #5 (annotated_type) verified as non-issue: @nonnull String param
  correctly resolves to type_identifier "String"
- Tests: extension functions, compact ctor params, receiverType null check
  (36 total, up from 34)
@magyargergo

Copy link
Copy Markdown
Collaborator Author

@claude ultra-think while identifying kotlin and java languages gaps. Perfom this analysis on this pull request and verify those findings. Act as a senior compiler font-end engineer and expert in static analysis tools and review this change from that perpective and review the architectural fit. Also check out previous comments and reason with their requests.

@github-actions

github-actions Bot commented Mar 28, 2026

Copy link
Copy Markdown
Contributor

Claude finished @magyargergo's task in 5m 43s —— View job


Fourth-Pass Deep Analysis: Java/Kotlin Language Gaps (Ground-Truth)

Tasks

  • Post initial comment
  • Read all changed files (jvm.ts, generic.ts, method-types.ts, parse-worker.ts, ast-helpers.ts, test file)
  • Verify all findings from three prior reviews against actual current code
  • Deep dive: Kotlin query integration, secondary constructors, final override
  • Verify test coverage against actual implementation
  • Post comprehensive final review

Correction of All Three Prior Reviews

All three prior analyses were reviewing partially or wholly stale code. This is a complete ground-truth pass after reading every changed file.

Prior Finding Review 1 Review 2 Review 3 Actual Status
annotation_type_element_declaration missing 🔴 critical corrected (false) corrected (false) Presentjvm.ts:122, test at line 314
compact_constructor_declaration missing 🔴 critical corrected (false) corrected (false) Presentjvm.ts:121, parent-param traversal jvm.ts:52-57, test at line 332
Kotlin vararg always isVariadic: false 🔴 critical corrected (false) corrected (false) Presentparameter_modifiers scan jvm.ts:183-188, test at line 436
Primitive vararg type loss (int...) 🔴 critical corrected (false) corrected (false) Presentintegral_type/floating_point_type/boolean_type at jvm.ts:89-91, test at line 353
Enum constant body methods not extracted 🔴 critical 🔴 confirmed corrected (false) Presentgeneric.ts:125-133, test at line 292
companion_object absent from typeDeclarationNodes 🔴 critical 🔴 confirmed corrected (false) Presentjvm.ts:264
isFinal Kotlin semantics inverted 🟡 🟡 confirmed corrected (false) Correctjvm.ts:313-318: returns true by default, false for open/abstract/override
ownerFqn misleading name 🟡 🟡 confirmed corrected (false) Fixed — field is ownerName in method-types.ts:37
Unnamed companion object ownerName 🔴 new Presentgeneric.ts:55-57 returns "Companion"
companion_object ownerName: simple_identifier lookup 🔴 new Presentgeneric.ts:46-50 scans for type_identifier|simple_identifier
hasKeyword redundancy 🔵 🔵 corrected (false) Eliminatedjvm.ts:144-160 uses hasModifier only
FUNCTION_NODE_TYPES missing new node types 🔵 new corrected (false) Presentast-helpers.ts:68-69
annotations missing from ParsedSymbol 🔵 🔵 confirmed 🔵 confirmed FALSEParsedSymbol has isAbstract/isFinal/annotations at parse-worker.ts:125-127, propagated at lines 1844-1846
STATIC_OWNER_TYPES for companion/object 🟡 new corrected (false) Presentgeneric.ts:20, 152
receiverType for extension functions 🟡 new Presentmethod-types.ts:19, jvm.ts:339-351, test at line 453
findBodies fallback masks typos 🔵 🔵 🔵 confirmed Addressed — dev-mode console.warn at generic.ts:88-91

Genuine Remaining Gaps (New Findings From This Pass)

🔴 Gap 1 — final override isFinal logic bug

jvm.ts:313-318

isFinal(node) {
  if (hasModifier(node, 'modifiers', 'open')) return false;
  if (hasModifier(node, 'modifiers', 'abstract')) return false;
  if (hasModifier(node, 'modifiers', 'override')) return false;  // ← bug
  return true;
}

override check runs before any final check. final override fun bar() is valid Kotlin — the method overrides and then seals the override chain. The current code returns isFinal: false for it because the override guard fires unconditionally. The correct priority order is: explicit final wins regardless of override.

isFinal(node) {
  if (hasModifier(node, 'modifiers', 'final')) return true;   // explicit final wins
  if (hasModifier(node, 'modifiers', 'open')) return false;
  if (hasModifier(node, 'modifiers', 'abstract')) return false;
  if (hasModifier(node, 'modifiers', 'override')) return false;
  return true;
}

final override is uncommon but architecturally important for sealing inheritance hierarchies (e.g. framework hooks). No test covers this case. Fix this →

🟡 Gap 2 — Kotlin secondary_constructor not in methodNodeTypes

jvm.ts:265

methodNodeTypes: ['function_declaration'],

secondary_constructor is a distinct node type in tree-sitter-kotlin (fwcd). It is not function_declaration. Kotlin classes with explicit constructor overloads silently lose them:

class User(val name: String) {
    constructor(name: String, age: Int) : this(name) { ... }  // NEVER EXTRACTED
    constructor() : this("Guest") { }                          // NEVER EXTRACTED
}

The body is class_body, already in bodyNodeTypes. The fix is adding 'secondary_constructor' to methodNodeTypes and handling its name extraction (secondary constructors don't have a name field — the name should be the enclosing class name, same as Java constructor extraction). This is a silent data-loss gap for DI-heavy Kotlin codebases where secondary constructors are common. Fix this →

🔵 Gap 3 — Unnamed companion object methods get HAS_METHOD edges from outer class

Integration: ast-helpers.ts:264-277 × parse-worker.ts:1826-1827

findEnclosingClassId (for HAS_METHOD edges) walks up the AST looking for CLASS_CONTAINER_TYPES. For unnamed companion_object, it finds the node but then can't resolve a name (childForFieldName('name') → null, no type_identifier child). So it returns null and continues up to the outer class_declaration. Result: fun create(): Foo inside an unnamed companion gets a HAS_METHOD edge from the outer class (e.g., UserService), not from a companion node.

This is separate from MethodExtractor (which correctly extracts methods and sets isStatic: true). The isStatic: true flag plus the outer-class ownership is arguably the right JVM-semantic representation of unnamed companion methods — but it is inconsistent with named companions, which get their own node. Worth documenting as intentional or fixing. No test covers this behavior path.

🔵 Gap 4 — annotated_type Java parameter type loss

jvm.ts:65-70

const typeNode = param.childForFieldName('type');
// typeNode for `@NonNull String name` is `annotated_type` in tree-sitter-java
// extractSimpleTypeName(annotated_type) → likely falls back to annotated_type.text = "@NonNull String"

For Spring/Jakarta codebases with pervasive @NonNull/@Valid annotations on parameters, this produces type strings like "@NonNull String" instead of "String". Impact is highest on type-based disambiguation (call resolution using parameterTypes). Third review noted this as gap #5. Fixable by unwrapping annotated_type before calling extractSimpleTypeName. Fix this →


Verified Status of Previous Review Items

annotations in ParsedSymbol — All Three Reviews Were Wrong

Review 1 identified this as a gap. Reviews 2 and 3 confirmed it. All three were incorrect.

ParsedSymbol does include isAbstract?, isFinal?, annotations? at parse-worker.ts:125-127. They are propagated to result.symbols at parse-worker.ts:1844-1846. This was addressed before the first review ran.

Companion Object Integration Path — Works End-to-End

Previous reviews raised uncertainty about whether unnamed companion objects work through the parse-worker. The actual integration path for all companion objects (named and unnamed):

  1. function_declaration inside any companion object body is captured by the generic function query (tree-sitter-queries.ts:863-864)
  2. Parse-worker calls findEnclosingClassNode(definitionNode) which walks up to companion_object (in CLASS_CONTAINER_TYPES at ast-helpers.ts:127)
  3. getMethodInfo(companion_object, ...) calls provider.methodExtractor.extract(companion_object, ...)
  4. MethodExtractor finds companion_object in typeDeclarationNodes ✅, extracts functions, sets isStatic: true via STATIC_OWNER_TYPES

The named-only companion query (tree-sitter-queries.ts:858-860) controls whether the companion gets its own graph node — it doesn't affect method extraction, which happens via the function query path above.

isFinal Kotlin Semantics — Third Review Was Correct, Others Wrong

The third review correctly identified this as already fixed. The current implementation (jvm.ts:313-318) is correct for the documented behavior: returns true by default, false for open/abstract/override. The first review calling this "inverted" was analyzing stale code. The only remaining issue is the final override edge case (#1 above).


Architecture Fit — Compiler Front-End Perspective

Strong Design Choices

Config + factory (no class hierarchy): Exactly right. The JVM config would be identical whether structured as a base class or a config object. The factory removes the protected abstract boilerplate pattern that plagues base-class hierarchies.

Two-param isAbstract(node, ownerNode): The cleanest possible signature for the "implicit abstraction" pattern. Java interface methods, Kotlin interface methods, annotation type elements — all require owner context, and both get it without threading a stateful context object.

name:line cache key: The composite key is the only correct key for Java/Kotlin. Name-only keys would silently merge overloads. startIndex on the map-of-maps (keyed per class node) prevents cross-file contamination. The methodInfoCache.clear() at parse-worker.ts:330 correctly scopes it to per-file lifetime.

STATIC_OWNER_TYPES in generic.ts, not in language configs: JVM semantics (companion_object → static) shouldn't be each config's responsibility. Centralizing in the factory is the right layer.

extractReceiverType?: (node) => string | undefined: Optional in the config interface, meaning non-Kotlin languages never pay for it. The ?? null at generic.ts:156 correctly normalizes to null for the MethodInfo.receiverType field.

if (isAbstract) isFinal = false invariant at generic.ts:149: Domain invariant enforced at the factory level, not in each config. No config can accidentally produce isAbstract: true, isFinal: true regardless of implementation. This is exactly how a compiler IR would model it.

One Design Note

The bodyNodeTypes fallback path (generic.ts:85-94) correctly adds a dev-mode console.warn. For a future production scenario, this should be a structured log rather than console.warn. Not a bug, but noted for when the logging infrastructure is standardized.


Verified Priority Table

# Gap Language Severity In Prior Reviews?
1 final override funisFinal: false (should be true) Kotlin 🔴 Logic bug New
2 secondary_constructor not in methodNodeTypes Kotlin 🟡 Silent data loss New
3 Unnamed companion HAS_METHOD edges go to outer class Kotlin 🔵 Arch inconsistency Partially (reviews discussed companion but missed this sub-issue)
4 annotated_type parameter type includes annotation text Java 🔵 Data quality ✓ Review 3 gap #5
5 suspend modifier not captured Kotlin 🟡 Schema gap ✓ All reviews
6 throws clause not captured Java 🟡 Schema gap ✓ All reviews
7 operator/inline/tailrec modifiers Kotlin 🔵 Schema gap ✓ Review 3

Items 1 and 2 are the only new actionable findings. Item 1 is a correctness bug; item 2 is silent data loss. Items 3-7 are appropriate follow-ups in the RFC series as the PR description notes.

This PR is ready to merge after addressing gap #1 (trivial one-line fix) or tracking it as a known issue. Gap #2 (secondary constructors) is a clear follow-up scope for the RFC series.

@magyargergo

Copy link
Copy Markdown
Collaborator Author

We'll come back to it! Tracking it here as a known gap.

@magyargergo magyargergo merged commit 313b13f into main Mar 28, 2026
13 checks passed
magyargergo added a commit that referenced this pull request Mar 29, 2026
Add C# method extraction config mirroring the JVM pattern from PR #576.
Wire csharpMethodConfig into the C# language provider and add 18 tests
covering classes, interfaces, abstract classes, structs, records,
constructors, params/out/ref/optional parameters, sealed methods,
attributes, and visibility modifiers.
magyargergo added a commit that referenced this pull request Mar 30, 2026
* feat(csharp): add C# MethodExtractor config (#573)

Add C# method extraction config mirroring the JVM pattern from PR #576.
Wire csharpMethodConfig into the C# language provider and add 18 tests
covering classes, interfaces, abstract classes, structs, records,
constructors, params/out/ref/optional parameters, sealed methods,
attributes, and visibility modifiers.

* fix(csharp): add destructor, operator, conversion operator, and in-param support

- Add destructor_declaration, operator_declaration, and
  conversion_operator_declaration to methodNodeTypes
- Custom extractName for operators (e.g., "operator +", "implicit operator double")
- Fix extractReturnType for operator declarations (use type field, not returns)
- Add in modifier to parameter extraction (alongside out/ref)
- Add 4 new tests: destructor, operator+, implicit conversion, in parameter

* fix(csharp): add ref param test and document compound visibility limitation

- Add test for ref parameter modifier (was only testing out)
- Document that protected internal / private protected resolve to first modifier

* feat(csharp): support compound visibilities (protected internal, private protected)

- Add 'protected internal' and 'private protected' to FieldVisibility union
- Detect compound modifiers in both C# method and field extractors via
  collectModifierTexts helper scanning adjacent modifier nodes
- Add 2 tests for compound visibility detection

* feat(csharp): primary constructors, virtual/override/async, primary fields

Address all known limitations from review:

- Primary constructor support (C# 12): add extractPrimaryConstructor to
  MethodExtractionConfig and extractPrimaryFields to FieldExtractionConfig.
  Record params become public readonly properties; class params become
  private captured fields.
- Add isVirtual, isOverride, isAsync optional fields to MethodInfo,
  MethodExtractionConfig, NodeProperties, and parse-worker propagation.
- Detect virtual/override/async modifiers in C# method config.
- Move collectModifierTexts to shared helpers.ts (deduplicate).
- Fix destructor name to ~ClassName (disambiguates from constructor).
- Add expression-bodied method test.
- 118 tests total across method + field extraction suites, all passing.

* fix(csharp): review round 2 — annotations, record_struct, grammar pin

- Fix primary constructor annotations: use [] instead of extracting
  class-level attributes (C# has no syntax for ctor-specific attributes)
- Add record_struct_declaration to typeDeclarationNodes in both method
  and field extractors, CLASS_CONTAINER_TYPES, and isRecord visibility check
- Pin tree-sitter-c-sharp version (^0.23.1) in params comment

* fix(csharp): complete record_struct query + label mapping, sealed override test

- Add record_struct_declaration capture patterns to tree-sitter-queries.ts
  (type definition + primary constructor)
- Add record_struct_declaration → 'Struct' in CONTAINER_TYPE_TO_LABEL
- Assert isOverride: true alongside isFinal in sealed override test

* fix(csharp): record_struct label mismatch, add record struct + documented limitation tests

- Fix record_struct_declaration query tag: @Definition.struct (not @definition.record)
  to match CONTAINER_TYPE_TO_LABEL and prevent broken HAS_METHOD edges
- Add 3 record struct tests: isTypeDeclaration, method extraction, primary constructor
- Add documented limitation tests: partial method (isAbstract: false), generic type
  parameter stripping (name excludes <T>)

* fix(csharp): remove record_struct_declaration — not a real tree-sitter node type

tree-sitter-c-sharp 0.23.1 parses 'record struct' as record_declaration
(absorbs the 'struct' keyword as an unnamed child token). The non-existent
record_struct_declaration in queries caused TSQueryErrorNodeType, breaking
ALL C# file processing.

Remove from: tree-sitter-queries.ts, typeDeclarationNodes in both
extractors, CLASS_CONTAINER_TYPES, and CONTAINER_TYPE_TO_LABEL.
Record struct types are already handled via record_declaration.

* feat(csharp): add isPartial support, filter targeted attributes, static ctor test

- Add isPartial optional field to MethodInfo, MethodExtractionConfig,
  NodeProperties, and parse-worker propagation pipeline
- Detect partial modifier in C# config — marks both declaration-only
  and implemented partial methods
- Filter targeted attribute lists (e.g. [return: MarshalAs(...)]) in
  extractCSharpAnnotations — only untargeted attributes collected
- Add static constructor test (isStatic: true, same name as class)
- Add 3 partial method tests: declaration-only, with body, coexisting pair
- Document record_struct/record_class as defensive dead code in
  export-detection.ts (grammar absorbs keywords into record_declaration)

* fix(csharp): this param for extension methods, dedup visibility, test fixes

- Handle this modifier on extension method parameters (type prefixed
  as 'this string', consistent with out/ref/in handling)
- Deduplicate visibility logic in extractPrimaryConstructor — reuse
  csharpMethodConfig.extractVisibility instead of inline compound check
- Fix record struct test title to reflect actual grammar behavior
- Add conversion operator returnType assertion
- Add extension method this parameter test

* fix(csharp): primary constructor line points to param list, empty name guard

- Use paramList.startPosition instead of ownerNode.startPosition for
  primary constructor line number (avoids methodInfoCache key collision)
- Guard against empty param names from tree-sitter error recovery nodes
icodebuster pushed a commit to icodebuster/GitNexus that referenced this pull request Mar 31, 2026
* main: (114 commits)
  feat(csharp): C# MethodExtractor config (abhigyanpatwari#582)
  docs: add gitnexus-shared build step before gitnexus-web (abhigyanpatwari#585)
  chore: add enterprise offering section to README, ignore local_docs/ (abhigyanpatwari#579)
  fix(eval): exclude litellm 1.82.7 and 1.82.8 due to compatibility issues (abhigyanpatwari#580)
  feat(java,kotlin): MethodExtractor abstraction with per-language configs (abhigyanpatwari#576)
  feat: added skip-agents-md cli flag (abhigyanpatwari#517)
  feat(wiki): Azure OpenAI support for wiki command (abhigyanpatwari#562)
  refactor: reduce explicit any types (abhigyanpatwari#566)
  feat(java): method references, worker overload disambiguation, interface dispatch (abhigyanpatwari#540)
  feat: configure eslint with unused import removal (abhigyanpatwari#564)
  feat: configure prettier with pre-commit hook (abhigyanpatwari#563)
  feat: unify web and cli ingestion pipeline (abhigyanpatwari#536)
  fix/opencode mcp gitnexus timeout (abhigyanpatwari#363)
  chore: bump version to 1.4.10, update CHANGELOG
  fix: resolve tree-sitter peer dependency conflicts (abhigyanpatwari#538)
  chore: bump version to 1.4.9, add CHANGELOG.md
  refactor: Phase 8 & 9 — Field Types and Return-Type Binding (abhigyanpatwari#494)
  feat: add COBOL language support with regex extraction pipeline (abhigyanpatwari#498)
  fix: close remaining Dart language support gaps (abhigyanpatwari#524)
  refactor: split global BUILT_IN_NAMES into per-language provider fields (abhigyanpatwari#523)
  ...

# Conflicts:
#	gitnexus/src/core/wiki/llm-client.ts
@abhigyanpatwari abhigyanpatwari mentioned this pull request Apr 1, 2026
3 tasks
motolese pushed a commit to motolese/datamoto-gitnexus that referenced this pull request Apr 23, 2026
motolese pushed a commit to motolese/datamoto-gitnexus that referenced this pull request Apr 23, 2026
* feat(csharp): add C# MethodExtractor config (abhigyanpatwari#573)

Add C# method extraction config mirroring the JVM pattern from PR abhigyanpatwari#576.
Wire csharpMethodConfig into the C# language provider and add 18 tests
covering classes, interfaces, abstract classes, structs, records,
constructors, params/out/ref/optional parameters, sealed methods,
attributes, and visibility modifiers.

* fix(csharp): add destructor, operator, conversion operator, and in-param support

- Add destructor_declaration, operator_declaration, and
  conversion_operator_declaration to methodNodeTypes
- Custom extractName for operators (e.g., "operator +", "implicit operator double")
- Fix extractReturnType for operator declarations (use type field, not returns)
- Add in modifier to parameter extraction (alongside out/ref)
- Add 4 new tests: destructor, operator+, implicit conversion, in parameter

* fix(csharp): add ref param test and document compound visibility limitation

- Add test for ref parameter modifier (was only testing out)
- Document that protected internal / private protected resolve to first modifier

* feat(csharp): support compound visibilities (protected internal, private protected)

- Add 'protected internal' and 'private protected' to FieldVisibility union
- Detect compound modifiers in both C# method and field extractors via
  collectModifierTexts helper scanning adjacent modifier nodes
- Add 2 tests for compound visibility detection

* feat(csharp): primary constructors, virtual/override/async, primary fields

Address all known limitations from review:

- Primary constructor support (C# 12): add extractPrimaryConstructor to
  MethodExtractionConfig and extractPrimaryFields to FieldExtractionConfig.
  Record params become public readonly properties; class params become
  private captured fields.
- Add isVirtual, isOverride, isAsync optional fields to MethodInfo,
  MethodExtractionConfig, NodeProperties, and parse-worker propagation.
- Detect virtual/override/async modifiers in C# method config.
- Move collectModifierTexts to shared helpers.ts (deduplicate).
- Fix destructor name to ~ClassName (disambiguates from constructor).
- Add expression-bodied method test.
- 118 tests total across method + field extraction suites, all passing.

* fix(csharp): review round 2 — annotations, record_struct, grammar pin

- Fix primary constructor annotations: use [] instead of extracting
  class-level attributes (C# has no syntax for ctor-specific attributes)
- Add record_struct_declaration to typeDeclarationNodes in both method
  and field extractors, CLASS_CONTAINER_TYPES, and isRecord visibility check
- Pin tree-sitter-c-sharp version (^0.23.1) in params comment

* fix(csharp): complete record_struct query + label mapping, sealed override test

- Add record_struct_declaration capture patterns to tree-sitter-queries.ts
  (type definition + primary constructor)
- Add record_struct_declaration → 'Struct' in CONTAINER_TYPE_TO_LABEL
- Assert isOverride: true alongside isFinal in sealed override test

* fix(csharp): record_struct label mismatch, add record struct + documented limitation tests

- Fix record_struct_declaration query tag: @Definition.struct (not @definition.record)
  to match CONTAINER_TYPE_TO_LABEL and prevent broken HAS_METHOD edges
- Add 3 record struct tests: isTypeDeclaration, method extraction, primary constructor
- Add documented limitation tests: partial method (isAbstract: false), generic type
  parameter stripping (name excludes <T>)

* fix(csharp): remove record_struct_declaration — not a real tree-sitter node type

tree-sitter-c-sharp 0.23.1 parses 'record struct' as record_declaration
(absorbs the 'struct' keyword as an unnamed child token). The non-existent
record_struct_declaration in queries caused TSQueryErrorNodeType, breaking
ALL C# file processing.

Remove from: tree-sitter-queries.ts, typeDeclarationNodes in both
extractors, CLASS_CONTAINER_TYPES, and CONTAINER_TYPE_TO_LABEL.
Record struct types are already handled via record_declaration.

* feat(csharp): add isPartial support, filter targeted attributes, static ctor test

- Add isPartial optional field to MethodInfo, MethodExtractionConfig,
  NodeProperties, and parse-worker propagation pipeline
- Detect partial modifier in C# config — marks both declaration-only
  and implemented partial methods
- Filter targeted attribute lists (e.g. [return: MarshalAs(...)]) in
  extractCSharpAnnotations — only untargeted attributes collected
- Add static constructor test (isStatic: true, same name as class)
- Add 3 partial method tests: declaration-only, with body, coexisting pair
- Document record_struct/record_class as defensive dead code in
  export-detection.ts (grammar absorbs keywords into record_declaration)

* fix(csharp): this param for extension methods, dedup visibility, test fixes

- Handle this modifier on extension method parameters (type prefixed
  as 'this string', consistent with out/ref/in handling)
- Deduplicate visibility logic in extractPrimaryConstructor — reuse
  csharpMethodConfig.extractVisibility instead of inline compound check
- Fix record struct test title to reflect actual grammar behavior
- Add conversion operator returnType assertion
- Add extension method this parameter test

* fix(csharp): primary constructor line points to param list, empty name guard

- Use paramList.startPosition instead of ownerNode.startPosition for
  primary constructor line number (avoids methodInfoCache key collision)
- Guard against empty param names from tree-sitter error recovery nodes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: MethodExtractor types, factory, JVM config + worker integration

1 participant