Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 44 additions & 3 deletions gitnexus/src/core/ingestion/languages/typescript.ts
Original file line number Diff line number Diff line change
Expand Up @@ -66,11 +66,21 @@ import {
* - `{ addItem: (item) => ... }` (pair / property_assignment) → "addItem"
* Covers Zustand stores, TanStack Query factories, React Context
* providers, and most other HOF-heavy idioms (issue #1166).
* - `const X = HOC((args) => { ... })` (arguments → call_expression →
* variable_declarator) → "X". Covers `React.forwardRef`, `memo`,
* `useCallback`, `useMemo`, `observer`, `debounce`, and other HOC
* factories that wrap their behaviour-defining arrow. Without this
* branch, every shadcn/Radix UI component (`const Button =
* React.forwardRef(...)`) registered as an anonymous arrow with
* calls inside falling back to File-level attribution. The same
* applied to all `useCallback` / `useMemo` callbacks bound to a
* const — the sole way to give them a named caller anchor.
*
* Returns `null` for funcName when the arrow lives in a context that has
* no static name — call arguments, computed keys, return-from-arrow
* positions. The parent walk in findEnclosingFunctionId then continues
* up to the next named ancestor (or to the file).
* no static name — bare call arguments (not bound to a const), computed
* keys, return-from-arrow positions. The parent walk in
* findEnclosingFunctionId then continues up to the next named ancestor
* (or to the file).
*/
const tsExtractFunctionName = (
node: SyntaxNode,
Expand Down Expand Up @@ -114,6 +124,37 @@ const tsExtractFunctionName = (
return { funcName: null, label: 'Function' };
}

// HOC-wrapped variable declarations: `const Button = forwardRef((p, r) => { ... })`,
// `const handleClick = useCallback(() => doStuff(), [deps])`,
// `const Card = React.memo((props) => { ... })`. The arrow's `parent` is
// `arguments`, grandparent is `call_expression`, great-grandparent is
// `variable_declarator`. Walk the chain up and take the variable's name
// — the meaningful identifier the developer wrote on the LHS. Mirrors
// the four registry-primary patterns in `typescript/query.ts`. The
// wrapping callee (`forwardRef`, `memo`, `React.memo`, `useCallback`,
// user-defined HOCs) is intentionally NOT constrained: any function
// call whose result is bound to a const and whose first/positional
// argument is an arrow takes the const's name. Chained array-method
// calls (`const x = arr.find((y) => p(y))`) match too and produce a
// mostly-harmless `Function:x` (consumed as a value, never invoked),
// accepted as a small false-positive cost vs. the much larger gain of
// capturing the React UI-component idiom.
if (parent.type === 'arguments') {
const callExpr = parent.parent;
if (!callExpr || callExpr.type !== 'call_expression') {
return { funcName: null, label: 'Function' };
}
const declarator = callExpr.parent;
if (!declarator || declarator.type !== 'variable_declarator') {
return { funcName: null, label: 'Function' };
}
const nameNode = declarator.childForFieldName?.('name');
if (nameNode?.type === 'identifier') {
return { funcName: nameNode.text, label: 'Function' };
}
return { funcName: null, label: 'Function' };
}

return { funcName: null, label: 'Function' };
};

Expand Down
89 changes: 89 additions & 0 deletions gitnexus/src/core/ingestion/languages/typescript/query.ts
Original file line number Diff line number Diff line change
Expand Up @@ -214,6 +214,95 @@ const TYPESCRIPT_SCOPE_QUERY = `
key: (string (string_fragment) @declaration.name)
value: (function_expression) @declaration.function)

;; HOC-wrapped variable declarations: \`const X = HOC((args) => { ... })\`.
;;
;; Covers the dominant React UI idiom (\`React.forwardRef\`, \`React.memo\`,
;; bare \`forwardRef\` / \`memo\` / \`observer\`), Hook callbacks
;; (\`useCallback\`, \`useMemo\`), and library-wrapper factories (\`debounce\`,
;; \`throttle\`, user-defined \`withErrorBoundary\` / \`createHook\`, etc.).
;; All produce the same AST shape:
;;
;; lexical_declaration
;; variable_declarator
;; name: identifier "X" ← we want this name
;; value: call_expression
;; function: identifier | member_expression ← any callee
;; arguments: arguments
;; arrow_function | function_expression ← the actual code
;;
;; The pre-fix \`tsExtractFunctionName\` only handled \`variable_declarator\`
;; and \`pair\` parents, so HOC-wrapped arrows fell through anonymous. The
;; registry-primary \`query.ts\` had no pattern for this shape either —
;; \`const Button = forwardRef((p, r) => { ... })\` registered as a
;; \`Variable\` with no \`Function\` def, and every call inside the arrow
;; body lost caller attribution: \`resolveCallerGraphId\` walked up past
;; the empty arrow scope to the module's File fallback. Sourcerer-fe alone
;; has ~296 such declarations (57 forwardRef + 21 memo + 161 useCallback
;; + 57 useMemo) — all invisible to \`gitnexus_context\` /
;; \`gitnexus_impact\` for outgoing edges before this fix.
;;
;; Anchor discipline: same as the \`lexical_declaration\` / \`pair\` blocks
;; above — on the INNER \`arrow_function\` / \`function_expression\`, NOT
;; the outer \`call_expression\`. The arrow's range matches its own
;; \`@scope.function\` range, so \`pass2AttachDeclarations.atPosition\`
;; resolves \`innermost\` to the arrow's own scope and
;; \`rangesEqual(anchor.range, innermost.range)\` triggers the auto-hoist
;; that promotes the binding to the parent scope (where \`const X\`
;; lives).
;;
;; Trade-off — chained array-method form: \`const x = arr.find((y) => p(y))\`
;; has the same syntactic shape and would also match, naming the
;; \`.find\` callback as \`x\`. The resulting \`Function:x\` is mostly
;; harmless: \`x\` is consumed as a value (\`if (x) { ... }\`), never
;; invoked as a function, so it gets zero incoming \`CALLS\` edges. The
;; one outgoing edge \`Function:x → p\` is a minor mis-attribution that
;; could in principle be fixed by adding a \`function: [(identifier)
;; (member_expression)]\` predicate that excludes property-identifiers
;; matching a known array-method blocklist (\`map\` / \`filter\` / \`find\`
;; / \`reduce\` / \`forEach\` / \`some\` / \`every\`). We don't do that here
;; because (a) the false-positive cost is negligible, (b) the blocklist
;; would need maintenance, and (c) any user-defined fluent-API method
;; with a callback argument would still false-positive — there's no
;; clean syntactic line.
;;
;; Trade-off — multi-arrow arguments: \`const x = call(arrow1, arrow2)\`
;; would emit TWO matches with the same name \`x\`. tree-sitter-query
;; iterates all arrow_function direct children of \`arguments\`, so each
;; emits its own \`(name=x, function=...)\` pair. \`pass2AttachDeclarations\`
;; pushes both \`Function:x\` defs into the same arrow scopes (each in
;; its own arrow's \`ownedDefs\`) and hoists both bindings to the parent.
;; The downstream registry's qualified-name dedup then collapses them
;; via \`(filePath, type, qualifiedName)\` — second wins. Acceptable;
;; multi-arrow-callback APIs are rare (\`new Promise(executor)\` is the
;; main one and takes a single executor).
(lexical_declaration
(variable_declarator
name: (identifier) @declaration.name
value: (call_expression
arguments: (arguments
(arrow_function) @declaration.function))))

(lexical_declaration
(variable_declarator
name: (identifier) @declaration.name
value: (call_expression
arguments: (arguments
(function_expression) @declaration.function))))

(variable_declaration
(variable_declarator
name: (identifier) @declaration.name
value: (call_expression
arguments: (arguments
(arrow_function) @declaration.function))))

(variable_declaration
(variable_declarator
name: (identifier) @declaration.name
value: (call_expression
arguments: (arguments
(function_expression) @declaration.function))))

;; Method definitions — regular + private (#field) methods.
(method_definition
name: (property_identifier) @declaration.name) @declaration.method
Expand Down
108 changes: 108 additions & 0 deletions gitnexus/src/core/ingestion/tree-sitter-queries.ts
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,63 @@ export const TYPESCRIPT_QUERIES = `
key: (string (string_fragment) @name)
value: (function_expression)) @definition.function

; HOC-wrapped variable declarations: \`const X = HOC((args) => { ... })\`.
; Mirrors the registry-primary patterns in \`languages/typescript/query.ts\`
; so the legacy Call-Resolution DAG and the registry-primary pipeline
; produce the same set of \`Function\` nodes — required for the CI parity
; gate. Covers React.forwardRef / memo / useCallback / useMemo / observer
; / debounce / user-defined HOC factories. The \`var X = HOC(...)\` form is
; mirrored too (registry-primary has it) so that codebases mixing \`var\` and
; \`const\` see identical attribution on both pipelines. See
; \`tsExtractFunctionName\` for the resolution logic and the \`query.ts\`
; comment for the full anchor-discipline rationale and the chained-
; array-method trade-off.
(lexical_declaration
(variable_declarator
name: (identifier) @name
value: (call_expression
arguments: (arguments
(arrow_function))))) @definition.function

(lexical_declaration
(variable_declarator
name: (identifier) @name
value: (call_expression
arguments: (arguments
(function_expression))))) @definition.function

(export_statement
declaration: (lexical_declaration
(variable_declarator
name: (identifier) @name
value: (call_expression
arguments: (arguments
(arrow_function)))))) @definition.function

(export_statement
declaration: (lexical_declaration
(variable_declarator
name: (identifier) @name
value: (call_expression
arguments: (arguments
(function_expression)))))) @definition.function

; \`var X = HOC(...)\` parity with registry-primary. Legacy code (and any
; transpiler output that downlevels \`const\` to \`var\`) hits this shape.
(variable_declaration
(variable_declarator
name: (identifier) @name
value: (call_expression
arguments: (arguments
(arrow_function))))) @definition.function

(variable_declaration
(variable_declarator
name: (identifier) @name
value: (call_expression
arguments: (arguments
(function_expression))))) @definition.function

; Variable/constant declarations (non-function values).
; Overlap with @definition.function patterns is handled by parse-worker dedup.
(lexical_declaration
Expand Down Expand Up @@ -260,6 +317,57 @@ export const JAVASCRIPT_QUERIES = `
key: (string (string_fragment) @name)
value: (function_expression)) @definition.function

; HOC-wrapped variable declarations: \`const X = HOC((args) => { ... })\`.
; See TYPESCRIPT_QUERIES section above for the full rationale (issue #1166
; follow-up — covers forwardRef / memo / useCallback / useMemo / observer
; / debounce / user-defined HOC factories). Both \`const\` and \`var\` forms
; are mirrored so JS code that uses \`var\` (or transpiler output) gets the
; same attribution as the registry-primary path.
(lexical_declaration
(variable_declarator
name: (identifier) @name
value: (call_expression
arguments: (arguments
(arrow_function))))) @definition.function

(lexical_declaration
(variable_declarator
name: (identifier) @name
value: (call_expression
arguments: (arguments
(function_expression))))) @definition.function

(export_statement
declaration: (lexical_declaration
(variable_declarator
name: (identifier) @name
value: (call_expression
arguments: (arguments
(arrow_function)))))) @definition.function

(export_statement
declaration: (lexical_declaration
(variable_declarator
name: (identifier) @name
value: (call_expression
arguments: (arguments
(function_expression)))))) @definition.function

; \`var X = HOC(...)\` parity with registry-primary.
(variable_declaration
(variable_declarator
name: (identifier) @name
value: (call_expression
arguments: (arguments
(arrow_function))))) @definition.function

(variable_declaration
(variable_declarator
name: (identifier) @name
value: (call_expression
arguments: (arguments
(function_expression))))) @definition.function

; Variable/constant declarations (non-function values).
; Overlap with @definition.function patterns is handled by parse-worker dedup.
(lexical_declaration
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
// Library-wrapper / utility-HOC form: `debounce`, `throttle`, `once`,
// `memoize` — all share the same shape `const X = wrap(arrow)` and should
// produce a `Function:X` def named after the const.

import { doStuff } from './helpers';

const debounce = <F extends (...args: unknown[]) => unknown>(fn: F, _ms: number): F => fn;

export const debouncedSearch = debounce((query: string) => {
doStuff(query.length);
}, 250);
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
// shadcn/Radix UI canonical pattern: every primitive component is wrapped
// in `React.forwardRef` so callers can attach a ref. The arrow inside is
// where the actual rendering logic lives — every call inside its body
// (cn(), helper(), JSX renders) should attribute to `Button`, not File.
//
// Pre-fix: `Button` was a Variable; calls inside attributed to File.
// Post-fix: `Button` is a Function; calls attribute to `Button`.

import { helper, cn } from './helpers';

// Stand-in for React.forwardRef — defined locally so the outer call_expression
// is in-fixture and we don't need to mock the React types. Same shape as
// the real React.forwardRef<T, P>.
const React = {
forwardRef: <T, P>(render: (props: P, ref: T | null) => unknown) => render,
};

interface ButtonProps {
className?: string;
variant?: 'default' | 'ghost';
}

export const Button = React.forwardRef<HTMLButtonElement, ButtonProps>(
({ className, variant }, _ref) => {
const cls = cn('btn', variant ?? 'default', className ?? '');
helper(cls);
return null;
},
);
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
// Shared helpers used as call targets in HOC-wrapped fixture files. Each
// helper is a plain named arrow so we can assert exact `Caller → helper`
// edges without confounding cross-file resolution.

export const helper = (label: string): string => label.toUpperCase();

export const doStuff = (n: number): number => n + 1;

export const cn = (...classes: string[]): string => classes.filter(Boolean).join(' ');

export const fmt = (value: number): string => `[${value}]`;
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
// Bare-identifier HOC form: `const Card = memo((props) => { ... })`.
// Common when the HOC is named-imported (`import { memo } from 'react'`)
// rather than accessed via a namespace (`React.memo`). Both should work.

import { helper, cn } from './helpers';

const memo = <P,>(render: (props: P) => unknown) => render;

interface CardProps {
title: string;
className?: string;
}

export const Card = memo<CardProps>(({ title, className }) => {
const cls = cn('card', className ?? '');
helper(title);
helper(cls);
return null;
});
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
// Negative-control: bare statement-level HOC calls (NOT bound to a
// `const`/`let`/`var`) must NOT produce phantom Function nodes.
//
// This exercises the `parent.type === 'arguments'` branch in
// `tsExtractFunctionName`: the walk-up `arguments → call_expression →
// (program | expression_statement)` short-circuits because the parent
// of `call_expression` is NOT `variable_declarator`. The arrow stays
// anonymous and calls inside fall back to the enclosing module scope.

import { doStuff } from './helpers';

const useCallback = <F extends (...args: unknown[]) => unknown>(fn: F, _deps: unknown[]): F => fn;
const memo = <P,>(render: (props: P) => unknown) => render;

// Statement-level: result is discarded.
useCallback(() => {
doStuff(1);
}, []);

memo<{ x: number }>(({ x }) => {
doStuff(x);
});

// Function-arg position (passed to another call): also unbound.
const wrap = <T>(value: T): T => value;
wrap(
memo<{ y: number }>(({ y }) => {
doStuff(y);
}),
);
Loading
Loading