Skip to content

fix(group): detect httpx async consumers#1408

Merged
magyargergo merged 10 commits into
abhigyanpatwari:mainfrom
juyua9:devb-gitnexus-httpx
May 11, 2026
Merged

fix(group): detect httpx async consumers#1408
magyargergo merged 10 commits into
abhigyanpatwari:mainfrom
juyua9:devb-gitnexus-httpx

Conversation

@juyua9

@juyua9 juyua9 commented May 7, 2026

Copy link
Copy Markdown
Contributor

What problem does this PR solve?

Partially fixes #1184 for the Python httpx.AsyncClient slice. The NestJS HttpService and project-specific axios wrapper slices remain follow-up work.

The source-scan HTTP extractor currently detects Python requests.* consumers, but misses async service clients built with httpx.AsyncClient. That means group contract extraction can omit real Python consumer call sites such as attribute-held clients (self._client.get(...)) and async with aliases (client.post(...)).

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

Changes

  • Tracks httpx.AsyncClient(...) clients assigned to variables/attributes and async with ... as client aliases.
  • Emits HTTP consumer detections for .get/.post/.put/.delete/.patch(...) calls on those clients.
  • Emits consumer detections for .request("METHOD", "url") on those clients.
  • Adds focused regression coverage for attribute-held clients, generic request calls, async-with aliases, and same-name clients reused in unrelated Python scopes.

Validation

  • cd gitnexus-shared && npx tsc
  • cd gitnexus && npx tsc
  • cd gitnexus && npx vitest run test/unit/group/http-route-extractor.test.ts — 28 passed

@vercel

vercel Bot commented May 7, 2026

Copy link
Copy Markdown

@juyua9 is attempting to deploy a commit to the NexusCore Team on Vercel.

A member of the Team first needs to authorize it.

@github-actions

github-actions Bot commented May 7, 2026

Copy link
Copy Markdown
Contributor

Claude finished @magyargergo's task in 4m 34s —— View job


PR #1408 Review — fix(group): detect httpx async consumers


Review bar for this PR

  • httpx.AsyncClient(...) assignments (local vars and attributes) are tracked before any call detection runs.
  • async with httpx.AsyncClient() as client aliases are tracked before any call detection runs.
  • Calls on tracked clients emit consumer contracts; calls on untracked objects do not.
  • Existing FastAPI provider and requests.* consumer extraction is not modified.
  • Contract IDs are normalized via the existing normalizeConsumerPathcontractIdFor pipeline.
  • Tests cover the three target patterns (attribute-held, generic .request(), async-with alias).
  • No bidi/invisible Unicode in query strings, executable code, or test fixtures.

Current PR state

Property Value
Head SHA 68bd835 — matches PR body
Status Open, not draft
Commits 1
Changed files 2 (python.ts +133, http-route-extractor.test.ts +35)
CI (main suite) In progress — no failures visible yet
Docker Build In progress
CodeQL ✅ Passed
Dependency Review ✅ Passed
Gitleaks ✅ Passed
PR Description Check ✅ Passed
Typecheck (per PR body) Claimed green (gitnexus-shared && gitnexus) — cannot independently verify until CI completes
Focused test (per PR body) Claimed 28 passed — cannot independently verify until CI completes
Issue #1184 scope Partial — Python httpx slice only; NestJS HttpService and axios wrappers remain open
Hidden Unicode Non-ASCII chars confirmed: (U+2500) and (U+2014) in section comments and describe strings only — no bidi controls, not in query strings, executable code, or test fixtures. Pre-existing style pattern. No block.
Vercel Irrelevant to this extractor review.

Understanding of the change

Before this PR, the Python HTTP plugin scanned for:

  • FastAPI @app.get("/path") → provider contracts
  • requests.get/post/...("url") → consumer contracts (direct module access)
  • requests.request("METHOD", "url") → generic consumer contracts

httpx.AsyncClient was never tracked because httpx is object-oriented — the client is constructed once and reused, so a simple object: (identifier) @module (#eq? @module "httpx") guard at the call site is insufficient. You must first identify which variables hold an httpx.AsyncClient and then filter call detections by that set.

This PR adds a two-phase approach:

  1. collectHttpxAsyncClients(tree) runs two compiled patterns against the full AST to build a Set<string> of client identities (by node text):
    • Assignment pattern: self._client = httpx.AsyncClient(...) → adds "self._client"
    • async with alias pattern: async with httpx.AsyncClient() as client: → adds "client"
  2. Two consumer scan patterns then match ALL method calls of the right shape, but skip any match where clientNode.text is not in the collected set — preventing false positives from arbitrary .get() / .request() calls.

Out of scope from #1184: NestJS HttpService and project-specific axios wrappers remain unaddressed and issue #1184 should stay open.


Findings

[MEDIUM] Test framework assertion is semantically over-broad

  • Category: Test correctness
  • File: gitnexus/test/unit/group/http-route-extractor.test.ts:463
  • Issue: expect(consumers.every((c) => c.meta.framework === 'python-httpx')).toBe(true) asserts that every consumer in the entire fixture output has framework python-httpx. This is currently correct only because the fixture file contains no requests.* calls or other consumer patterns. If the fixture evolves or a new pattern in the Python plugin starts emitting on this code, the assertion becomes misleading rather than failing precisely.
  • Why it matters: A future change that accidentally emits a python-requests consumer for code in this fixture would produce a confusing test failure message, and the assertion provides no signal about which of the three target contracts is failing.
  • Recommended fix: Assert framework only on the specific expected contract IDs:
    const get = consumers.find((c) => c.contractId === 'http::GET::/topic');
    expect(get?.meta.framework).toBe('python-httpx');
    Fix this →
  • Blocks merge: No — currently correct; semantic weakness only.

[LOW] No negative test coverage for false-positive prevention

  • Category: Test completeness
  • File: gitnexus/test/unit/group/http-route-extractor.test.ts
  • Issue: The implementation correctly guards call detections with httpxAsyncClients.has(clientNode.text), but no test proves that untracked objects (dict.get("/x"), some_service.request("POST", "/x")) do not emit httpx consumer contracts. The existing negative control test (does not emit consumers for unrelated object-literal calls) only covers JS/jQuery.
  • Why it matters: Without a negative test, a future refactor that removes or widens the guard would not be caught by the test suite.
  • Recommended fix: Add a Python fixture that contains data = {}; data.get("/nope") and service.request("POST", "/nope") alongside a valid httpx client, and assert no /nope consumer is emitted.
  • Blocks merge: No.

[LOW] .put, .delete, .patch verb coverage not tested

  • Category: Test completeness
  • File: gitnexus/test/unit/group/http-route-extractor.test.ts
  • Issue: The regex ^(get|post|put|delete|patch)$ claims to cover five verbs but the fixture only exercises get and post. If someone accidentally narrows the regex to ^(get|post)$, the test suite still passes.
  • Why it matters: Low risk, but the PR description claims all five verbs are detected.
  • Recommended fix: Add one await self._client.delete("/topic") or await client.patch("/x") case to the fixture.
  • Blocks merge: No.

[LOW] Direct-import and aliased-import httpx are undocumented false negatives

  • Category: Scope / precision
  • File: gitnexus/src/core/group/extractors/http-patterns/python.ts
  • Issue: The following common httpx patterns are silently missed:
    from httpx import AsyncClient
    client = AsyncClient()          # assignment RHS is not an attribute call
    
    import httpx as hx
    client = hx.AsyncClient()       # module identifier is "hx", not "httpx"
    Neither is mentioned in the PR or tracked as a follow-up issue.
  • Why it matters: Both are common in production code. The scoped PR is acceptable, but the limitation should be documented so it doesn't quietly mislead users of the group feature.
  • Recommended fix: Add a // NOTE: comment in python.ts near HTTPX_ASYNC_CLIENT_ASSIGN_PATTERNS noting these two gaps, and open a follow-up issue.
  • Blocks merge: No.

[INFO] CI main suite still in progress

  • Category: CI
  • Issue: The CI and Docker Build & Push workflows are still running at review time. Typecheck and focused test results cannot be independently confirmed — only the PR author's claim of "28 passed" is available.
  • Blocks merge: No — pending completion; not a code defect.

Python AST / Tree-sitter assessment

Assignment pattern (HTTPX_ASYNC_CLIENT_ASSIGN_PATTERNS, lines 84–100):

(assignment
  left: (_) @client
  right: (call
    function: (attribute
      object: (identifier) @module (#eq? @module "httpx")
      attribute: (identifier) @client_class (#eq? @client_class "AsyncClient"))))
  • (_) on the left correctly captures both plain identifiers (client) and attribute expressions (self._client, cls._http) — the node text is the stable source span.
  • The #eq? @module "httpx" predicate requires the explicit httpx.AsyncClient(...) form; something_else.AsyncClient() and httpx.Client() are both rejected. ✓
  • Typed annotations (client: httpx.AsyncClient = httpx.AsyncClient()) use a typed_assignment node and are not detected — a known, acceptable false negative for this scope.

async with alias pattern (HTTPX_ASYNC_CLIENT_WITH_ALIAS_PATTERNS, lines 103–119):

(as_pattern
  (call function: (attribute object: (identifier) @module ...))
  (as_pattern_target (identifier) @client))
  • In the Python tree-sitter grammar, async with expr as alias: parses as with_item → as_pattern → (value, as_pattern_target). The query structure matches this correctly.
  • Captures only simple identifiers as aliases (not tuple unpacking like as (a, b)) — correct for the common case.

Text identity between assign and call captures: For self._client = httpx.AsyncClient(...) the @client capture is the attribute node with .text == "self._client". For await self._client.get("/topic") the call's object: (_) @client is also the attribute node self._client with the same .text. The Set membership test is correct. ✓

Non-detected patterns (false negatives):

Pattern Reason
from httpx import AsyncClient; client = AsyncClient() RHS is call(identifier), not call(attribute)
import httpx as hx; client = hx.AsyncClient() Module text is "hx", fails #eq? "httpx"
c = self._client; await c.get("/x") Alias chain — c not in tracked set
return httpx.AsyncClient().get("/x") Inline call, no assignment captured
client = make_httpx_client() Factory function, RHS not httpx.AsyncClient(...)

All are acceptable false negatives for a targeted PR; the more common patterns (attribute-held and async-with alias) are covered.


Consumer detection assessment

Verb pattern (HTTPX_ASYNC_CLIENT_VERB_PATTERNS, lines 122–137):

(call
  function: (attribute object: (_) @client attribute: (identifier) @method (#match? ...))
  arguments: (argument_list . (string) @path))
  • The argument_list . anchor (first-child positional anchor in tree-sitter) requires the path string to be the first positional argument. This works for client.get("/x") and client.post("/x", json=data) (keyword args follow the positional first arg). It does NOT work for pure keyword-arg form client.get(url="/x") — acceptable out-of-scope limitation.
  • method.text.toUpperCase() produces the canonical HTTP method. ✓
  • The httpxAsyncClients.has(clientNode.text) guard prevents any untracked object from producing a detection — dict.get("/x"), response.get("/foo"), service.request("POST", "/x") are all rejected. ✓

Generic pattern (HTTPX_ASYNC_CLIENT_GENERIC_PATTERNS, lines 140–155):

(call
  function: (attribute object: (_) @client attribute: ... (#eq? "request"))
  arguments: (argument_list . (string) @http_method (string) @path))
  • Requires both method and path to be positional string literals. unquoteLiteral strips quotes; methodRaw.toUpperCase() normalizes. Non-literal method/path → both unquoteLiteral results are null → skipped. ✓
  • Keyword form client.request(method="POST", url="/x") is not supported — acceptable.

Static-analysis precision / recall assessment

Expected true positives — all correctly detected:

self._client = httpx.AsyncClient()
await self._client.get("/topic")        # ✓

async with httpx.AsyncClient() as client:
    await client.post("/questions/dup-check")  # ✓

await self._client.request("POST", "/questions/import")  # ✓

False positives — correctly prevented:

data = {}
data.get("/x")                          # "data" not in tracked set → skip ✓

service.request("POST", "/x")          # "service" not in tracked set → skip ✓

client = something_else.AsyncClient()  # module != "httpx" → not tracked ✓
await client.get("/x")                 # "client" not in set → skip ✓

Known false negatives (acceptable scope):

  • Direct import: from httpx import AsyncClient + bare AsyncClient()
  • Aliased import: import httpx as hx + hx.AsyncClient()
  • Reassigned alias: c = self._client; await c.get("/x")
  • Keyword-only args: client.get(url="/x")
  • Typed annotation: client: httpx.AsyncClient = httpx.AsyncClient()

All are edge cases relative to the issue's target patterns. A follow-up issue is recommended.


HTTP contract / group matching assessment

Emitted HttpDetection objects:

  • role: 'consumer'
  • framework: 'python-httpx' ✓ (distinct from 'python-requests', 'fastapi')
  • method: uppercased from capture ✓
  • path: raw literal → fed to orchestrator's normalizeConsumerPath
    • Absolute URL https://svc.local/questions/duplicate-checknew URL(...).pathname/questions/duplicate-checkhttp::POST::/questions/duplicate-check
    • Relative path /topichttp::GET::/topic
  • name: null — consistent with all other consumer plugins ✓
  • confidence: 0.7 — consistent with all other source-scan consumer plugins ✓

Group matching interaction: The orchestrator uses method + normalized path for contract ID generation, not framework. python-httpx consumers will correctly match providers from FastAPI, Spring, NestJS, etc. when method + path align. Framework field is stored in meta only. Confidence 0.7 does not suppress matching. dedupeContracts uses contractId|filePath|name as key — duplicate calls in the same file are de-duplicated. ✓

No schema changes, no new contract types, no provider contracts emitted. ✓


Test assessment

The new test (http-route-extractor.test.ts:431–464) covers:

  • ✅ Attribute-held client: self._client = httpx.AsyncClient() + await self._client.get("/topic")
  • ✅ Generic .request(): await self._client.request("POST", "/questions/import")
  • async with alias: async with httpx.AsyncClient() as client: + await client.post("https://svc.local/questions/duplicate-check")
  • ✅ Absolute URL normalization: https://svc.local/questions/duplicate-checkhttp::POST::/questions/duplicate-check
  • ✅ Framework metadata assertion (currently correct, though over-broad — see Finding 1)
  • ✅ Three expected contract IDs verified

Would tests fail before this PR? Yes — the three expect(...).toBeDefined() assertions would all fail on the unpatched code since no httpx consumer patterns existed.

Missing test coverage:

  • ❌ Local variable client (only attribute self._client is tested; client = httpx.AsyncClient() is not a separate case in the test)
  • ❌ Negative: arbitrary .get() / .request() on untracked objects
  • .put, .delete, .patch verb coverage
  • ❌ Non-literal method/path → skipped (document this guarantee)
  • ❌ Additional positional kwargs: client.post("/x", json=payload) still matches (good behavior, untested)
  • from httpx import AsyncClient → correctly undetected (a negative case worth documenting)

Scope / issue hygiene assessment


Hidden Unicode / hygiene assessment

grep -Pn "[\x{202A}-\x{202E}\x{2066}-\x{2069}]" python.ts test.ts
→ No bidi chars found ✓

grep -Pn "[^\x00-\x7F]" python.ts test.ts
→ Matches: section comment separator lines only (─ U+2500, — U+2014)

All non-ASCII matches are:

  • (U+2500 BOX DRAWINGS LIGHT HORIZONTAL) in section comment separators on lines 27, 46, 64, 83, 102, 121, 139 of python.ts
  • (U+2014 EM DASH) in describe() block strings in the test file

These are consistent with the existing codebase style (pre-existing sections in python.ts use the same separator), are in comments and string labels only, and are entirely absent from tree-sitter query strings, regex patterns, and test fixture Python source. No bidi control characters. No invisible whitespace. No block.

The GitHub "hidden Unicode" warning triggered by these decorative separators is a false alarm relative to the security concern.


Performance / maintainability assessment

  • collectHttpxAsyncClients(tree) runs 2 compiled pattern bundles in a single pass — O(AST match count). No full-file string search.
  • The 4 new compiled pattern bundles (ASSIGN, WITH_ALIAS, VERB, GENERIC) are compiled once at module load time via compilePatterns — zero per-file compilation overhead. ✓
  • Consumer scan is O(call nodes) per file — well within normal tree-sitter performance bounds.
  • Pattern names (HTTPX_ASYNC_CLIENT_ASSIGN_PATTERNS, etc.) are clear and follow the existing naming convention. ✓
  • Section comments match the style of existing sections in the same file. ✓
  • Future extensions (sync httpx.Client, direct imports, aliased imports) can be added by appending patterns to collectHttpxAsyncClients without changing the consumer scan loops. Architecture supports clean extension. ✓

Final verdict

production-ready with minor follow-ups

The core implementation is architecturally sound. The two-phase tracked-client approach correctly prevents false positives from arbitrary .get() / .request() calls — the httpxAsyncClients.has(clientNode.text) guard is the right mechanism and it works correctly for both attribute-held clients and async with aliases. Path and method normalization reuse the existing normalizeConsumerPath pipeline unchanged. Contract IDs, confidence, and framework metadata are consistent with the rest of the consumer plugin surface.

The outstanding items are test coverage gaps (negative cases, .put/.delete/.patch verbs, framework assertion scope) and undocumented false negatives for direct and aliased imports. None of these constitute a correctness defect in the stated scope — the patterns the PR claims to detect are detected correctly, and the patterns it should not detect are correctly rejected. Before merge, it is worth confirming CI completes green, and the maintainers should ensure issue #1184 is not auto-closed since the NestJS and axios slices remain open.


Comment thread gitnexus/test/unit/group/http-route-extractor.test.ts Fixed
@magyargergo

Copy link
Copy Markdown
Collaborator

@juyua9 please look into these findings 🙏

@juyua9

juyua9 commented May 7, 2026

Copy link
Copy Markdown
Contributor Author

Thanks @magyargergo — I looked into the review findings and pushed 522a6b2.

Follow-up changes:

  • narrowed the framework assertion to the specific expected httpx contract IDs instead of checking every emitted consumer globally;
  • added negative coverage proving untracked .get() / .request() calls do not emit /nope consumers;
  • added .delete() verb coverage so the five-verb pattern is not only tested through GET/POST;
  • documented the intentionally scoped false negatives for direct-import and aliased-import httpx.AsyncClient construction as follow-up territory.

Validation:

  • npx prettier --check gitnexus/src/core/group/extractors/http-patterns/python.ts gitnexus/test/unit/group/http-route-extractor.test.ts
  • cd gitnexus-shared && npx tsc
  • cd gitnexus && npx tsc
  • cd gitnexus && npx vitest run test/unit/group/http-route-extractor.test.ts — 28 passed

Comment thread gitnexus/test/unit/group/http-route-extractor.test.ts Fixed
@juyua9

juyua9 commented May 8, 2026

Copy link
Copy Markdown
Contributor Author

Thanks — I pushed feac264 to address the CodeQL temporary-file finding.

Change:

  • Replaced the predictable path.join(os.tmpdir(), ...) test directory with fs.mkdtempSync(path.join(os.tmpdir(), 'gitnexus-http-extract-')), so each test run gets a securely created unique temp directory.

Validation:

  • cd gitnexus && ./node_modules/.bin/vitest run test/unit/group/http-route-extractor.test.ts — 28 passed
  • cd gitnexus && ./node_modules/.bin/tsc
  • cd gitnexus-shared && ./node_modules/.bin/tsc
  • conflict-marker scan clean

@github-actions

github-actions Bot commented May 8, 2026

Copy link
Copy Markdown
Contributor

Claude finished @magyargergo's task in 5m 37s —— View job


PR #1408 Review — fix(group): detect httpx async consumers


Review bar for this PR

  • httpx.AsyncClient(...) assignments (local vars and attributes) are tracked into a per-file set before any call detection runs.
  • async with httpx.AsyncClient() as client aliases are tracked into the same set.
  • Calls on tracked clients emit consumer contracts; untracked objects that happen to call .get() or .request() do not emit.
  • Existing FastAPI provider and requests.* consumer extraction is unchanged.
  • Contract IDs are produced via the existing normalizeConsumerPathcontractIdFor pipeline.
  • Tests cover positive cases, negative false-positive control, absolute URL normalization, and existing regressions.
  • No bidi/invisible Unicode in query strings, executable code, or test fixtures.

Current PR state

Property Value
Latest head SHA 532cde6 — merge-from-main commit; actual PR commits are 68bd835, 522a6b2, feac264
Status Open, not draft
Commits 4 (3 PR + 1 merge-from-main)
Changed files (PR only) 2: python.ts +135/−0, http-route-extractor.test.ts +51/−2
Net diff +186/−2
CI: quality/format ✅ passed
CI: quality/lint ✅ passed
CI: quality/typecheck ✅ passed
CI: scope-parity (python, go, typescript, csharp) ✅ passed
CI: tests/ubuntu/coverage ⏳ in progress
CI: tests/windows-latest ⏳ in progress
CI: tests/macos-latest ⏳ in progress
CodeQL ✅ passed
Gitleaks ✅ passed
Dependency Review ✅ passed
tree-sitter-python version 0.23.4 (pinned in package.json)
Hidden Unicode (bidi controls) None found in changed files
Non-ASCII chars (U+2500) in section comments, (U+2014) in test describe() strings — consistent with pre-existing codebase style; absent from query strings, regexes, fixtures
Vercel Irrelevant to this extractor change
PR body Fixes #1184 ⚠️ Will auto-close issue #1184; NestJS HttpService and axios wrappers slices remain open

Branch hygiene assessment

Verdict: merge-from-main commit present but harmless and merge-safe.

The merge commit 532cde6 (authored by the maintainer) brings 116 files and 5099 insertions from main into the PR branch. These files are already in main's history — they would not appear as new changes when this PR merges. The actual PR diff is exactly the two focused TypeScript files. No unrelated package version, lockfile, workflow, or generated file changes originate from the PR commits themselves. The merge commit is standard branch-synchronization practice and does not pollute the PR's semantic content.


Understanding of the change

Before this PR, the Python HTTP plugin scanned for requests.get(...), requests.post(...), etc. by requiring the receiver to be the literal identifier requests. This works for direct-module-access patterns but fails for object-oriented HTTP clients like httpx.AsyncClient, where the client is constructed once and stored in a variable or attribute, then reused for multiple requests.

Group matching needs consumer contracts in contracts.json before any method/path matching logic can link providers to consumers. If a Python service uses httpx.AsyncClient and no consumer contracts are emitted, the group feature will report zero cross-service matches for those endpoints — regardless of how good the matching algorithm is.

Why attribute-held clients matter: Production service wrappers typically look like self._http = httpx.AsyncClient(base_url=...) in __init__, with methods like async def get_user(self): return await self._http.get("/users"). Without tracking the assignment, there's no way to know that self._http.get(...) is an HTTP consumer call.

Why async-with aliases matter: One-off request patterns (async with httpx.AsyncClient() as client: await client.post(...)) are common in integration code and tests. The alias variable (client) only exists for the scope of the with block but must still be tracked to emit the consumer contract.

Why .request("METHOD", "url") is separate: The verb pattern requires knowing the HTTP method from the attribute name (.get, .post, etc.). The generic .request(method, url) form passes method as a string argument, requiring a different query that captures two positional string arguments and reads the method from the first one.

Why false positives from arbitrary .get() are dangerous: Python has many objects with .get() methods: dict.get(), configparser.get(), os.environ.get(), response.headers.get(). Without the tracked-client guard, every .get("/some/path") call in a Python file would emit a spurious HTTP consumer contract, polluting the group graph with phantom cross-service dependencies.

Deliberately out of scope: from httpx import AsyncClient + bare AsyncClient(), import httpx as hx + hx.AsyncClient(), aiohttp.ClientSession, NestJS HttpService, project-specific axios wrappers.


Findings

[MEDIUM] File-global client tracking — no test for same-name reuse across scopes

  • Category: Static-analysis precision / Test completeness
  • Files: python.ts:159–173, http-route-extractor.test.ts
  • Issue: collectHttpxAsyncClients(tree) aggregates tracked names into a single file-level Set<string>. If the same identifier (e.g., client) is used in one class as an httpx.AsyncClient and independently in another class or function for something unrelated, calls on the second client object will be falsely emitted as HTTP consumers. The test fixture does not prove this doesn't happen — it only covers data and service as non-tracked objects (which differ in name from any tracked client).
  • Concrete example not tested:
    class HttpService:
        def __init__(self):
            self.client = httpx.AsyncClient()        # "self.client" tracked
    
    class ConfigStore:
        def lookup(self, key: str) -> str:
            client = self._get_cache_client()        # "client" NOT tracked
            return client.get(key)                   # false positive if "client" was tracked elsewhere
    However, note: self.clientclient as raw text, so this specific example is fine. The real risk is when the exact same text string is tracked and reused — e.g., two separate functions both use a local variable named client, one as httpx, one as something else.
  • Why it matters: The 0.7 confidence signal and the NOTE comment document the scope boundaries, but the absence of a same-name scope-leakage test means a future refactor that accidentally widens matching would not be caught.
  • Recommended fix: Add one negative fixture where client = httpx.AsyncClient() appears in function A, and client = some_cache() + client.get("key") appears in function B — assert that no consumer is emitted for "key". Alternatively, add a // PRECISION NOTE: to collectHttpxAsyncClients stating that tracking is file-global and same-name collisions across scopes are accepted false positives at this confidence level.
  • Blocks merge: No — the existing precision level (0.7 confidence) explicitly accommodates approximations, and the NOTE comment at lines 83–85 already acknowledges the scoped limitations. This is a documentation/test gap rather than a correctness defect for the stated use cases.

[LOW] PR body Fixes #1184 will auto-close issue with unresolved slices

  • Category: Issue / scope hygiene
  • Files: PR description
  • Issue: The PR body uses Fixes https://github.com/abhigyanpatwari/GitNexus/issues/1184, which GitHub will auto-close on merge. Issue [bug][group] HTTP consumer extractor misses httpx.AsyncClient, NestJS HttpService, and project axios wrappers #1184 covers three slices — Python httpx.AsyncClient (this PR), NestJS HttpService, and project-specific axios wrappers. The latter two remain unaddressed.
  • Why it matters: Auto-closing the issue loses tracking of the remaining work. Future contributors won't discover the gaps unless they search the git history.
  • Recommended fix: Change to Partially fixes #1184 or Closes #1184 (httpx slice only; see NestJS HttpService and axios wrapper slices for follow-up) and explicitly open sub-issues or a follow-up issue for the remaining slices.
  • Blocks merge: No — this is a process concern, not a code correctness concern.

[LOW] CI tests still in progress at review time

  • Category: CI / release readiness
  • Issue: tests/ubuntu/coverage, tests/windows-latest, and tests/macos-latest are in-progress. Typecheck, format, lint, and scope-parity have all passed. The PR author claims 28 tests pass locally.
  • Blocks merge: No — pending completion; not a code defect. Do not merge until the three platform test runs complete green.

Python tree-sitter query assessment

Assignment query (HTTPX_ASYNC_CLIENT_ASSIGN_PATTERNS):

(assignment
  left: (_) @client
  right: (call
    function: (attribute
      object: (identifier) @module (#eq? @module "httpx")
      attribute: (identifier) @client_class (#eq? @client_class "AsyncClient"))))

The (_) wildcard on left correctly captures both plain identifiers (client) and attribute expressions (self._client, cls._http) — the node text is the stable source span. The #eq? @module "httpx" predicate correctly rejects something_else.AsyncClient() and httpx.Client(). Typed annotations (client: httpx.AsyncClient = httpx.AsyncClient()) use typed_assignment and are silently missed — acceptable out-of-scope limitation. ✓

async-with alias query (HTTPX_ASYNC_CLIENT_WITH_ALIAS_PATTERNS):

(as_pattern
  (call function: (attribute object: (identifier) @module ...))
  (as_pattern_target (identifier) @client))

In tree-sitter-python 0.23.x, async with expr as alias: parses as with_item → as_pattern → (value, as_pattern_target). The query structure matches this correctly. Only simple identifier aliases are captured (not tuple unpacking as (a, b)) — correct for the common case. The functional test (claiming 28 passes) confirms this query matches at runtime against the pinned grammar. ✓

Verb query (HTTPX_ASYNC_CLIENT_VERB_PATTERNS):
The argument_list . positional anchor correctly requires the path to be the first positional argument. This handles client.get("/x", params={...}) (keyword args follow the positional first arg). Pure keyword-only form client.get(url="/x") is intentionally not covered. The regex ^(get|post|put|delete|patch)$ is correctly scoped to the five standard verbs. ✓

Generic query (HTTPX_ASYNC_CLIENT_GENERIC_PATTERNS):
Requires both method and path to be positional string literals. unquoteLiteral strips quotes; non-literal method/path → both return null → skipped safely. ✓

Awaited calls: Under await self._client.get(...), tree-sitter wraps the call in an await expression, but the inner call node structure is unchanged. Both HTTPX_ASYNC_CLIENT_VERB_PATTERNS and HTTPX_ASYNC_CLIENT_GENERIC_PATTERNS match the call node regardless of wrapping. ✓

Non-regression: FASTAPI_PATTERNS, REQUESTS_VERB_PATTERNS, and REQUESTS_GENERIC_PATTERNS are untouched. The new patterns are additive. requests.get("/path") could theoretically match HTTPX_ASYNC_CLIENT_VERB_PATTERNS (since requests has the right call shape), but httpxAsyncClients.has("requests") is false unless the file contains requests = httpx.AsyncClient() — pathological and irrelevant. ✓

Query compilation: All four new pattern bundles are compiled via compilePatterns at module load time. A malformed query would throw immediately on import — the existing test suite importing the plugin is sufficient to detect this. Since typecheck passes and the tests are reported to pass, compilation succeeds against tree-sitter-python 0.23.4. ✓


Client tracking / static-analysis precision assessment

collectHttpxAsyncClients(tree) runs two pattern passes over the full AST and accumulates client identities by raw node text. The file-global set design is a deliberate approximation:

Why it's acceptable for this use case:

Known approximations (all acceptable):

Scenario Behavior
client in function A (httpx) / client.get() in function B (dict) False positive if variable text matches exactly
self._client in class A (httpx) / self._client.get() in class B Controlled false positive — self._client text is identical; same-class reuse is rare
c = self._client; await c.get("/x") False negative — alias chain not tracked
from httpx import AsyncClient; client = AsyncClient() False negative — documented in NOTE comment
import httpx as hx; client = hx.AsyncClient() False negative — documented in NOTE comment

False-positive control proven in tests:

  • data.get("/nope")data not in tracked set → no emission ✓
  • service.request("POST", "/nope")service not in tracked set → no emission ✓

Gap: No test proves that same-name reuse across scopes within the same file is controlled. This is the main open precision question — acceptable for the current DoD bar but worth tracking.


HTTP contract semantics assessment

Emitted HttpDetection objects pass through extractConsumersSourceScan:

  • role: 'consumer'
  • framework: 'python-httpx' ✓ (distinct from 'python-requests', 'fastapi'; stored in meta only)
  • method: uppercased from capture ✓
  • path: raw literal → normalizeConsumerPath in orchestrator:
    • Absolute https://svc.local/questions/duplicate-checknew URL(...).pathname/questions/duplicate-checkhttp::POST::/questions/duplicate-check
    • Relative /topichttp::GET::/topic
  • name: null — consistent with all consumer plugins ✓
  • confidence: 0.7 — consistent with source-scan consumer plugins ✓
  • dedupeContracts uses contractId|filePath|name key — duplicate calls in the same file are de-duplicated ✓

Group matching compatibility: The orchestrator uses method + normalized path for contract ID generation; framework is meta only. python-httpx consumers will correctly match FastAPI, Spring, NestJS, Express providers when method + path align. No schema changes, no new contract types. ✓

base_url from httpx.AsyncClient(base_url=...) is not merged into consumer paths. Relative paths like "/topic" remain relative. This is consistent with existing extractor policy — base URL merging is explicitly not in scope and is the correct behavior since the group matching normalizes on path segments, not full URLs.


Scope / issue alignment assessment

The PR title, body, and code correctly scope to the Python httpx.AsyncClient slice only. The NOTE comment at lines 83–85 explicitly documents the unsupported forms:

// NOTE: This targeted detector only tracks explicit `httpx.AsyncClient(...)`
// construction. Direct imports (`from httpx import AsyncClient`) and module
// aliases (`import httpx as hx`) are intentionally left for a follow-up.

The PR body correctly states "Fixes... for the Python httpx.AsyncClient slice." However, the Fixes #1184 keyword will trigger GitHub's auto-close behavior regardless of the qualifying text. The maintainers should:

  1. Edit the PR body to use Partially fixes #1184, or
  2. Reopen [bug][group] HTTP consumer extractor misses httpx.AsyncClient, NestJS HttpService, and project axios wrappers #1184 after merge, or
  3. Create sub-issues for NestJS HttpService and axios wrapper slices before merging.

No claim of broader httpx support (sync httpx.Client, direct imports, aliased imports) appears in the code or docs. ✓


Test assessment

Positive cases covered:

  • ✅ Attribute-held client: self._client = httpx.AsyncClient() + await self._client.get("/topic")
  • ✅ Generic .request(): await self._client.request("POST", "/questions/import")
  • .delete() verb: await self._client.delete("/topic")
  • async with alias: async with httpx.AsyncClient() as client: + await client.post(...)
  • ✅ Absolute URL normalization: https://svc.local/questions/duplicate-checkhttp::POST::/questions/duplicate-check
  • ✅ Framework metadata assertion: per-contractId (not over-broad global assertion)

Negative cases covered:

  • data.get("/nope") — untracked object, no emission
  • service.request("POST", "/nope") — untracked object, no emission

Regression coverage:

  • ✅ FastAPI @app.get("/users") provider extraction tested (lines 649–675)
  • requests.post(...) consumer extraction tested (lines 411–430)
  • ✅ Existing 27 tests (27 + 1 new = 28 total) cover the full range of existing providers/consumers

Gaps (non-blocking):

  • ❌ No test for same-name local variable reused in a different function (file-global scope leakage)
  • ❌ No test explicitly proving from httpx import AsyncClient; client = AsyncClient() does NOT emit (absence is implicitly correct; negative test would document the boundary)
  • .put and .patch verb coverage absent from the httpx test (only .get, .post, .delete tested); the .put/.patch path through HTTPX_ASYNC_CLIENT_VERB_PATTERNS is implicitly validated by the REQUESTS_VERB_PATTERNS tests but not separately exercised for httpx

Temp directory safety:
The change from a shared Date.now() path to per-test fs.mkdtempSync(...) is correct and fixes the CodeQL insecure-tempfile finding. tmpDir is declared as let, assigned in beforeEach, used in afterEach — all scoped correctly. Each test gets a unique private directory; parallel test runs cannot collide. ✓

Would tests have failed before this PR? Yes — all four expect(consumer).toBeDefined() assertions in the new httpx test block would have failed on the unpatched code since no httpx patterns existed.


CI / release-readiness assessment

Check Status
quality/format ✅ passed
quality/lint ✅ passed
quality/typecheck ✅ passed
quality/typecheck-web ✅ passed
scope-parity (4 languages) ✅ passed
e2e ✅ passed
tests/ubuntu/coverage ⏳ in progress
tests/windows-latest ⏳ in progress
tests/macos-latest ⏳ in progress
CodeQL ✅ passed
Gitleaks ✅ passed
Dependency Review ✅ passed
Docker Build & Push ⏳ in progress

Typecheck passes on both gitnexus and gitnexus-shared packages. The three platform test runs are still in progress — do not merge until they complete green.


Hidden Unicode / hygiene assessment

Bidi control characters (U+202A–U+202E, U+2066–U+2069, U+200B–U+200F, U+FEFF): None found in either changed file. ✓

Non-ASCII characters:

  • python.ts lines 27, 46, 64, 82, 104, 123, 141: (U+2500 BOX DRAWINGS LIGHT HORIZONTAL) in section comment separators — consistent with the pre-existing style used in python.ts before this PR, and present throughout the codebase. Not in query strings, regexes, or executable code. ✓
  • http-route-extractor.test.ts: (U+2014 EM DASH) in describe() block label strings — pre-existing style throughout the test file. Not in test fixture Python source code, query strings, or assertions. ✓

No block. The GitHub "hidden Unicode" warning is a false alarm triggered by decorative comment separators.


Elegance / maintainability assessment

The implementation is focused, local, and follows the existing plugin pattern exactly. The four pattern bundles follow the naming convention of existing bundles (FASTAPI_PATTERNS, REQUESTS_VERB_PATTERNS, REQUESTS_GENERIC_PATTERNS). All four are compiled at module load time via compilePatterns. collectHttpxAsyncClients is a clear, 14-line function with no external dependencies. The two consumer scan loops follow the identical structure of the existing REQUESTS_* loops.

Section comments use the same ─── Name ────── style as the pre-existing sections. The NOTE comment accurately states the scoping decision without overpromising.

Future extensions (sync httpx.Client, direct imports, aliased imports, aiohttp) can be added by appending patterns to collectHttpxAsyncClients without modifying the consumer scan loops. The architecture supports clean extension. ✓

No broad extractor refactor, no new types, no schema changes, no unrelated file touches.


Final verdict

production-ready with minor follow-ups

The core implementation is architecturally sound and achieves its stated goal. The two-phase approach (collect tracked clients, then filter call detections) correctly prevents false positives from arbitrary .get() / .request() calls — the httpxAsyncClients.has(clientNode.text) guard works for both attribute-held clients and async with aliases, and the test suite confirms this with both positive detections and negative data.get("/nope") / service.request("POST", "/nope") assertions. The tree-sitter queries are correct for the pinned tree-sitter-python 0.23.4, path normalization reuses the existing normalizeConsumerPath pipeline, and the emitted contracts are fully compatible with group exact matching.

The two open items are: (1) the Fixes #1184 PR body keyword will auto-close an issue that has unresolved slices — the maintainers should handle this explicitly before or immediately after merge; and (2) the file-global client-name tracking has no regression test for same-name reuse across scopes, which is an accepted approximation documented in the NOTE comment but untested. Neither constitutes a correctness defect for the declared scope of this PR.

Do not merge until the tests/ubuntu/coverage, tests/windows-latest, and tests/macos-latest runs complete green. All quality, typecheck, scope-parity, CodeQL, Gitleaks, and dependency checks are already green.


@github-actions

github-actions Bot commented May 8, 2026

Copy link
Copy Markdown
Contributor

CI Report

All checks passed

Pipeline Status

Stage Status Details
✅ Typecheck success tsc --noEmit
✅ Tests success unit tests, 3 platforms
✅ E2E success gitnexus-web changes only

Test Results

Tests Passed Failed Skipped Duration
8673 8672 0 1 369s

✅ All 8672 tests passed

1 test(s) skipped — expand for details
  • buildTypeEnv > known limitations (documented skip tests) > Ruby block parameter: users.each { |user| } — closure param inference, different feature

Code Coverage

Tests

Metric Coverage Covered Base Delta Status
Statements 77.98% 25792/33071 N/A% 🟢 ███████████████░░░░░
Branches 66.49% 16293/24501 N/A% 🟢 █████████████░░░░░░░
Functions 83.18% 2597/3122 N/A% 🟢 ████████████████░░░░
Lines 81.05% 23317/28768 N/A% 🟢 ████████████████░░░░

📋 View full run · Generated by CI

@juyua9

juyua9 commented May 10, 2026

Copy link
Copy Markdown
Contributor Author

Thanks — I pushed 4b4bee8 with the follow-up for the same-name scope concern.

Changes:

  • Scoped tracked httpx.AsyncClient identities to their enclosing Python function/class/module instead of a file-global client-name set.
  • Kept self._client-style attributes class-scoped so existing attribute-held client extraction still works across methods.
  • Added regression coverage for an unrelated function reusing client.get(...), ensuring it does not emit a false HTTP consumer.
  • Updated the PR body from Fixes #1184 to Partially fixes #1184 so the remaining NestJS HttpService and axios wrapper slices stay open.

Validation:

  • cd gitnexus && npx vitest run test/unit/group/http-route-extractor.test.ts — 28 passed
  • cd gitnexus && npx tsc --noEmit

@github-actions

github-actions Bot commented May 10, 2026

Copy link
Copy Markdown
Contributor

Claude finished @magyargergo's task in 6m 29s —— View job


PR #1408 Review — fix(group): detect httpx async consumers

Tasks:

  • Post initial comment
  • Read repo docs (DoD.md, AGENTS.md, GUARDRAILS.md, CONTRIBUTING.md)
  • Inspect changed files (python.ts, http-route-extractor.test.ts)
  • Inspect related files (http-route-extractor.ts, types.ts, tree-sitter-scanner.ts)
  • Check CI status and branch hygiene
  • Unicode/hygiene scan
  • Scope analysis (collectHttpxAsyncClients, getScopeKey, hasTrackedHttpxAsyncClient)
  • Tree-sitter query correctness assessment
  • Contract normalization trace
  • Test quality assessment
  • Write final verdict

Review bar for this PR

  • httpx.AsyncClient(...) assignments (local vars and attributes) are tracked into a per-scope set before call detection.
  • async with httpx.AsyncClient() as client aliases are tracked into the same set.
  • Calls on tracked clients emit consumer contracts; calls on untracked objects with the same method names do not.
  • Existing FastAPI provider and requests.* consumer extraction is unchanged.
  • Contract IDs are produced via the existing normalizeConsumerPathcontractIdFor pipeline.
  • Tests cover positive cases, negative false-positive controls, scope-collision prevention, and absolute URL normalization.
  • No bidi/invisible Unicode in query strings, executable code, or test fixtures.

Current PR state

Property Value
Head SHA e4916f7 — merge-from-main commit; PR commits are 68bd835, 522a6b2, feac264, 4b4bee8
Status Open, not draft
PR-originating commits 4
Changed files (PR only) 2: python.ts +198/−0, http-route-extractor.test.ts +58/−2
CI: quality/format, lint, typecheck ✅ per prior CI run (May 8)
CI: scope-parity, e2e ✅ per prior CI run (May 8)
CI: tests (all platforms) ⚠️ Status unknown — last CI run (May 8) showed all 8441 tests passed but CI Gate failed; the latest merge-from-main (e4916f7, May 10) should trigger a new run. Cannot confirm current gate state.
CodeQL ✅ passed
Gitleaks, Dependency Review ✅ passed
Issue #1184 scope PR body now correctly says Partially fixes #1184
Hidden Unicode (bidi controls) None
Non-ASCII chars (U+2500) in section comments, (U+2014) in describe strings — pre-existing codebase style, absent from queries/fixtures ✓

Branch hygiene assessment

Verdict: merge-from-main commits present but harmless and merge-safe.

The branch has three merge-from-main commits (532cde6, bc4e89e, e4916f7). All were authored by the maintainer, follow standard synchronization practice, and bring in unrelated changes (cursor hooks, dep bumps, de-flaking commits). The final diff is limited to exactly two files; git diff 4b4bee8 HEAD confirms neither PR file was touched after the last PR commit. The merge commits do not introduce unrelated final-diff churn and do not obscure the review. No rebase is required.


Understanding of the change

The PR adds a two-phase approach to Python HTTP consumer extraction:

  1. collectHttpxAsyncClients(tree) runs two compiled patterns against the full AST and returns Map<clientText, Set<scopeKey>>:
    • Assignment: self._client = httpx.AsyncClient(...) → adds "self._client" under class scope
    • async with httpx.AsyncClient() as client: → adds "client" under function scope
  2. Scope scoping (getScopeKey, trackedClientScopeKey, callScopeKeys): tracks each client identity to its enclosing class (for dotted names like self._client) or function (for simple names like client), preventing same-name collisions across unrelated scopes.
  3. Two consumer scan loops match all call shapes, then filter via hasTrackedHttpxAsyncClient(..., clientNode), which checks whether the call site's scope intersects the tracked assignment scope.

This correctly handles: attribute-held clients in class methods, async with aliases scoped to functions, and generic .request("METHOD", "url") calls. The guard httpxAsyncClients.has(clientNode.text) + callScopeKeys(...) prevents false positives from unrelated .get() / .request() callers.


Findings

[MEDIUM] Module-scope client leaks false positives into functions

  • Category: Scope / false-positive safety

  • File: gitnexus/src/core/group/extractors/http-patterns/python.ts:149–167

  • Issue: callScopeKeys() unconditionally appends 'module' to every call site's candidate scope list. If client = httpx.AsyncClient() exists at module scope (tracked to scope key 'module'), then ANY function in the same file that uses a local variable named client and calls .get(...) on it will match and emit a false consumer contract.

    Concrete false-positive path (verified by tracing):

    # module level — tracked to scope 'module'
    client = httpx.AsyncClient()
    
    class Cache:
        def lookup(self, key):
            client = self._get_db_client()   # NOT an httpx.AsyncClient
            return client.get(key)           # callScopeKeys → [..., 'module'] → MATCH → false consumer

    callScopeKeys for client.get(key) yields [function:<lookup_start>:<lookup_end>, 'module']. Since the tracked scope is 'module', scopes.has('module') is true → detection emitted.

  • Why it matters: Module-level httpx.AsyncClient() is a valid production pattern (shared singletons). The PR's stated DoD requirement is "Does not emit consumers for untracked objects that coincidentally expose .get". This scenario violates that. The 0.7 confidence mitigates the severity but the false edge still corrupts the group graph.

  • Evidence: Trace through callScopeKeys (lines 149–167): keys.add('module') is unconditional. The NOTE comment (lines 83–85) documents direct-import and alias gaps but does not document this module-scope shadowing false positive.

  • The test covers this partially but not completely: The fixture has client tracked in function scope (check_duplicate), not module scope. unrelated_scope_collision()'s client.get(...) correctly misses because check_duplicate's function scope ≠ unrelated_scope_collision's function scope. But the module-scope → any-function scenario is untested.

  • Recommended fix: Either (a) add a NOTE comment explicitly acknowledging this limitation, or (b) add a fixture test covering client = httpx.AsyncClient() at module level + client.get("/nope") inside a function where client is a different object — and assert no /nope consumer is emitted.

  • Blocks merge: No0.7 confidence explicitly accommodates approximations, and attribute patterns (self._client) which are the primary production target are correctly class-scoped and immune from this. However, the NOTE comment must document this boundary before merge.

[LOW] Annotated assignment false negative not documented

  • Category: Scope / documentation
  • File: gitnexus/src/core/group/extractors/http-patterns/python.ts:82–100
  • Issue: self._client: httpx.AsyncClient = httpx.AsyncClient(...) parses as typed_assignment in tree-sitter-python, not assignment. The query only matches assignment, so annotated forms are silently missed.
  • Why it matters: Annotated class attributes are common in production Python (client: httpx.AsyncClient = httpx.AsyncClient(...)). The NOTE comment mentions direct-import and module-alias gaps but not this one.
  • Recommended fix: Add to the NOTE comment: // Annotated assignments ('client: httpx.AsyncClient = httpx.AsyncClient()' → typed_assignment node) are also not tracked.
  • Blocks merge: No.

[LOW] callScopeKeys contains redundant computation

  • Category: Code quality
  • File: gitnexus/src/core/group/extractors/http-patterns/python.ts:149–167
  • Issue: The second block inside callScopeKeys (if (!preferClass) { const functionScope = getScopeKey(clientNode.parent, false); ... }) recomputes exactly what nearestScope already computed when preferClass=false. Both calls are getScopeKey(clientNode.parent, false) — they return the same value. The keys.add(functionScope) adds a duplicate that the Set silently absorbs.
  • Why it matters: Dead code, but harmless. Adds a false read suggesting there's a distinction between nearestScope and functionScope that doesn't exist.
  • Recommended fix: Remove the redundant block. The code works identically without it.
  • Blocks merge: No.

[LOW] client.request("NOT_HTTP_METHOD", "/x") emits arbitrary method strings

  • Category: Contract precision
  • File: gitnexus/src/core/group/extractors/http-patterns/python.ts:319–335
  • Issue: HTTPX_ASYNC_CLIENT_GENERIC_PATTERNS captures any string literal as @http_method and emits it uppercased. client.request("CONNECT", "/x")http::CONNECT::/x. client.request("not-a-method", "/x")http::NOT-A-METHOD::/x. Neither is filtered.
  • Why it matters: This mirrors existing REQUESTS_GENERIC_PATTERNS behavior (no method whitelist there either), so this is not a regression. But it can produce contract IDs that never match any provider.
  • Recommended fix: Consistent with existing behavior — acceptable as-is. Optionally, add a comment noting this is intentional parity with python-requests.
  • Blocks merge: No.

[INFO] CI Gate status unconfirmed post latest merge

  • Category: CI / release readiness
  • Issue: The last known CI run (May 8) showed all 8441 tests passing but the CI Gate failed (exact cause: macOS u8 timing flake and possibly lint annotations in unrelated gitnexus-shared/src/scope-resolution/ files). The latest merge-from-main e4916f7 (May 10) includes 4848dce test(u8): de-flake regex linearity assertions from main which should address the timing flake. A new CI run should have triggered but current state is unverifiable from this review.
  • Blocks merge: Yes — do not merge until CI Gate is confirmed green. This is a process gate, not a code defect.

Python Tree-sitter query correctness

Assignment query (HTTPX_ASYNC_CLIENT_ASSIGN_PATTERNS):
Correctly captures both plain identifiers (client) and attribute expressions (self._client, cls._http) via (_) wildcard on left. The #eq? @module "httpx" predicate correctly rejects something_else.AsyncClient() and httpx.Client(). typed_assignment (annotated form) is silently missed — acceptable false negative, should be documented. ✓

async-with alias query (HTTPX_ASYNC_CLIENT_WITH_ALIAS_PATTERNS):
In tree-sitter-python 0.23.x, async with expr as alias: parses as with_item → as_pattern → (value, as_pattern_target). The query structure using (as_pattern ...) is correct. The (as_pattern_target (identifier) @client) captures only simple identifier aliases. The functional test (28 passes) confirms this matches at runtime. ✓

Verb query (HTTPX_ASYNC_CLIENT_VERB_PATTERNS):
The argument_list . positional anchor correctly requires the path to be the first positional argument. Works for client.get("/x", params={...}). Pure keyword client.get(url="/x") not supported — acceptable, consistent with requests patterns. Regex ^(get|post|put|delete|patch)$ correctly scoped. ✓

Generic query (HTTPX_ASYNC_CLIENT_GENERIC_PATTERNS):
Requires both method and path to be positional string literals. Non-literal → unquoteLiteral returns null → skipped safely. ✓

Awaited calls: Under await self._client.get(...), tree-sitter wraps the call in an await_expression. The inner call node structure is unchanged and still matched. ✓


Scope and false-positive safety

Class-scoped attributes (self._client): trackedClientScopeKey calls getScopeKey(parent, preferClass=true), which walks to the enclosing class_definition. callScopeKeys also uses preferClass=true and produces [class:<start>:<end>, 'module']. A self._client.get(...) call in a different class fails because its class scope differs. ✓

Function-scoped locals (client in async-with or assignment): tracked to function:<start>:<end>. A same-named client in a different function has a different function scope key → rejected. ✓ (Covered by test.)

Module-scope locals (client = httpx.AsyncClient() at module level): tracked to 'module'. callScopeKeys always adds 'module', so any call site matching by name will pass regardless of function context. False positive risk — not tested, not documented. ⚠️


Consumer extraction and contract normalization

  • role: 'consumer'
  • framework: 'python-httpx' ✓ (distinct from 'python-requests', 'fastapi')
  • method: uppercased from capture ✓; .toUpperCase() on string argument for .request()
  • path: raw literal → normalizeConsumerPath in orchestrator:
    • Absolute https://svc.local/questions/duplicate-check → pathname → /questions/duplicate-checkhttp::POST::/questions/duplicate-check
    • Relative /topichttp::GET::/topic
  • name: null — consistent with all consumer plugins ✓
  • confidence: 0.7 — consistent with source-scan consumer plugins ✓
  • dedupeContracts uses contractId|filePath|name key ✓
  • No schema changes, no new contract types, no provider contracts emitted ✓
  • FASTAPI_PATTERNS, REQUESTS_VERB_PATTERNS, REQUESTS_GENERIC_PATTERNS untouched ✓

Test coverage and regression safety

Positive cases:

  • ✅ Attribute-held client: self._client = httpx.AsyncClient() + await self._client.get("/topic")
  • ✅ Generic .request(): await self._client.request("POST", "/questions/import")
  • .delete() verb (non-GET/POST): await self._client.delete("/topic")
  • async with alias + absolute URL normalization: await client.post("https://svc.local/questions/duplicate-check")
  • ✅ Per-contractId framework assertions (not over-broad every(...))

Negative (false-positive prevention):

  • data.get("/nope") — untracked object, data not in set
  • service.request("POST", "/nope") — untracked object
  • client.get("/ignored-same-name") in unrelated_scope_collision() where client was tracked to check_duplicate's function scope

Gap:

  • ❌ No test for module-scope client = httpx.AsyncClient() + client.get(...) in a function where client is a different object → would reveal the module-scope false-positive path

Regression:

  • ✅ FastAPI provider, requests.post(...) consumer, Java, Go all tested and pass (27 pre-existing tests)
  • fs.mkdtempSync replaces predictable shared temp path — CodeQL insecure-tempfile finding resolved ✓

Would tests fail before this PR? Yes — all four expect(consumer).toBeDefined() assertions would fail on unpatched code. ✓


CI/security/hygiene

Unicode: No bidi controls or invisible chars in any changed file. (U+2500) in section comments and (U+2014) in describe() strings are consistent with pre-existing codebase style. Not in queries, regexes, or test fixture code. No block. ✓

CI Gate: Failed on May 8 run; all 8441 tests passed in that run. Root cause was likely macOS u8-redos-resource-exhaustion.test.ts timing flake (since fixed in main via 4848dce) or a secondary lint job. New run should have triggered after e4916f7 merge — must be confirmed green before merge.

Scope diff: Final diff is strictly limited to python.ts and http-route-extractor.test.ts. Merge commits are from main changes in unrelated packages. ✓


Out-of-scope issue #1184 remainder

PR body now correctly says Partially fixes #1184. Issue #1184 must remain open for:

  • NestJS HttpService consumer detection
  • Project-specific axios wrapper detection

The PR correctly makes no claim about these. The code adds no NestJS or axios patterns. ✓


Final verdict

not production-ready

Primary gate (process): The CI Gate was failing on the last known run (May 8), and the current state after the May 10 merge-from-main is unconfirmed. Do not merge until CI Gate completes green on the current head. The tests themselves all passed on May 8, and the de-flaking fix (4848dce) has been merged into the branch, so CI is expected to pass — but it must be verified.

Secondary issue (medium, should be addressed before merge): callScopeKeys unconditionally adds 'module' to every call-site's candidate scope list. This means client = httpx.AsyncClient() at module scope causes any function-local client.get(...) call in the same file to emit a false consumer contract. This is a reproducible false-positive path against the DoD requirement "Does not emit consumers for untracked objects." It can be remediated by adding a NOTE comment documenting the limitation, or by adding a one-case fixture test that covers module-scope tracking + function-local same-name shadowing.

Once CI is confirmed green and the module-scope limitation is either tested or documented in the NOTE comment, this PR is production-ready with minor follow-ups — the core algorithm is architecturally sound, queries are correct for the stated scope, path normalization is correct, and the existing requests.* / FastAPI extraction is untouched.


@magyargergo

Copy link
Copy Markdown
Collaborator

@juyua9 Can you please look into claude's findings? 🙏

Prevent module-scope httpx.AsyncClient tracking from matching same-name local variables inside functions.

Also documents the intentionally unsupported direct-import, alias, and typed-assignment forms, and extends the httpx extractor regression fixture to cover module-scope shadowing while keeping module-scope calls detected.
@juyua9

juyua9 commented May 11, 2026

Copy link
Copy Markdown
Contributor Author

Thanks — I looked into Claude's latest findings and pushed 7b28c93.

What changed:

  • Tightened callScopeKeys() so module-scope httpx.AsyncClient tracking no longer matches same-name local variables inside functions.
  • Kept module-scope calls themselves detected (module_client.get(...)).
  • Extended the regression fixture with a module-scope tracked client plus a shadowed function-local client.get("/ignored-module-same-name") negative case.
  • Documented the remaining intentionally unsupported forms in the NOTE: direct imports, module aliases, and typed assignments.

Validation run locally:

  • npx vitest run test/unit/group/http-route-extractor.test.ts -t 'extracts Python httpx.AsyncClient calls assigned to attributes or aliases' — passed
  • npx tsc --noEmit — passed
  • git diff --check -- gitnexus/src/core/group/extractors/http-patterns/python.ts gitnexus/test/unit/group/http-route-extractor.test.ts — passed

I did not expand scope to the remaining #1184 slices (NestJS HttpService / axios wrappers); this PR remains the Python httpx.AsyncClient slice.

@magyargergo magyargergo merged commit 55b7a79 into abhigyanpatwari:main May 11, 2026
28 of 29 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[bug][group] HTTP consumer extractor misses httpx.AsyncClient, NestJS HttpService, and project axios wrappers

3 participants