Skip to content

feat(group): Support Django route extraction for multi-repo#1836

Open
HuyNguyenDinh wants to merge 22 commits into
abhigyanpatwari:mainfrom
HuyNguyenDinh:main
Open

feat(group): Support Django route extraction for multi-repo#1836
HuyNguyenDinh wants to merge 22 commits into
abhigyanpatwari:mainfrom
HuyNguyenDinh:main

Conversation

@HuyNguyenDinh

Copy link
Copy Markdown

Summary

Add end-to-end Django URL route extraction and cross-repo HTTP contract matching, covering static route discovery with cross-file include() resolution, worker-pool and sequential-fallback integration, and manifest-based cross-linking via group.yaml.

Motivation / context

Cross-repo HTTP contract matching requires routes to be extracted from source code and resolved to their full URL prefixes. Django repos use deeply nested urlpatterns with include() that can span multiple files across 2-3+ levels, making static analysis non-trivial. Additionally, HTTP client calls often use variable-based URLs from settings.py (f-strings) that cannot be statically analyzed — those consumer contracts must come from explicit links in group.yaml.

Areas touched

  • gitnexus/ (CLI / core / MCP server)
  • gitnexus-web/ (Vite / React UI)
  • .github/ (workflows, actions)
  • eval/ or other tooling
  • Docs / agent config only (AGENTS.md, CLAUDE.md, .cursor/, llms.txt, etc.)

Scope & constraints

In scope

  • Django route extractor (django.ts): static analysis of urlpatterns via tree-sitter supporting path(), re_path(), url(), cross-file include() resolution with 4-strategy path lookup, urls/init.py directory modules, depth-8 traversal with cycle detection, augmented assignment, string concatenation, and HTTP method inference from view name suffixes
  • Django root URL discovery (django-root-discovery.ts): follows manage.py → DJANGO_SETTINGS_MODULE → settings.py → ROOT_URLCONF → root urls.py
  • isDjangoRouteFile hook on Python language provider (python.ts)
  • Django path()/re_path()/url() tree-sitter query patterns in group HTTP extractor Strategy B (http-patterns/python.ts)
  • Worker-pool path integration (parse-worker.ts): processFileGroup calls Django extractor for Python route files, routes flow through ParseWorkerResult.routes → mergeChunkResults → WorkerExtractedData.routes
  • Sequential fallback path integration (parsing-processor.ts): processParsingSequential extracts routes and pushes to outRoutes output parameter, with fileContentMap pre-computation and Django root discovery at chunk level
  • Route collection plumbing (parse-impl.ts): allExtractedRoutes collects routes from both worker (chunkWorkerData.routes) and sequential (outRoutes) paths, passes to processRoutesFromExtracted in deferred resolution band
  • 105 explicit links in group.yaml for consumer→provider HTTP contracts (1669 contracts, 105 cross-links at matchType: manifest, confidence: 1)
  • Fixed duplicate ExtractedRoute type import in parsing-processor.ts (TS2300)

Explicitly out of scope / not done here

  • Route extraction for web frameworks beyond Django/Laravel
  • Route prefix deduplication (currently shows 3 variants per route: unprefixed, /v2, /api/v2)
  • Worker pool gap-fill for quarantined files (sequential fallback enters via the existing processImports path only)

Implementation notes

Route flow after this change:

  • Worker path: routes returned in chunkWorkerData.routes, pushed to allExtractedRoutes at line 667
  • Sequential path: processParsingSequential pushes directly to outRoutes (allExtractedRoutes) without intermediate data structure
  • Both converge at processRoutesFromExtracted (line 839) for graph node/edge creation with resolved URL prefixes
    Consumer contracts for variable-based URLs are declared via explicit links in group.yaml — bypasses static analysis entirely. Manifest links produce CrossLink with matchType: manifest and confidence: 1 directly during group sync.

Testing & verification

  • cd gitnexus && npx tsc --noEmit (clean)
  • cd gitnexus && npm run test:unit — 12/12 Django, 4/4 Laravel tests pass
  • cd gitnexus && npm test
  • cd gitnexus && npm run test:integration
  • cd gitnexus-web && npm test
  • cd gitnexus-web && npx tsc -b --noEmit
  • Manual / Playwright E2E

Risk & rollout

Low risk — additive changes (new optional parameters, new route data that was previously silently dropped). No index migration required; a fresh gitnexus analyze picks up the new routes. Manifest-based cross-links from group.yaml are independent of the graph and don't change with re-indexing.

Checklist

  • PR body meets repo minimum length
  • If AGENTS.md / overlays changed: headers, scope block, and changelog updated per project conventions
  • No secrets, tokens, or machine-specific paths committed

@vercel

vercel Bot commented May 26, 2026

Copy link
Copy Markdown

@HuyNguyenDinh is attempting to deploy a commit to the NexusCore Team on Vercel.

A member of the Team first needs to authorize it.

@magyargergo magyargergo changed the title Support Django route extraction for multi-repo feat(group): Support Django route extraction for multi-repo May 26, 2026
Comment thread gitnexus/src/core/ingestion/route-extractors/django.ts Fixed

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review

Found 4 issues: 2 AGENTS.md compliance violations and 2 bugs.

Comment thread gitnexus/src/core/ingestion/parsing-processor.ts Outdated
Comment thread gitnexus/src/core/ingestion/workers/parse-worker.ts Outdated
Comment thread gitnexus/src/core/ingestion/route-extractors/django-root-discovery.ts Outdated
Comment thread gitnexus/src/core/ingestion/route-extractors/django.ts
@github-actions

github-actions Bot commented May 26, 2026

Copy link
Copy Markdown
Contributor

CI Report

All checks passed

Pipeline Status

Stage Status Details
✅ Typecheck success tsc --noEmit
✅ Tests success unit tests, 3 platforms
✅ E2E success gitnexus-web changes only

Test Results

Tests Passed Failed Skipped Duration
10256 10249 0 7 623s

✅ All 10249 tests passed

7 test(s) skipped — expand for details
  • COBOL pipeline benchmark > scales with file count
  • C# pipeline benchmark > scales with file count — namespaces spread across the solution
  • C# pipeline benchmark > scales with file count — all types in one (global) namespace bucket
  • PHP pipeline benchmark > scales with file count (workers enabled)
  • Ruby pipeline benchmark > scales with file count (workers enabled)
  • Rust pipeline benchmark > scales with file count (workers enabled)
  • buildTypeEnv > known limitations (documented skip tests) > Ruby block parameter: users.each { |user| } — closure param inference, different feature

Code Coverage

Tests

Metric Coverage Covered Base Delta Status
Statements 79.84% 36166/45293 79.84% = 0.0 🟢 ███████████████░░░░░
Branches 68.5% 23097/33717 68.5% = 0.0 🟢 █████████████░░░░░░░
Functions 84.94% 3725/4385 84.94% = 0.0 🟢 ████████████████░░░░
Lines 83.36% 32565/39061 83.36% = 0.0 🟢 ████████████████░░░░

📋 View full run · Generated by CI

@magyargergo

Copy link
Copy Markdown
Collaborator

@HuyNguyenDinh Can you please address github-actions comments? 🙏

@HuyNguyenDinh

Copy link
Copy Markdown
Author

@HuyNguyenDinh Can you please address github-actions comments? 🙏
I've resolve the issues and fix the conflict

@HuyNguyenDinh

Copy link
Copy Markdown
Author

@magyargergo I’ve looked and pushed the changes to resolve the issues. Let take a look and re-trigger the review phase

magyargergo and others added 6 commits May 30, 2026 22:19
…umer detection

- Add REQUESTS_KEYWORD_URL_PATTERNS for requests.get(url='...') keyword args
- Add WRAPPER_URI_PATTERNS for generic wrapper.fetch(uri='...') calls
- Add WRAPPER_URI_VAR_PATTERNS + buildLocalStringMap for uri=variable propagation
- Add LOCAL_STRING_ASSIGNMENTS to track uri='...' assignments
- Wire both direct-string and variable-propagation loops in scan()
- Add normalizeConsumerPath() helper

Note: Automatic cross-link detection remains limited for runtime-computed URLs
(URLs built via .format(), string concat, or module constants). Manual
manifest links needed for known cross-repo contracts.
…tterns

Re-add LOCAL_STRING_ASSIGNMENTS, WRAPPER_URI_VAR_PATTERNS,
buildLocalStringMap(), and normalizeConsumerPath() lost during
cherry-pick merge of upstream keyword-URL commit.

Together with the upstream WRAPPER_URI_PATTERNS and
REQUESTS_KEYWORD_URL_PATTERNS, we now detect:
- requests.get(url='literal') keyword args
- wrapper.fetch(uri='literal') keyword args
- wrapper.fetch(uri=variable) where variable was assigned a string literal
@magyargergo

Copy link
Copy Markdown
Collaborator

@HuyNguyenDinh Can you please resolve the merge conflicts? 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants