fix(ingestion): reduce parse-phase memory for huge repos (#1983)#1
Closed
magyargergo wants to merge 1 commit into
Closed
Conversation
…wari#1983) Stop retaining full parse-cache chunks in RAM alongside the merged graph, slim on-disk shards, defer worker ParsedFile emission for scope-resolver languages, and add GITNEXUS_DEBUG_HEAP probes for OOM diagnosis. Co-authored-by: Cursor <cursoragent@cursor.com>
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
magyargergo
added a commit
that referenced
this pull request
Jun 10, 2026
…abhigyanpatwari#2078) * feat(ingestion): add Java Spring route annotation → Route node extraction Previously, GitNexus only supported Route node generation for JS/TS ecosystems (Express, Next.js, Fastify, etc.) and Python (FastAPI, Flask). Java Spring's annotation-based routing (@RequestMapping, @GetMapping, @PostMapping, etc.) was only supported at the group contract layer (http-patterns/java.ts) for cross-repo matching, but NOT at the ingestion layer for generating graph Route nodes. This commit adds ingestion-layer support: 1. JAVA_QUERIES (tree-sitter-queries.ts): - Added method-level annotation captures (@GetMapping, @PostMapping, @PutMapping, @DeleteMapping, @PatchMapping) → @decorator captures - Added class-level @RequestMapping → @decorator capture (prefix) - Supports both positional ("/path") and named (path="/path", value="/path") annotation argument forms 2. parse-worker.ts: - Java class-level @RequestMapping is detected and stored as a prefix (not pushed as a standalone Route) - After per-file capture processing, the prefix is applied to all method-level routes in the same file via the existing ExtractedDecoratorRoute.prefix field - The routes phase (normalizeExtractedRoutePath) handles the prefix joining, producing final URLs like /api/users/list 3. Tests: - Unit test (worker-backed): 4 cases covering prefix joining, bare routes, class-level exclusion, multi-file isolation - Integration test (full pipeline): 6 cases covering end-to-end Route node + HANDLES_ROUTE edge generation Closes the feature gap where `route_map`, `shape_check`, and `api_impact` MCP tools returned empty results for Java Spring projects. * chore(autofix): apply prettier + eslint fixes via /autofix command * fix: address review findings — extract spring.ts module, fix PatchMapping, multi-class support Addresses all P2 findings from tri-review: 1. **Architecture**: Extracted Spring route logic from parse-worker.ts into a dedicated `route-extractors/spring.ts` module (matching the pattern of `laravel.ts` and `fastapi-router-bindings.ts`). parse-worker now has a single dispatch line — no language-specific logic inline. 2. **PatchMapping bug**: Added `'PatchMapping'` to `ROUTE_DECORATOR_NAMES` (was silently dropped before). 3. **Multi-class bug**: The new `extractSpringRoutes` walks each class declaration independently with its own prefix — no more single-scalar `javaClassPrefix` last-wins issue. 4. **Test hygiene**: Unit tests now import `extractSpringRoutes` directly (no dist build / worker pool dependency). Tests run in all tiers. 5. **Removed JAVA_QUERIES decorator patterns**: The Spring extractor does its own AST walk, so the tree-sitter query captures for Java annotations are no longer needed (avoids duplicate route emission). Additional test coverage: - Multi-class in one file with independent prefixes - @PatchMapping support - Named annotation args (path= and value=) on class-level @RequestMapping * refactor: move Spring route extraction to LanguageProvider hook Addresses the second review comment: instead of an inline `if (language === SupportedLanguages.Java)` dispatch in parse-worker, the Spring route extraction is now wired through a new optional `extractDecoratorRoutes` hook on LanguageProviderConfig. - Added `extractDecoratorRoutes` to LanguageProviderConfig interface - Java provider registers `extractSpringRoutes` as its implementation - parse-worker calls `provider.extractDecoratorRoutes?.()` generically - Removed direct import of spring.ts from parse-worker This keeps parse-worker fully language-agnostic — no language names appear in the dispatch path for route extraction. * refactor: rewrite spring.ts with tree-sitter captures, fix inline imports Addresses all 4 inline review comments: 1. Rewrote spring.ts to use a single predicate-free Parser.Query (same pattern as group-layer JAVA_ROUTE_ANNOTATION_PATTERNS). Two-phase loop: first pass collects class prefixes by node.id, second pass resolves method routes via findEnclosingClass. No more manual DFS / recursion. 2-3. Moved inline import(...) type references in language-provider.ts to proper top-level imports (Parser, ExtractedDecoratorRoute). 4. Covered by #1 — recursive helpers removed entirely. Added 3 extra test cases: non-route named args filtering, prefix isolation across mixed classes, line number accuracy. * refactor: extract shared Spring route primitives + add parity test Addresses review follow-up on abhigyanpatwari#2078: - Extract the primitives shared by the ingestion (route-extractors/spring.ts) and group (http-patterns/java.ts) Spring extractors into a new route-extractors/spring-shared.ts: METHOD_ANNOTATION_TO_HTTP, findEnclosingClass, isRouteMemberKey, and a safe unquoteSpringLiteral. Both extractors now import from it (group -> ingestion, the layer-correct direction) so the shared semantics can't drift apart. - Replace spring.ts's local unquote() with the safer unquoteSpringLiteral (returns null for non-string nodes instead of assuming a quoted string). - Add test/unit/spring-route-extractor-parity.test.ts: runs one shared Spring fixture through both extractors and asserts they surface the same provider method/path combinations. The broader HttpRouteExtractor source-scan optimization is tracked in abhigyanpatwari#2138. --------- Co-authored-by: henry <zhangwei2017@unipus.cn> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Gergő Magyar <gergomagyar@icloud.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes JavaScript heap OOM when analyzing very large repos (e.g. Linux kernel) during the parse phase. Stacked on abhigyanpatwari/GitNexus#2033 — merge abhigyanpatwari#2033 first, then open/retarget this against
main.ParsedFilefor scope-resolver languages (scope-resolution re-extracts on main thread)exportedTypeMapduring chunk mergeGITNEXUS_DEBUG_HEAP=1—[gitnexus-heap]probes for OOM diagnosisFixes abhigyanpatwari#1983
Test plan
npx tsc --noEmitvitest run test/unit/incremental-parse-cache.test.tsvitest run test/unit/parse-impl-worker-lazy-cache.test.tsNODE_OPTIONS=--max-old-space-size=20480 GITNEXUS_DEBUG_HEAP=1 gitnexus analyze --verboseMerge order
mainabhigyanpatwari/GitNexusmain(or cherry-pick the single commit665c60b8)Made with Cursor