feat: add LOD graph rendering and performance optimizations#362
feat: add LOD graph rendering and performance optimizations#362naicud wants to merge 65 commits into
Conversation
…ate documentation
…n and adding missing elements
The sequential parsing path (used when worker pool times out on large COBOL codebases) was missing the regex-based paragraph/section supplement. This ensures the same hybrid tree-sitter + regex extraction runs regardless of which parsing path is used. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Updated package.json and package-lock.json to include aws4fetch dependency. - Enhanced SettingsPanel to allow configuration for custom and AWS Bedrock providers. - Implemented ChatBedrockBrowser for AWS Bedrock integration, supporting streaming and tool calls. - Modified agent and settings-service to handle new provider configurations. - Added types for custom and AWS Bedrock configurations. - Created a browser stub for Node.js 'os' module to satisfy AWS SDK imports in the browser context. - Updated README with setup instructions and API details.
…nd remove large file skip logic in filesystem walker
…ion and improve performance
Feat/dnn cobol
…ck and converse endpoints
Feat/dnn chat on aws andcypher
Feat/dnn cobol increase trhe beast
Integrates upstream symbol resolution engine, Ruby support, MRO, call routing, type environment, skill generation, and 282+ new tests while preserving COBOL support, AWS Bedrock proxy, and Neptune backend. Conflict resolution: - Config files: upstream v1.4.0 base with local changelog entries preserved - export-detection.ts: upstream dispatch table + COBOL checker added - import-processor.ts: upstream refactored architecture + COBOL resolver ported - parse-worker.ts: local lazy loading + COBOL regex dispatch + upstream Ruby/MRO/TypeEnv - parsing-processor.ts, pipeline.ts, utils.ts: both sides merged (non-overlapping) - call-routing.ts, type-extractors/index.ts: COBOL entries added to satisfies Records - schema.ts: duplicate Record→Property edge removed 47 test files, 1151 tests passing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add DbConfig types and IDbQueryAdapter interface (interfaces.ts) - Add NeptuneAdapter for read queries via @aws-sdk/client-neptunedata - Add Neptune ingestion with UNWIND+MERGE batches (batch size 500) - Add @aws-sdk/client-neptunedata dependency Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Backend: - Extend RegistryEntry with db?: DbConfig for per-repo DB config - Add getDbConfig() helper with KuzuDB backward compatibility - CLI: --db neptune, --neptune-endpoint, --neptune-region, --neptune-port - analyze.ts: Neptune ingestion path (skip FTS/embeddings with warnings) - Env vars: GITNEXUS_DB_TYPE, GITNEXUS_NEPTUNE_ENDPOINT, etc. API Server: - POST /api/db/test: Neptune connectivity check with latency - GET /api/graph: Neptune dispatch for full graph loading - POST /api/query: Neptune dispatch for Cypher queries - POST /api/search: CONTAINS-based fallback for Neptune (no FTS) MCP LocalBackend: - Per-repo NeptuneAdapter map (HTTP stateless, no pool needed) - runQuery()/runParameterized() dispatch methods - All 32 direct KuzuDB calls replaced with dispatch methods - Neptune FTS fallback with CONTAINS predicate search Web UI: - Database Backend section in Settings (KuzuDB/Neptune toggle) - Neptune config: endpoint, region, port with Test Connection button - testDbConnection() in backend service - CypherConsole modal (Table/JSON views, Ctrl+Enter to run) - database? field in LLMSettings for localStorage persistence - AWS Graph Explorer info box Documentation: - docs/neptune-setup.md: Complete step-by-step guide (VPC, security groups, subnet group, cluster, instance, IAM, CLI config, UI config, Graph Explorer, troubleshooting, costs) Tests: - neptune-adapter.test.ts: 17 tests (constructor, query, params, close, test) - neptune-ingest.test.ts: 17 tests (clear, batch, indexes, progress, stats) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…tor search - Introduced TypeScript indexing documentation detailing extraction of definitions, imports, calls, and inheritance. - Implemented Neptune vector search functionality, including caching of embeddings and cosine similarity computation. - Added Cohere, Ollama, OpenAI, and Transformers embedding providers with error handling and API integration. - Created a factory for embedding provider instantiation based on configuration. - Developed unit tests for embedding providers and Neptune vector search to ensure functionality and correctness.
- Introduced `pipeline-architecture.md` detailing the GitNexus indexing pipeline, including an overview, end-to-end sequence, and detailed phases of processing from file scanning to database loading. - Added `worker-architecture.md` outlining the parallel code parsing mechanism using worker threads, covering worker lifecycle, message protocol, sub-batching, memory management, timeout handling, and error recovery.
…overview and language comparison
…, add yes option to skip prompts
- Implemented `runAnalyzeWizard` for guided analysis configuration. - Created `runSetupWizard` to configure supported editors. - Added `runWikiWizard` for generating wikis with LLM integration. - Introduced new interactive TUI components for context, cypher, impact, and query analysis. - Developed `generateGraphSummary` for precomputing graph summaries for visualization. - Enhanced user experience with prompts and structured outputs across wizards and analysis tools.
|
@claude ultra-think while reviewing this pull request and check if performance improvements have been achieved |
|
Claude finished @magyargergo's task in 3m 35s —— View job PR Performance Review — feat: add LOD graph rendering and performance optimizationsTasks
SummaryThe core LOD (Level-of-Detail) architecture is genuinely effective for its stated goal of preventing browser overload on large repos. Most claimed optimizations check out, though several real issues deserve attention. ✅ Verified Performance Improvements1. NODE_HARD_CAP + Graph Truncation — Real, Effective
const NODE_HARD_CAP = 20000;
const buildGraph = async (limit?: number) => { ... }The hard cap prevents sending > 20K nodes to the browser in one shot. The multi-table loop early-exits when the cap is hit, and the // Push edge filtering to Cypher (avoid fetching all edges for large repos)
WHERE a.id IN $ids AND b.id IN $idsThis is the most impactful change in the PR. ✅ 2. Graph Summary Precomputed and Cached — Real
During pipeline, 3. Batch Graph Import — Real
graph.import({ nodes: batchNodes, edges: batchEdges });Both adapters use batch import, eliminating per-node event emissions that would trigger Sigma.js re-renders during the add loop. ✅ 4.
|
|
@naicud Can you review the comment above and address that in another commit to this branch, good work so far! |
|
Please resolve the merge conflicts too 🙏 |
- Level-of-detail graph rendering (summary, hierarchy, full)
- Graph summary builder (structural aggregation by cluster)
- New API endpoints: /api/graph/{info,summary,expand,neighbors,hierarchy,ancestors,schema}
- Node hard cap with truncation for large graphs
- Refactored useAppState into per-concern hooks
- New components: DataExplorer, ContextMenu, NeighborPanel, SchemaGraph, SearchPanel, StylingPanel
- Summary and hierarchy graph adapters
- ForceAtlas2 tuning for LOD rendering
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…npatwari#397) Merge origin/main into pr/performance-lod-rendering, resolving 22 conflict files across backend (8), frontend (13), and tests (1). Key resolution decisions: - LOD endpoints, hooks, and components: kept our branch additions - Language dispatch unification (abhigyanpatwari#409): kept main's exhaustive Record types, added COBOL entries to all dispatch tables (import-resolution, framework- detection, call-routing, entry-point-scoring, tree-sitter-queries, parser-loader) - Cross-file binding propagation (abhigyanpatwari#397): kept our branch's Phase 14 code - Neptune DB backend: kept main's Neptune dispatch paths in api.ts, local-backend - LLM providers: merged minimax (ours) + custom/bedrock (main) into all provider lists, types, and settings - arm64 Mac safety: preserved sequential query execution in local-backend impact analysis, updated to use this.runQuery for Neptune compatibility - Schema test: updated SCHEMA_QUERIES count to 29 (28 node + 1 rel, Section added) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ies, fix LOD thresholds, optimize buildGraph - Replace all string-interpolated Cypher with executeParameterizedQuery (injection fix) - Align LOD mode thresholds with NODE_HARD_CAP (20K) to prevent silent truncation - Skip count queries for non-truncated tables in buildGraph - Remove n.content from graph visualization queries (bandwidth optimization) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Extracted duplicate edge aggregation logic from generateGraphSummary and generateStructuralSummary into a shared aggregateInterGroupEdges helper function. Both functions had nearly identical code for counting inter-group edges and tracking edge types. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…r-mode hydration Symbol nodes (Module, Class, Function, etc.) still omit content for bandwidth optimization. File nodes retain content because server-connection.ts extractFileContents() depends on it for BM25 indexing, agent tools, and embeddings. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… plan - AGENTS.md/CLAUDE.md: 2273→3816 symbols, 5419→8368 relationships, 174→300 flows - Add implementation plan for PR abhigyanpatwari#362 review findings Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
53c5c20 to
35868e6
Compare
CI Report✅ All checks passed Pipeline
Tests
✅ All 3635 tests passed across 1046 files 20 test(s) skipped
Coverage
📋 Full run · Coverage from Ubuntu · Generated by CI |
|
Working on :) |
…output)
- parse-worker: add `{ type: 'ready' }` postMessage after module init so
worker-pool readiness gate resolves (was timing out after 10s)
- status-formatter: write informational messages to stdout instead of
stderr, consistent with the success path (p.note → stdout)
- Remove obsolete PR abhigyanpatwari#362 review fixes plan document
Fixes: 5 worker-pool.test.ts failures, 2 cli-e2e.test.ts failures
Result: 3635/3635 tests passing, 0 failures
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
@zander-raycraft Check now plz and tell me if you need other changes sorry for the delay :) |
|
@magyargergo This looks good to me how does it look to you? |
|
@naicud Thank you! |
I think there a lot of unrelated changes. I'll take a closer look. |
|
@naicud may i ask you to lean this pr down to just strictly to the level of detail performance improvement changes? 🙏 |
|
@claude can you identify the group of changes in this PR? I feel like there's more happening here than just the LOD? What other categories are there? Can they be split up? |
|
Claude finished @magyargergo's task in 3m 7s —— View job PR Change Category Analysis
Categories of ChangesYou're right — this PR contains at least 6 distinct feature groups bundled together. Here's the breakdown across the 100 changed files and 65 commits: 1. LOD Graph Rendering (the intended scope) — ~36 filesThe core feature. Entirely within the squashed
2. useAppState Refactor — ~8 files
3. AWS Bedrock Integration — ~4 filesCommits like
4. COBOL Language Support — ~8 filesCommits
5. TUI Wizards + Embed CLI — ~12 filesCommits
6. Documentation — ~22 filesAll 7. Skills / Config Files — ~8 files
Why the Extra Features Appear in This PRPRs #358–#361 were authored on a shared working branch that eventually became this PR (#362). Those PRs were reviewed and approved but haven't been merged into Can They Be Split?
Recommended PathThe cleanest approach for @naicud:
The |
|
@naicud can you please have a look at this? We need to split this change to reduce any risks in the future and it helps us with reviewing your changes more easier. |
|
|
Please submit a new PR if this is still relevant |
Summary
Adds Level-of-Detail (LOD) graph rendering to handle large codebases (10K+ nodes) that would overwhelm the browser, plus performance optimizations and a major UI restructuring.
LOD Rendering
Three rendering levels that progressively reveal detail:
Users click to expand clusters, drill into neighborhoods, and navigate large graphs fluidly.
Performance
buildGraph()with limit/truncation andtruncated/totalAvailableresponse metadatabuildStructuralSummaryFromDb()— server-side summary generation from graph DBfastStripNullable()— type resolution optimizationskipGraphPhases— pipeline option to skip expensive graph phasesNew API Endpoints
/api/graph/{info, summary, expand, neighbors, hierarchy, ancestors, neighbor-counts, schema}UI Restructuring
useAppStaterefactored into 6 per-concern hooks:useGraphState,useFilterState,useChatState,useWorkerState,useKeyboardShortcuts,useStyleConfigDataExplorer,ContextMenu,NeighborPanel,SchemaGraph,SearchPanel,StylingPanelsummary-graph-adapter.ts,hierarchy-graph-adapter.tsfor LOD data transformationChanges (37 files)
api.ts,graph-summary.ts,pipeline.tsgraph-lod.ts,summary-graph-adapter.ts,hierarchy-graph-adapter.tsGraphCanvas.tsx,App.tsxuseAppState.tsx,useSigma.tsgraph-adapter.ts,constants.tsTest plan
cd gitnexus && npm run build && cd ../gitnexus-web && npm run build🤖 Generated with Claude Code