Skip to content

feat: AWS Neptune backend, COBOL support, Bedrock AI, TUI wizards & perf #343

Merged
github-actions[bot] merged 0 commit into
abhigyanpatwari:mainfrom
naicud:main
Mar 18, 2026
Merged

feat: AWS Neptune backend, COBOL support, Bedrock AI, TUI wizards & perf #343
github-actions[bot] merged 0 commit into
abhigyanpatwari:mainfrom
naicud:main

Conversation

@naicud

@naicud naicud commented Mar 18, 2026

Copy link
Copy Markdown
Contributor

Summary

This PR adds several major features developed in parallel on top of upstream:

🗄️ AWS Neptune Graph Backend

  • Full Neptune adapter as alternative DB backend to KuzuDB/LadybugDB
  • CLI flags (--db neptune, --neptune-endpoint, --neptune-region)
  • IAM/SigV4 auth via AWS SDK credential chain
  • Fault-tolerant ingestion with idempotent upserts (batch 500)
  • Vector search via Neptune embeddings (float[] serialized as JSON)
  • API endpoint /api/db/test + MCP dispatch for Neptune queries
  • Web UI: "Database Backend" settings panel + Cypher console modal
  • WAL corruption recovery + compatibility shim for KuzuDB→LadybugDB rename

🏢 COBOL Language Support

  • Regex-only extraction (bypasses tree-sitter-cobol which hangs on ~5% of files)
  • Extracts: PROGRAM-ID → Module, paragraphs → Function, sections → Namespace
  • CALL / PERFORM / COPY edge resolution (cross-file + intra-file)
  • GITNEXUS_COBOL_DIRS env var for extensionless file detection
  • Worker pool sub-batch size tuned for large COBOL repos (200 vs 1500 default)
  • JCL analysis with modernization scoring and reporting

🤖 AWS Bedrock Integration

  • Bedrock proxy endpoints (/api/bedrock/health, /api/bedrock/converse)
  • Chat with Claude via Bedrock from the GitNexus web UI
  • Support for custom OpenAI-compatible providers

🖥️ TUI Improvements

  • Interactive wizards for analyze and wiki commands
  • --yes flag to skip prompts in CI environments

⚡ Performance & Rendering

  • Phase 1A/1B/1C graph rendering foundation with LOD (Level of Detail) support
  • fastStripNullable, skipGraphPhases, AST pruning optimizations
  • Improved embedding cache build process in Neptune adapter

📦 New CLI Command

  • gitnexus embed — standalone command to generate/update embeddings

Test plan

  • gitnexus analyze works unchanged on TS/JS repos (LedyBugDB path untouched)
  • gitnexus analyze --db neptune --neptune-endpoint <host> ingests to Neptune
  • COBOL indexing: set GITNEXUS_COBOL_DIRS=s,c and run analyze on a COBOL repo
  • Bedrock proxy responds to /api/bedrock/health
  • TUI wizards launch on gitnexus analyze (interactive mode)
  • gitnexus embed generates embeddings without re-analyzing

@vercel

vercel Bot commented Mar 18, 2026

Copy link
Copy Markdown

@naicud is attempting to deploy a commit to the NexusCore Team on Vercel.

A member of the Team first needs to authorize it.

@magyargergo

Copy link
Copy Markdown
Collaborator

This is very impressive work! Could you please split it up as the scope is hug.

@naicud

naicud commented Mar 18, 2026

Copy link
Copy Markdown
Contributor Author

Hi @magyargergo,

thanks for raising this I totally understand the concern around the size of the PR.

I did evaluate splitting it, but given how tightly the components are coupled, breaking it up would actually introduce more risk than it removes. Each part depends on the others to function correctly, and reviewing them in isolation could create a false sense of safety changes that look fine individually might only surface issues when integrated.

Keeping it as a single PR gives us full visibility over the impact of each change, and ensures the review reflects the actual behaviour of the system.

I'm happy to make the review easier in other ways for example, I can walk you through the code structure in a short call, add more inline documentation, or break down the PR description by logical area so it's easier to navigate.

Let me know what works best for you.

Thanks,

@magyargergo

Copy link
Copy Markdown
Collaborator

@claude can you recommend a splicing strategy for this PR to reduce the scope of the task? Such as the language support can go into a separate PR. As I want to see how good coverage they got for the COBOL language support and would be interested how they tackled with the specific language challenges.

@github-actions github-actions Bot merged commit 7b00e7f into abhigyanpatwari:main Mar 18, 2026
1 check failed
@claude

claude Bot commented Mar 18, 2026

Copy link
Copy Markdown

Claude finished @magyargergo's task in 2m 49s —— View job


PR Splitting Strategy

  • Understand the request
  • Explore codebase to identify component boundaries
  • Analyze coupling between features
  • Recommend splitting strategy

Based on exploring the actual codebase, here is a recommended split into 5 focused PRs, ordered from least to most coupled:


PR 1 — COBOL Language Support (most self-contained — start here)

Files involved:

  • gitnexus/src/core/ingestion/cobol-preprocessor.ts — regex-based symbol extraction
  • gitnexus/src/core/ingestion/cobol-copy-expander.ts — COPY statement expansion with cycle detection
  • scripts/patch-tree-sitter-cobol.cjs — tree-sitter grammar patch
  • Integration touches in ingestion/parsing-processor.ts (dispatch) and worker pool (sub-batch size 200 for GITNEXUS_COBOL_DIRS)

Why it can stand alone: COBOL support has zero dependencies on Neptune, Bedrock, or the embed command. It plugs in as a new language parser behind a feature-flagged code path (GITNEXUS_COBOL_DIRS env var). You can review it in complete isolation, and this is exactly the PR you mentioned wanting to scrutinize for test coverage and how language-specific challenges (COPY expansion, JCL, regex vs tree-sitter decision) were addressed.

Questions worth asking in review:

  • What's the test coverage on cobol-copy-expander.ts — specifically cycle detection and REPLACING transformations?
  • How is the "tree-sitter-cobol hangs on ~5% of files" problem bounded? Is there a timeout/fallback?
  • How does extensionless file detection work when GITNEXUS_COBOL_DIRS is set?

PR 2 — gitnexus embed CLI + TUI Wizards

Files involved:

  • gitnexus/src/cli/embed.ts, embed-config.ts
  • gitnexus/src/cli/tui/wizards/ (analyze-wizard, embed-wizard, wiki-wizard, setup-wizard)
  • gitnexus/src/cli/tui/shared.ts — TTY/CI detection, --yes flag, env var serialization

Why to group these: The embed wizard and analyze wizard share the same @clack/prompts-based infrastructure. The --yes flag spans both. They don't depend on Neptune or Bedrock being merged first since the wizard conditionally shows options based on what's available.


PR 3 — AWS Neptune Backend

Files involved:

  • gitnexus/src/core/db/neptune/ (adapter, ingest, vector-search)
  • gitnexus/src/core/db/interfaces.ts — adds DbType = 'lbug' | 'neptune' and NeptuneDbConfig
  • CLI flags --db neptune, --neptune-endpoint, --neptune-region
  • API endpoint /api/db/test
  • Web UI "Database Backend" settings panel

Key coupling to manage: neptune-vector-search.ts consumes the embedding pipeline — stub this out with a throw new Error("embeddings required") guard if embeddings aren't ready, or merge after PR for embeddings.


PR 4 — AWS Bedrock Integration

Files involved:

  • gitnexus-web/src/core/llm/bedrock-browser.ts — 507-line ChatBedrockBrowser class
  • Bedrock proxy endpoints in gitnexus/src/server/api.ts (/api/bedrock/health, /api/bedrock/converse, /api/bedrock/converse-stream)
  • Web UI chat panel changes

Why it stands alone: Bedrock is purely additive. It doesn't change any existing graph ingestion or query logic. The proxy is a separate API surface using aws4fetch for SigV4 signing. No other features depend on it being present.


PR 5 — Performance & LOD Graph Rendering

Files involved:

  • gitnexus-web/src/services/graph-lod.tsGraphInfo, GraphSummary, GroupExpansion types + fetch functions
  • gitnexus-web/src/lib/summary-graph-adapter.ts, hierarchy-graph-adapter.ts
  • LOD API endpoints in api.ts (/api/graph/info, /api/graph/summary, /api/graph/expand, etc.)
  • fastStripNullable, skipGraphPhases, AST pruning optimizations (touches core pipeline)

Note: This is the riskiest split — the performance changes touching the core analysis pipeline (fastStripNullable, AST pruning) should be separated from pure rendering/LOD changes to make regression testing tractable.


Coupling Heat Map

Feature Depends on
COBOL (nothing new) — pure addition to existing ingestion pipeline
TUI Wizards Existing commands; conditionally references Neptune config types
Neptune IDbQueryAdapter interface (additive); optionally uses embeddings
Bedrock Server API (additive proxy routes only)
LOD Rendering Server API (additive endpoints); community detection (existing)

Suggested Review Order

PR 1: COBOL             ← review independently right now
PR 2: TUI + embed CLI   ← can be reviewed in parallel with COBOL  
PR 3: Neptune           ← after interfaces.ts changes are agreed on
PR 4: Bedrock           ← anytime, fully independent
PR 5: Perf + LOD        ← last, most risk to existing behavior

The key counterargument from the author (that the components are "tightly coupled") doesn't hold up against the actual code structure — the Neptune and COBOL changes integrate via clean adapter interfaces and conditional code paths, not intertwined logic.

@magyargergo

magyargergo commented Mar 18, 2026

Copy link
Copy Markdown
Collaborator

Hey @zander-raycraft, thanks for all the work on this PR. Unfortunately, due to a CI workflow issue, this PR was accidentally merged into main and we had to force-push to restore the correct history. The PR is now in a broken state (marked as merged but the commits aren't on main).

Could you re-submit your changes as separate PRs following the splitting strategy Claude suggested earlier? Here's the recommended breakdown:

  1. PR 1: COBOL Language Support (most self-contained, start here)
  2. PR 2: gitnexus embed CLI + TUI Wizards
  3. PR 3: AWS Neptune Backend
  4. PR 4: AWS Bedrock Integration
  5. PR 5: Performance & LOD Graph Rendering

Suggested review order:

PR 1: COBOL             ← review independently right now
PR 2: TUI + embed CLI   ← can be reviewed in parallel with COBOL
PR 3: Neptune           ← after interfaces.ts changes are agreed on
PR 4: Bedrock           ← anytime, fully independent
PR 5: Perf + LOD        ← last, most risk to existing behavior

See the full splitting analysis for detailed file lists, coupling notes, and review questions for each PR.

Splitting these up will make review much more manageable and reduce the risk of regressions. Let us know if you have any questions!

@naicud

naicud commented Mar 18, 2026

Copy link
Copy Markdown
Contributor Author

@magyargergo do you want an help ?

@magyargergo

Copy link
Copy Markdown
Collaborator

@magyargergo do you want an help ?

Yes please 🙏 I'm quite busy at the moment so if you could split up your changes into 5 separate PRs, that would be greatly appriciated and would be a great help!

@naicud

naicud commented Mar 18, 2026

Copy link
Copy Markdown
Contributor Author

@magyargergo I'm trying to split in 5 PR but we will have some conflicts

@naicud

naicud commented Mar 18, 2026

Copy link
Copy Markdown
Contributor Author

┌─────┬─────────────────────────┬──────┬────────┐
│ # │ PR │ URL │ Status │
├─────┼─────────────────────────┼──────┼────────┤
│ 1 │ AWS Bedrock Integration │ #358 │ ✅ │
├─────┼─────────────────────────┼──────┼────────┤
│ 2 │ COBOL Language Support │ #359 │ ✅ │
├─────┼─────────────────────────┼──────┼────────┤
│ 3 │ AWS Neptune Backend │ #360 │ ✅ │
├─────┼─────────────────────────┼──────┼────────┤
│ 4 │ TUI Wizards + Embed CLI │ #361 │ ✅ │
├─────┼─────────────────────────┼──────┼────────┤
│ 5 │ LOD Graph Rendering │ #362 │ ✅ │
└─────┴─────────────────────────┴──────┴────────┘

@magyargergo

@zander-raycraft

Copy link
Copy Markdown
Collaborator

Hey @magyargergo Did you mean to tag me in this PR? IF you need me to work on this I am happy to, I think this may be @naicud's PR and it looks like he is doing some stuff with it! If not then lmk what you do need help with!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants