Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
239 changes: 239 additions & 0 deletions .cursor/plans/enhance_523ca41c.plan.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,239 @@
---
name: Enhance
overview: Restructure GitNexus LLM tools to leverage clusters and processes for better code understanding. Remove unused highlight tool, add new tools (explore, overview), enhance existing tools with cluster/process context, and improve impact analysis reliability.
todos: []
---

# Enhanced LLM Tools with Cluster and Process Integration

## Summary

Consolidate GitNexus from 6 tools to **7 focused tools** that leverage the pre-computed clusters (Communities) and processes for richer context. Remove the highlight tool, add `explore` and `overview` tools, and enhance `search` and `blastRadius` with cluster/process awareness.

## Final Tool Set

| Tool | Status | Purpose ||------|--------|---------|| `search` | Enhance | Hybrid search + group results by process/cluster || `grep` | Keep | Regex pattern search || `read` | Keep | Read file content || `explore` | **New** | Deep dive on one symbol, cluster, or process || `overview` | **New** | Codebase map (all clusters + all processes) || `impact` | Enhance | Rename from blastRadius, add process/cluster context, increase limits || `cypher` | Keep | Raw graph queries || `highlight` | **Remove** | No longer needed |

## Architecture

```mermaid
flowchart TD
subgraph tools [LLM Tools Layer]
search[search]
grep[grep]
read[read]
explore[explore]
overview[overview]
impact[impact]
cypher[cypher]
end

subgraph graph [Knowledge Graph]
nodes[Nodes: File, Function, Class...]
communities[Community Nodes]
processes[Process Nodes]
edges[CodeRelation Edges]
memberOf[MEMBER_OF Edges]
stepIn[STEP_IN_PROCESS Edges]
end

search --> edges
search --> communities
search --> processes
explore --> communities
explore --> processes
explore --> memberOf
explore --> stepIn
overview --> communities
overview --> processes
impact --> edges
impact --> communities
impact --> processes
cypher --> graph
```



## File Changes

### 1. Remove Highlight Tool

**File:** [gitnexus/src/core/llm/tools.ts](gitnexus/src/core/llm/tools.ts)

- Delete the `highlightTool` definition (lines ~395-414)
- Remove `highlightTool` from the returned array (line ~862)
- Remove highlight marker logic from `blastRadius` output (line ~814-816)

**File:** [gitnexus/src/core/llm/agent.ts](gitnexus/src/core/llm/agent.ts)

- Remove highlight references from system prompt (lines 70, 77)
- Update tool list in prompt to reflect new tools

**File:** [gitnexus/src/core/llm/types.ts](gitnexus/src/core/llm/types.ts)

- Remove `'highlight'` from `AgentStreamChunk.type` union (line 180)
- Remove `highlightNodeIds` property (line 187-188)

### 2. Add `explore` Tool

**File:** [gitnexus/src/core/llm/tools.ts](gitnexus/src/core/llm/tools.ts)New tool that auto-detects target type and returns comprehensive context:

```typescript
explore({
target: string, // Name of symbol, cluster, or process
type?: 'symbol' | 'cluster' | 'process' // Optional, auto-detected
})
```

**Functionality:**

- For symbols: Query node, get MEMBER_OF cluster, get STEP_IN_PROCESS processes, get 1-hop connections
- For clusters: Query Community node, get members via MEMBER_OF, get processes that touch this cluster
- For processes: Query Process node, get steps via STEP_IN_PROCESS with step order, get clusters touched

**Cypher queries needed:**

```cypher
-- Symbol cluster membership
MATCH (s {name: $name})-[:CodeRelation {type: 'MEMBER_OF'}]->(c:Community)
RETURN c.label, c.description

-- Symbol process participation
MATCH (s {name: $name})-[r:CodeRelation {type: 'STEP_IN_PROCESS'}]->(p:Process)
RETURN p.label, r.step, p.stepCount

-- Process steps in order
MATCH (s)-[r:CodeRelation {type: 'STEP_IN_PROCESS'}]->(p:Process {id: $processId})
RETURN s.name, s.filePath, r.step
ORDER BY r.step
```



### 3. Add `overview` Tool

**File:** [gitnexus/src/core/llm/tools.ts](gitnexus/src/core/llm/tools.ts)New tool that returns codebase structure:

```typescript
overview() // No parameters
```

**Functionality:**

- Query all Community nodes with member counts
- Query all Process nodes with step counts and types
- Calculate cluster dependencies (cross-cluster CALLS)
- Identify critical paths (most connected processes)

**Output format:**

```javascript
CLUSTERS (N total):
| Cluster | Symbols | Cohesion | Description |
...

PROCESSES (N total):
| Process | Steps | Type | Clusters |
...

CRITICAL PATHS:
- LoginFlow (45 edges)
...
```



### 4. Enhance `search` Tool

**File:** [gitnexus/src/core/llm/tools.ts](gitnexus/src/core/llm/tools.ts)Modify existing search to group results by process:**Current:** Returns flat list with 1-hop connections**Enhanced:** Groups results by process, adds cluster context**Changes:**

- After hybrid search, query STEP_IN_PROCESS for each result
- Group results by process ID
- Sort processes by number of matching results (relevance)
- Add cluster label for each result via MEMBER_OF query
- Keep 1-hop connections as optional detail

**New parameter:**

```typescript
search({
query: string,
groupByProcess?: boolean, // Default: true
limit?: number
})
```



### 5. Enhance `impact` Tool (rename from blastRadius)

**File:** [gitnexus/src/core/llm/tools.ts](gitnexus/src/core/llm/tools.ts)**Rename:** `blastRadiusTool` to `impactTool`**Enhancements:**

1. Increase LIMIT clauses: 100 to 300 (depth 1), 100 to 200 (depth 2), 50 to 100 (depth 3)
2. Add affected processes section (query STEP_IN_PROCESS for all affected symbols)
3. Add affected clusters section (query MEMBER_OF for all affected symbols)
4. Add risk assessment summary
5. Surface confidence scores more prominently (group by confidence level)

**New output sections:**

```javascript
AFFECTED PROCESSES:
- LoginFlow - BROKEN at step 2
- SignupFlow - BROKEN at step 1

AFFECTED CLUSTERS:
- Authentication (direct)
- API Routes (indirect)

RISK: CRITICAL
- N direct callers
- N processes affected
- N clusters affected
```



### 6. Increase Process Detection Limits

**File:** [gitnexus/src/core/ingestion/process-processor.ts](gitnexus/src/core/ingestion/process-processor.ts)Change default config (lines 27-32):

```typescript
const DEFAULT_CONFIG: ProcessDetectionConfig = {
maxTraceDepth: 10, // Keep
maxBranching: 4, // Was 3
maxProcesses: 75, // Was 50
minSteps: 2, // Keep
};
```



### 7. Update System Prompt

**File:** [gitnexus/src/core/llm/agent.ts](gitnexus/src/core/llm/agent.ts)Update BASE_SYSTEM_PROMPT to reflect new tools:

```javascript
## TOOLS
- **search** - Hybrid search. Results grouped by process with cluster context.
- **grep** - Regex pattern search for exact strings.
- **read** - Read file content.
- **explore** - Deep dive on a symbol, cluster, or process. Shows membership, participation, connections.
- **overview** - Codebase map showing all clusters and processes.
- **impact** - Impact analysis. Shows affected processes, clusters, and risk level.
- **cypher** - Raw Cypher queries against the graph.

## GRAPH SCHEMA
Nodes: File, Folder, Function, Class, Interface, Method, Community, Process
Relations: CodeRelation with type: CONTAINS, DEFINES, IMPORTS, CALLS, EXTENDS, IMPLEMENTS, MEMBER_OF, STEP_IN_PROCESS
```



## Implementation Order

1. Remove highlight tool (cleanup)
2. Increase process detection limits
3. Add overview tool (simplest new tool)
4. Add explore tool
5. Enhance impact tool
18 changes: 18 additions & 0 deletions .sisyphus/drafts/gitnexus-brainstorming.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# Draft: Gitnexus Brainstorming - Clustering & Process Maps

## Initial Context
- Project: **GitnexusV2**
- Structure:
- `gitnexus/` (Likely the core application)
- `gitnexus-mcp/` (Likely a Model Context Protocol server)
- Goal: Make it accurate and usable for smaller/dumber models.
- Current Focus: Implementing **Clustering** and **Process Maps**.

## Findings
- **Clustering**: Found `gitnexus/src/core/ingestion/cluster-enricher.ts`.
- **Process Maps**: No files matched `*process*map*` yet. Searching content next.

## Open Questions
- How is "process map" defined in this context? (Graph, mermaid diagram, flowchart?)
- What is the input for clustering? (Code chunks, files, commits?)
- What is the intended output for "smaller models"? (Simplified context, summaries?)
34 changes: 34 additions & 0 deletions .sisyphus/drafts/noodlbox-comparison.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# Draft: Gitnexus vs Noodlbox Strategy

## Objectives
- Understand GitnexusV2 current state and goals.
- Analyze Noodlbox capabilities from provided URL.
- Compare features, architecture, and value proposition.
- Provide strategic views and recommendations.

## Research Findings
- [GitnexusV2]: Zero-server, browser-native (WASM), KuzuDB based. Graph + Vector hybrid search.
- [Noodlbox]: CLI-first, heavy install. Has "Session Hooks" and "Search Hooks" via plugins/CLI.

## Comparison Points
- **Core Philosophy**: Both bet on "Knowledge Graph + MCP" as the future. Noodlbox validates Gitnexus's direction.
- **Architecture**:
- *Noodlbox*: CLI/Binary based. Likely local server management.
- *Gitnexus*: Zero-server, Browser-native (WASM). Lower friction, higher privacy.
- **Features**:
- *Communities/Processes*: Both have them. Noodlbox uses them for "context injection". Gitnexus uses them for "visual exploration + query".
- *Impact Analysis*: Noodlbox has polished workflows (e.g., `detect_impact staged`). Gitnexus has the engine (`blastRadius`) but maybe not the specific workflow wrappers yet.
- **UX/Integration**:
- *Noodlbox*: "Hooks" (Session/Search) are a killer feature. Proactively injecting context into the agent's session.
- *Gitnexus*: Powerful tools, but relies on agent *pulling* data?

## Strategic Views
1. **Validation**: The market direction is confirmed. You are building the right thing.
2. **differentiation**: Lean into "Zero-Setup / Browser-Native". Noodlbox requires `noodl init` and CLI handling. Gitnexus could just *be*.
3. **Opportunity**: Steal the "Session/Search Hooks" pattern. Make the agent smarter *automatically* without the user asking "check impact".
4. **Workflow Polish**: Noodlbox's `/detect_impact staged` is a great specific use case. Gitnexus should wrap `blastRadius` into similar concrete workflows.

## Technical Feasibility (Interception)
- **Cursor**: Use `.cursorrules` to "shadow" default tools. Instruct agent to ALWAYS use `gitnexus_search` instead of `grep`.
- **Claude Code**: Likely uses a private plugin API for `PreToolUse`. We can't match this exactly without an official plugin, but we can approximate it with strong prompt instructions in `AGENTS.md`.
- **MCP Shadowing**: Define tools with names that conflict (e.g., `grep`)? No, unsafe. Better to use "Virtual Hooks" via system prompt instructions.
4 changes: 2 additions & 2 deletions gitnexus-mcp/package.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "gitnexus-mcp",
"version": "0.1.1",
"version": "0.2.0",
"description": "MCP server for GitNexus code intelligence - connect Cursor, Claude, and other AI agents to your codebase",
"author": "Abhigyan Patwari",
"license": "MIT",
Expand Down Expand Up @@ -44,4 +44,4 @@
"engines": {
"node": ">=18.0.0"
}
}
}
18 changes: 12 additions & 6 deletions gitnexus-mcp/src/mcp/server.ts
Original file line number Diff line number Diff line change
Expand Up @@ -74,22 +74,28 @@ function formatContextAsMarkdown(context: CodebaseContext): string {
// Usage hints
lines.push('## 🛠️ Available Tools');
lines.push('');
lines.push('- **search**: Semantic search across the codebase');
lines.push('- **cypher**: Execute Cypher queries on the knowledge graph');
lines.push('- **blastRadius**: Analyze impact of changes to a node');
lines.push('- **highlight**: Visualize nodes in the graph');
lines.push('- **search**: Semantic + keyword search across codebase');
lines.push('- **cypher**: Execute Cypher queries on knowledge graph');
lines.push('- **grep**: Regex pattern search in files');
lines.push('- **read**: Read file contents');
lines.push('- **explore**: Deep dive on symbol, cluster, or process');
lines.push('- **overview**: Codebase map (all clusters + processes)');
lines.push('- **impact**: Analyze change impact (upstream/downstream)');
lines.push('- **highlight**: Visualize nodes in graph');
lines.push('');
lines.push('## 📝 Graph Schema');
lines.push('');
lines.push('**Node Types**: File, Folder, Function, Class, Interface, Method');
lines.push('**Node Types**: File, Folder, Function, Class, Interface, Method, Community, Process');
lines.push('');
lines.push('**Relation**: `CodeRelation` with `type` property:');
lines.push('- CONTAINS, DEFINES, IMPORTS, CALLS, EXTENDS, IMPLEMENTS');
lines.push('- CALLS, IMPORTS, EXTENDS, IMPLEMENTS, CONTAINS, DEFINES');
lines.push('- MEMBER_OF (symbol → community), STEP_IN_PROCESS (symbol → process)');
lines.push('');
lines.push('**Example Cypher Queries**:');
lines.push('```cypher');
lines.push('MATCH (f:Function) RETURN f.name LIMIT 10');
lines.push("MATCH (f:File)-[:CodeRelation {type: 'IMPORTS'}]->(g:File) RETURN f.name, g.name");
lines.push("MATCH (s)-[:CodeRelation {type: 'MEMBER_OF'}]->(c:Community) RETURN c.label, count(s)");
lines.push('```');

return lines.join('\n');
Expand Down
Loading