Skip to content

feat(ruby): Add Ruby language support for CLI and web#111

Merged
magyargergo merged 18 commits into
abhigyanpatwari:mainfrom
candidosales:add-support-ruby-rails
Mar 13, 2026
Merged

feat(ruby): Add Ruby language support for CLI and web#111
magyargergo merged 18 commits into
abhigyanpatwari:mainfrom
candidosales:add-support-ruby-rails

Conversation

@candidosales

@candidosales candidosales commented Feb 28, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Adds Ruby as the 11th supported language across both CLI (worker + sequential fallback) and web ingestion pipelines
  • Full tree-sitter parsing with Ruby-specific query patterns for classes, modules, methods, calls, and heritage
  • Ruby-aware import resolution (require / require_relative) with proper path handling
  • Mixin heritage extraction (include / extend / prepend) routed as IMPLEMENTS edges
  • Property extraction from attr_accessor / attr_reader / attr_writer
  • Entry point scoring for Ruby conventions (service objects, background jobs, CLI executables)
  • Framework detection for Rake tasks, bin/exe scripts, and extensionless Ruby files
  • Ruby test file filtering (_spec.rb, _test.rb, /spec/) in MCP impact tool

What changed

Core parsing (both CLI + web):

  • tree-sitter-queries.ts — new Ruby query capturing classes, modules, methods, singleton methods, calls, and heritage
  • parsing-processor.ts — Ruby export detection (all symbols public by default)
  • utils.ts — file extension + extensionless filename mapping (Rakefile, Gemfile, etc.)
  • parser-loader.ts — Ruby WASM grammar registration

Call resolution (both CLI + web):

  • call-processor.ts — Ruby-specific routing: require/require_relative skipped (handled by import-processor), include/extend/prependIMPLEMENTS edges, attr_*Property nodes
  • import-processor.ts — Ruby require/require_relative captured from @call nodes (Ruby grammar doesn't produce @import captures)

CLI worker thread:

  • parse-worker.ts — Full Ruby call routing in the worker path (heritage, properties, import-like calls)

Scoring & detection (both CLI + web):

  • entry-point-scoring.ts — Ruby entry point patterns: call, perform, execute
  • framework-detection.ts — Ruby bin/exe detection, Rake tasks

MCP layer:

  • local-backend.tsisTestFilePath now filters _spec.rb, _test.rb, /spec/

Config:

  • supported-languages.tsRuby = 'ruby' added to enum
  • tree-sitter-ruby.wasm — WASM grammar binary for web

Design decisions

  1. Ruby calls routed in call-processor, not heritage-processor — Ruby's tree-sitter grammar captures include/extend/prepend as generic @call nodes, not @heritage. The routing lives in call-processor since it already iterates @call captures.

  2. All mixin types use trait-implIMPLEMENTSinclude, extend, and prepend all add module methods to a class/module. None represent true class inheritance (EXTENDS), so all three map to IMPLEMENTS.

  3. Ruby imports in import-processorrequire/require_relative are routed to import-processor (not just filtered as built-ins) so importMap gets populated before call resolution, enabling high-confidence (0.9) import-resolved call edges.

  4. Removed lib/ framework boost — Every .rb file under lib/ was getting a 1.5x entry point multiplier, which conflicts with isUtilityFile(). Removed to avoid scoring contradictions.

  5. Removed initialize and run from entry pointsinitialize is a constructor (called implicitly), not an entry point. run is too generic and creates false positives.

Test plan

  • Run npx gitnexus analyze on a Ruby/Rails codebase and verify:
    • Folders and files
    • Classes, modules, and methods appear as nodes
    • require/require_relative create IMPORTS edges
    • include/extend/prepend create IMPLEMENTS edges
    • attr_* create Property nodes
    • Method calls create CALLS edges with appropriate confidence
  • Verify web version produces identical graph structure
  • Verify impact({ includeTests: false }) filters Ruby spec files
  • Verify entry point scoring ranks call/perform/execute methods higher

Demo

  • 6,724 edges
  • 4,331 nodes
  • 709 clusters

Tested with Fizzy

Screenshot 2026-03-13 at 12 07 06 PM

How test it?

# Build
cd gitnexus && npm install && npm run build

# Analyze the codebase
node gitnexus/dist/cli/index.js analyze --force ../fizzy`

# Start the serve
npx gitnexus serve

# Open the 	Web UI
cd gitnexus-web && npm install && npm run dev

Syntax Highlight at Code inspector

Screenshot 2026-03-13 at 12 05 34 PM

candidosales and others added 2 commits February 27, 2026 17:53
Add Ruby as the 11th supported language in GitNexus, enabling code
intelligence for Ruby codebases. This includes:

- Tree-sitter parsing for classes, modules, methods, and singleton methods
- Import extraction for require/require_relative via call post-processing
- Mixin heritage detection for include/extend/prepend
- attr_accessor/attr_reader/attr_writer property extraction
- Ruby-specific built-in filtering (Kernel methods + enumerables)
- Framework detection for lib/, bin/, exe/, and Rake patterns
- Entry point scoring for service objects, jobs, and CLI commands
- Test file detection for _spec.rb, _test.rb, and spec/ directories
- WASM grammar for web version

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@vercel

vercel Bot commented Feb 28, 2026

Copy link
Copy Markdown

@candidosales is attempting to deploy a commit to the NexusCore Team on Vercel.

A member of the Team first needs to authorize it.

@candidosales candidosales changed the title Add support ruby rails feat(ruby): Add Ruby language support for CLI and web Feb 28, 2026
@candidosales candidosales marked this pull request as ready for review February 28, 2026 02:35
@candidosales

Copy link
Copy Markdown
Contributor Author

@abhigyanpatwari when you get a chance, could you review it? 🙇

@abhigyanpatwari

Copy link
Copy Markdown
Owner

Sure will look into it. Would you join the discord https://discord.gg/f3DAmGZb its getting difficult to manage stuff so created discord

…ed modules

Conflicts resolved by taking upstream refactored shared module approach
(extractFunctionName, isBuiltInOrNoise, isNodeExported from utils.js and
export-detection.js) while preserving Ruby-specific call handling (require,
include/extend/prepend, attr_accessor) and adding Ruby to the language matrix.
@magyargergo

Copy link
Copy Markdown
Collaborator

@claude review this pr and recommend a him the best way to migrate his ruby implementation into our current architucture.

@claude

claude Bot commented Mar 13, 2026

Copy link
Copy Markdown

Claude finished @magyargergo's task in 3m 2s —— View job


Ruby Implementation Migration Analysis

Tasks:

  • Gather context on the PR
  • Read current architecture files
  • Analyze PR changes vs current codebase patterns
  • Provide migration recommendations

Review Summary

This is a solid Ruby implementation that works end-to-end, but it has 4 architectural friction points that need to be resolved before merging. The core issue is that Ruby's unconventional grammar (everything-is-a-call) is handled with ad-hoc conditionals scattered across 3 files rather than following the patterns other languages use.


Issue 1: Critical — Triplicated Ruby Routing Logic

The require/include/attr_* routing logic is duplicated in three separate files with slightly different implementations:

File Path
CLI sequential fallback gitnexus/src/core/ingestion/call-processor.ts
CLI worker thread gitnexus/src/core/ingestion/workers/parse-worker.ts
Web gitnexus-web/src/core/ingestion/call-processor.ts

Each copy is subtly different — the sequential call-processor.ts omits callForm/argCount for regular Ruby calls while the worker path also omits them (both should match the full metadata other languages extract). Any future bug fix or Ruby grammar change requires three separate patches.

Recommended fix: Extract the routing logic into a shared helper in utils.ts:

// gitnexus/src/core/ingestion/utils.ts
export type RubyCallRouting =
  | { kind: 'import'; importPath: string }
  | { kind: 'heritage'; enclosingClass: string; mixinName: string }
  | { kind: 'property'; propName: string; startLine: number; endLine: number }
  | { kind: 'call' }
  | { kind: 'skip' };

export function routeRubyCall(calledName: string, callNode: any): RubyCallRouting { ... }

Then import and call this helper from all three locations.

Fix this →


Issue 2: Critical — Sequential Path Bypasses heritage-processor.ts

In the sequential fallback, call-processor.ts adds IMPLEMENTS edges directly to the graph for include/extend/prepend:

// gitnexus/src/core/ingestion/call-processor.ts:159-182
graph.addRelationship({
  id: relId, sourceId: childId, targetId: parentId,
  type: 'IMPLEMENTS', confidence: 1.0, reason: 'trait-impl',
});

But the worker path correctly serializes to result.heritage and lets processHeritageFromExtracted run later. The two paths produce different symbol resolution quality because heritage-processor.ts uses resolveExtendsType() (which checks symbol table types, understands import scope, and has language-aware fallbacks), while call-processor.ts uses a bare symbolTable.lookupFuzzy(mixinName)[0]?.nodeId with a hard-coded generateId('Module', ...) fallback.

Recommended fix: In the sequential processCalls function, instead of writing to the graph directly, push to a local array and call processHeritageFromExtracted at the end (or have call-processor.ts return extracted heritage items):

// Instead of graph.addRelationship() directly:
extractedHeritage.push({
  filePath: file.path,
  className: enclosingClass,
  parentName: mixinName,
  kind: 'trait-impl',
});
// ...after the main loop:
await processHeritageFromExtracted(graph, extractedHeritage, symbolTable, importMap);

Fix this →


Issue 3: High — Sequential Path Creates Property Nodes Inside processCalls

In call-processor.ts:189-210, attr_accessor/attr_reader/attr_writer creates Property nodes by calling graph.addNode() directly inside the call processor. This is wrong for two reasons:

  1. Wrong phase — Node creation should happen during parsing, not call resolution. The symbol table gets updated during call processing which means those symbols are invisible to prior resolution steps.
  2. Missing description field — The worker path includes description: calledName (which accessor type generated it), but the sequential path omits it.

Recommended fix: Either move attr_* property extraction into parsing-processor.ts (run it as a post-parse pass over Ruby files), or in the very least, make the sequential path match the worker path by including the description field.


Issue 4: Medium — Regular Ruby Calls Missing Metadata

In parse-worker.ts:959-963, regular Ruby calls omit metadata that all other languages include:

// Current (Ruby):
result.calls.push({ filePath: file.path, calledName, sourceId });

// Should match other languages:
result.calls.push({
  filePath: file.path, calledName, sourceId,
  argCount: countCallArguments(callNode),
  callForm, receiverName, receiverTypeName,
});

Without callForm and argCount, Ruby call resolution uses weaker heuristics and produces more false positives (no arity filtering, no member vs. free-call disambiguation).

Fix this →


Minor Issues

Web uses string literal instead of enum (gitnexus-web/src/core/ingestion/call-processor.ts:203): Uses language === 'ruby' rather than language === SupportedLanguages.Ruby. This compiles fine today but will break silently if the enum value ever changes.

type-extractors/index.ts:33-37 — The Ruby stub is a valid placeholder, but since Ruby uses duck typing it could eventually benefit from simple instance variable type hints (e.g., @user: User). Worth a TODO comment.

@heritage capture in RUBY_QUERIES is never checked in parse-worker's captureMap because the heritage block only fires when captureMap['heritage.class'] exists, which it will for class < SuperClass — this path is fine and is correctly handled by the existing heritage processor.


Checklist for Migration

    • Extract routeRubyCall() helper into utils.ts (fixes Issue 1)
    • Fix sequential processCalls to route mixin heritage through processHeritageFromExtracted (fixes Issue 2)
    • Move attr_* property extraction to parsing phase or ensure consistent fields with worker path (fixes Issue 3)
    • Add callForm/argCount/receiverName/receiverTypeName to regular Ruby calls in parse-worker (fixes Issue 4)
    • Replace 'ruby' string literal with SupportedLanguages.Ruby in web call-processor

The grammar queries, import resolution, framework detection, entry-point scoring, and MCP test file filtering are all well-implemented and follow existing patterns closely. The foundation is solid — these are structural consistency issues rather than correctness bugs.

@github-actions

github-actions Bot commented Mar 13, 2026

Copy link
Copy Markdown
Contributor

CI Report

All checks passed

Pipeline Status

Stage Status Details
✅ Typecheck success tsc --noEmit
✅ Unit Tests success 3 platforms
✅ Integration success 3 OS x 4 groups = 12 jobs

Test Results

1609 passed
· 508 suites · 1641 total
· ⏱️ 35s
· 📊 1151 unit + 490 integration

Code Coverage

Combined (Unit + Integration)

Metric Coverage Covered Threshold Status
Statements 47.58% 3728/7834 26% 🟢 █████████░░░░░░░░░░░
Branches 39.25% 2141/5454 23% 🟢 ███████░░░░░░░░░░░░░
Functions 50.24% 414/824 28% 🟢 ██████████░░░░░░░░░░
Lines 48.98% 3456/7055 27% 🟢 █████████░░░░░░░░░░░
Coverage breakdown by test suite

Unit Tests

Metric Coverage Covered Threshold Status
Statements 36.52% 2861/7834 26% 🟢 ███████░░░░░░░░░░░░░
Branches 30.8% 1680/5454 23% 🟢 ██████░░░░░░░░░░░░░░
Functions 38.47% 317/824 28% 🟢 ███████░░░░░░░░░░░░░
Lines 37.87% 2672/7055 27% 🟢 ███████░░░░░░░░░░░░░

Integration Tests

Metric Coverage Covered Threshold Status
Statements 24.45% 1916/7834 26% 🔴 ████░░░░░░░░░░░░░░░░
Branches 19.16% 1045/5454 23% 🔴 ███░░░░░░░░░░░░░░░░░
Functions 26.57% 219/824 28% 🔴 █████░░░░░░░░░░░░░░░
Lines 25.11% 1772/7055 27% 🔴 █████░░░░░░░░░░░░░░░
Coverage thresholds are auto-ratcheted — they only go up

Vitest thresholds.autoUpdate bumps the floor whenever local coverage exceeds it.
CI enforces the current thresholds; developers commit the ratcheted values.


📋 View full run · Generated by CI

Repository owner deleted a comment from claude Bot Mar 13, 2026
@magyargergo

Copy link
Copy Markdown
Collaborator

@claude can you please design the ingestion integration tests similarly to other languages like c/c++? We need as much coverage as we can get for ruby.

@claude

claude Bot commented Mar 13, 2026

Copy link
Copy Markdown

Claude finished @magyargergo's task in 8m 6s —— View job


Ruby Integration Tests Design

Tasks:

  • Gather context on the request
  • Read existing C/C++ integration tests to understand patterns
  • Read existing Ruby implementation files
  • Read test infrastructure and fixtures
  • Create Ruby fixture files (8 scenarios, 23 files)
  • Implement gitnexus/test/integration/resolvers/ruby.test.ts
  • Commit and push changes

What was added

gitnexus/test/integration/resolvers/ruby.test.ts — 8 describe suites with 27 it assertions, modelled exactly after the C/C++ and Python resolver tests.

Scenario Fixture dir What it tests
ruby-pkg 5 files across lib/, services/, utils/ Class detection, Module detection, Method detection, EXTENDS (class < Super), IMPLEMENTS (include), IMPORTS (4 require_relative edges)
ruby-ambiguous 2 Handler classes in a/ and b/ Import disambiguates EXTENDS — resolves to handlers/a/handler.rb, not handlers/b/
ruby-calls one.rb (1 param), zero.rb (0 params), service.rb Arity-filtered call resolution: write_audit("hello")one.rb with reason: 'import-resolved'
ruby-member-calls user.rb + app.rb user.save member call resolves via import, HAS_METHOD edge verified
ruby-mixin-heritage auditable.rb, cacheable.rb, hookable.rb, record.rb All three mixin forms (include/extend/prepend) → 3 IMPLEMENTS edges; zero EXTENDS edges
ruby-attr-properties models/person.rb attr_accessor :name, :age + attr_reader :id + attr_writer :email → 4 Property nodes with correct description (accessorType)
ruby-receiver-resolution user.rb, repo.rb, app.rb Both User/Repo classes with save; HAS_METHOD for each; at least 1 resolved CALLS edge
ruby-local-shadow src/main.rb, src/utils.rb run → save resolves to same-file main.rb, not the require'd utils.rb

Design notes mirroring C/C++

  • Same fixture structure as cpp-diamond, cpp-ambiguous, cpp-calls, cpp-member-calls, etc.
  • Ambiguity via suffix matchingrequire_relative '../handlers/a/handler' resolves because the suffix resolver tries progressively shorter path segments, reaching handlers/a/handler.rb which uniquely identifies the correct file.
  • Conservative CALLS assertions for receiver resolution — Ruby has no type annotations, so the receiver-resolution suite verifies HAS_METHOD correctness and "at least 1 resolved call" rather than pinning both to specific files (unlike Python/C++ which use explicit type hints).
    · Branch

@candidosales

Copy link
Copy Markdown
Contributor Author

@magyargergo it's ready 👍

@magyargergo magyargergo merged commit 0999595 into abhigyanpatwari:main Mar 13, 2026
20 of 21 checks passed
@magyargergo

magyargergo commented Mar 14, 2026

Copy link
Copy Markdown
Collaborator

Hi @candidosales,

While working on my PR #274 I merged #111 into master and then into my branch, which gave me a chance to test the Ruby graph building end-to-end. I found a few issues worth flagging:

1. Ruby methods not recognised as function scopes (fixed in #274)

FUNCTION_NODE_TYPES in utils.ts was missing Ruby's method and singleton_method node types, and extractFunctionName had no handler for them. This meant all Ruby CALLS edges were being attributed to the file rather than the method they belong to. I've added both and it's working now.

2. Bare Ruby method calls are not captured (tree-sitter limitation)

This is the main finding. In Ruby it's idiomatic to call methods without a receiver or parentheses , e.g. persist instead of self.persist(). Tree-sitter-ruby parses these as plain identifier nodes, not call nodes. The current query (call method: (identifier) @call.name) only matches call nodes, so bare calls are completely invisible to the ingestion pipeline. Calls with an explicit receiver like user.persist work fine. I've documented this with 3 it.todo tests explaining the limitation.

3. Ruby receiver extraction not wired up (minor)

MEMBER_ACCESS_NODE_TYPES doesn't include Ruby's call node, so even explicit-receiver calls like user.persist get classified as "free" calls rather than "member" calls. The tests still pass because resolution falls back to name-based matching, but type-based receiver resolution won't work for Ruby until this is added.

Current state: 10 Ruby resolver integration tests passing, 3 marked as todo due to the bare call limitation. I also moved all Ruby-specific call routing logic into ruby-call-routing.ts to keep the processors language-agnostic.

Happy to discuss any of this further!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants