Skip to content

feat(review): add PR reviewer swarm agents#1851

Merged
magyargergo merged 6 commits into
mainfrom
feat/gitnexus-reviewer-swarm
May 29, 2026
Merged

feat(review): add PR reviewer swarm agents#1851
magyargergo merged 6 commits into
mainfrom
feat/gitnexus-reviewer-swarm

Conversation

@magyargergo

Copy link
Copy Markdown
Collaborator

Summary

Adds a coordinated, read-only Claude Code reviewer swarm for GitNexus pull requests. Seven specialized subagents — each focused on a single review domain — are orchestrated by a new /gitnexus-pr-swarm-review skill that produces structured, evidence-grounded production-readiness reviews.

The swarm coexists with the existing single-agent /gitnexus-pr-review skill as a deeper alternative.

What's included

7 project-level subagents (.claude/agents/):

Agent Focus
gitnexus-pr-facts-historian PR identity, GitHub state, changed files, commits, linked issues, repo history, visibility gaps
gitnexus-branch-hygiene-reviewer Merge state and branch hygiene classification using exact enumerated values
gitnexus-risk-architect Production failure modes via risk-model-first reasoning
gitnexus-test-ci-verifier Test coverage, CI wiring, validation gaps
gitnexus-security-boundary-reviewer Auth, secrets, injection, hidden Unicode/bidi controls, trust boundaries
gitnexus-docs-dod-reviewer PR-specific Definition of Done from repo guidance docs
gitnexus-synthesis-critic Final review validation for evidence grounding and verdict-rule compliance

1 orchestration skill (.claude/skills/gitnexus-pr-swarm-review/SKILL.md):

  • Invoked as /gitnexus-pr-swarm-review <PR number or URL>
  • Dispatches agents in dependency order (facts first, then parallel domain review, synthesis last)
  • Enforces exact enum classifications for merge state, branch hygiene, and final verdict
  • Requires evidence citations on all findings

1 documentation file (.claude/README-gitnexus-reviewer-swarm.md)

Design decisions

  • All agents are read-only — tools limited to Read, Grep, Glob, Bash. No agent can edit files.
  • Missing visibility → verification work — when GitHub state can't be determined, agents must convert gaps into mandatory verification points rather than inventing facts.
  • Exact enum verdicts — merge state (9 values), branch hygiene (4 values), and final verdict (4 values) use closed sets to prevent ambiguous or creative classifications.
  • No hooks — manually invoked only. Automatic triggering is deferred to a follow-up PR.

.gitignore changes

Changed .claude/agents/ and .claude/skills/ from directory-level ignores to content-level ignores (* instead of /) so negation patterns can un-ignore specific files. Added negation patterns for gitnexus-*.md agents and the gitnexus-pr-swarm-review/ skill.

Test plan

  • All 7 agent files have valid YAML frontmatter (--- delimiters, name, description, tools, model, maxTurns)
  • No agent contains Edit, Write, or NotebookEdit tools
  • Skill file has valid frontmatter
  • All files are visible to git (not gitignored)
  • Existing .claude/skills/gitnexus/ skills are not affected by .gitignore changes
  • Invoke /gitnexus-pr-swarm-review on a test PR to validate end-to-end flow
  • Restart Claude Code and verify agents appear in the available agent list

🤖 Generated with Claude Code

Seven read-only subagents coordinated by an orchestration skill for
structured, evidence-grounded production-readiness PR reviews.

Agents: facts-historian, branch-hygiene, risk-architect, test-ci-verifier,
security-boundary, docs-dod, synthesis-critic. All use Read/Grep/Glob/Bash
only — no edit tools.

Skill invoked as /gitnexus-pr-swarm-review <PR>.
@vercel

vercel Bot commented May 27, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
gitnexus Ready Ready Preview, Comment May 29, 2026 2:49pm

Request Review

@github-actions

github-actions Bot commented May 27, 2026

Copy link
Copy Markdown
Contributor

CI Report

All checks passed

Pipeline Status

Stage Status Details
✅ Typecheck success tsc --noEmit
✅ Tests success unit tests, 3 platforms
✅ E2E success gitnexus-web changes only

Test Results

Tests Passed Failed Skipped Duration
10080 10075 0 5 637s

✅ All 10075 tests passed

5 test(s) skipped — expand for details
  • COBOL pipeline benchmark > scales with file count
  • PHP pipeline benchmark > scales with file count (workers enabled)
  • Ruby pipeline benchmark > scales with file count (workers enabled)
  • Rust pipeline benchmark > scales with file count (workers enabled)
  • buildTypeEnv > known limitations (documented skip tests) > Ruby block parameter: users.each { |user| } — closure param inference, different feature

Code Coverage

Tests

Metric Coverage Covered Base Delta Status
Statements 79.86% 35479/44425 79.48% 📈 +0.4 🟢 ███████████████░░░░░
Branches 68.44% 22629/33063 68% 📈 +0.4 🟢 █████████████░░░░░░░
Functions 84.87% 3659/4311 84.56% 📈 +0.3 🟢 ████████████████░░░░
Lines 83.42% 31966/38319 82.97% 📈 +0.5 🟢 ████████████████░░░░

📋 View full run · Generated by CI

@github-actions

github-actions Bot commented May 27, 2026

Copy link
Copy Markdown
Contributor

Claude finished @magyargergo's task in 8m 24s —— View job


Production-Readiness Review: PR #1851feat(review): add PR reviewer swarm agents

Reviewer personas: Senior GitNexus maintainer · Context engineer · LLM expert

Progress Checklist

  • Post initial comment
  • Verify PR facts (git log, branch hygiene, merge commit analysis)
  • Read required repo docs (DoD.md, AGENTS.md, GUARDRAILS.md — DoD.md read in full)
  • Inspect all 10 changed files completely
  • Inspect related context surfaces (existing skills, ai-context.ts write paths)
  • Run .gitignore validation (git ls-files — all new files confirmed tracked)
  • Parse YAML frontmatter from all agent/skill files
  • Run hidden Unicode/bidi scan
  • Compare with existing /gitnexus-pr-review skill
  • Search generate/overwrite paths (installSkills, generateAIContextFiles)
  • Build risk model and write findings
  • Write final verdict

Review bar for this PR

Applied DoD.md §2 (Correctness, Security, Tests, Scope, Agent-Assisted Guardrails §3), AGENTS.md Claude Code conventions, and the PR's own stated DoD. This PR touches only .claude/agents/, .claude/skills/gitnexus-pr-swarm-review/, .claude/README-gitnexus-reviewer-swarm.md, and .gitignore. No TypeScript/runtime/UI code is touched; the DoD's validation baseline items for gitnexus/, gitnexus-web/, and gitnexus-shared/ do not apply. The relevant bar is: valid discoverable agent/skill files, correct .gitignore tracking, no safety policy gaps, no hidden Unicode, CI on changed files, and correct Claude Code frontmatter.


Problem being solved

Adds a coordinated, read-only Claude Code reviewer swarm as a deeper alternative to the existing /gitnexus-pr-review single-agent skill. Seven specialized subagents cover discrete review domains (facts, branch hygiene, risk, test/CI, security, docs/DoD, synthesis critique), orchestrated by a new /gitnexus-pr-swarm-review slash skill.


Current PR state

Open, not draft. 2 visible commits (1 PR commit + 1 merge-from-main). CI bot reports all 9984 tests passed with coverage above base. Vercel preview ready. No reviewers or labels assigned.


Merge status and mergeability

Classification: visibility incomplete

gh pr view with --json mergeable,mergeStateStatus could not be run (requires approval). From visible evidence: CI bot reported all checks passed, Vercel preview deployed, no conflicts visible. The authoritative mergeability status was not verified via API.


Repository history considered


Branch hygiene assessment

Classification: merge-from-main commit present but harmless and merge-safe

Evidence from git show --stat:

  • Commit 4067f8b: adds 10 files — all 7 agent .md files, 1 skill .md, 1 README, 1 .gitignore mod. Scope exactly matches PR description.
  • Commit 186ed18 (HEAD, merge from main): touches only gitnexus/src/core/group/extractors/http-patterns/ (Kotlin extractor from PR feat(group): add Kotlin Spring HTTP route extraction (named + positional) #1849). This is a routine sync of main into the feature branch — not pollution. The merge commit only modifies the Kotlin HTTP extractor (unrelated to the swarm), leaving all .claude/ files and .gitignore from commit 4067f8b intact.
  • No unrelated churn, no reformatting, no workflow files, no release/Docker changes.

Understanding of the change

The PR adds 7 read-only Claude Code project-level subagents under .claude/agents/ and one orchestration skill under .claude/skills/gitnexus-pr-swarm-review/SKILL.md. The agents delegate distinct review domains; the coordinator dispatches them in dependency order (facts → hygiene → parallel domain lanes → synthesis critique). The .gitignore is updated from directory-level ignores (.claude/agents/, .claude/skills/) to content-level ignores (.claude/agents/*, .claude/skills/*) so that negation patterns can un-ignore specific committed files.


Findings

Finding 1 — model: sonnet is not a valid Claude Code model ID

Confirmed.

  • Risk: All 7 agent files declare model: sonnet (lines 9 in each agent). sonnet is not a recognized Claude Code model ID. Valid IDs are claude-sonnet-4-6, claude-opus-4-6, etc. (per CLAUDE.md and Anthropic model table). If Claude Code rejects unrecognized values, it may fall back to the default model silently — which would mean all 7 agents operate on the coordinator's model instead of the declared intent. Worse, future Claude Code versions may error on invalid model values instead of ignoring them.
  • Evidence: .claude/agents/*.md line 9 in every file: model: sonnet. CLAUDE.md documents claude-sonnet-4-6 as the correct model ID.
  • Recommended fix: Change model: sonnet to model: claude-sonnet-4-6 in all 7 agent files. Fix this →
  • Blocks merge: yes — semantically invalid frontmatter field value; risk of silent model downgrade or future load failure.

Finding 2 — Bash read-only enforcement is prose-only, no explicit prohibited-command list

Confirmed.

  • Risk: Every agent lists Bash in its tools array and says "Do not edit files. You are read-only." in its Rules section. But Bash is not inherently read-only — an LLM running these agents could invoke gh pr comment (post to GitHub), git commit, npm install, write files via shell redirection, or run git checkout -- <file>. No agent contains an explicit prohibited-command list. The security-boundary agent's own instructions say to run git grep and git diff commands, but none of the 7 agents say "do NOT run gh pr comment, sed -i, git commit, or any command that writes files or posts to GitHub."
  • Evidence: .claude/agents/gitnexus-pr-facts-historian.md lines 43–51: gh pr view, gh pr diff, gh issue view — these are read-only. But no agent says gh pr comment, git commit, and file writes are prohibited. .claude/agents/gitnexus-security-boundary-reviewer.md lines 42–53 instructs running three specific git commands but adds no prohibition.
  • Recommended fix: Add a Bash policy block to every agent's Rules section:
    **Bash is read-only.** Permitted: `git log`, `git diff`, `git show`, `git grep`, `git ls-files`, `gh pr view`, `gh pr diff`, `gh pr checks`, `grep`, `cat`, `find`. Prohibited: any command that writes files, posts to GitHub (`gh pr comment`, `gh pr review`, `gh issue comment`), installs packages (`npm install`, `pip install`), modifies git state (`git commit`, `git checkout -- <file>`, `git add`), or executes arbitrary shell scripts.
    
  • Blocks merge: maybe — no agent explicitly authorizes mutation and current Claude models follow "read-only" intent well, but the gap leaves the policy undefended against prompt injection or instruction override in PR content being reviewed.

Finding 3 — Two test plan items explicitly unchecked (discovery and end-to-end invocation unverified)

Confirmed.

  • Risk: The PR's own test plan marks these as incomplete:
    • [ ] Invoke /gitnexus-pr-swarm-review on a test PR to validate end-to-end flow
    • [ ] Restart Claude Code and verify agents appear in the available agent list
      Agent discovery depends on model/maxTurns validity (see Findings 1 and 4) and on Claude Code's parsing of the frontmatter at startup. If agents are undiscoverable, the entire PR's value is zero.
  • Evidence: PR description test plan, last two items unchecked. No CI workflow runs discovery tests for .claude/agents/ files.
  • Recommended fix: Before merging: (1) restart Claude Code on this branch; (2) verify all 7 agents appear in /agents; (3) invoke /gitnexus-pr-swarm-review <test-PR> and confirm lane dispatch. Document the observed behavior in the PR description (checking those boxes). If model: sonnet is rejected during this test, that confirms Finding 1 is a hard blocker.
  • Blocks merge: yes — per the PR's own DoD and DoD.md §3 "No false-done": validation is not claimed until the baseline has been run.

Finding 4 — maxTurns field: casing may be incorrect

Unverified suspicion (not confirmed blocker).

  • Risk: Claude Code subagent docs reference max_turns (snake_case) in some contexts. The agents use maxTurns (camelCase). If the field name is unrecognized, it is silently ignored — agents would run until Claude Code's default turn limit. This is misleading to future maintainers who may think turn limits are enforced.
  • Evidence: All 7 agents, line 10: maxTurns: 30|35|40|25. Claude Code's supported field names are not confirmed in this review.
  • Recommended fix: Verify accepted field name in Claude Code docs. If maxTurns is rejected, rename to max_turns. If both are accepted, document which is canonical.
  • Blocks merge: no if silently ignored; maybe if causes parse error.

Finding 5 — Synthesis critic loop has no hard enforcement checkpoint

Confirmed design gap (non-blocking if treated as advisory).

  • Risk: The SKILL.md coordinator says "Pass your draft final review to this agent for validation before outputting it." (line 84). The synthesis-critic agent produces a "Required corrections before posting" section (line 95). But neither the skill nor the critic defines a hard protocol: there is no "DO NOT POST if required-corrections is non-empty" instruction in the coordinator, and there is no pass/fail return value. A coordinator run could skip lane 7, present it as cosmetic, or ignore its corrections list.
  • Evidence: .claude/skills/gitnexus-pr-swarm-review/SKILL.md lines 79–84 (lane 7 description). .claude/agents/gitnexus-synthesis-critic.md lines 88–97 (output sections). Neither contains a hard block gate.
  • Recommended fix: Add to coordinator SKILL.md lane 7 description: "Do not produce or post the final review until the synthesis critic has returned with an empty 'Required corrections before posting' section. If corrections are required, revise and re-run the critic."
  • Blocks merge: maybe — the swarm is still functional without this; the gap reduces synthesis critique reliability but doesn't break discovery or safety.

PR-specific assessment sections

A. Claude Code skill contract

.claude/skills/gitnexus-pr-swarm-review/SKILL.md frontmatter has name and description only — valid per skills contract (no tools needed in the skill itself; the coordinator dispatches subagents). YAML is structurally valid. Invocation name: gitnexus-pr-swarm-review is stable and routing-friendly. ✅

B. Claude Code subagent contract

All 7 agents have structurally valid YAML. name and description present in every file. tools list is correct (Read, Grep, Glob, Bash; no Edit, Write, MultiEdit, NotebookEdit). The model: sonnet issue (Finding 1) and maxTurns casing issue (Finding 4) are the only frontmatter gaps.

C. Context-engineering architecture

Lane ordering is internally consistent: facts-historian first (output feeds all others), hygiene second (changed file list required by later lanes), domain lanes 3–6 parallel after prerequisites, synthesis last. Each lane has a narrow responsibility and structured output contract. No lane relies on facts from a lane that hasn't run yet. The "Mandatory verification points" section in the facts-historian's output schema correctly forwards unverified items to downstream agents. ✅ The parallelism is stated as "can run in parallel" (permissive), not asserted as guaranteed. ✅

D. LLM orchestration and prompt-risk

Descriptions are routing-friendly (each description names the use case clearly). No agent prompt creates recursive delegation patterns. No agent references itself or calls the coordinator. Agent names are unique and stable. The synthesis loop advisory gap (Finding 5) is the main orchestration risk. No agent prompt encourages destructive shell commands explicitly — but absence of prohibition leaves the gap open (Finding 2).

E. Read-only and security-boundary lane

Declared tools: Read, Grep, Glob, Bash — no mutating Claude Code tools. ✅ Bash policy: prose-only enforcement (Finding 2 — confirmed gap). No agent prompts authorize mutating commands. The gitnexus-security-boundary-reviewer.md explicitly instructs running git grep -nP and git diff --check which are read-only. But no agent prohibits gh pr comment or other mutating Bash operations.

F. .gitignore and repository hygiene lane

Confirmed correct. git ls-files verified all 8 new committed files are tracked. The pattern change from .claude/agents/.claude/agents/* + !.claude/agents/gitnexus-*.md and .claude/skills/.claude/skills/* + !.claude/skills/gitnexus-pr-swarm-review/ is semantically correct: content-level wildcard ignores allow negation to work; directory-level ignores would not allow it. Verified that .claude/settings.local.json remains ignored (line 20 and 49), .claude/worktrees/ (line 52), .claude/commands/ (line 96), .claude/helpers (line 97), .claude/skills/generated/ (line 55 + matched by .claude/skills/* without negation), and non-gitnexus local agent files (not matching !gitnexus-*.md pattern) all remain correctly ignored. ✅

G. Hidden Unicode and markdown hygiene lane

Clean. grep -P '[^\x00-\x7F]' across all 8 new files found only em dashes (U+2014) used in output section labels throughout prose text (e.g., 1. **PR identity** — title, number, author, base/head branches). These are:

  • Classification: Benign — visible punctuation in human-readable prose output labels, not in YAML keys, agent names, identifiers, commands, regexes, shell snippets, or security-critical text.
  • No bidi controls (U+202A–U+202E, U+2066–U+2069). No zero-width characters. git diff --check ran clean (no whitespace errors). ✅

H. CI and validation lane

The GitHub Actions CI bot comment (2026-05-27T07:44:01Z) reports all 9984 tests passed, typecheck success, E2E for gitnexus-web (gitnexus-web only), coverage above base. The PR's changed files are all .md and .gitignore — no TypeScript changed, so typecheck and unit tests are not exercised by this PR's code. No markdown linting CI step is visible in the workflow list. Authoritative head SHA vs. CI run SHA could not be independently verified (gh auth required).

I. Docs and user-experience lane

.claude/README-gitnexus-reviewer-swarm.md covers: invocation syntax with examples, agent table with purposes, key properties (read-only, evidence-grounded, missing-visibility policy, manually-invoked), relationship to /gitnexus-pr-review, and restart guidance. The restart warning is explicit: "Claude Code loads agent files at startup. If you directly add or edit files in .claude/agents/, restart Claude Code for the changes to take effect." ✅ The README does not promise hooks or automatic GitHub review posting. ✅

J. Repository-history alignment lane

  • installSkills() writes only to .claude/skills/gitnexus/<6 named skills>/ — confirmed by reading gitnexus/src/cli/ai-context.ts lines 299–375. The gitnexus-pr-swarm-review/ directory is a sibling, not covered. ✅
  • generateAIContextFiles() writes to AGENTS.md, CLAUDE.md, and .claude/skills/generated/ — not to .claude/agents/ or .claude/skills/gitnexus-pr-swarm-review/. ✅
  • New swarm files are hand-authored project assets that will NOT be overwritten by gitnexus analyze. However, there is no comment in README or agents noting "hand-authored, not generated by analyze" — this is a minor clarity gap, not a blocker.

Back-and-forth avoided by verifying

Claim Verified directly
New swarm files are tracked by git git ls-files confirms all 8 new files ✅
.gitignore negation patterns work Content-level * not directory-level — pattern change is necessary and correct ✅
installSkills doesn't overwrite swarm ai-context.ts:299-375 confirmed write path is only gitnexus/<named-skills>
No bidi/hidden Unicode grep -P '[^\x00-\x7F]' found only em dashes in prose ✅
Merge commit only touches Kotlin extractor git show --stat 186ed18 confirmed — unrelated to swarm ✅
Existing /gitnexus-pr-review is different scope Read SKILL.md — MCP-tool-based, different verdict format, no conflicts ✅
No mutating tools declared All 7 agents have only Read/Grep/Glob/Bash — no Edit/Write/NotebookEdit ✅

Open questions that remain only if unavoidable

  1. Does Claude Code accept model: sonnet as a shorthand alias, or is it silently ignored? This requires restarting Claude Code with this branch and checking the model actually used by an agent invocation. Until tested, treat as requiring the fix in Finding 1.
  2. Did CI run on the merge commit HEAD (186ed18) or only on 4067f8b? The merge commit only touches Kotlin .ts files with existing tests, so CI should pass regardless — but the head SHA cannot be confirmed against the CI run without authenticated API access.

Final verdict

not production-ready

Three issues block merge. First, model: sonnet (all 7 agents, line 9) is not a valid Claude Code model ID — it should be claude-sonnet-4-6, and its current form risks either silent model downgrade or future load failure. Second, the PR's own test plan explicitly marks agent discovery and end-to-end invocation as unchecked — per DoD.md §3 ("No false-done"), a PR cannot claim Done until the validation baseline has been run, and neither manual validation item is complete. Third, the Bash read-only policy exists only as prose intent without explicit prohibited-command lists in any of the 7 agents, which leaves the read-only claim undefended against adversarial PR content during review.

With these three fixes — model: claude-sonnet-4-6 in all agents, explicit Bash prohibition lists, and confirmed agent discovery + end-to-end invocation on a test PR — the PR would be production-ready with minor follow-ups (the synthesis-critic enforcement gap and maxTurns casing uncertainty are non-blocking).

The .gitignore design, file tracking, Unicode hygiene, overwrite-safety analysis, branch structure, and context-engineering architecture are all sound and would not require further changes.


· Branch

Comment thread .claude/agents/gitnexus-test-ci-verifier.md Outdated
Comment thread .claude/agents/gitnexus-branch-hygiene-reviewer.md Outdated
- Pin explicit model IDs in all 7 reviewer-swarm agents per CLAUDE.md
  (no unversioned aliases). Set the two mechanical agents
  (test-ci-verifier, branch-hygiene-reviewer) to claude-haiku-4-5-20251001
  per @Cenrax's "this could be haiku"; the five analytical agents use
  claude-sonnet-4-6.
- Add an explicit read-only Bash policy (permitted/prohibited command
  lists) to every agent's Rules section, so the read-only guarantee is
  defended against injected/adversarial PR content rather than prose-only.
- Add a hard synthesis-critic gate to the swarm skill: do not post the
  final review until the critic's "Required corrections before posting"
  section is empty (was advisory only).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Restructure the reviewer swarm around a single CLI-neutral source of truth so it
runs from any AI CLI, not just Claude Code.

- pr-swarm-review/: canonical orchestration.md (Swarm + Solo execution modes with
  an identical output contract) and personas/0N-*.md (the 7 review personas,
  relocated verbatim from the Claude agents, each tagged with a model tier and the
  read-only Bash policy). Single source of truth — edit here, not in the wrappers.
- Thin per-CLI adapters that read the canonical spec at runtime (no duplication):
  - Claude Code: coordinator skill (Swarm mode) + the 7 agents are now thin
    wrappers that read their persona file (frontmatter/model preserved; mechanical
    lanes Haiku, analytical lanes Sonnet).
  - Gemini CLI: .gemini/commands/gitnexus-pr-swarm-review.toml
  - GitHub Copilot: .github/prompts/gitnexus-pr-swarm-review.prompt.md
  - Cursor: .cursor/commands/gitnexus-pr-swarm-review.md
- AGENTS.md: canonical "PR Swarm Review" section -> orchestration.md, the universal
  entrypoint honored by Codex, Cursor, Gemini, Copilot, and any AGENTS.md-aware
  agent (Codex user-level prompt install noted in the README).

Graceful degradation: only Claude Code has parallel subagents (Swarm mode); every
other CLI runs the 7 lanes sequentially in one agent (Solo mode) with the same
output contract. prettier --check clean (root config).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@magyargergo magyargergo merged commit 85727ca into main May 29, 2026
30 checks passed
@magyargergo magyargergo deleted the feat/gitnexus-reviewer-swarm branch May 29, 2026 17:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants