marcusquinn · marcusquinn · Feb 27, 2026 · Feb 25, 2026 · Feb 26, 2026 · Feb 26, 2026
diff --git a/.agents/scripts/commands/cross-review.md b/.agents/scripts/commands/cross-review.md
@@ -1,30 +1,21 @@
 ---
-description: Dispatch a prompt to multiple AI models, diff results, and optionally score via a judge model
+description: Dispatch the same prompt to multiple AI models, diff results, and optionally auto-score via a judge model
 agent: Build+
 mode: subagent
 ---
 
-Run a multi-model adversarial review: dispatch the same prompt to N models in parallel, collect outputs, diff results, and optionally score via a judge model (Ouroboros-style pipeline).
+Dispatch a prompt to multiple AI models in parallel, collect and diff their responses, and optionally score them via a judge model.
 
 Target: $ARGUMENTS
 
 ## Instructions
 
-Parse the arguments to extract:
-- `--prompt`: the review prompt (required)
-- `--models`: comma-separated model tiers (default: `sonnet,opus`)
-- `--score`: enable judge scoring pipeline (optional flag)
-- `--judge`: judge model tier (default: `opus`)
-- `--task-type`: scoring category — `code`, `review`, `analysis`, `text`, `general` (default: `general`)
-- `--timeout`: per-model timeout in seconds (default: 600)
-- `--output`: output directory (default: auto-generated tmp dir)
-
 1. Parse the user's arguments. Common forms:
 
    ```bash
    /cross-review "review this PR diff" --models sonnet,opus
    /cross-review "audit this code" --models sonnet,gemini-pro,gpt-4.1 --score
-   /cross-review "design this API" --score --judge opus --task-type analysis
+   /cross-review "design this API" --score --judge opus
    ```
 
 2. Run the cross-review:
@@ -41,11 +32,11 @@ Parse the arguments to extract:
      --models "sonnet,gemini-pro,gpt-4.1" \
      --score
 
-   # With custom judge model and task type
+   # With custom judge model
    ~/.aidevops/agents/scripts/compare-models-helper.sh cross-review \
      --prompt "your prompt here" \
      --models "sonnet,opus" \
-     --score --judge sonnet --task-type review
+     --score --judge sonnet
    ```
 
 3. Present the results:
@@ -55,18 +46,16 @@ Parse the arguments to extract:
    - Note any models that failed to respond
 
 4. If `--score` was used, scores are automatically:
-   - Recorded in the model-comparisons SQLite DB (`~/.aidevops/.agent-workspace/memory/model-comparisons.db`)
-   - Fed into the pattern tracker for data-driven model routing (`/route`, `/patterns`)
+   - Recorded in the model-comparisons SQLite DB
+   - Fed into the pattern tracker for model routing (`/route`, `/patterns`)
 
 ## Options
 
 | Option | Default | Description |
 |--------|---------|-------------|
-| `--prompt` | (required) | The review prompt |
 | `--models` | `sonnet,opus` | Comma-separated model tiers to compare |
 | `--score` | off | Auto-score outputs via judge model |
 | `--judge` | `opus` | Judge model tier (used with `--score`) |
-| `--task-type` | `general` | Scoring category: `code`, `review`, `analysis`, `text`, `general` |
 | `--timeout` | `600` | Seconds per model |
 | `--output` | auto | Directory for raw outputs |
 | `--workdir` | `pwd` | Working directory for model context |
@@ -77,14 +66,13 @@ Parse the arguments to extract:
 
 ## Scoring Criteria (judge model, 1-10 scale)
 
-| Criterion | Scale | Description |
-|-----------|-------|-------------|
-| Correctness | 1-10 | Factual accuracy and technical correctness |
-| Completeness | 1-10 | Coverage of all requirements and edge cases |
-| Quality | 1-10 | Code quality / writing quality |
-| Clarity | 1-10 | Clear explanation, good formatting, readability |
-| Adherence | 1-10 | Following instructions precisely, staying on-task |
-| Overall | 1-10 | Judge's holistic assessment |
+| Criterion | Description |
+|-----------|-------------|
+| correctness | Factual accuracy and technical correctness |
+| completeness | Coverage of all requirements and edge cases |
+| quality | Code quality, best practices, maintainability |
+| clarity | Clear explanation, good formatting, readability |
+| adherence | Following the original prompt instructions precisely |
 
 ## Examples
 
@@ -96,25 +84,13 @@ Parse the arguments to extract:
 /cross-review "Design a rate limiting strategy for a REST API" \
   --models sonnet,opus,pro --score
 
-# Custom judge model and task type
-/cross-review "Audit this architecture" --models "sonnet,opus" --score --judge opus --task-type analysis
-
 # Quick diff with custom timeout
 /cross-review "Summarize the key changes in this diff" --models haiku,sonnet --timeout 120
 
 # View scoring results after a cross-review
 /score-responses --leaderboard
 ```
 
-## Output
-
-- Per-model responses displayed inline
-- Diff summary (word counts, unified diff for 2-model comparisons)
-- Judge scores table (when `--score` is set)
-- Winner declaration with reasoning
-- Results saved to `~/.aidevops/.agent-workspace/tmp/cross-review-<timestamp>/`
-- Judge JSON saved to `<output_dir>/judge-scores.json`
-
 ## Related
 
 - `/compare-models` — Compare model capabilities and pricing (no live dispatch)