Skip to content

Conversation

@marcusquinn
Copy link
Owner

@marcusquinn marcusquinn commented Feb 9, 2026

Summary

  • Create model-specific subagent definitions for all 5 routing tiers (haiku, flash, sonnet, pro, opus)
  • Add cross-provider reviewer subagents (gemini-reviewer, gpt-reviewer)
  • Update model-routing.md with subagent reference table

Details

New directory: .agents/tools/ai-assistants/models/

Tier subagents (each declares model:, model-tier:, model-fallback: in frontmatter):

File Primary Model Fallback Use Case
haiku.md claude-3-5-haiku gemini-2.5-flash Triage, classification, simple transforms
flash.md gemini-2.5-flash gpt-4.1-mini Large context reads, summarization
sonnet.md claude-sonnet-4 gpt-4.1 Code implementation, review (default)
pro.md gemini-2.5-pro claude-sonnet-4 Large codebase analysis
opus.md claude-opus-4 o3 Architecture, novel problems

Cross-provider reviewers:

  • gemini-reviewer.md — Gemini-powered code review (1M context)
  • gpt-reviewer.md — GPT-powered code review (second opinion)

Integration pattern:

  • In-session: Task tool invokes subagent by name (model is advisory)
  • Headless: Supervisor reads model: from frontmatter, passes as --model flag to CLI
  • Fallback: model-fallback: field enables resilient dispatch (t132.4)

Task

Implements t132.1 from the Cross-Provider Model Routing plan.

Testing

  • All files are markdown documentation (no runtime changes)
  • Frontmatter follows existing conventions (model:, mode:, tools:)
  • model-routing.md updated with cross-references

Summary by CodeRabbit

  • Documentation
    • Added AI model tier documentation (Haiku, Flash, Sonnet, Pro, Opus) including capabilities, constraints, context windows, and pricing details.
    • Documented cross-provider code reviewer agents (Gemini and GPT-based) with review criteria and severity-based findings format.
    • Expanded model routing guide with tier mappings, subagent definitions, and configuration instructions for headless dispatch.

…2.1)

- Create models/ directory with tier-based subagents:
  haiku.md, flash.md, sonnet.md, pro.md, opus.md
- Create cross-provider reviewer subagents:
  gemini-reviewer.md, gpt-reviewer.md
- Add README.md documenting tier mapping, dispatch patterns,
  and supervisor integration
- Each subagent declares model: with full provider/model ID,
  model-tier: for tier lookup, and model-fallback: for resilience
- Update model-routing.md with subagent reference table
- Enables headless dispatch with --model flag from frontmatter
@gemini-code-assist
Copy link

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 9, 2026

Walkthrough

This PR introduces comprehensive documentation for model-specific subagents within the AI-assistants tools framework. It adds individual configuration files for six model tiers (haiku, flash, sonnet, pro, opus) and two specialized reviewers (gemini-reviewer, gpt-reviewer), along with a README and model-routing guidance explaining tier mapping, workflow details, and supervisor integration patterns.

Changes

Cohort / File(s) Summary
Model Tier Documentation
models/haiku.md, models/flash.md, models/sonnet.md, models/pro.md, models/opus.md
Five new files documenting lightweight to high-capability model tiers with YAML frontmatter metadata (model identifier, tier, fallback), capabilities summaries, operational constraints, and tabular model details (provider, context window, costs).
Specialized Reviewer Models
models/gemini-reviewer.md, models/gpt-reviewer.md
Two new cross-provider code reviewer specifications with tool configurations, review focus areas (Correctness, Security, Performance, Maintainability, Conventions), structured output formats, and severity-level guidance.
Model Routing & Reference
models/README.md, context/model-routing.md
New overview documentation mapping tiers to subagent files, detailing In-Session Task Tool and Headless Dispatch workflows, supervisor integration steps, and instructions for adding new models.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related issues

Possibly related PRs

Poem

📚 Six tiers rise tall, from swift haiku to opus grand,
Each model finds its place with flair and metadata close at hand,
Reviewers gemini and gpt stand ready for critique,
While routing docs guide supervisors to dispatches sleek,
Zero debt maintained—our docs shine bright! ✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: define model-specific subagents for cross-provider routing (t132.1)' directly and clearly describes the main change: introducing model-specific subagent definitions for cross-provider model routing with tier mappings, reviewer agents, and routing documentation.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/t132-model-subagents

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link

github-actions bot commented Feb 9, 2026

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 50 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Mon Feb 9 04:32:59 UTC 2026: Code review monitoring started
Mon Feb 9 04:33:00 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 50

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 50
  • VULNERABILITIES: 0

Generated on: Mon Feb 9 04:33:02 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

@sonarqubecloud
Copy link

sonarqubecloud bot commented Feb 9, 2026

@marcusquinn marcusquinn merged commit 6dddc70 into main Feb 9, 2026
10 of 11 checks passed
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🤖 Fix all issues with AI agents
In @.agents/tools/ai-assistants/models/flash.md:
- Line 32: Fix the typo in the constraint phrase by replacing "throughness" with
"thoroughness" in the markdown line that currently reads "Prioritize throughness
of coverage over depth of analysis" inside
.agents/tools/ai-assistants/models/flash.md so the sentence becomes "Prioritize
thoroughness of coverage over depth of analysis".
- Around line 4-46: Update the frontmatter model identifier from
"google/gemini-2.5-flash-preview-05-20" to the GA name
"google/gemini-2.5-flash", correct the pricing values in the "Model Details"
section to Input $0.30/1M and Output $2.50/1M, keep the context value at 1M
tokens, and review/replace the "model-fallback: openai/gpt-4.1-mini" entry
(consider "openai/gpt-4o") to avoid using the retired fallback; ensure you
update both the frontmatter keys (model, model-fallback) and the pricing table
entries to match these changes.

In @.agents/tools/ai-assistants/models/opus.md:
- Line 35: Update the incorrect cost multiplier text: replace the phrase "Cost
is approximately 3x sonnet" with "Cost is approximately 5x sonnet" (or adjust
the pricing table to match a 3x multiplier) so the narrative matches the pricing
table values; locate the string "Cost is approximately 3x sonnet" in opus.md and
ensure the Input ($15.00/1M) and Output ($75.00/1M) entries remain consistent
with the new multiplier.

In @.agents/tools/ai-assistants/models/pro.md:
- Around line 4-46: Replace the deprecated preview frontmatter value "model:
google/gemini-2.5-pro-preview-06-05" with the GA endpoint "model:
google/gemini-2.5-pro" and ensure any other references in this file (the Model
field in the Model Details table) are updated to "gemini-2.5-pro" while leaving
model-tier, model-fallback, pricing, and context values unchanged; confirm the
fallback (anthropic/claude-sonnet-4-20250514) remains intact.

In @.agents/tools/ai-assistants/models/README.md:
- Around line 48-55: Update the README.md reference to the helper script so it
uses the consistent path prefix; replace the bare script name
"compare-models-helper.sh" with "scripts/compare-models-helper.sh" in the
.agents/tools/ai-assistants/models/README.md (look for the line that currently
says "Run `compare-models-helper.sh discover --probe`" and change it to "Run
`scripts/compare-models-helper.sh discover --probe`").

In @.agents/tools/context/model-routing.md:
- Around line 110-112: The table in model-routing.md references missing files
models/flash.md and models/pro.md (rows labeled `flash` and `pro`), which will
create broken links; either add the referenced markdown files (models/flash.md
and models/pro.md) to the PR with the correct content, or remove the
corresponding `flash` and `pro` rows from the table in
.agents/tools/context/model-routing.md until the files are added; update the
table consistently so no links point to non-existent files.
- Around line 109-115: The table in model-routing.md uses short model names
(e.g., claude-3-5-haiku, gemini-2.5-pro, gpt-4.1) that do not match the
fully-qualified frontmatter identifiers in the model files (e.g.,
anthropic/claude-sonnet-4-20250514, google/gemini-2.5-pro-preview-06-05); pick
one fix: either update the table rows for each Tier (haiku, flash, sonnet, pro,
opus) to use the exact frontmatter identifiers from the corresponding model
files (models/haiku.md, models/flash.md, models/sonnet.md, models/pro.md,
models/opus.md), or add a short note in model-routing.md that documents the
naming convention and state that the supervisor’s model resolver will
canonicalize short names to full provider/date-prefixed identifiers (and link to
the resolver behavior); ensure the chosen option is applied consistently for
both Primary Model and Fallback columns.
🧹 Nitpick comments (1)
.agents/tools/ai-assistants/models/pro.md (1)

18-36: Consider referencing AGENTS.md as the authoritative source.

These instructions look good, but adding a brief note pointing to AGENTS.md helps keep a single source of truth for assistant guidance.

Based on learnings "Maintain all AI assistant instructions, documentation, and operational guidance in AGENTS.md as the single source of truth."

Comment on lines +4 to +46
model: google/gemini-2.5-flash-preview-05-20
model-tier: flash
model-fallback: openai/gpt-4.1-mini
tools:
read: true
write: false
edit: false
bash: false
glob: false
grep: false
webfetch: false
task: false
---

# Flash Tier Model

You are a fast, large-context AI assistant optimized for processing large amounts of text efficiently.

## Capabilities

- Reading and summarizing large files or codebases (50K+ tokens)
- Document, PR, and discussion summarization
- Bulk processing (many small tasks in sequence)
- Initial research sweeps before deeper analysis
- Data extraction and formatting

## Constraints

- Prioritize throughness of coverage over depth of analysis
- For complex reasoning tasks, recommend escalation to sonnet or pro tier
- Leverage your large context window (1M tokens) for comprehensive reads
- Keep output structured and scannable

## Model Details

| Field | Value |
|-------|-------|
| Provider | Google |
| Model | gemini-2.5-flash |
| Context | 1M tokens |
| Input cost | $0.15/1M tokens |
| Output cost | $0.60/1M tokens |
| Tier | flash (low cost, large context) |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# Check the actual file content
cat -n .agents/tools/ai-assistants/models/flash.md

Repository: marcusquinn/aidevops

Length of output: 1605


🏁 Script executed:

# List other model files to understand naming patterns
find .agents/tools/ai-assistants/models -type f -name "*.md" | head -20

Repository: marcusquinn/aidevops

Length of output: 431


🏁 Script executed:

# Search for model identifier patterns across existing agent instructions
rg "model:" .agents/ -A 2 | head -50

Repository: marcusquinn/aidevops

Length of output: 3192


🌐 Web query:

What is the current official model identifier and pricing for Google Gemini 2.5 Flash (preview-05-20) as of February 2026? Include context window size and cost per 1M tokens.

💡 Result:

As of February 2026, gemini-2.5-flash-preview-05-20 is no longer an active official endpoint on Vertex AI: Google’s Vertex AI release notes say the 2.5 Flash preview endpoints (including gemini-2.5-flash-preview-05-20) were shut down after July 15, 2025. [1]

What to use instead (official, current as of Feb 2026)

Official GA model identifier: gemini-2.5-flash (Gemini API / Vertex AI) [2][3]
Context window: 1,000,000 tokens (1M) [2]

Official pricing (USD) — per 1M tokens

(These are the published Gemini 2.5 Flash prices; the preview-05-20 endpoint itself is retired.)

  • Input (text/image/video): $0.30 / 1M tokens [2][3]
  • Output (includes thinking tokens): $2.50 / 1M tokens [2][3]
  • Audio input (if used): $1.00 / 1M tokens [2][3]
  • Context caching (text/image/video): $0.03 / 1M tokens (plus storage pricing) [2]

Sources: Vertex AI release notes (preview shutdown) [1], Gemini API pricing page [2], Vertex AI generative AI pricing page [3].


🏁 Script executed:

# Check the fallback model and other model configs for consistency
grep -r "model-fallback" .agents/tools/ai-assistants/models/

Repository: marcusquinn/aidevops

Length of output: 668


🏁 Script executed:

# Look at other model files to understand the pattern
cat -n .agents/tools/ai-assistants/models/sonnet.md | head -20

Repository: marcusquinn/aidevops

Length of output: 670


🌐 Web query:

What is the current status of OpenAI gpt-4.1-mini model as of February 2026? Is it still available or has it been deprecated/renamed?

💡 Result:

As of February 9, 2026, GPT-4.1 mini is still available in ChatGPT, but it is scheduled to be retired from ChatGPT on February 13, 2026. OpenAI says there are no API changes at this time, meaning gpt-4.1-mini remains available via the OpenAI API and has not been renamed (no replacement name announced—just a ChatGPT availability change). [1], [2], [3]

So: not deprecated/renamed in the API; being retired specifically in ChatGPT on Feb 13, 2026. [1], [2]

Sources: [1] https://help.openai.com/articles/20001051, [2] https://openai.com/index/retiring-gpt-4o-and-older-models/, [3] https://help.openai.com/en/articles/9624314-model-release-notes


Update primary model to active endpoint and fix pricing table immediately.

The frontmatter uses google/gemini-2.5-flash-preview-05-20, which was retired on July 15, 2025 and is no longer available. Update to the current GA identifier google/gemini-2.5-flash.

The pricing table is significantly incorrect:

  • Listed: Input $0.15/1M, Output $0.60/1M
  • Correct (Feb 2026): Input $0.30/1M, Output $2.50/1M

Context window of 1M tokens is correct. Also review model-fallback: openai/gpt-4.1-mini as it's being phased out (ChatGPT retirement Feb 13, 2026); consider using openai/gpt-4o for consistency with other model configs.

🧰 Tools
🪛 LanguageTool

[grammar] ~32-~32: Ensure spelling is correct
Context: ...ormatting ## Constraints - Prioritize throughness of coverage over depth of analysis - Fo...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)

🤖 Prompt for AI Agents
In @.agents/tools/ai-assistants/models/flash.md around lines 4 - 46, Update the
frontmatter model identifier from "google/gemini-2.5-flash-preview-05-20" to the
GA name "google/gemini-2.5-flash", correct the pricing values in the "Model
Details" section to Input $0.30/1M and Output $2.50/1M, keep the context value
at 1M tokens, and review/replace the "model-fallback: openai/gpt-4.1-mini" entry
(consider "openai/gpt-4o") to avoid using the retired fallback; ensure you
update both the frontmatter keys (model, model-fallback) and the pricing table
entries to match these changes.


## Constraints

- Prioritize throughness of coverage over depth of analysis
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix typo in constraint (throughness → thoroughness).

Minor doc quality issue.

✏️ Proposed fix
-- Prioritize throughness of coverage over depth of analysis
+- Prioritize thoroughness of coverage over depth of analysis
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- Prioritize throughness of coverage over depth of analysis
- Prioritize thoroughness of coverage over depth of analysis
🧰 Tools
🪛 LanguageTool

[grammar] ~32-~32: Ensure spelling is correct
Context: ...ormatting ## Constraints - Prioritize throughness of coverage over depth of analysis - Fo...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)

🤖 Prompt for AI Agents
In @.agents/tools/ai-assistants/models/flash.md at line 32, Fix the typo in the
constraint phrase by replacing "throughness" with "thoroughness" in the markdown
line that currently reads "Prioritize throughness of coverage over depth of
analysis" inside .agents/tools/ai-assistants/models/flash.md so the sentence
becomes "Prioritize thoroughness of coverage over depth of analysis".


- Only use this tier when the task genuinely requires it
- Most coding tasks are better served by sonnet tier
- Cost is approximately 3x sonnet -- justify the spend
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Cost multiplier inaccurate: states "3x sonnet" but table shows 5x.

Line 35 states "Cost is approximately 3x sonnet," but the model details table shows:

  • Input: $15.00/1M (5x sonnet's $3.00)
  • Output: $75.00/1M (5x sonnet's $15.00)

Update line 35 to reflect the actual 5x multiplier or correct the pricing table.

📝 Proposed fix
-  Cost is approximately 3x sonnet -- justify the spend
+  Cost is 5x sonnet (input and output) -- justify the spend
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- Cost is approximately 3x sonnet -- justify the spend
- Cost is 5x sonnet (input and output) -- justify the spend
🤖 Prompt for AI Agents
In @.agents/tools/ai-assistants/models/opus.md at line 35, Update the incorrect
cost multiplier text: replace the phrase "Cost is approximately 3x sonnet" with
"Cost is approximately 5x sonnet" (or adjust the pricing table to match a 3x
multiplier) so the narrative matches the pricing table values; locate the string
"Cost is approximately 3x sonnet" in opus.md and ensure the Input ($15.00/1M)
and Output ($75.00/1M) entries remain consistent with the new multiplier.

Comment on lines +4 to +46
model: google/gemini-2.5-pro-preview-06-05
model-tier: pro
model-fallback: anthropic/claude-sonnet-4-20250514
tools:
read: true
write: true
edit: true
bash: true
glob: false
grep: true
webfetch: true
task: false
---

# Pro Tier Model

You are a high-capability AI assistant optimized for complex tasks that require both deep reasoning and large context windows.

## Capabilities

- Analyzing very large codebases (100K+ tokens of context)
- Complex reasoning that also needs large context
- Multi-file refactoring across many files
- Comprehensive code review of large PRs
- Cross-referencing documentation with implementation

## Constraints

- Use this tier when both large context AND deep reasoning are needed
- For large context with simple processing, flash tier is more cost-effective
- For deep reasoning with normal context, sonnet tier is sufficient
- For architecture decisions and novel problems, opus tier may be better

## Model Details

| Field | Value |
|-------|-------|
| Provider | Google |
| Model | gemini-2.5-pro |
| Context | 1M tokens |
| Input cost | $1.25/1M tokens |
| Output cost | $10.00/1M tokens |
| Tier | pro (high capability, large context) |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

Google Gemini 2.5 Pro model identifier pricing context window 2025

💡 Result:

  • Model identifier (Gemini API / AI Studio): gemini-2.5-pro. (ai.google.dev)
  • Context window / token limits (2025): Max input: 1,048,576 tokens (~1M). Max output: ~65,536 tokens. (geminidocumentation.com)
  • Pricing (token-based, USD, 2025) — Gemini Developer API (paid tier):
    • Input: $1.25 / 1M tokens (prompts ≤ 200k tokens); $2.50 / 1M tokens (prompts > 200k). (ai.google.dev)
    • Output (includes thinking tokens): $10.00 / 1M tokens (≤ 200k); $15.00 / 1M tokens (> 200k). (ai.google.dev)
    • Context caching: $0.125 / 1M cached input tokens (≤ 200k); $0.25 / 1M cached input tokens (> 200k) plus $4.50 per 1,000,000 tokens per hour for cache storage. (ai.google.dev)
  • Vertex AI pricing (same headline rates): Input $1.25→$2.50 and output $10→$15 per 1M tokens depending on whether the request is ≤200k vs >200k input tokens. (cloud.google.com)
  • Google’s March 2025 announcement: Gemini 2.5 Pro shipped with a 1M-token context window (with “2M coming soon” noted at the time). (blog.google)

Citations:


🌐 Web query:

gemini-2.5-pro-preview-06-05 current status pricing

💡 Result:

As of Feb 9, 2026, gemini-2.5-pro-preview-06-05 is shut down (Google ended all Gemini 2.5 Pro/Flash preview endpoints, including gemini-2.5-pro-preview-06-05, on July 15, 2025). (cloud.google.com)

What to use instead (current status)

Use the GA model endpoint: gemini-2.5-pro (Gemini 2.5 Pro is generally available). (cloud.google.com)

Current pricing (Gemini API / AI Studio) for gemini-2.5-pro

Standard (per 1M tokens, USD): (ai.google.dev)

  • Input: $1.25 (prompts ≤ 200k tokens) / $2.50 (prompts > 200k)
  • Output (incl. thinking tokens): $10.00 (≤ 200k) / $15.00 (> 200k)
  • Context caching: $0.125 (≤ 200k) / $0.25 (> 200k) + $4.50 / 1M tokens/hour storage

If you meant Vertex AI pricing specifically (it can differ by product/region), tell me your region and whether you’re using Vertex AI vs Gemini Developer API, and I’ll pull the exact table.

Citations:


🌐 Web query:

Google Gemini API pricing 2025 gemini-2.5-pro

💡 Result:

Gemini API (Google AI for Developers) pricing for gemini-2.5-pro (2025): [1]

  • Standard (paid tier), per 1M tokens (USD):
    • Input: $1.25 (prompts ≤ 200k tokens); $2.50 (prompts > 200k tokens) [1]
    • Output (incl. “thinking” tokens): $10.00 (prompts ≤ 200k); $15.00 (prompts > 200k) [1]
  • Context caching (paid tier): $0.125 / 1M tokens (≤200k); $0.25 / 1M tokens (>200k) + storage $4.50 / 1M tokens/hour [1]
  • Batch (paid tier): Input $0.625 / 1M tokens (prompts ≤ 200k tokens) [1]

This matches Google’s announced 2025 API pricing breakdown (≤200k vs >200k prompt thresholds). [2]

Sources: [1] Google Gemini Developer API pricing page, [2] TechCrunch (Apr 4, 2025).


Update frontmatter to use the current GA endpoint instead of deprecated preview variant.

The google/gemini-2.5-pro-preview-06-05 model was shut down on July 15, 2025. Update the frontmatter to use google/gemini-2.5-pro (the generally available endpoint). The table's model reference and pricing ($1.25/1M input, $10.00/1M output) and context window (1M tokens) are current and correct.

🧰 Tools
🪛 LanguageTool

[style] ~24-~24: As an alternative to the over-used intensifier ‘very’, consider replacing this phrase.
Context: ... windows. ## Capabilities - Analyzing very large codebases (100K+ tokens of context) - C...

(EN_WEAK_ADJECTIVE)

🤖 Prompt for AI Agents
In @.agents/tools/ai-assistants/models/pro.md around lines 4 - 46, Replace the
deprecated preview frontmatter value "model:
google/gemini-2.5-pro-preview-06-05" with the GA endpoint "model:
google/gemini-2.5-pro" and ensure any other references in this file (the Model
field in the Model Details table) are updated to "gemini-2.5-pro" while leaving
model-tier, model-fallback, pricing, and context values unchanged; confirm the
fallback (anthropic/claude-sonnet-4-20250514) remains intact.

Comment on lines +48 to +55
3. Add to the tier mapping in `model-routing.md`
4. Run `compare-models-helper.sh discover --probe` to verify access

## Related

- `tools/context/model-routing.md` — Cost-aware routing rules
- `compare-models-helper.sh discover` — Detect available providers
- `tools/ai-assistants/headless-dispatch.md` — CLI dispatch with model selection
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify referenced files exist in the repository

echo "=== Looking for referenced files ==="
echo "Searching for model-routing.md..."
fd -t f "model-routing.md" 2>/dev/null || echo "Not found"

echo -e "\nSearching for compare-models-helper.sh..."
fd -t f "compare-models-helper.sh" 2>/dev/null || echo "Not found"

echo -e "\nSearching for headless-dispatch.md..."
fd -t f "headless-dispatch.md" 2>/dev/null || echo "Not found"

echo -e "\n=== All .sh files (potential matches for compare-models-helper) ==="
fd -t f -e sh 2>/dev/null | head -20

echo -e "\n=== Check .agents directory structure ==="
fd -t f . .agents/ 2>/dev/null | head -30

echo -e "\n=== Verify the target file exists ==="
if [ -f ".agents/tools/ai-assistants/models/README.md" ]; then
  echo "Target file exists"
  echo "Content at lines 48-55:"
  sed -n '48,55p' ".agents/tools/ai-assistants/models/README.md"
else
  echo "Target file NOT found"
fi

Repository: marcusquinn/aidevops

Length of output: 2717


Update path reference for consistency.

All referenced files exist. However, the reference to compare-models-helper.sh should be updated to scripts/compare-models-helper.sh to match the path pattern used for other file references in this section:

  • tools/context/model-routing.md (exists)
  • scripts/compare-models-helper.sh (currently referenced without path prefix)
  • tools/ai-assistants/headless-dispatch.md (exists)
🤖 Prompt for AI Agents
In @.agents/tools/ai-assistants/models/README.md around lines 48 - 55, Update
the README.md reference to the helper script so it uses the consistent path
prefix; replace the bare script name "compare-models-helper.sh" with
"scripts/compare-models-helper.sh" in the
.agents/tools/ai-assistants/models/README.md (look for the line that currently
says "Run `compare-models-helper.sh discover --probe`" and change it to "Run
`scripts/compare-models-helper.sh discover --probe`").

Comment on lines +109 to +115
| Tier | Subagent | Primary Model | Fallback |
|------|----------|---------------|----------|
| `haiku` | `models/haiku.md` | claude-3-5-haiku | gemini-2.5-flash |
| `flash` | `models/flash.md` | gemini-2.5-flash | gpt-4.1-mini |
| `sonnet` | `models/sonnet.md` | claude-sonnet-4 | gpt-4.1 |
| `pro` | `models/pro.md` | gemini-2.5-pro | claude-sonnet-4 |
| `opus` | `models/opus.md` | claude-opus-4 | o3 |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Model identifier naming convention inconsistent with frontmatter.

The tier mapping table uses abbreviated model names:

  • claude-3-5-haiku, claude-sonnet-4, claude-opus-4
  • gemini-2.5-flash, gemini-2.5-pro
  • gpt-4.1, gpt-4.1-mini, o3

However, the actual frontmatter in the model files uses fully-qualified identifiers with provider prefixes and date suffixes:

  • anthropic/claude-sonnet-4-20250514
  • google/gemini-2.5-pro-preview-06-05

Either:

  1. Update this table to match the actual frontmatter identifiers, or
  2. Document the naming convention and ensure the supervisor's model resolver canonicalizes both formats
📋 Option 1: Update table to match frontmatter
 | Tier | Subagent | Primary Model | Fallback |
 |------|----------|---------------|----------|
-| `haiku` | `models/haiku.md` | claude-3-5-haiku | gemini-2.5-flash |
-| `flash` | `models/flash.md` | gemini-2.5-flash | gpt-4.1-mini |
-| `sonnet` | `models/sonnet.md` | claude-sonnet-4 | gpt-4.1 |
-| `pro` | `models/pro.md` | gemini-2.5-pro | claude-sonnet-4 |
-| `opus` | `models/opus.md` | claude-opus-4 | o3 |
+| `haiku` | `models/haiku.md` | anthropic/claude-3-5-haiku-20241022 | google/gemini-2.5-flash-preview-05-20 |
+| `flash` | `models/flash.md` | google/gemini-2.5-flash-preview-05-20 | openai/gpt-4.1-mini |
+| `sonnet` | `models/sonnet.md` | anthropic/claude-sonnet-4-20250514 | openai/gpt-4.1 |
+| `pro` | `models/pro.md` | google/gemini-2.5-pro-preview-06-05 | anthropic/claude-sonnet-4-20250514 |
+| `opus` | `models/opus.md` | anthropic/claude-opus-4-20250514 | openai/o3 |
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
| Tier | Subagent | Primary Model | Fallback |
|------|----------|---------------|----------|
| `haiku` | `models/haiku.md` | claude-3-5-haiku | gemini-2.5-flash |
| `flash` | `models/flash.md` | gemini-2.5-flash | gpt-4.1-mini |
| `sonnet` | `models/sonnet.md` | claude-sonnet-4 | gpt-4.1 |
| `pro` | `models/pro.md` | gemini-2.5-pro | claude-sonnet-4 |
| `opus` | `models/opus.md` | claude-opus-4 | o3 |
| Tier | Subagent | Primary Model | Fallback |
|------|----------|---------------|----------|
| `haiku` | `models/haiku.md` | anthropic/claude-3-5-haiku-20241022 | google/gemini-2.5-flash-preview-05-20 |
| `flash` | `models/flash.md` | google/gemini-2.5-flash-preview-05-20 | openai/gpt-4.1-mini |
| `sonnet` | `models/sonnet.md` | anthropic/claude-sonnet-4-20250514 | openai/gpt-4.1 |
| `pro` | `models/pro.md` | google/gemini-2.5-pro-preview-06-05 | anthropic/claude-sonnet-4-20250514 |
| `opus` | `models/opus.md` | anthropic/claude-opus-4-20250514 | openai/o3 |
🤖 Prompt for AI Agents
In @.agents/tools/context/model-routing.md around lines 109 - 115, The table in
model-routing.md uses short model names (e.g., claude-3-5-haiku, gemini-2.5-pro,
gpt-4.1) that do not match the fully-qualified frontmatter identifiers in the
model files (e.g., anthropic/claude-sonnet-4-20250514,
google/gemini-2.5-pro-preview-06-05); pick one fix: either update the table rows
for each Tier (haiku, flash, sonnet, pro, opus) to use the exact frontmatter
identifiers from the corresponding model files (models/haiku.md,
models/flash.md, models/sonnet.md, models/pro.md, models/opus.md), or add a
short note in model-routing.md that documents the naming convention and state
that the supervisor’s model resolver will canonicalize short names to full
provider/date-prefixed identifiers (and link to the resolver behavior); ensure
the chosen option is applied consistently for both Primary Model and Fallback
columns.

Comment on lines +110 to +112
|------|----------|---------------|----------|
| `haiku` | `models/haiku.md` | claude-3-5-haiku | gemini-2.5-flash |
| `flash` | `models/flash.md` | gemini-2.5-flash | gpt-4.1-mini |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

References missing files: flash.md and pro.md not included in PR.

The table references models/flash.md and models/pro.md, but these files are not provided in this pull request. This creates broken references.

Ensure these files are added before merging, or remove these rows from the table temporarily.

🤖 Prompt for AI Agents
In @.agents/tools/context/model-routing.md around lines 110 - 112, The table in
model-routing.md references missing files models/flash.md and models/pro.md
(rows labeled `flash` and `pro`), which will create broken links; either add the
referenced markdown files (models/flash.md and models/pro.md) to the PR with the
correct content, or remove the corresponding `flash` and `pro` rows from the
table in .agents/tools/context/model-routing.md until the files are added;
update the table consistently so no links point to non-existent files.

alex-solovyev added a commit that referenced this pull request Feb 9, 2026
All 8 subtasks of t132 (Cross-Provider Model Routing) are now complete:
- t132.1: Model-specific subagents (PR #758)
- t132.2: Provider/model registry (PR #761)
- t132.3: Model availability checker (PR #770)
- t132.4: Fallback chain config (PR #781)
- t132.5: Supervisor model resolution (PR #787)
- t132.6: Quality gate with escalation (PR #788)
- t132.7: Multi-provider runner/cron support (PR #789)
- t132.8: Cross-model review workflow (PR #791)

Also fixed stale git conflict markers in TODO.md.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant