feat: define model-specific subagents for cross-provider routing (t132.1) #758

marcusquinn · 2026-02-09T04:32:31Z

Summary

Create model-specific subagent definitions for all 5 routing tiers (haiku, flash, sonnet, pro, opus)
Add cross-provider reviewer subagents (gemini-reviewer, gpt-reviewer)
Update model-routing.md with subagent reference table

Details

New directory: .agents/tools/ai-assistants/models/

Tier subagents (each declares model:, model-tier:, model-fallback: in frontmatter):

File	Primary Model	Fallback	Use Case
`haiku.md`	claude-3-5-haiku	gemini-2.5-flash	Triage, classification, simple transforms
`flash.md`	gemini-2.5-flash	gpt-4.1-mini	Large context reads, summarization
`sonnet.md`	claude-sonnet-4	gpt-4.1	Code implementation, review (default)
`pro.md`	gemini-2.5-pro	claude-sonnet-4	Large codebase analysis
`opus.md`	claude-opus-4	o3	Architecture, novel problems

Cross-provider reviewers:

gemini-reviewer.md — Gemini-powered code review (1M context)
gpt-reviewer.md — GPT-powered code review (second opinion)

Integration pattern:

In-session: Task tool invokes subagent by name (model is advisory)
Headless: Supervisor reads model: from frontmatter, passes as --model flag to CLI
Fallback: model-fallback: field enables resilient dispatch (t132.4)

Task

Implements t132.1 from the Cross-Provider Model Routing plan.

Testing

All files are markdown documentation (no runtime changes)
Frontmatter follows existing conventions (model:, mode:, tools:)
model-routing.md updated with cross-references

Summary by CodeRabbit

Documentation
- Added AI model tier documentation (Haiku, Flash, Sonnet, Pro, Opus) including capabilities, constraints, context windows, and pricing details.
- Documented cross-provider code reviewer agents (Gemini and GPT-based) with review criteria and severity-based findings format.
- Expanded model routing guide with tier mappings, subagent definitions, and configuration instructions for headless dispatch.

…2.1) - Create models/ directory with tier-based subagents: haiku.md, flash.md, sonnet.md, pro.md, opus.md - Create cross-provider reviewer subagents: gemini-reviewer.md, gpt-reviewer.md - Add README.md documenting tier mapping, dispatch patterns, and supervisor integration - Each subagent declares model: with full provider/model ID, model-tier: for tier lookup, and model-fallback: for resilience - Update model-routing.md with subagent reference table - Enables headless dispatch with --model flag from frontmatter

gemini-code-assist · 2026-02-09T04:32:34Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

coderabbitai · 2026-02-09T04:32:51Z

Walkthrough

This PR introduces comprehensive documentation for model-specific subagents within the AI-assistants tools framework. It adds individual configuration files for six model tiers (haiku, flash, sonnet, pro, opus) and two specialized reviewers (gemini-reviewer, gpt-reviewer), along with a README and model-routing guidance explaining tier mapping, workflow details, and supervisor integration patterns.

Changes

Cohort / File(s)	Summary
Model Tier Documentation `models/haiku.md`, `models/flash.md`, `models/sonnet.md`, `models/pro.md`, `models/opus.md`	Five new files documenting lightweight to high-capability model tiers with YAML frontmatter metadata (model identifier, tier, fallback), capabilities summaries, operational constraints, and tabular model details (provider, context window, costs).
Specialized Reviewer Models `models/gemini-reviewer.md`, `models/gpt-reviewer.md`	Two new cross-provider code reviewer specifications with tool configurations, review focus areas (Correctness, Security, Performance, Maintainability, Conventions), structured output formats, and severity-level guidance.
Model Routing & Reference `models/README.md`, `context/model-routing.md`	New overview documentation mapping tiers to subagent files, detailing In-Session Task Tool and Headless Dispatch workflows, supervisor integration steps, and instructions for adding new models.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related issues

t132.multimodal: Evaluate tools/multimodal/ vs cross-references #539: Directly implements model-specific subagent frontmatter structure and supervisor/dispatch workflow patterns documented here.
t132: Cross-Provider Model Routing with Fallbacks #497: Delivers the subagent definitions and supervisor model-resolution mechanics that align with tier-to-subagent mapping described in this PR.
t132.1: Define model-specific subagents in opencode.json #733: Complements model-specific subagent definitions (gemini-reviewer, gpt-reviewer) and per-agent model selection configuration outlined here.

Possibly related PRs

feat: add content summaries to subagent routing table #157: Modifies subagent routing documentation and tier tables that interface with the new model-routing guidance.
feat: Agent Design Pattern Improvements (t052-t057, t067) #140: Extends subagent generation and MCP index loading mechanisms that depend on the model-tier frontmatter structures introduced here.
feat: Claude-Flow inspired features - model routing, semantic memory, pattern tracking (t102) #341: Adds tiered subagent definitions and model-routing documentation with overlapping coverage of the tier-to-model mapping table.

Poem

📚 Six tiers rise tall, from swift haiku to opus grand,
Each model finds its place with flair and metadata close at hand,
Reviewers gemini and gpt stand ready for critique,
While routing docs guide supervisors to dispatches sleek,
Zero debt maintained—our docs shine bright! ✨

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'feat: define model-specific subagents for cross-provider routing (t132.1)' directly and clearly describes the main change: introducing model-specific subagent definitions for cross-provider model routing with tier mappings, reviewer agents, and routing documentation.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feature/t132-model-subagents

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-02-09T04:33:03Z

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 50 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Mon Feb 9 04:32:59 UTC 2026: Code review monitoring started
Mon Feb 9 04:33:00 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 50

📈 Current Quality Metrics

BUGS: 0
CODE SMELLS: 50
VULNERABILITIES: 0

Generated on: Mon Feb 9 04:33:02 UTC 2026

Generated by AI DevOps Framework Code Review Monitoring

sonarqubecloud · 2026-02-09T04:33:28Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

coderabbitai

Actionable comments posted: 7

🤖 Fix all issues with AI agents

In @.agents/tools/ai-assistants/models/flash.md:
- Line 32: Fix the typo in the constraint phrase by replacing "throughness" with
"thoroughness" in the markdown line that currently reads "Prioritize throughness
of coverage over depth of analysis" inside
.agents/tools/ai-assistants/models/flash.md so the sentence becomes "Prioritize
thoroughness of coverage over depth of analysis".
- Around line 4-46: Update the frontmatter model identifier from
"google/gemini-2.5-flash-preview-05-20" to the GA name
"google/gemini-2.5-flash", correct the pricing values in the "Model Details"
section to Input $0.30/1M and Output $2.50/1M, keep the context value at 1M
tokens, and review/replace the "model-fallback: openai/gpt-4.1-mini" entry
(consider "openai/gpt-4o") to avoid using the retired fallback; ensure you
update both the frontmatter keys (model, model-fallback) and the pricing table
entries to match these changes.

In @.agents/tools/ai-assistants/models/opus.md:
- Line 35: Update the incorrect cost multiplier text: replace the phrase "Cost
is approximately 3x sonnet" with "Cost is approximately 5x sonnet" (or adjust
the pricing table to match a 3x multiplier) so the narrative matches the pricing
table values; locate the string "Cost is approximately 3x sonnet" in opus.md and
ensure the Input ($15.00/1M) and Output ($75.00/1M) entries remain consistent
with the new multiplier.

In @.agents/tools/ai-assistants/models/pro.md:
- Around line 4-46: Replace the deprecated preview frontmatter value "model:
google/gemini-2.5-pro-preview-06-05" with the GA endpoint "model:
google/gemini-2.5-pro" and ensure any other references in this file (the Model
field in the Model Details table) are updated to "gemini-2.5-pro" while leaving
model-tier, model-fallback, pricing, and context values unchanged; confirm the
fallback (anthropic/claude-sonnet-4-20250514) remains intact.

In @.agents/tools/ai-assistants/models/README.md:
- Around line 48-55: Update the README.md reference to the helper script so it
uses the consistent path prefix; replace the bare script name
"compare-models-helper.sh" with "scripts/compare-models-helper.sh" in the
.agents/tools/ai-assistants/models/README.md (look for the line that currently
says "Run `compare-models-helper.sh discover --probe`" and change it to "Run
`scripts/compare-models-helper.sh discover --probe`").

In @.agents/tools/context/model-routing.md:
- Around line 110-112: The table in model-routing.md references missing files
models/flash.md and models/pro.md (rows labeled `flash` and `pro`), which will
create broken links; either add the referenced markdown files (models/flash.md
and models/pro.md) to the PR with the correct content, or remove the
corresponding `flash` and `pro` rows from the table in
.agents/tools/context/model-routing.md until the files are added; update the
table consistently so no links point to non-existent files.
- Around line 109-115: The table in model-routing.md uses short model names
(e.g., claude-3-5-haiku, gemini-2.5-pro, gpt-4.1) that do not match the
fully-qualified frontmatter identifiers in the model files (e.g.,
anthropic/claude-sonnet-4-20250514, google/gemini-2.5-pro-preview-06-05); pick
one fix: either update the table rows for each Tier (haiku, flash, sonnet, pro,
opus) to use the exact frontmatter identifiers from the corresponding model
files (models/haiku.md, models/flash.md, models/sonnet.md, models/pro.md,
models/opus.md), or add a short note in model-routing.md that documents the
naming convention and state that the supervisor’s model resolver will
canonicalize short names to full provider/date-prefixed identifiers (and link to
the resolver behavior); ensure the chosen option is applied consistently for
both Primary Model and Fallback columns.

🧹 Nitpick comments (1)

.agents/tools/ai-assistants/models/pro.md (1)

18-36: Consider referencing AGENTS.md as the authoritative source.

These instructions look good, but adding a brief note pointing to AGENTS.md helps keep a single source of truth for assistant guidance.

Based on learnings "Maintain all AI assistant instructions, documentation, and operational guidance in AGENTS.md as the single source of truth."

coderabbitai · 2026-02-09T04:37:13Z

.agents/tools/ai-assistants/models/flash.md

+model: google/gemini-2.5-flash-preview-05-20
+model-tier: flash
+model-fallback: openai/gpt-4.1-mini
+tools:
+  read: true
+  write: false
+  edit: false
+  bash: false
+  glob: false
+  grep: false
+  webfetch: false
+  task: false
+---
+
+# Flash Tier Model
+
+You are a fast, large-context AI assistant optimized for processing large amounts of text efficiently.
+
+## Capabilities
+
+- Reading and summarizing large files or codebases (50K+ tokens)
+- Document, PR, and discussion summarization
+- Bulk processing (many small tasks in sequence)
+- Initial research sweeps before deeper analysis
+- Data extraction and formatting
+
+## Constraints
+
+- Prioritize throughness of coverage over depth of analysis
+- For complex reasoning tasks, recommend escalation to sonnet or pro tier
+- Leverage your large context window (1M tokens) for comprehensive reads
+- Keep output structured and scannable
+
+## Model Details
+
+| Field | Value |
+|-------|-------|
+| Provider | Google |
+| Model | gemini-2.5-flash |
+| Context | 1M tokens |
+| Input cost | $0.15/1M tokens |
+| Output cost | $0.60/1M tokens |
+| Tier | flash (low cost, large context) |


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# Check the actual file content cat -n .agents/tools/ai-assistants/models/flash.md

Repository: marcusquinn/aidevops

Length of output: 1605

🏁 Script executed:

# List other model files to understand naming patterns find .agents/tools/ai-assistants/models -type f -name "*.md" | head -20

Repository: marcusquinn/aidevops

Length of output: 431

🏁 Script executed:

# Search for model identifier patterns across existing agent instructions rg "model:" .agents/ -A 2 | head -50

Repository: marcusquinn/aidevops

Length of output: 3192

🌐 Web query:

What is the current official model identifier and pricing for Google Gemini 2.5 Flash (preview-05-20) as of February 2026? Include context window size and cost per 1M tokens.

💡 Result:

As of February 2026, gemini-2.5-flash-preview-05-20 is no longer an active official endpoint on Vertex AI: Google’s Vertex AI release notes say the 2.5 Flash preview endpoints (including gemini-2.5-flash-preview-05-20) were shut down after July 15, 2025. [1]

What to use instead (official, current as of Feb 2026)

Official GA model identifier: gemini-2.5-flash (Gemini API / Vertex AI) [2][3]
Context window: 1,000,000 tokens (1M) [2]

Official pricing (USD) — per 1M tokens

(These are the published Gemini 2.5 Flash prices; the preview-05-20 endpoint itself is retired.)

Input (text/image/video): $0.30 / 1M tokens [2][3]

Output (includes thinking tokens): $2.50 / 1M tokens [2][3]

Audio input (if used): $1.00 / 1M tokens [2][3]

Context caching (text/image/video): $0.03 / 1M tokens (plus storage pricing) [2]

Sources: Vertex AI release notes (preview shutdown) [1], Gemini API pricing page [2], Vertex AI generative AI pricing page [3].

🏁 Script executed:

# Check the fallback model and other model configs for consistency grep -r "model-fallback" .agents/tools/ai-assistants/models/

Repository: marcusquinn/aidevops

Length of output: 668

🏁 Script executed:

# Look at other model files to understand the pattern cat -n .agents/tools/ai-assistants/models/sonnet.md | head -20

Repository: marcusquinn/aidevops

Length of output: 670

🌐 Web query:

What is the current status of OpenAI gpt-4.1-mini model as of February 2026? Is it still available or has it been deprecated/renamed?

💡 Result:

As of February 9, 2026, GPT-4.1 mini is still available in ChatGPT, but it is scheduled to be retired from ChatGPT on February 13, 2026. OpenAI says there are no API changes at this time, meaning gpt-4.1-mini remains available via the OpenAI API and has not been renamed (no replacement name announced—just a ChatGPT availability change). [1], [2], [3]

So: not deprecated/renamed in the API; being retired specifically in ChatGPT on Feb 13, 2026. [1], [2]

Sources: [1] https://help.openai.com/articles/20001051, [2] https://openai.com/index/retiring-gpt-4o-and-older-models/, [3] https://help.openai.com/en/articles/9624314-model-release-notes

Update primary model to active endpoint and fix pricing table immediately.

The frontmatter uses google/gemini-2.5-flash-preview-05-20, which was retired on July 15, 2025 and is no longer available. Update to the current GA identifier google/gemini-2.5-flash.

The pricing table is significantly incorrect:

Listed: Input $0.15/1M, Output $0.60/1M

Correct (Feb 2026): Input $0.30/1M, Output $2.50/1M

Context window of 1M tokens is correct. Also review model-fallback: openai/gpt-4.1-mini as it's being phased out (ChatGPT retirement Feb 13, 2026); consider using openai/gpt-4o for consistency with other model configs.

🧰 Tools

🪛 LanguageTool

[grammar] ~32-~32: Ensure spelling is correct
Context: ...ormatting ## Constraints - Prioritize throughness of coverage over depth of analysis - Fo...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)

🤖 Prompt for AI Agents

In @.agents/tools/ai-assistants/models/flash.md around lines 4 - 46, Update the frontmatter model identifier from "google/gemini-2.5-flash-preview-05-20" to the GA name "google/gemini-2.5-flash", correct the pricing values in the "Model Details" section to Input $0.30/1M and Output $2.50/1M, keep the context value at 1M tokens, and review/replace the "model-fallback: openai/gpt-4.1-mini" entry (consider "openai/gpt-4o") to avoid using the retired fallback; ensure you update both the frontmatter keys (model, model-fallback) and the pricing table entries to match these changes.

coderabbitai · 2026-02-09T04:37:13Z

.agents/tools/ai-assistants/models/flash.md

+
+## Constraints
+
+- Prioritize throughness of coverage over depth of analysis


⚠️ Potential issue | 🟡 Minor

Fix typo in constraint (throughness → thoroughness).

Minor doc quality issue.

✏️ Proposed fix

-- Prioritize throughness of coverage over depth of analysis +- Prioritize thoroughness of coverage over depth of analysis

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

- Prioritize throughness of coverage over depth of analysis

- Prioritize thoroughness of coverage over depth of analysis

🧰 Tools

🪛 LanguageTool

[grammar] ~32-~32: Ensure spelling is correct
Context: ...ormatting ## Constraints - Prioritize throughness of coverage over depth of analysis - Fo...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)

🤖 Prompt for AI Agents

In @.agents/tools/ai-assistants/models/flash.md at line 32, Fix the typo in the constraint phrase by replacing "throughness" with "thoroughness" in the markdown line that currently reads "Prioritize throughness of coverage over depth of analysis" inside .agents/tools/ai-assistants/models/flash.md so the sentence becomes "Prioritize thoroughness of coverage over depth of analysis".

coderabbitai · 2026-02-09T04:37:13Z

.agents/tools/ai-assistants/models/opus.md

+
+- Only use this tier when the task genuinely requires it
+- Most coding tasks are better served by sonnet tier
+- Cost is approximately 3x sonnet -- justify the spend


⚠️ Potential issue | 🟡 Minor

Cost multiplier inaccurate: states "3x sonnet" but table shows 5x.

Line 35 states "Cost is approximately 3x sonnet," but the model details table shows:

Input: $15.00/1M (5x sonnet's $3.00)

Output: $75.00/1M (5x sonnet's $15.00)

Update line 35 to reflect the actual 5x multiplier or correct the pricing table.

📝 Proposed fix

- Cost is approximately 3x sonnet -- justify the spend + Cost is 5x sonnet (input and output) -- justify the spend

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

- Cost is approximately 3x sonnet -- justify the spend

- Cost is 5x sonnet (input and output) -- justify the spend

🤖 Prompt for AI Agents

In @.agents/tools/ai-assistants/models/opus.md at line 35, Update the incorrect cost multiplier text: replace the phrase "Cost is approximately 3x sonnet" with "Cost is approximately 5x sonnet" (or adjust the pricing table to match a 3x multiplier) so the narrative matches the pricing table values; locate the string "Cost is approximately 3x sonnet" in opus.md and ensure the Input ($15.00/1M) and Output ($75.00/1M) entries remain consistent with the new multiplier.

coderabbitai · 2026-02-09T04:37:14Z

.agents/tools/ai-assistants/models/pro.md

+model: google/gemini-2.5-pro-preview-06-05
+model-tier: pro
+model-fallback: anthropic/claude-sonnet-4-20250514
+tools:
+  read: true
+  write: true
+  edit: true
+  bash: true
+  glob: false
+  grep: true
+  webfetch: true
+  task: false
+---
+
+# Pro Tier Model
+
+You are a high-capability AI assistant optimized for complex tasks that require both deep reasoning and large context windows.
+
+## Capabilities
+
+- Analyzing very large codebases (100K+ tokens of context)
+- Complex reasoning that also needs large context
+- Multi-file refactoring across many files
+- Comprehensive code review of large PRs
+- Cross-referencing documentation with implementation
+
+## Constraints
+
+- Use this tier when both large context AND deep reasoning are needed
+- For large context with simple processing, flash tier is more cost-effective
+- For deep reasoning with normal context, sonnet tier is sufficient
+- For architecture decisions and novel problems, opus tier may be better
+
+## Model Details
+
+| Field | Value |
+|-------|-------|
+| Provider | Google |
+| Model | gemini-2.5-pro |
+| Context | 1M tokens |
+| Input cost | $1.25/1M tokens |
+| Output cost | $10.00/1M tokens |
+| Tier | pro (high capability, large context) |


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

Google Gemini 2.5 Pro model identifier pricing context window 2025

💡 Result:

Model identifier (Gemini API / AI Studio): gemini-2.5-pro. (ai.google.dev)

Context window / token limits (2025): Max input: 1,048,576 tokens (~1M). Max output: ~65,536 tokens. (geminidocumentation.com)

Pricing (token-based, USD, 2025) — Gemini Developer API (paid tier):

Input: $1.25 / 1M tokens (prompts ≤ 200k tokens); $2.50 / 1M tokens (prompts > 200k). (ai.google.dev)

Output (includes thinking tokens): $10.00 / 1M tokens (≤ 200k); $15.00 / 1M tokens (> 200k). (ai.google.dev)

Context caching: $0.125 / 1M cached input tokens (≤ 200k); $0.25 / 1M cached input tokens (> 200k) plus $4.50 per 1,000,000 tokens per hour for cache storage. (ai.google.dev)

Vertex AI pricing (same headline rates): Input $1.25→$2.50 and output $10→$15 per 1M tokens depending on whether the request is ≤200k vs >200k input tokens. (cloud.google.com)

Google’s March 2025 announcement: Gemini 2.5 Pro shipped with a 1M-token context window (with “2M coming soon” noted at the time). (blog.google)

Citations:

1: https://ai.google.dev/pricing?utm_source=openai

2: https://geminidocumentation.com/gemini-api/docs/models?utm_source=openai

3: https://ai.google.dev/pricing?utm_source=openai

4: https://ai.google.dev/pricing?utm_source=openai

5: https://ai.google.dev/pricing?utm_source=openai

6: https://cloud.google.com/vertex-ai/generative-ai/pricing?utm_source=openai

7: https://blog.google/technology/google-deepmind/gemini-model-thinking-updates-march-2025?utm_source=openai

🌐 Web query:

gemini-2.5-pro-preview-06-05 current status pricing

💡 Result:

As of Feb 9, 2026, gemini-2.5-pro-preview-06-05 is shut down (Google ended all Gemini 2.5 Pro/Flash preview endpoints, including gemini-2.5-pro-preview-06-05, on July 15, 2025). (cloud.google.com)

What to use instead (current status)

Use the GA model endpoint: gemini-2.5-pro (Gemini 2.5 Pro is generally available). (cloud.google.com)

Current pricing (Gemini API / AI Studio) for gemini-2.5-pro

Standard (per 1M tokens, USD): (ai.google.dev)

Input: $1.25 (prompts ≤ 200k tokens) / $2.50 (prompts > 200k)

Output (incl. thinking tokens): $10.00 (≤ 200k) / $15.00 (> 200k)

Context caching: $0.125 (≤ 200k) / $0.25 (> 200k) + $4.50 / 1M tokens/hour storage

If you meant Vertex AI pricing specifically (it can differ by product/region), tell me your region and whether you’re using Vertex AI vs Gemini Developer API, and I’ll pull the exact table.

Citations:

1: https://cloud.google.com/vertex-ai/generative-ai/docs/release-notes?utm_source=openai

2: https://cloud.google.com/vertex-ai/generative-ai/docs/release-notes?utm_source=openai

3: https://ai.google.dev/pricing?utm_source=openai

🌐 Web query:

Google Gemini API pricing 2025 gemini-2.5-pro

💡 Result:

Gemini API (Google AI for Developers) pricing for gemini-2.5-pro (2025): [1]

Standard (paid tier), per 1M tokens (USD):

Input: $1.25 (prompts ≤ 200k tokens); $2.50 (prompts > 200k tokens) [1]

Output (incl. “thinking” tokens): $10.00 (prompts ≤ 200k); $15.00 (prompts > 200k) [1]

Context caching (paid tier): $0.125 / 1M tokens (≤200k); $0.25 / 1M tokens (>200k) + storage $4.50 / 1M tokens/hour [1]

Batch (paid tier): Input $0.625 / 1M tokens (prompts ≤ 200k tokens) [1]

This matches Google’s announced 2025 API pricing breakdown (≤200k vs >200k prompt thresholds). [2]

Sources: [1] Google Gemini Developer API pricing page, [2] TechCrunch (Apr 4, 2025).

Update frontmatter to use the current GA endpoint instead of deprecated preview variant.

The google/gemini-2.5-pro-preview-06-05 model was shut down on July 15, 2025. Update the frontmatter to use google/gemini-2.5-pro (the generally available endpoint). The table's model reference and pricing ($1.25/1M input, $10.00/1M output) and context window (1M tokens) are current and correct.

🧰 Tools

🪛 LanguageTool

[style] ~24-~24: As an alternative to the over-used intensifier ‘very’, consider replacing this phrase.
Context: ... windows. ## Capabilities - Analyzing very large codebases (100K+ tokens of context) - C...

(EN_WEAK_ADJECTIVE)

🤖 Prompt for AI Agents

In @.agents/tools/ai-assistants/models/pro.md around lines 4 - 46, Replace the deprecated preview frontmatter value "model: google/gemini-2.5-pro-preview-06-05" with the GA endpoint "model: google/gemini-2.5-pro" and ensure any other references in this file (the Model field in the Model Details table) are updated to "gemini-2.5-pro" while leaving model-tier, model-fallback, pricing, and context values unchanged; confirm the fallback (anthropic/claude-sonnet-4-20250514) remains intact.

coderabbitai · 2026-02-09T04:37:14Z

.agents/tools/ai-assistants/models/README.md

+3. Add to the tier mapping in `model-routing.md`
+4. Run `compare-models-helper.sh discover --probe` to verify access
+
+## Related
+
+- `tools/context/model-routing.md` — Cost-aware routing rules
+- `compare-models-helper.sh discover` — Detect available providers
+- `tools/ai-assistants/headless-dispatch.md` — CLI dispatch with model selection


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verify referenced files exist in the repository echo "=== Looking for referenced files ===" echo "Searching for model-routing.md..." fd -t f "model-routing.md" 2>/dev/null || echo "Not found" echo -e "\nSearching for compare-models-helper.sh..." fd -t f "compare-models-helper.sh" 2>/dev/null || echo "Not found" echo -e "\nSearching for headless-dispatch.md..." fd -t f "headless-dispatch.md" 2>/dev/null || echo "Not found" echo -e "\n=== All .sh files (potential matches for compare-models-helper) ===" fd -t f -e sh 2>/dev/null | head -20 echo -e "\n=== Check .agents directory structure ===" fd -t f . .agents/ 2>/dev/null | head -30 echo -e "\n=== Verify the target file exists ===" if [ -f ".agents/tools/ai-assistants/models/README.md" ]; then echo "Target file exists" echo "Content at lines 48-55:" sed -n '48,55p' ".agents/tools/ai-assistants/models/README.md" else echo "Target file NOT found" fi

Repository: marcusquinn/aidevops

Length of output: 2717

Update path reference for consistency.

All referenced files exist. However, the reference to compare-models-helper.sh should be updated to scripts/compare-models-helper.sh to match the path pattern used for other file references in this section:

tools/context/model-routing.md (exists)

scripts/compare-models-helper.sh (currently referenced without path prefix)

tools/ai-assistants/headless-dispatch.md (exists)

🤖 Prompt for AI Agents

In @.agents/tools/ai-assistants/models/README.md around lines 48 - 55, Update the README.md reference to the helper script so it uses the consistent path prefix; replace the bare script name "compare-models-helper.sh" with "scripts/compare-models-helper.sh" in the .agents/tools/ai-assistants/models/README.md (look for the line that currently says "Run `compare-models-helper.sh discover --probe`" and change it to "Run `scripts/compare-models-helper.sh discover --probe`").

coderabbitai · 2026-02-09T04:37:14Z

.agents/tools/context/model-routing.md

+| Tier | Subagent | Primary Model | Fallback |
+|------|----------|---------------|----------|
+| `haiku` | `models/haiku.md` | claude-3-5-haiku | gemini-2.5-flash |
+| `flash` | `models/flash.md` | gemini-2.5-flash | gpt-4.1-mini |
+| `sonnet` | `models/sonnet.md` | claude-sonnet-4 | gpt-4.1 |
+| `pro` | `models/pro.md` | gemini-2.5-pro | claude-sonnet-4 |
+| `opus` | `models/opus.md` | claude-opus-4 | o3 |


🛠️ Refactor suggestion | 🟠 Major

Model identifier naming convention inconsistent with frontmatter.

The tier mapping table uses abbreviated model names:

claude-3-5-haiku, claude-sonnet-4, claude-opus-4

gemini-2.5-flash, gemini-2.5-pro

gpt-4.1, gpt-4.1-mini, o3

However, the actual frontmatter in the model files uses fully-qualified identifiers with provider prefixes and date suffixes:

anthropic/claude-sonnet-4-20250514

google/gemini-2.5-pro-preview-06-05

Either:

Update this table to match the actual frontmatter identifiers, or

Document the naming convention and ensure the supervisor's model resolver canonicalizes both formats

📋 Option 1: Update table to match frontmatter

| Tier | Subagent | Primary Model | Fallback | |------|----------|---------------|----------| -| `haiku` | `models/haiku.md` | claude-3-5-haiku | gemini-2.5-flash | -| `flash` | `models/flash.md` | gemini-2.5-flash | gpt-4.1-mini | -| `sonnet` | `models/sonnet.md` | claude-sonnet-4 | gpt-4.1 | -| `pro` | `models/pro.md` | gemini-2.5-pro | claude-sonnet-4 | -| `opus` | `models/opus.md` | claude-opus-4 | o3 | +| `haiku` | `models/haiku.md` | anthropic/claude-3-5-haiku-20241022 | google/gemini-2.5-flash-preview-05-20 | +| `flash` | `models/flash.md` | google/gemini-2.5-flash-preview-05-20 | openai/gpt-4.1-mini | +| `sonnet` | `models/sonnet.md` | anthropic/claude-sonnet-4-20250514 | openai/gpt-4.1 | +| `pro` | `models/pro.md` | google/gemini-2.5-pro-preview-06-05 | anthropic/claude-sonnet-4-20250514 | +| `opus` | `models/opus.md` | anthropic/claude-opus-4-20250514 | openai/o3 |

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

| Tier | Subagent | Primary Model | Fallback |

|------|----------|---------------|----------|

| `haiku` | `models/haiku.md` | claude-3-5-haiku | gemini-2.5-flash |

| `flash` | `models/flash.md` | gemini-2.5-flash | gpt-4.1-mini |

| `sonnet` | `models/sonnet.md` | claude-sonnet-4 | gpt-4.1 |

| `pro` | `models/pro.md` | gemini-2.5-pro | claude-sonnet-4 |

| `opus` | `models/opus.md` | claude-opus-4 | o3 |

| Tier | Subagent | Primary Model | Fallback |

|------|----------|---------------|----------|

| `haiku` | `models/haiku.md` | anthropic/claude-3-5-haiku-20241022 | google/gemini-2.5-flash-preview-05-20 |

| `flash` | `models/flash.md` | google/gemini-2.5-flash-preview-05-20 | openai/gpt-4.1-mini |

| `sonnet` | `models/sonnet.md` | anthropic/claude-sonnet-4-20250514 | openai/gpt-4.1 |

| `pro` | `models/pro.md` | google/gemini-2.5-pro-preview-06-05 | anthropic/claude-sonnet-4-20250514 |

| `opus` | `models/opus.md` | anthropic/claude-opus-4-20250514 | openai/o3 |

🤖 Prompt for AI Agents

In @.agents/tools/context/model-routing.md around lines 109 - 115, The table in model-routing.md uses short model names (e.g., claude-3-5-haiku, gemini-2.5-pro, gpt-4.1) that do not match the fully-qualified frontmatter identifiers in the model files (e.g., anthropic/claude-sonnet-4-20250514, google/gemini-2.5-pro-preview-06-05); pick one fix: either update the table rows for each Tier (haiku, flash, sonnet, pro, opus) to use the exact frontmatter identifiers from the corresponding model files (models/haiku.md, models/flash.md, models/sonnet.md, models/pro.md, models/opus.md), or add a short note in model-routing.md that documents the naming convention and state that the supervisor’s model resolver will canonicalize short names to full provider/date-prefixed identifiers (and link to the resolver behavior); ensure the chosen option is applied consistently for both Primary Model and Fallback columns.

coderabbitai · 2026-02-09T04:37:14Z

.agents/tools/context/model-routing.md

+|------|----------|---------------|----------|
+| `haiku` | `models/haiku.md` | claude-3-5-haiku | gemini-2.5-flash |
+| `flash` | `models/flash.md` | gemini-2.5-flash | gpt-4.1-mini |


⚠️ Potential issue | 🟠 Major

References missing files: flash.md and pro.md not included in PR.

The table references models/flash.md and models/pro.md, but these files are not provided in this pull request. This creates broken references.

Ensure these files are added before merging, or remove these rows from the table temporarily.

🤖 Prompt for AI Agents

In @.agents/tools/context/model-routing.md around lines 110 - 112, The table in model-routing.md references missing files models/flash.md and models/pro.md (rows labeled `flash` and `pro`), which will create broken links; either add the referenced markdown files (models/flash.md and models/pro.md) to the PR with the correct content, or remove the corresponding `flash` and `pro` rows from the table in .agents/tools/context/model-routing.md until the files are added; update the table consistently so no links point to non-existent files.

All 8 subtasks of t132 (Cross-Provider Model Routing) are now complete: - t132.1: Model-specific subagents (PR #758) - t132.2: Provider/model registry (PR #761) - t132.3: Model availability checker (PR #770) - t132.4: Fallback chain config (PR #781) - t132.5: Supervisor model resolution (PR #787) - t132.6: Quality gate with escalation (PR #788) - t132.7: Multi-provider runner/cron support (PR #789) - t132.8: Cross-model review workflow (PR #791) Also fixed stale git conflict markers in TODO.md.

marcusquinn merged commit 6dddc70 into main Feb 9, 2026
10 of 11 checks passed

coderabbitai bot requested changes Feb 9, 2026

View reviewed changes

github-actions bot mentioned this pull request Feb 9, 2026

t132: Cross-Provider Model Routing with Fallbacks #497

Closed

13 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: define model-specific subagents for cross-provider routing (t132.1) #758

feat: define model-specific subagents for cross-provider routing (t132.1) #758

marcusquinn commented Feb 9, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

gemini-code-assist bot commented Feb 9, 2026

Uh oh!

coderabbitai bot commented Feb 9, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Feb 9, 2026

Uh oh!

sonarqubecloud bot commented Feb 9, 2026

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Feb 9, 2026

Uh oh!

coderabbitai bot Feb 9, 2026

Uh oh!

coderabbitai bot Feb 9, 2026

Uh oh!

coderabbitai bot Feb 9, 2026

Uh oh!

coderabbitai bot Feb 9, 2026

Uh oh!

coderabbitai bot Feb 9, 2026

Uh oh!

coderabbitai bot Feb 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant


		## Constraints

		- Prioritize throughness of coverage over depth of analysis

	- Prioritize throughness of coverage over depth of analysis
	- Prioritize thoroughness of coverage over depth of analysis

	- Cost is approximately 3x sonnet -- justify the spend
	- Cost is 5x sonnet (input and output) -- justify the spend

feat: define model-specific subagents for cross-provider routing (t132.1) #758

feat: define model-specific subagents for cross-provider routing (t132.1) #758

Conversation

marcusquinn commented Feb 9, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Details

Task

Testing

Summary by CodeRabbit

Uh oh!

gemini-code-assist bot commented Feb 9, 2026

Uh oh!

coderabbitai bot commented Feb 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Possibly related PRs

Poem

Uh oh!

github-actions bot commented Feb 9, 2026

🔍 Code Quality Report

📈 Current Quality Metrics

Uh oh!

sonarqubecloud bot commented Feb 9, 2026

Quality Gate passed

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 9, 2026

Choose a reason for hiding this comment

What to use instead (official, current as of Feb 2026)

Official pricing (USD) — per 1M tokens

Uh oh!

coderabbitai bot Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 9, 2026

Choose a reason for hiding this comment

What to use instead (current status)

Current pricing (Gemini API / AI Studio) for gemini-2.5-pro

Uh oh!

coderabbitai bot Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

marcusquinn commented Feb 9, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 9, 2026 •

edited

Loading

Current pricing (Gemini API / AI Studio) for `gemini-2.5-pro`