t1330: Rate limit tracker for provider utilisation monitoring by marcusquinn · Pull Request #2273 · marcusquinn/aidevops

marcusquinn · 2026-02-25T03:17:38Z

Summary

Adds aidevops stats rate-limits command to observability-helper.sh that monitors requests/min and tokens/min per provider from the existing observability DB (t1307), compares against configurable rate limits, and warns at 80% threshold
Integrates rate limit awareness into model-availability-helper.sh resolve_tier() so throttle-risk providers are automatically deprioritised in favour of fallback providers
Integrates rate limit warning into dispatch.sh resolve_model() static fallback path

Changes

New: `.agents/configs/rate-limits.json.txt`

Configurable rate limit definitions per provider (requests/min, tokens/min, billing_type). Copy to ~/.config/aidevops/rate-limits.json and adjust for your plan. Covers: anthropic, openai, google, deepseek, openrouter, groq, opencode, xai.

Extended: `observability-helper.sh`

cmd_rate_limits() — new command showing rolling-window utilisation vs configured limits per provider. Supports --json, --provider, --window flags.
check_rate_limit_risk() — returns ok/warn/critical for a provider based on current utilisation vs configured limits
Supporting helpers: _get_rate_limits_config(), _get_rate_limit_value(), _get_warn_pct(), _get_window_minutes(), _get_configured_providers()

Extended: `model-availability-helper.sh`

_check_provider_rate_limit_risk() — queries observability-helper.sh for rate limit status
_extract_provider() — extracts provider from model spec
resolve_tier() — now checks rate limit risk before selecting primary provider; if primary is at throttle risk (≥warn_pct), prefers fallback provider
cmd_rate_limits() — now also shows observability-derived utilisation alongside API header data

Extended: `dispatch.sh`

resolve_model() — adds rate limit risk check in static fallback path; logs warning when anthropic is at throttle risk, prompting operator to configure alternative providers

Acceptance Criteria

Rate limit definitions exist per provider (requests_per_min, tokens_per_min) in .agents/configs/rate-limits.json.txt
aidevops stats rate-limits shows current utilisation per provider
When a provider exceeds 80% of its rate limit, model routing prefers alternatives (via resolve_tier() in model-availability-helper.sh)
Rate limit data derived from existing observability SQLite DB — no new data collection
Works with both token-billed and subscription providers (billing_type field)
ShellCheck clean — zero errors/warnings on all modified .sh files

Testing

# Show rate limit utilisation
aidevops stats rate-limits

# JSON output for scripting
aidevops stats rate-limits --json

# Filter by provider
aidevops stats rate-limits --provider anthropic --json

# Via model-availability-helper
.agents/scripts/model-availability-helper.sh rate-limits

Ref: t1307 (observability DB), t1100 (budget-aware routing), GH#2263

Summary by CodeRabbit

New Features
- CLI rate-limits command to view per-provider utilization (ok/warn/critical) with JSON output and filtering.
- Automatic routing notices and fallback suggestions when a provider shows throttle risk.
- Observability now surfaces rate-limit utilization alongside API-derived status.
Documentation
- Added user-configurable rate-limits template, help text describing config location, fields, and usage.
Chores
- Package/tooling version bumped to 1.1.0.

coderabbitai · 2026-02-25T03:17:58Z

Warning

Rate limit exceeded

@marcusquinn has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 15 minutes and 49 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 2807eb9 and 9cd012a.

📒 Files selected for processing (3)

.agents/scripts/model-availability-helper.sh
.agents/scripts/observability-helper.sh
.agents/scripts/supervisor-archived/dispatch.sh

Walkthrough

Adds rate-limit awareness across the agents: a new rate-limits config template, observability-helper rate-limit tracking and CLI, model-availability-helper provider extraction and throttle-risk checks, and supervisor dispatch warnings when providers (notably Anthropic) hit warn/critical thresholds.

Changes

Cohort / File(s)	Summary
Rate-limits config `.agents/configs/rate-limits.json.txt`	New template defining per-provider limits (requests/tokens per minute), global metadata (warn_pct, window_minutes, threshold), and billing types.
Observability helper `.agents/scripts/observability-helper.sh`	Adds rate-limit tracking (t1330): new config constants, config readers, provider value getters, risk scoring (check_rate_limit_risk), provider listing, `cmd_rate_limits()` CLI, help/dispatch wiring, and version bump to 1.1.0.
Model availability helper `.agents/scripts/model-availability-helper.sh`	Adds `_extract_provider()` (model → provider mapping) and `_check_provider_rate_limit_risk()` to query observability; integrates throttle-risk checks into tier resolution and rate-limit/status display.
Supervisor dispatch `.agents/scripts/supervisor/dispatch.sh`	Emits runtime warnings when Anthropic rate-limit status is warn/critical during model resolution (two code locations); no core selection changes.
Acceptance / tasks `todo/tasks/t1330-brief.md`	Adds acceptance-check items for rate-limit definitions and a shellcheck verification block; no runtime logic changes.

Sequence Diagram(s)

sequenceDiagram
    participant ModelSelector as ModelSelector
    participant RLChecker as RateLimitChecker
    participant ObsHelper as ObservabilityHelper
    participant Config as RateLimitsConfig
    participant ProviderResolver as ProviderResolver

    ModelSelector->>RLChecker: ask risk for <model_id>
    RLChecker->>Config: read provider thresholds (window, warn_pct)
    Config-->>RLChecker: return thresholds
    RLChecker->>ObsHelper: query utilization metrics (requests/tokens)
    ObsHelper-->>RLChecker: return utilization data
    RLChecker->>RLChecker: compute status (ok / warn / critical)
    RLChecker-->>ModelSelector: return risk

    alt risk == warn or critical
        ModelSelector->>ProviderResolver: prefer fallback provider
    else risk == ok
        ModelSelector->>ProviderResolver: use primary provider
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

fix: enforce Anthropic-only model routing for all worker dispatch #799 — Modifies Anthropic model-resolution/fallback behavior; related due to changes that affect Anthropic fallback handling.

Suggested labels

needs-review

Poem

🚦 Limits watched as metrics chime,
Fallbacks ready, chosen in time.
Observability hums in tune,
Warnings whisper, plans attune.
DevOps sails beneath the moon.

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The pull request title clearly and specifically describes the main feature being introduced—a rate limit tracker for provider utilization monitoring—which directly maps to the primary changes across observability-helper.sh, model-availability-helper.sh, and related configuration.
Docstring Coverage	✅ Passed	Docstring coverage is 94.12% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feature/t1330

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist · 2026-02-25T03:17:58Z

Summary of Changes

Hello @marcusquinn, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the AI model routing system by integrating comprehensive rate limit awareness. It introduces a new command for monitoring provider utilization, enabling users to track requests and tokens per minute against defined limits. Crucially, the system now intelligently deprioritizes providers at risk of throttling, ensuring more resilient and efficient model selection. This proactive approach helps avoid service interruptions due to exceeding API rate limits.

Highlights

Rate Limit Monitoring Command: Introduced a new aidevops stats rate-limits command to monitor requests/min and tokens/min per provider from the observability database, comparing against configurable rate limits and warning at an 80% threshold.
Dynamic Provider Deprioritization: Integrated rate limit awareness into model-availability-helper.sh's resolve_tier() function, automatically deprioritizing throttle-risk providers in favor of fallback providers.
Static Fallback Rate Limit Warning: Integrated a rate limit warning into dispatch.sh's resolve_model() static fallback path, logging a warning if a primary provider (e.g., Anthropic) is at throttle risk.
Configurable Rate Limit Definitions: Added a new configuration file, .agents/configs/rate-limits.json.txt, allowing users to define rate limits (requests/min, tokens/min, billing_type) for various AI providers.

Changelog

.agents/configs/rate-limits.json.txt
- Added a new JSON configuration file to define rate limits (requests/min, tokens/min, billing_type) for various AI providers, including a warning threshold and window minutes.
.agents/scripts/model-availability-helper.sh
- Introduced _check_provider_rate_limit_risk to query observability-helper.sh for a provider's rate limit status.
- Added _extract_provider to parse the provider name from a model specification.
- Modified resolve_tier to check the primary provider's rate limit risk and prefer a fallback provider if the primary is at throttle risk.
- Updated cmd_rate_limits to display observability-derived utilization alongside API header data.
.agents/scripts/observability-helper.sh
- Added rate-limits as a new command option.
- Updated the script version to 1.1.0.
- Defined new constants for rate limit configuration file paths and default values.
- Implemented helper functions: _get_rate_limits_config, _get_rate_limit_value, _get_warn_pct, _get_window_minutes, _get_configured_providers for managing rate limit settings.
- Created check_rate_limit_risk to determine a provider's risk status (ok, warn, critical) based on current usage and configured limits.
- Developed cmd_rate_limits to display detailed rate limit utilization, supporting JSON output, provider filtering, and window customization.
- Extended cmd_help with new options and examples for the rate-limits command.
- Integrated cmd_rate_limits into the main command dispatch logic.
.agents/scripts/supervisor/dispatch.sh
- Updated comments in resolve_model to reflect the new rate limit awareness in model-availability-helper.sh.
- Added a rate limit check within resolve_model's static fallback path to log warnings if Anthropic is at throttle risk, prompting users to configure alternatives.

Activity

No human activity has been recorded on this pull request yet.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

github-actions · 2026-02-25T03:18:19Z

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 56 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Wed Feb 25 03:18:14 UTC 2026: Code review monitoring started
Wed Feb 25 03:18:15 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 56

📈 Current Quality Metrics

BUGS: 0
CODE SMELLS: 56
VULNERABILITIES: 0

Generated on: Wed Feb 25 03:18:17 UTC 2026

Generated by AI DevOps Framework Code Review Monitoring

gemini-code-assist

Code Review

This pull request introduces a valuable rate-limiting awareness mechanism to improve provider routing. The implementation is well-structured, adding a new rate-limits command and integrating it into the model resolution logic. My review focuses on improving the robustness and maintainability of the new shell scripts. Specifically, I've pointed out several instances where suppressing stderr can hide important errors, and a case where manually constructing JSON can be unsafe. Addressing these points will make the new functionality more reliable and easier to debug.

.agents/scripts/model-availability-helper.sh

.agents/scripts/observability-helper.sh

.agents/scripts/supervisor/dispatch.sh

augmentcode · 2026-02-25T03:22:25Z

🤖 Augment PR Summary

Summary: Introduces rate-limit utilisation monitoring per provider and makes routing aware of throttle risk.

Changes:

Adds a configurable provider rate-limit template at .agents/configs/rate-limits.json.txt (requests/min, tokens/min, billing type, thresholds).
Extends observability-helper.sh with a new rate-limits command that reads the existing observability SQLite DB and reports rolling-window utilisation, with optional JSON output and filtering.
Adds check_rate_limit_risk() and supporting config helpers to classify providers as ok/warn/critical based on utilisation vs configured limits.
Updates model-availability-helper.sh tier resolution to prefer a fallback provider when the primary provider is at throttle risk.
Enhances model-availability-helper.sh rate-limits output to include observability-derived utilisation alongside API header-derived limits.
Updates supervisor dispatch.sh static fallback path to warn when Anthropic appears throttle-risk, prompting configuration of alternative providers.

Technical Notes: Rate-limit data is derived from the existing observability DB (no new collection) and uses a configurable rolling window with a default warn threshold of 80%.

_{🤖 Was this summary useful? React with 👍 or 👎}

augmentcode

Review completed. 2 suggestions posted.

Comment augment review to trigger a new review at any time.

.agents/scripts/model-availability-helper.sh

.agents/scripts/observability-helper.sh

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.agents/scripts/model-availability-helper.sh:
- Around line 944-945: The code is overwriting the rate-limit status returned by
_check_provider_rate_limit_risk by using "|| rl_risk=\"ok\"" which loses
"warn"/"critical" outputs; instead, capture the function output into rl_risk
regardless of exit code (call _check_provider_rate_limit_risk
"$primary_provider" and allow non-zero exits), then only apply a default if
rl_risk is empty (e.g., set rl_risk=${rl_risk:-ok}); ensure you still redirect
stderr as before and then use rl_risk in the subsequent check that compares to
"warn" or "critical".

In @.agents/scripts/observability-helper.sh:
- Around line 1266-1296: Validate and sanitize the window_minutes value returned
by _get_window_minutes before embedding it into SQL: ensure window_minutes is a
positive integer (e.g., use a regex like '^[0-9]+$' or arithmetic casting) and
fallback to a safe default if invalid; replace direct interpolation of
"${window_minutes}" in db_query calls with the validated/clamped variable (same
change for occurrences around the other db_query blocks noted) and do the same
for any effective_window variables so SQL date modifiers only ever receive a
numeric value, preventing malformed queries and injection via CLI/config.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 76d1493 and 35be5c0d1822cba1d8c0e317927e3f71b7709e98.

📒 Files selected for processing (4)

.agents/configs/rate-limits.json.txt
.agents/scripts/model-availability-helper.sh
.agents/scripts/observability-helper.sh
.agents/scripts/supervisor/dispatch.sh

.agents/scripts/model-availability-helper.sh

.agents/scripts/observability-helper.sh

marcusquinn · 2026-02-25T05:45:36Z

Flagged for Human Review

Reason: PR #2273 (t1330) is OPEN but task t1330 was cancelled in supervisor DB. PR has 1 review request change and 2 comments. Decide whether to close this PR or merge the useful work and update DB state.