Skip to content

feat: model-level backoff instead of provider-level#4928

Closed
alex-solovyev wants to merge 1 commit intomainfrom
feat/model-level-backoff
Closed

feat: model-level backoff instead of provider-level#4928
alex-solovyev wants to merge 1 commit intomainfrom
feat/model-level-backoff

Conversation

@alex-solovyev
Copy link
Collaborator

@alex-solovyev alex-solovyev commented Mar 15, 2026

Summary

  • Backoff is now per-model, not per-provider — same-provider model fallback works
  • Auth errors still block all models of the same provider (shared credentials)

Problem

AIDEVOPS_HEADLESS_MODELS="anthropic/claude-sonnet-4-6,anthropic/claude-opus-4-6"

When sonnet gets rate-limited (429), the entire anthropic provider was backed off. Opus was skipped too, even though it has its own separate rate limit. Same-provider fallback didn't work.

Changes

Schema migration (init_state_db)

  • provider_backoff table: PK changed from provider to model (full provider/model string)
  • Added provider column for provider-level auth_error lookups
  • DROP TABLE + CREATE migration — backoff rows are ephemeral (15-60 min TTL), no data loss

model_backoff_active() (new, replaces provider_backoff_active)

  1. Check model-level backoff first (WHERE model = ?)
  2. If no model-level hit, check provider-level auth_error (WHERE provider = ? AND reason = 'auth_error')
  3. provider_backoff_active() kept as backward-compatible alias

record_provider_backoff()

  • Now accepts full model ID (e.g. anthropic/claude-sonnet-4-6), extracts provider internally

choose_model()

  • Passes full model to model_backoff_active(), not just provider name

cmd_backoff()

  • status: shows model column
  • clear: accepts model ID or provider name (clears accordingly)
  • set: accepts model ID

Behaviour matrix

Scenario Sonnet Opus Correct?
Sonnet rate_limit blocked available Yes — fallback works
Sonnet auth_error blocked blocked Yes — auth is shared
Clean available available Yes

Verification

  • bash -n: pass
  • shellcheck: clean (SC1091 info only)
  • Migration tested: old schema → DROP → new schema → queries verified
  • All three scenarios verified with SQLite test DB

Note from maintainer

exclude openai from pulse models, because it's shit at that

OpenAI already removed from DEFAULT_HEADLESS_MODELS in a prior commit. No change needed.

Closes #4925

Summary by CodeRabbit

  • Refactor

    • Restructured backoff management to track and clear retry state at the model level instead of provider level, enabling more granular control over model performance.
  • Documentation

    • Updated help text and usage messages to reflect model-oriented backoff operations.

Backoff was keyed on provider name — when sonnet hit a rate limit,
all anthropic models (including opus) were backed off together.
Same-provider model fallback didn't work.

Changes:
- provider_backoff table: PK changed from provider to model (full
  provider/model string). DROP+CREATE migration (backoff rows are
  ephemeral, 15-60 min TTL).
- record_provider_backoff(): accepts model ID, extracts provider
- model_backoff_active(): checks model-level first, then falls back
  to provider-level for auth_error (auth is shared per provider)
- choose_model(): passes model to backoff check, not provider
- cmd_backoff(): status/clear/set work with model IDs
- provider_backoff_active() kept as backward-compatible alias

Auth errors still block all models of the same provider — OpenCode
uses one OAuth token / API key per provider, no multi-account.

Closes #4925
@gemini-code-assist
Copy link

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@github-actions github-actions bot added the enhancement Auto-created from TODO.md tag label Mar 15, 2026
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 15, 2026

Caution

Review failed

Pull request was closed or merged during review

Walkthrough

Refactored backoff tracking in headless-runtime-helper.sh from provider-level to model-level granularity. The provider_backoff table now uses model as the primary key, enabling independent rate-limit handling for multiple models from the same provider while preserving provider-level auth error backoff semantics.

Changes

Cohort / File(s) Summary
Backoff System Refactor
.agents/scripts/headless-runtime-helper.sh
Redesigned backoff data model from provider-keyed to model-keyed storage; renamed provider_backoff_active() to model_backoff_active() with backward-compatible alias; updated record_provider_backoff() and clear_provider_backoff() to operate on model identifiers; modified all call sites in choose_model(), cmd_run(), cmd_backoff() to check model-level backoff first with provider-level fallback for auth errors; adjusted SQL INSERT/UPDATE statements to use model as conflict target; updated help text and status messages to reflect model-oriented backoff semantics.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

Multiple models, one provider strong,
No more skipped fallbacks when rates go wrong,
Sonnet hits limits, opus breaks free,
Model-level backoff—per-query harmony! 🚀

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 9.09% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately captures the main change: shifting from provider-level to model-level backoff tracking, which is the core architectural improvement of this PR.
Linked Issues check ✅ Passed The PR fully addresses all coding requirements from #4925: model-keyed backoff storage, per-model rate-limit checks, preserved provider-level auth blocking, updated functions, and schema migration.
Out of Scope Changes check ✅ Passed All changes are directly scoped to the backoff system refactoring; no unrelated modifications to other systems or functionality are present.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/model-level-backoff
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@marcusquinn
Copy link
Owner

Closing as superseded. PR #4927 (same feature: model-level backoff) was merged at 03:37 UTC today. Issue #4925 is already closed with status:done. This PR has a merge conflict and is no longer needed.

@marcusquinn marcusquinn added the already-fixed Already fixed by another change label Mar 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

already-fixed Already fixed by another change enhancement Auto-created from TODO.md tag

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: model-level backoff and fallback instead of provider-level

2 participants