t233: Pre-dispatch model availability check with distinct exit codes by marcusquinn · Pull Request #959 · marcusquinn/aidevops

marcusquinn · 2026-02-10T17:50:04Z

Summary

Enhanced check_model_health() to propagate distinct exit codes from model-availability-helper.sh (0=healthy, 1=unavailable, 2=rate-limited, 3=invalid-key) instead of collapsing all failures to exit 1
Updated cmd_dispatch() and cmd_reprompt() to handle distinct codes with specific log messages and appropriate defer/block behavior
Added missing health check to dispatch_review_fix_worker() (previously spawned workers with no availability check at all)
Added secondary provider check after model resolution in cmd_dispatch() to catch cases where the dispatch model's provider differs from the health tier provider

Problem

9 wasted worker failures from ambiguous_ai_unavailable + backend_quota_error caused by workers being spawned against unavailable/rate-limited providers. The root cause was check_model_health() collapsing exit codes 2 (rate-limited) and 3 (invalid-key) from the availability helper into a single return 1, preventing callers from distinguishing transient rate-limits from permanent key failures.

Additionally, dispatch_review_fix_worker() had no health check at all, spawning review-fix workers blindly regardless of provider status.

Changes

Function	Change
`check_model_health()`	Propagates exit 2 (rate-limited) and exit 3 (invalid-key) from availability helper; slow-path probe now distinguishes credits/billing (exit 3) from quota (exit 2) from generic errors (exit 1)
`cmd_dispatch()`	Handles distinct exit codes with case statement; adds secondary availability check for resolved model's provider when it differs from health tier
`cmd_reprompt()`	Handles distinct exit codes with case statement; still returns EX_TEMPFAIL (75) to avoid double-incrementing retry counter
`dispatch_review_fix_worker()`	New: pre-dispatch availability check (was completely missing)

Testing

bash -n syntax check: PASS
shellcheck --severity=warning: No new warnings (all existing are pre-existing SC2034 unused vars in unrelated functions)
No functional test changes needed — the exit code propagation is backward-compatible (callers that used if ! check_model_health still work since non-zero is still falsy)

…(t233) - Enhanced check_model_health() to propagate exit codes from model-availability-helper.sh: exit 0=healthy, 1=unavailable, 2=rate-limited, 3=invalid-key/credits (previously collapsed all failures to exit 1) - Updated cmd_dispatch() to handle distinct codes with specific log messages - Updated cmd_reprompt() to handle distinct codes (still returns EX_TEMPFAIL=75) - Added missing health check to dispatch_review_fix_worker() (previously had none) - Added secondary provider check after model resolution in cmd_dispatch() (catches case where health tier provider differs from dispatch model provider) - Slow-path probe now distinguishes credits/billing (exit 3) from quota/rate-limit (exit 2) from generic provider errors (exit 1) Prevents wasted dispatch attempts when provider is rate-limited or down. Root cause: 9 failures from ambiguous_ai_unavailable + backend_quota_error were caused by workers spawned against unavailable providers.

coderabbitai · 2026-02-10T17:50:11Z

Warning

Rate limit exceeded

@marcusquinn has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 15 minutes and 59 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feature/t233

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist · 2026-02-10T17:50:22Z

Summary of Changes

Hello @marcusquinn, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly improves the robustness of AI model dispatch and retry mechanisms by introducing more granular availability checks. Previously, various provider failures were collapsed into a single error, leading to wasted worker attempts against unavailable or rate-limited AI services. The changes now allow the system to distinguish between transient issues like rate limiting, persistent problems like invalid API keys, and general unavailability, enabling more intelligent deferral or blocking of tasks. This reduces unnecessary resource consumption and provides clearer diagnostics for operational issues.

Highlights

Enhanced Model Health Check: The check_model_health() function now propagates distinct exit codes (0=healthy, 1=unavailable, 2=rate-limited, 3=invalid-key) from model-availability-helper.sh, providing more granular status information.
Improved Dispatch and Reprompt Handling: The cmd_dispatch() and cmd_reprompt() functions have been updated to interpret these distinct exit codes, allowing for specific log messages and appropriate deferral or blocking behavior based on the availability status.
Added Health Check for Review Fix Workers: A previously missing pre-dispatch availability check has been added to dispatch_review_fix_worker(), preventing the blind spawning of workers against unavailable or rate-limited providers.
Secondary Provider Availability Check: A new check was introduced in cmd_dispatch() to verify the availability of the resolved model's specific provider, especially when it differs from the initial health tier provider.

Changelog

.agents/scripts/supervisor-helper.sh
- Updated check_model_health() to propagate distinct exit codes (2 for rate-limited, 3 for invalid API key) from the availability helper.
- Modified check_model_health()'s slow path to differentiate between billing/credits exhaustion (exit 3), quota/rate limiting (exit 2), and generic provider errors (exit 1).
- Refactored cmd_dispatch() to handle the new distinct exit codes from check_model_health() using a case statement for specific logging and return behavior.
- Added a secondary availability check in cmd_dispatch() to verify the resolved model's provider if it differs from the health tier provider.
- Updated cmd_reprompt() to process the distinct exit codes from check_model_health() with tailored log messages, while maintaining EX_TEMPFAIL return for transient issues.
- Implemented a pre-dispatch availability check in dispatch_review_fix_worker() to prevent spawning workers when the AI provider is unavailable.

Activity

No human activity has occurred on this pull request yet.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

github-actions · 2026-02-10T17:50:43Z

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 36 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Tue Feb 10 17:50:38 UTC 2026: Code review monitoring started
Tue Feb 10 17:50:39 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 36

📈 Current Quality Metrics

BUGS: 0
CODE SMELLS: 36
VULNERABILITIES: 0

Generated on: Tue Feb 10 17:50:41 UTC 2026

Generated by AI DevOps Framework Code Review Monitoring

gemini-code-assist

Code Review

This pull request is a significant improvement to the supervisor's robustness by introducing a pre-dispatch model availability check with distinct exit codes. Propagating specific failure reasons like rate-limiting and invalid API keys from check_model_health() will prevent wasting worker resources on doomed dispatches. The updates to cmd_dispatch(), cmd_reprompt(), and the addition of a missing check in dispatch_review_fix_worker() are well-executed and clearly address the problems outlined. My review includes a few suggestions to improve consistency and fully align the changes with the PR's goal of handling distinct exit codes, particularly in the new secondary provider check. I've also noted a few violations of the repository's shell scripting style guide regarding error stream handling. Overall, this is a valuable change that hardens the dispatch logic.

gemini-code-assist · 2026-02-10T17:53:31Z

.agents/scripts/supervisor-helper.sh

+                "$availability_helper" check "$resolved_provider" --quiet 2>/dev/null || resolved_avail_exit=$?
+                if [[ "$resolved_avail_exit" -ne 0 ]]; then
+                    log_warn "Resolved model provider '$resolved_provider' unavailable (exit $resolved_avail_exit) for $task_id — deferring dispatch"
+                    return 3
+                fi


This secondary availability check has two issues:

It uses 2>/dev/null for blanket error suppression, which violates the repository style guide (rule docs: update branch creation to recommend worktrees for parallel sessions #50).

It doesn't differentiate between failure types from the availability helper (e.g., rate-limit vs. invalid key), treating all non-zero exits as "unavailable". This undermines the main goal of this PR, which is to propagate and handle distinct exit codes.

The logic should be expanded to handle distinct exit codes with specific logging, and stderr should not be blanket suppressed.

Suggested change

"$availability_helper" check "$resolved_provider" --quiet 2>/dev/null || resolved_avail_exit=$?

if [[ "$resolved_avail_exit" -ne 0 ]]; then

log_warn "Resolved model provider '$resolved_provider' unavailable (exit $resolved_avail_exit) for $task_id — deferring dispatch"

return 3

fi

"$availability_helper" check "$resolved_provider" --quiet || resolved_avail_exit=$?

if [[ "$resolved_avail_exit" -ne 0 ]]; then

case "$resolved_avail_exit" in

2)

log_warn "Resolved model provider '$resolved_provider' is rate-limited (exit $resolved_avail_exit) for $task_id — deferring dispatch"

;;

3)

log_error "Resolved model provider '$resolved_provider' has invalid key/credits (exit $resolved_avail_exit) for $task_id — blocking dispatch"

;;

*)

log_warn "Resolved model provider '$resolved_provider' unavailable (exit $resolved_avail_exit) for $task_id — deferring dispatch"

;;

esac

return 3

fi

References

Rule docs: update branch creation to recommend worktrees for parallel sessions #50: 2>/dev/null is acceptable ONLY when redirecting to log files, not blanket suppression. The code uses 2>/dev/null to suppress potential errors from the helper script without logging. ^(link)

In shell scripts, capture a command's exit code in a variable instead of using $? directly in conditionals. This aligns with ShellCheck SC2181 and improves clarity for multi-way branches.

Fixed in ed1fc41: Removed blanket 2>/dev/null and expanded the secondary availability check to handle distinct exit codes (2=rate-limited, 3=invalid-key/credits, *=generic unavailable) with specific logging for each. This now correctly propagates the exit code semantics that are the core goal of t233.

gemini-code-assist · 2026-02-10T17:53:32Z

.agents/scripts/supervisor-helper.sh

+    if echo "$probe_result" | grep -qiE 'CreditsError|Insufficient balance' 2>/dev/null; then
+        log_warn "Model health check FAILED: billing/credits exhausted (slow path)"
+        return 3  # t233: credits = invalid key equivalent (won't resolve without human action)
+    fi
+    if echo "$probe_result" | grep -qiE 'Quota protection|over[_ -]?usage|quota reset|429|too many requests|rate.limit' 2>/dev/null; then
+        log_warn "Model health check FAILED: quota/rate limited (slow path)"
+        return 2  # t233: rate-limited = defer dispatch, retry soon
+    fi
+    if echo "$probe_result" | grep -qiE 'endpoints failed|"status":[[:space:]]*503|HTTP 503|503 Service|service unavailable' 2>/dev/null; then
+        log_warn "Model health check FAILED: provider error detected (slow path)"
        return 1
    fi


The grep commands in this block use 2>/dev/null to suppress stderr. This is a violation of the repository style guide (rule #50), which prohibits blanket error suppression and allows 2>/dev/null only when redirecting to log files. While the risk of grep erroring here is low, adhering to the style guide improves script robustness and debuggability. Please consider removing the suppression or redirecting stderr to a log if appropriate.

References

Rule docs: update branch creation to recommend worktrees for parallel sessions #50: 2>/dev/null is acceptable ONLY when redirecting to log files, not blanket suppression. The code uses 2>/dev/null to suppress potential errors from grep without logging. ^(link)

Fixed in ed1fc41: Removed 2>/dev/null from all three grep commands in the failure-pattern detection block. These pipe from echo so stderr errors are unlikely, but removing the suppression aligns with the style guide and improves debuggability.

gemini-code-assist · 2026-02-10T17:53:32Z

.agents/scripts/supervisor-helper.sh

+        case "$health_exit" in
+            2) log_warn "Provider rate-limited for $task_id review-fix — deferring to next pulse" ;;
+            3) log_error "API key invalid/credits exhausted for $task_id review-fix" ;;
+            *) log_error "Provider unavailable for $task_id review-fix — deferring" ;;
+        esac


This case statement uses a compact, single-line format for each branch. For consistency with other case statements in this file and for better readability, please format it using multiple lines.

Suggested change

case "$health_exit" in

2) log_warn "Provider rate-limited for $task_id review-fix — deferring to next pulse" ;;

3) log_error "API key invalid/credits exhausted for $task_id review-fix" ;;

*) log_error "Provider unavailable for $task_id review-fix — deferring" ;;

esac

case "$health_exit" in

2)

log_warn "Provider rate-limited for $task_id review-fix — deferring to next pulse"

;;

3)

log_error "API key invalid/credits exhausted for $task_id review-fix"

;;

*)

log_error "Provider unavailable for $task_id review-fix — deferring"

;;

esac

Fixed in ed1fc41: Reformatted the compact single-line case branches to multi-line format, consistent with the rest of the file (e.g., the dispatch_worker case statement at line ~4615).

github-actions · 2026-02-10T18:01:00Z

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 36 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Tue Feb 10 18:00:55 UTC 2026: Code review monitoring started
Tue Feb 10 18:00:56 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 36

📈 Current Quality Metrics

BUGS: 0
CODE SMELLS: 36
VULNERABILITIES: 0

Generated on: Tue Feb 10 18:00:58 UTC 2026

Generated by AI DevOps Framework Code Review Monitoring

sonarqubecloud · 2026-02-10T18:01:20Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

gemini-code-assist bot reviewed Feb 10, 2026

View reviewed changes

fix: address review feedback for t233

ed1fc41

marcusquinn merged commit 0516cfb into main Feb 10, 2026
18 checks passed

marcusquinn deleted the feature/t233 branch February 10, 2026 18:05

marcusquinn added a commit that referenced this pull request Feb 10, 2026

chore: mark t233 complete in TODO.md (#959)

e2f6697

Conversation

marcusquinn commented Feb 10, 2026

Summary

Problem

Changes

Testing

Uh oh!

coderabbitai bot commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Uh oh!

gemini-code-assist bot commented Feb 10, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

github-actions bot commented Feb 10, 2026

🔍 Code Quality Report

📈 Current Quality Metrics

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

marcusquinn Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

marcusquinn Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

marcusquinn Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Feb 10, 2026

🔍 Code Quality Report

📈 Current Quality Metrics

Uh oh!

sonarqubecloud bot commented Feb 10, 2026

Quality Gate passed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai bot commented Feb 10, 2026 •

edited

Loading