-
Notifications
You must be signed in to change notification settings - Fork 5
feat: model availability checker - probe before dispatch (t132.3) #770
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- New model-availability-helper.sh: lightweight HTTP probes (~1-2s vs ~8-15s CLI) - Tests API key validity, rate limits, model availability per provider - SQLite cache with short TTL (5min health, 60s rate limits) - Tier resolution with automatic cross-provider fallback - Integrated into supervisor resolve_model() and check_model_health() - Direct HTTP /models endpoint probes over CLI invocations - Bash 3.2-compatible patterns (no associative arrays) for macOS - Test suite: 29 tests (20 pass, 9 skip without keys, 0 fail)
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Mon Feb 9 15:00:52 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
|
Summary of ChangesHello @marcusquinn, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a model availability checker to proactively verify the health and availability of AI model providers before dispatching tasks. This enhancement reduces unnecessary token usage, improves task execution speed, and provides a more robust and reliable model selection process within the AI DevOps framework. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a significant performance improvement by adding a model-availability-helper.sh script for fast, cached health checks, and integrating it into the supervisor. The implementation is thorough and includes a new test suite, which is excellent. My review has identified a critical security vulnerability (SQL injection) in the new helper script that must be addressed. I've also found a few high-severity issues related to argument parsing and test isolation, and a medium-severity issue concerning error handling. The overall architecture and integration with the existing supervisor logic are well-designed.
| _record_health() { | ||
| local provider="$1" | ||
| local status="$2" | ||
| local http_code="$3" | ||
| local duration_ms="$4" | ||
| local error_msg="$5" | ||
| local models_count="$6" | ||
|
|
||
| db_query " | ||
| INSERT INTO provider_health (provider, status, http_code, response_ms, error_message, models_count, checked_at, ttl_seconds) | ||
| VALUES ( | ||
| '$(sql_escape "$provider")', | ||
| '$(sql_escape "$status")', | ||
| $http_code, | ||
| $duration_ms, | ||
| '$(sql_escape "$error_msg")', | ||
| $models_count, | ||
| strftime('%Y-%m-%dT%H:%M:%SZ', 'now'), | ||
| $DEFAULT_HEALTH_TTL | ||
| ) | ||
| ON CONFLICT(provider) DO UPDATE SET | ||
| status = excluded.status, | ||
| http_code = excluded.http_code, | ||
| response_ms = excluded.response_ms, | ||
| error_message = excluded.error_message, | ||
| models_count = excluded.models_count, | ||
| checked_at = excluded.checked_at, | ||
| ttl_seconds = excluded.ttl_seconds; | ||
| " || true | ||
| return 0 | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is vulnerable to SQL injection. It constructs an SQL query by embedding variables directly into the string. This is unsafe and violates the repository style guide (line 25), which recommends parameterized queries. An attacker who can control variables like provider or status could execute arbitrary SQL. This vulnerability is present in most database interactions in this script.
_record_health() {
local provider="$1"
local status="$2"
local http_code="$3"
local duration_ms="$4"
local error_msg="$5"
local models_count="$6"
sqlite3 "$AVAILABILITY_DB" >/dev/null 2>&1 <<EOF || true
PRAGMA busy_timeout=5000;
.parameter set @provider "$provider"
.parameter set @status "$status"
.parameter set @http_code $http_code
.parameter set @duration_ms $duration_ms
.parameter set @error_msg "$error_msg"
.parameter set @models_count $models_count
.parameter set @ttl $DEFAULT_HEALTH_TTL
INSERT INTO provider_health (provider, status, http_code, response_ms, error_message, models_count, checked_at, ttl_seconds)
VALUES (@provider, @status, @http_code, @duration_ms, @error_msg, @models_count, strftime('%Y-%m-%dT%H:%M:%SZ', 'now'), @ttl)
ON CONFLICT(provider) DO UPDATE SET
status = excluded.status,
http_code = excluded.http_code,
response_ms = excluded.response_ms,
error_message = excluded.error_message,
models_count = excluded.models_count,
checked_at = excluded.checked_at,
ttl_seconds = excluded.ttl_seconds;
EOF
return 0
}References
- Use parameterized queries to prevent SQL injection. (link)
| # ============================================================================= | ||
|
|
||
| readonly AVAILABILITY_DIR="${HOME}/.aidevops/.agent-workspace" | ||
| readonly AVAILABILITY_DB="${AVAILABILITY_DIR}/model-availability.db" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test script tests/test-model-availability.sh attempts to override the database path for testing via the AVAILABILITY_DB_OVERRIDE environment variable. However, this script does not respect that variable, causing the tests to run against the production database (~/.aidevops/.agent-workspace/model-availability.db). This can lead to test flakiness and pollution of the real cache.
| readonly AVAILABILITY_DB="${AVAILABILITY_DIR}/model-availability.db" | |
| readonly AVAILABILITY_DB="${AVAILABILITY_DB_OVERRIDE:-${AVAILABILITY_DIR}/model-availability.db}" |
| --force) force=true; shift ;; | ||
| --quiet) quiet=true; shift ;; | ||
| --json) json_flag=true; shift ;; | ||
| --ttl) custom_ttl="${2:-}"; shift 2 ;; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The argument parsing for --ttl is unsafe. shift 2 is called without checking if a value for the option was provided. If a user runs the command with --ttl as the last argument, set -e will cause the script to fail.
| --ttl) custom_ttl="${2:-}"; shift 2 ;; | |
| --ttl) | |
| if [[ -z "${2:-}" ]]; then | |
| print_error "Error: --ttl option requires an argument." | |
| return 1 | |
| fi | |
| custom_ttl="$2"; shift 2 ;; |
| db_query() { | ||
| local query="$1" | ||
| sqlite3 -cmd ".timeout 5000" "$AVAILABILITY_DB" "$query" 2>/dev/null | ||
| return $? | ||
| } | ||
|
|
||
| db_query_json() { | ||
| local query="$1" | ||
| sqlite3 -cmd ".timeout 5000" -json "$AVAILABILITY_DB" "$query" 2>/dev/null | ||
| return $? | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The db_query and db_query_json functions suppress all stderr output from sqlite3 using 2>/dev/null. This can hide important errors (e.g., SQL syntax errors, file permissions issues) and makes debugging difficult. This goes against the style guide's recommendation (line 50) to avoid blanket error suppression.
| db_query() { | |
| local query="$1" | |
| sqlite3 -cmd ".timeout 5000" "$AVAILABILITY_DB" "$query" 2>/dev/null | |
| return $? | |
| } | |
| db_query_json() { | |
| local query="$1" | |
| sqlite3 -cmd ".timeout 5000" -json "$AVAILABILITY_DB" "$query" 2>/dev/null | |
| return $? | |
| } | |
| db_query() { | |
| local query="$1" | |
| sqlite3 -cmd ".timeout 5000" "$AVAILABILITY_DB" "$query" | |
| return $? | |
| } | |
| db_query_json() { | |
| local query="$1" | |
| sqlite3 -cmd ".timeout 5000" -json "$AVAILABILITY_DB" "$query" | |
| return $? | |
| } |
References
- Avoid blanket error suppression. (link)
All 8 subtasks of t132 (Cross-Provider Model Routing) are now complete: - t132.1: Model-specific subagents (PR #758) - t132.2: Provider/model registry (PR #761) - t132.3: Model availability checker (PR #770) - t132.4: Fallback chain config (PR #781) - t132.5: Supervisor model resolution (PR #787) - t132.6: Quality gate with escalation (PR #788) - t132.7: Multi-provider runner/cron support (PR #789) - t132.8: Cross-model review workflow (PR #791) Also fixed stale git conflict markers in TODO.md.



Summary
model-availability-helper.sh: Lightweight HTTP probes (~1-2s) to check provider health, API key validity, rate limits, and model availability before dispatch — replacing the slow CLI probe path (~8-15s that burns tokens)resolve_model()andcheck_model_health()now use the availability helper as a fast path, with the existing CLI probe retained as a slow-path fallbackDetails
model-availability-helper.sh (~1150 lines)
check,probe,status,rate-limits,resolve,invalidate,help~/.aidevops/.agent-workspace/model-availability.db(5min health TTL, 60s rate limit TTL)/modelsendpoints (free, fast)Supervisor changes
resolve_model(): Added availability-helper fast path before static defaultscheck_model_health(): Two-tier probe — fast HTTP path, slow CLI fallbackDesign decisions
check_model_health()spawned a fullopencode runsession (~8-15s, burns tokens). New helper calls/modelsendpoints directly via curl (~1-2s, free)casestatement functions instead ofdeclare -Aassociative arraysTest results
(9 skipped = tier resolution tests that require API keys, gracefully handled)
Smoke tests: 373 total, 309 passed, 0 failed — no regressions.