Skip to content

Conversation

@marcusquinn
Copy link
Owner

@marcusquinn marcusquinn commented Feb 7, 2026

Summary

  • Bare 503 pattern in check_model_health() matched inside JSON timestamps (e.g. 1770503633269), causing every health probe to report provider error even when the model responded correctly
  • This blocked all supervisor dispatch — no tasks could be dispatched because the health gate always failed
  • Replaced with anchored patterns (HTTP 503, 503 Service, "status": 503) that only match actual HTTP errors
  • Also tightened over.*usage to over[_ -]?usage to prevent greedy matching across JSON fields

Root Cause

opencode run --format json returns JSON lines with Unix timestamps. The timestamp 1770503633269 contains the substring 503, which the old grep -qi '503' pattern matched.

Testing

5 test cases verified:

  • JSON timestamp 1770503... → no match (fixed)
  • HTTP 503 Service Unavailable → match (correct)
  • All Antigravity endpoints failed → match (correct)
  • Quota protection: over usage → match (correct)
  • Full opencode JSON response → no match (fixed)

Summary by CodeRabbit

  • Bug Fixes
    • Improved system health monitoring reliability by refining the failure detection pattern. The change reduces false positives that could occur when parsing JSON monitoring data, while simultaneously expanding detection to capture a broader range of failure conditions including quota constraints, HTTP errors, and service unavailability scenarios.

The bare '503' pattern in check_model_health() matched inside JSON
timestamps (e.g. 1770503633269), causing every health probe to fail
with 'provider error detected' even when the model responded correctly.

Replaced with anchored patterns ('HTTP 503', '503 Service',
'"status": 503') that only match actual HTTP 503 errors. Also
tightened 'over.*usage' to 'over[_ -]?usage' to prevent greedy
matching across JSON fields.

Verified: 5 test cases (2 true positives, 3 true negatives).
@gemini-code-assist
Copy link

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 7, 2026

Walkthrough

Enhanced the model health probe failure detection pattern in supervisor-helper.sh to use more robust regex with anchored patterns and multiple failure indicators (HTTP 503, quota issues, service unavailable). This reduces false positives from JSON payloads while maintaining existing error detection behavior.

Changes

Cohort / File(s) Summary
Health Probe Pattern Enhancement
.agents/scripts/supervisor-helper.sh
Upgraded regex pattern for model health probe failure detection from brittle plain string grep to anchored regex with alternative patterns. Added support for broader failure indicators including quota protection, usage limits, HTTP 503, and service unavailable messages while preventing false matches within JSON fields.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Possibly related PRs

Poem

🔍 A healthier probe with sharper eyes,
Regex anchored to spot false cries,
JSON fields no longer deceive,
Quotas and errors we'll now perceive! ✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main fix: replacing a bare '503' pattern that falsely matched JSON timestamps with anchored patterns for actual HTTP 503 errors.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch bugfix/health-check-503-false-positive

No actionable comments were generated in the recent review. 🎉


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link

github-actions bot commented Feb 7, 2026

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 31 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Sat Feb 7 22:54:01 UTC 2026: Code review monitoring started
Sat Feb 7 22:54:02 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 31
Sat Feb 7 22:54:02 UTC 2026: Qlty - 0 issues found, auto-formatting applied
Sat Feb 7 22:54:04 UTC 2026: Codacy analysis completed with auto-fixes

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 31
  • VULNERABILITIES: 0

Generated on: Sat Feb 7 22:54:04 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
TODO.md (1)

58-64: ⚠️ Potential issue | 🟡 Minor

Task hierarchy inconsistency: parent marked complete while subtasks remain pending.

Task t148 is marked as completed ([x] with completed:2026-02-07), but all six subtasks (t148.1 through t148.6) remain uncompleted ([ ]). This violates task tracking hierarchy rules where a parent task should only be marked complete when all its subtasks are done.

Either the subtasks should be marked complete if the work is finished, or the parent task should remain uncompleted until the subtasks are addressed.

🔧 Proposed fix

Option 1: If subtasks are actually complete, mark them:

-- [ ] t148.1 Add check_review_threads() to fetch unresolved threads via GraphQL ~1h blocked-by:none
-- [ ] t148.2 Add triage_review_feedback() to classify threads by severity ~1.5h blocked-by:t148.1
-- [ ] t148.3 Add review_triage state to supervisor state machine ~30m blocked-by:t148.1
-- [ ] t148.4 Modify cmd_pr_lifecycle to include triage before merge ~1h blocked-by:t148.2,t148.3
-- [ ] t148.5 Add worker dispatch for fixing valid review feedback ~1.5h blocked-by:t148.4
-- [ ] t148.6 Add --skip-review-triage emergency bypass flag ~15m blocked-by:t148.4
+- [x] t148.1 Add check_review_threads() to fetch unresolved threads via GraphQL ~1h blocked-by:none completed:2026-02-07
+- [x] t148.2 Add triage_review_feedback() to classify threads by severity ~1.5h blocked-by:t148.1 completed:2026-02-07
+- [x] t148.3 Add review_triage state to supervisor state machine ~30m blocked-by:t148.1 completed:2026-02-07
+- [x] t148.4 Modify cmd_pr_lifecycle to include triage before merge ~1h blocked-by:t148.2,t148.3 completed:2026-02-07
+- [x] t148.5 Add worker dispatch for fixing valid review feedback ~1.5h blocked-by:t148.4 completed:2026-02-07
+- [x] t148.6 Add --skip-review-triage emergency bypass flag ~15m blocked-by:t148.4 completed:2026-02-07

Option 2: If subtasks are not complete, unmark the parent:

-- [x] t148 Supervisor: add review-triage phase before PR merge `#plan` `#orchestration` `#quality` → [todo/PLANS.md] ~6h (ai:4h test:1.5h read:30m) logged:2026-02-07 ref:GH#437 completed:2026-02-07
+- [ ] t148 Supervisor: add review-triage phase before PR merge `#plan` `#orchestration` `#quality` → [todo/PLANS.md] ~6h (ai:4h test:1.5h read:30m) logged:2026-02-07 ref:GH#437
🤖 Fix all issues with AI agents
In @.agents/scripts/supervisor-helper.sh:
- Around line 1810-1813: The grep pattern in the probe check that inspects
"$probe_result" uses the non-portable escape \s* (see the if ... grep -qiE
line); change that to the POSIX-portable character class [[:space:]]* so the
pattern becomes portable on BSD/macOS and GNU grep while preserving the intended
matching of optional whitespace around the "status" JSON field.

Address CodeRabbit review: \s is GNU grep only, [[:space:]] works
on macOS/BSD/Linux. Also resolves merge conflict with main.
@github-actions
Copy link

github-actions bot commented Feb 7, 2026

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 31 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Sat Feb 7 23:14:37 UTC 2026: Code review monitoring started
Sat Feb 7 23:14:37 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 31
Sat Feb 7 23:14:37 UTC 2026: Qlty - 0 issues found, auto-formatting applied
Sat Feb 7 23:14:40 UTC 2026: Codacy analysis completed with auto-fixes

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 31
  • VULNERABILITIES: 0

Generated on: Sat Feb 7 23:14:40 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

@sonarqubecloud
Copy link

sonarqubecloud bot commented Feb 7, 2026

@marcusquinn marcusquinn merged commit 2ccc604 into main Feb 7, 2026
11 checks passed
marcusquinn added a commit that referenced this pull request Feb 7, 2026
…agnostic subtask newline parsing

Bug 1: evaluate_worker() checked backend_error_count before exit_code==0,
causing tasks with clean exits but backend error strings in log content
to be classified as retry:backend_infrastructure_error. Moved backend
error check inside the non-zero exit code guard (Tier 2 heuristics).
Also moved backend_error_count grep to tail-only search with anchored
503 pattern (consistent with health check fix in PR #488).

Bug 2: create_diagnostic_subtask() embedded raw log tail (with newlines)
into the task description stored in SQLite. When cmd_next returned this
as tab-separated output, embedded newlines caused EXIT:0 and
DIAGNOSTIC_CONTEXT_END to be parsed as separate task IDs. Fixed by
replacing newlines/tabs with spaces in failure_context before storage.

Defense: Added task ID validation guard in both pulse dispatch loops to
skip malformed IDs (empty, containing spaces/colons, non-alphanumeric).
marcusquinn added a commit that referenced this pull request Feb 7, 2026
…agnostic subtask newline parsing (#490)

Bug 1: evaluate_worker() checked backend_error_count before exit_code==0,
causing tasks with clean exits but backend error strings in log content
to be classified as retry:backend_infrastructure_error. Moved backend
error check inside the non-zero exit code guard (Tier 2 heuristics).
Also moved backend_error_count grep to tail-only search with anchored
503 pattern (consistent with health check fix in PR #488).

Bug 2: create_diagnostic_subtask() embedded raw log tail (with newlines)
into the task description stored in SQLite. When cmd_next returned this
as tab-separated output, embedded newlines caused EXIT:0 and
DIAGNOSTIC_CONTEXT_END to be parsed as separate task IDs. Fixed by
replacing newlines/tabs with spaces in failure_context before storage.

Defense: Added task ID validation guard in both pulse dispatch loops to
skip malformed IDs (empty, containing spaces/colons, non-alphanumeric).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant