fix(supervisor): health check 503 false positive on JSON timestamps #488

marcusquinn · 2026-02-07T22:53:30Z

Summary

Bare 503 pattern in check_model_health() matched inside JSON timestamps (e.g. 1770503633269), causing every health probe to report provider error even when the model responded correctly
This blocked all supervisor dispatch — no tasks could be dispatched because the health gate always failed
Replaced with anchored patterns (HTTP 503, 503 Service, "status": 503) that only match actual HTTP errors
Also tightened over.*usage to over[_ -]?usage to prevent greedy matching across JSON fields

Root Cause

opencode run --format json returns JSON lines with Unix timestamps. The timestamp 1770503633269 contains the substring 503, which the old grep -qi '503' pattern matched.

Testing

5 test cases verified:

JSON timestamp 1770503... → no match (fixed)
HTTP 503 Service Unavailable → match (correct)
All Antigravity endpoints failed → match (correct)
Quota protection: over usage → match (correct)
Full opencode JSON response → no match (fixed)

Summary by CodeRabbit

Bug Fixes
- Improved system health monitoring reliability by refining the failure detection pattern. The change reduces false positives that could occur when parsing JSON monitoring data, while simultaneously expanding detection to capture a broader range of failure conditions including quota constraints, HTTP errors, and service unavailability scenarios.

The bare '503' pattern in check_model_health() matched inside JSON timestamps (e.g. 1770503633269), causing every health probe to fail with 'provider error detected' even when the model responded correctly. Replaced with anchored patterns ('HTTP 503', '503 Service', '"status": 503') that only match actual HTTP 503 errors. Also tightened 'over.*usage' to 'over[_ -]?usage' to prevent greedy matching across JSON fields. Verified: 5 test cases (2 true positives, 3 true negatives).

gemini-code-assist · 2026-02-07T22:53:34Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

coderabbitai · 2026-02-07T22:53:57Z

Walkthrough

Enhanced the model health probe failure detection pattern in supervisor-helper.sh to use more robust regex with anchored patterns and multiple failure indicators (HTTP 503, quota issues, service unavailable). This reduces false positives from JSON payloads while maintaining existing error detection behavior.

Changes

Cohort / File(s)	Summary
Health Probe Pattern Enhancement `.agents/scripts/supervisor-helper.sh`	Upgraded regex pattern for model health probe failure detection from brittle plain string grep to anchored regex with alternative patterns. Added support for broader failure indicators including quota protection, usage limits, HTTP 503, and service unavailable messages while preventing false matches within JSON fields.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Possibly related PRs

fix: supervisor self-healing -- macOS timeout, PR detection, model names, stale PID cleanup #429: Also modifies .agents/scripts/supervisor-helper.sh health probe logic, including timeout and failure detection behavior adjustments.

Poem

🔍 A healthier probe with sharper eyes,
Regex anchored to spot false cries,
JSON fields no longer deceive,
Quotas and errors we'll now perceive! ✨

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main fix: replacing a bare '503' pattern that falsely matched JSON timestamps with anchored patterns for actual HTTP 503 errors.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch bugfix/health-check-503-false-positive

No actionable comments were generated in the recent review. 🎉

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-02-07T22:54:06Z

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 31 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Sat Feb 7 22:54:01 UTC 2026: Code review monitoring started
Sat Feb 7 22:54:02 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 31
Sat Feb 7 22:54:02 UTC 2026: Qlty - 0 issues found, auto-formatting applied
Sat Feb 7 22:54:04 UTC 2026: Codacy analysis completed with auto-fixes

📈 Current Quality Metrics

BUGS: 0
CODE SMELLS: 31
VULNERABILITIES: 0

Generated on: Sat Feb 7 22:54:04 UTC 2026

Generated by AI DevOps Framework Code Review Monitoring

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

TODO.md (1)

58-64: ⚠️ Potential issue | 🟡 Minor

Task hierarchy inconsistency: parent marked complete while subtasks remain pending.

Task t148 is marked as completed ([x] with completed:2026-02-07), but all six subtasks (t148.1 through t148.6) remain uncompleted ([ ]). This violates task tracking hierarchy rules where a parent task should only be marked complete when all its subtasks are done.

Either the subtasks should be marked complete if the work is finished, or the parent task should remain uncompleted until the subtasks are addressed.

🔧 Proposed fix

Option 1: If subtasks are actually complete, mark them:

-- [ ] t148.1 Add check_review_threads() to fetch unresolved threads via GraphQL ~1h blocked-by:none
-- [ ] t148.2 Add triage_review_feedback() to classify threads by severity ~1.5h blocked-by:t148.1
-- [ ] t148.3 Add review_triage state to supervisor state machine ~30m blocked-by:t148.1
-- [ ] t148.4 Modify cmd_pr_lifecycle to include triage before merge ~1h blocked-by:t148.2,t148.3
-- [ ] t148.5 Add worker dispatch for fixing valid review feedback ~1.5h blocked-by:t148.4
-- [ ] t148.6 Add --skip-review-triage emergency bypass flag ~15m blocked-by:t148.4
+- [x] t148.1 Add check_review_threads() to fetch unresolved threads via GraphQL ~1h blocked-by:none completed:2026-02-07
+- [x] t148.2 Add triage_review_feedback() to classify threads by severity ~1.5h blocked-by:t148.1 completed:2026-02-07
+- [x] t148.3 Add review_triage state to supervisor state machine ~30m blocked-by:t148.1 completed:2026-02-07
+- [x] t148.4 Modify cmd_pr_lifecycle to include triage before merge ~1h blocked-by:t148.2,t148.3 completed:2026-02-07
+- [x] t148.5 Add worker dispatch for fixing valid review feedback ~1.5h blocked-by:t148.4 completed:2026-02-07
+- [x] t148.6 Add --skip-review-triage emergency bypass flag ~15m blocked-by:t148.4 completed:2026-02-07

Option 2: If subtasks are not complete, unmark the parent:

-- [x] t148 Supervisor: add review-triage phase before PR merge `#plan` `#orchestration` `#quality` → [todo/PLANS.md] ~6h (ai:4h test:1.5h read:30m) logged:2026-02-07 ref:GH#437 completed:2026-02-07
+- [ ] t148 Supervisor: add review-triage phase before PR merge `#plan` `#orchestration` `#quality` → [todo/PLANS.md] ~6h (ai:4h test:1.5h read:30m) logged:2026-02-07 ref:GH#437

🤖 Fix all issues with AI agents

In @.agents/scripts/supervisor-helper.sh:
- Around line 1810-1813: The grep pattern in the probe check that inspects
"$probe_result" uses the non-portable escape \s* (see the if ... grep -qiE
line); change that to the POSIX-portable character class [[:space:]]* so the
pattern becomes portable on BSD/macOS and GNU grep while preserving the intended
matching of optional whitespace around the "status" JSON field.

.agents/scripts/supervisor-helper.sh

Address CodeRabbit review: \s is GNU grep only, [[:space:]] works on macOS/BSD/Linux. Also resolves merge conflict with main.

github-actions · 2026-02-07T23:14:41Z

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 31 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Sat Feb 7 23:14:37 UTC 2026: Code review monitoring started
Sat Feb 7 23:14:37 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 31
Sat Feb 7 23:14:37 UTC 2026: Qlty - 0 issues found, auto-formatting applied
Sat Feb 7 23:14:40 UTC 2026: Codacy analysis completed with auto-fixes

📈 Current Quality Metrics

BUGS: 0
CODE SMELLS: 31
VULNERABILITIES: 0

Generated on: Sat Feb 7 23:14:40 UTC 2026

Generated by AI DevOps Framework Code Review Monitoring

sonarqubecloud · 2026-02-07T23:15:08Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

…agnostic subtask newline parsing Bug 1: evaluate_worker() checked backend_error_count before exit_code==0, causing tasks with clean exits but backend error strings in log content to be classified as retry:backend_infrastructure_error. Moved backend error check inside the non-zero exit code guard (Tier 2 heuristics). Also moved backend_error_count grep to tail-only search with anchored 503 pattern (consistent with health check fix in PR #488). Bug 2: create_diagnostic_subtask() embedded raw log tail (with newlines) into the task description stored in SQLite. When cmd_next returned this as tab-separated output, embedded newlines caused EXIT:0 and DIAGNOSTIC_CONTEXT_END to be parsed as separate task IDs. Fixed by replacing newlines/tabs with spaces in failure_context before storage. Defense: Added task ID validation guard in both pulse dispatch loops to skip malformed IDs (empty, containing spaces/colons, non-alphanumeric).

…agnostic subtask newline parsing (#490) Bug 1: evaluate_worker() checked backend_error_count before exit_code==0, causing tasks with clean exits but backend error strings in log content to be classified as retry:backend_infrastructure_error. Moved backend error check inside the non-zero exit code guard (Tier 2 heuristics). Also moved backend_error_count grep to tail-only search with anchored 503 pattern (consistent with health check fix in PR #488). Bug 2: create_diagnostic_subtask() embedded raw log tail (with newlines) into the task description stored in SQLite. When cmd_next returned this as tab-separated output, embedded newlines caused EXIT:0 and DIAGNOSTIC_CONTEXT_END to be parsed as separate task IDs. Fixed by replacing newlines/tabs with spaces in failure_context before storage. Defense: Added task ID validation guard in both pulse dispatch loops to skip malformed IDs (empty, containing spaces/colons, non-alphanumeric).

marcusquinn added 3 commits February 7, 2026 22:44

chore: mark t147.7 blocked in TODO.md

3b1c722

chore: mark t148 complete in TODO.md (no_pr)

be3d7fe

coderabbitai bot requested changes Feb 7, 2026

View reviewed changes

.agents/scripts/supervisor-helper.sh Outdated Show resolved Hide resolved

fix: use POSIX [[:space:]] instead of \s for BSD grep portability

c26e11b

Address CodeRabbit review: \s is GNU grep only, [[:space:]] works on macOS/BSD/Linux. Also resolves merge conflict with main.

coderabbitai bot approved these changes Feb 7, 2026

View reviewed changes

marcusquinn merged commit 2ccc604 into main Feb 7, 2026
11 checks passed

marcusquinn mentioned this pull request Feb 7, 2026

fix(supervisor): evaluator misclassifies EXIT:0 as backend error + diagnostic subtask newline parsing #490

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(supervisor): health check 503 false positive on JSON timestamps #488

fix(supervisor): health check 503 false positive on JSON timestamps #488

Uh oh!

marcusquinn commented Feb 7, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

gemini-code-assist bot commented Feb 7, 2026

Uh oh!

coderabbitai bot commented Feb 7, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Feb 7, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

github-actions bot commented Feb 7, 2026

Uh oh!

sonarqubecloud bot commented Feb 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

fix(supervisor): health check 503 false positive on JSON timestamps #488

fix(supervisor): health check 503 false positive on JSON timestamps #488

Uh oh!

Conversation

marcusquinn commented Feb 7, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root Cause

Testing

Summary by CodeRabbit

Uh oh!

gemini-code-assist bot commented Feb 7, 2026

Uh oh!

coderabbitai bot commented Feb 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

github-actions bot commented Feb 7, 2026

🔍 Code Quality Report

📈 Current Quality Metrics

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions bot commented Feb 7, 2026

🔍 Code Quality Report

📈 Current Quality Metrics

Uh oh!

sonarqubecloud bot commented Feb 7, 2026

Quality Gate passed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

marcusquinn commented Feb 7, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 7, 2026 •

edited

Loading